[HTTPS-Everywhere] Style for HTTPS Everywhere rules that cover a lot of domains

David Batley httpseverywhere at dbatley.com
Sat Feb 19 19:17:23 PST 2011


Oops, didn't realise there was a https-everywhere-rules list :(

A catch-all rule is a bit trickier in this case, as there's a couple
of domains which use the "secure" subdomain for https (ie:
htttp://www.example.com to https://secure.example.com).

If the order the rule statements are executed is guarenteed, it could do:
  <!-- these need redirect to secure subdomain -->
  <rule from="^http://www\.(domains-needing-redirect-to-secure)/"
to="https://secure.$1" />
  <!-- everything else uses the same domain for https -->
  <rule from="^http://www\.([^/]+)/" to="https://www.$1" />

This would remove the text-editor-melting-ly long line. I think the
code will apply these two in the correct order, although if that ever
changes then it could break things unexpectedly.

--dave

On 20 February 2011 02:12, Peter Eckersley <pde at eff.org> wrote:
> Thanks for the ruleset!  Thereoretically rulesets should be posted to the
> https-everywhere-rules mailing list, but this one raises an interesting
> general question of style:
>
>  If you're trying to write a rule that covers dozens of domains at once,
>  should you write a single regexp to do all of that, or a different regexp
>  for each case?
>
> Readability argues for the latter.  Performance argues for the former.  Which
> wins?  The answer is that readability would win [*], but the best solution
> lets you have it both ways.  Just copy this excellent hack that Jeroen van der
> Gun used in the rulesets for the Netherlands government:
>
> https://gitweb.torproject.org/https-everywhere.git/blob/70f2a1999e135a94129283f332a1ef1ebf8edb2b:/src/chrome/content/rules/Nederland.xml
>
> [*] The <target host> mechanism should guarantee acceptable performance in all
> cases regardless of how baroque rulesets are.
>
> On Sat, Feb 19, 2011 at 09:38:27PM +0000, David Batley wrote:
>> Hi,
>>
>> I've made some rules for local government websites in the UK.
>>
>> http://dbatley.com/https/localgov/UKLocalGovernment.xml
>>
>> There's about 450 local councils, so I decided to automate the process
>> by fetching the homepage of each website (following the 302 redirects
>> to get to the actual website), then fetching the http and https
>> version of the homepage and comparing them. Although I've uploaded the
>> script to http://dbatley.com/https/localgov/ , it's really bad
>> hacked-together code: unstructured code with no parallellzation, no
>> retry on failure, no cookie support, and with plenty of hard-coded
>> special cases.
>>
>> The script is really only good for getting the initial list of
>> websites, and weeding out the un-interesting cases. It doesn't (for
>> example) check the validity of the ssl cert. So I have visited each of
>> the entries by hand to check they work.
>>
>> I've done all these councils in one file because there's a lot of them
>> (would spam the config dialog), and most people are only likely to
>> visit their local one. Hope that's ok?
>>
>> --dave
>>
>> PS: https://www.derrycity.gov.uk/ was an interesting case - they're on
>> a shared hosting plan and aren't the default https entry :)
>> _______________________________________________
>> HTTPS-everywhere mailing list
>> HTTPS-everywhere at mail1.eff.org
>> https://mail1.eff.org/mailman/listinfo/https-everywhere
>
> --
> Peter Eckersley                            pde at eff.org
> Senior Staff Technologist         Tel  +1 415 436 9333 x131
> Electronic Frontier Foundation    Fax  +1 415 436 9993
>



More information about the HTTPS-everywhere mailing list