[HTTPS-Everywhere] Style for HTTPS Everywhere rules that cover a lot of domains

Peter Eckersley pde at eff.org
Sat Feb 19 18:12:28 PST 2011


Thanks for the ruleset!  Thereoretically rulesets should be posted to the
https-everywhere-rules mailing list, but this one raises an interesting
general question of style:

  If you're trying to write a rule that covers dozens of domains at once,
  should you write a single regexp to do all of that, or a different regexp
  for each case?

Readability argues for the latter.  Performance argues for the former.  Which
wins?  The answer is that readability would win [*], but the best solution
lets you have it both ways.  Just copy this excellent hack that Jeroen van der
Gun used in the rulesets for the Netherlands government:

https://gitweb.torproject.org/https-everywhere.git/blob/70f2a1999e135a94129283f332a1ef1ebf8edb2b:/src/chrome/content/rules/Nederland.xml

[*] The <target host> mechanism should guarantee acceptable performance in all
cases regardless of how baroque rulesets are.

On Sat, Feb 19, 2011 at 09:38:27PM +0000, David Batley wrote:
> Hi,
> 
> I've made some rules for local government websites in the UK.
> 
> http://dbatley.com/https/localgov/UKLocalGovernment.xml
> 
> There's about 450 local councils, so I decided to automate the process
> by fetching the homepage of each website (following the 302 redirects
> to get to the actual website), then fetching the http and https
> version of the homepage and comparing them. Although I've uploaded the
> script to http://dbatley.com/https/localgov/ , it's really bad
> hacked-together code: unstructured code with no parallellzation, no
> retry on failure, no cookie support, and with plenty of hard-coded
> special cases.
> 
> The script is really only good for getting the initial list of
> websites, and weeding out the un-interesting cases. It doesn't (for
> example) check the validity of the ssl cert. So I have visited each of
> the entries by hand to check they work.
> 
> I've done all these councils in one file because there's a lot of them
> (would spam the config dialog), and most people are only likely to
> visit their local one. Hope that's ok?
> 
> --dave
> 
> PS: https://www.derrycity.gov.uk/ was an interesting case - they're on
> a shared hosting plan and aren't the default https entry :)
> _______________________________________________
> HTTPS-everywhere mailing list
> HTTPS-everywhere at mail1.eff.org
> https://mail1.eff.org/mailman/listinfo/https-everywhere

-- 
Peter Eckersley                            pde at eff.org
Senior Staff Technologist         Tel  +1 415 436 9333 x131
Electronic Frontier Foundation    Fax  +1 415 436 9993



More information about the HTTPS-everywhere mailing list