[HTTPS-Everywhere] Target Host format in rules / performance

Tue Jan 7 20:33:17 PST 2014

Hey all,

First post, so feel free to swat down anything I say that is way off base
or totally ignorant.

I started looked at the code recently, with specific focus on the code that
handles matching rules to a given host (much of which is in the
"potentiallyApplicableRulesets" method in rules.js). Some tracing there led
me to start strategically grepping the rule files a bit and I noticed that
some rules had target host patterns of the following form: "x.y.z.*".

I suppose my first and possibly naive question is why are target hosts of
that form allowed (or necessary)? One example rule file is
"GoogleMainSearch.xml", and while it has several instances of
"google.something.*", it doesn't seem to "use" these cases in the from/to
mappings. I also foresee a set of security problems when allowing to match
a rule on something like "www.google.*".

I won't be overly verbose for my motivation, but the reason I ask the above
question is to help guide me in thinking about new possible data-structures
that could represent the entire set of rules, as well as more efficient
algorithms to find and apply potentially applicable rules.

Cheers!

-- 
- John K. Stinson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.eff.org/pipermail/https-everywhere/attachments/20140107/b474a668/attachment.html>