[HTTPS-Everywhere] Proposal: ruleset maintainers and test URLs
Jacob S Hoffman-Andrews
jsha at eff.org
Wed Aug 13 07:32:29 PDT 2014
Hi all,
As HTTPS Everywhere encompasses more sites, we're having trouble
validating and maintaining rulesets in a scalable way. I'd like to
propose two small changes that should help keep things running
smoothly: Every ruleset should have a maintainer, and every ruleset
should have a set of test URLs.
The maintainer requirement is pretty straightforward. We would
probably specify it on the top-level ruleset node, e.g.:
<ruleset name="Twitter" maintainer="jsha at eff.org (Jacob
Hoffman-Andrews)">
For the test URLs, we would like at least one URL that exercises
each rewriting rule, so it makes sense to hang test URLs off of the
rule tag:
(old)
<rule from="^http://(?:www\.)?t\.co/"
to="https://t.co/" />
(new)
<rule from="^http://(?:www\.)?t\.co/"
to="https://t.co/">
<test href="https://t.co/kU0aUmcm4u" />
<test href="https://www.t.co/kU0aUmcm4u" />
</rule>
Maintainers would be responsible for choosing the right number of
tests to get adequate coverage when a pattern covers multiple hosts,
but there would be a test-enforced minimum of two URLs per rule. In
the case of bare domain rewrites, the test URLs should cover both
'/' and some page other than the root.
To test these URLs, we would first fetch all of them with curl. Any
URLs that return 4xx or 5xx would be marked as ignored for that run.
A special maintainer mode would flag for replacement all URLs that
return 4xx or 5xx.
After filtering out failing URLs, we'd load up a headless Firefox
instance with the extension (see starting Travis config at
https://github.com/EFForg/https-everywhere/pull/421), and load each
test URL in turn. We would validate that the rewrite rule actually
gets triggered, that the page context gets a 200 response, and that
not more than 3 subresources caused one of: (mixed content blocking,
4xx, 5xx).
We'd run ruleset validation for all URLs daily or weekly, and any
changes to a given ruleset would automatically trigger validation
for that ruleset.
When a ruleset fails during the daily/weekly check, we'd disable it
- either manually or automatically - until someone has time to fix it.
What do you think?
Thanks,
Jacob
More information about the HTTPS-Everywhere
mailing list