[HTTPS-Everywhere] Maintainability changes to HTTPS Everywhere

Mon Jan 26 14:41:31 PST 2015

Hi all,

HTTPS Everywhere is growing rapidly. We have more rulesets than ever
(14k+), and more volunteers than ever (thank you)! The downside of our
growth is that it's increasingly hard to provide a high-quality product.
I'm planning, with your help, to add two new processes to improve our
collective maintenance capability: better automated testing and a
clearer branching strategy.

Better automated testing

Right now we have 3.2k rulesets in the stable branch and 14k in master.
It's time to promote the master branch to stable. Among other things,
master has e10s support, while stable does not. However, there are a
large number of rules that completely break their target sites. See for
instance:  https://github.com/EFForg/https-everywhere/issues/529
https://github.com/EFForg/https-everywhere/issues/849
https://github.com/EFForg/https-everywhere/pull/931

Promoting master branch to stable will break many websites for many
users, which is not okay. We have a set of ruleset tests from 2013
(https://github.com/EFForg/https-everywhere/blob/master/src/chrome/content/ruleset-tests.js),
but these are only designed to detect mixed content. Also, by default
they run for all 14k rulesets, which means loading 24k URLs. And they
run a very simple heuristic to figure out which URLs to load.

We need to automate and improve the ruleset tests. We need to add a
test-url syntax to our ruleset files so we can specify which URLs to
load. Newly added rulesets must include test URLs, and we need to
retroactively add test URLs to existing rulesets (at first with an
automated process, then later with manual maintenance). And the ruleset
tests need to produce output that is easily used to disable failing
rulesets. I've broken this down into tasks here:
https://github.com/EFForg/https-everywhere/labels/Ruleset%20Testing

Clearer branching strategy

Right now it's not clear when you should merge to 4.0 (stable) vs when
you should merge to master. Often we merge ruleset fixes to master, then
later decide that they are important enough to cherry-pick into 4.0.
This cherry-picking makes it very challenging to merge 4.0 into master,
because git cannot recognize that the cherry-pick commits are already
merged.

I propose a new branching strategy:

- Code: All bug fixes must be merged to 4.0 first.
- Code: All new features or refactorings must be merged to master.
- Rulesets: Any change to a ruleset that exists in 4.0 (stable) must be
merged to 4.0 first.
- Rulesets: Any new ruleset, or change to a ruleset that does not yet
exist in 4.0, must be merged to master, and will not be cherry-picked
into 4.0, barring exceptional circumstances.

Since GitHub automatically opens new pull requests against master, it
will be easy to make mistakes. I propose to write a webhook, similar to
Travis CI, that will check new pull requests to see if they are made
against the right branch, and add an indicator if they are not
(https://github.com/EFForg/https-everywhere/issues/982). It will then be
the responsibility of the requestor to change the target branch, since
repo owners can't change that.

All of these changes are a significant amount of work. If you would like
to pitch in and help bring us closer to a 5.0-stable release, please
comment on the specific issue you'd like to take.

Thanks again for all your help. HTTPS Everywhere has a great community.

Jacob