[HTTPS-Everywhere] Draft specification for file used to check for ruleset updates

Seth David Schoen schoen at eff.org
Tue May 20 17:01:03 PDT 2014


Red writes:

> I've written my draft of the (hopefully not overly) simple specification
> for the contents of the update file and briefly explained how it will be
> used.  I posted the draft as a Gist on Github for anyone to read and
> comment on publicly: https://gist.github.com/redwire/2e1d8377ea58e43edb40
> Let me know if there is anything any of you feel should be better
> explained or modified.

Hi Zack,

Thanks for your work on this!  I'm not sure how much you've discussed this
issue with Yan, but I wanted to raise a slightly higher-level topic and make
sure it's on the radar.

Clearly HTTPS Everywhere is very highly trusted software in the context of
a user's browser, because it has the ability to do arbitrary rewrites.
Someone who can create malicious rules and get them applied in a user's
browser can do quite harmful things to do the user.  The most familiar
case is sending a user to a phishing site in order to steal their username
and password (maybe the phishing site will have a name that looks very
similar to the intended site -- for example, the same domain name but in
a different TLD, or the domain name with "-login" appended, or something).
Another case is what the HTTPS Everywhere build scripts call "downgrade
rules", where a user is sent from a secure HTTPS site to an insecure HTTP
site through the action of a rewrite rule.

A more subtle case might be rewriting the origin of a copy of jquery.js
that a site tries to embed, so that it instead gets loaded from a
malicious domain that redefines jquery code to have malicious side effects
(like leaking user data).

I and others have put some sanity checks into the build scripts to try to
warn about some of these possibilities, but as the number of rules has
exploded, there are now a large number of harmless warnings generated
whenever the extension is built, and it's not very likely that every
warning is investigated, since historically they've all been false alarms.
We also rely a fair amount on manual review of contributed rules.

The update signing is particularly sensitive because, if it fails, anyone
who controls the network that any HTTPS Everywhere user is on, and who
can momentarily convince that user's browser to trust a particular HTTPS
origin as the update server, can install new malicious rules in the user's
browser and permanently undermine the security of visits to other sites.
(Maybe we shouldn't speak lightly of the need to "convince [the] browser
to trust a particular HTTPS origin" -- which is fairly challenging,
since it probably requires subverting a certificate authority or finding
and exploiting a bug in the browser certificate verification!)

One thing we might want as an extra cryptographic precaution is pinning
the certificate of the HTTPS server that the ruleset updates are supposed
to come from.  (Notably, right now we don't have that for _extension_
updates, so if we did, upgrading the whole extension would probably
become a weaker link than just upgrading the ruleset library.)  That is,
instead of using the browser's own HTTPS validation based on chaining back
to a set of trusted root CAs, we could have a policy that says that only
a particular entity (key, set of keys, CA, or set of CAs) is acceptable.
I think I have heard someone say that we don't actually have a clean way
to do this in browser extensions... but maybe that's gotten better lately?

A further concern for me -- which, again, is also not yet mitigated for
the extension update process itself -- is coercion attacks against the
developers of HTTPS Everywhere (that is, us).  Since we have the ability
to release updates that contain bad rules that break user security,
someone might try to force us to use that ability to attack a particular
user or users.  Concretely, I think defending against this boils down to
having a mechanism to let users know if they got the same update (whether
extension update or ruleset upset) that everyone else in the world got,
or at least whether everyone was _offered_ the same set of updates.
There are many ideas about this and I think it's quite an interesting
problem for anyone publishing software to a large user base over the
Internet.

Since Yan previously put a framework in place for reproducible builds of
the distributed package files, and since the source code is in git and
releases are made using signed tags, we definitely have a convenient
situation for being able to _talk about_ the contents of releases (for
example, by referencing a git commit ID that was used to create them?).  I
might suggest adding the commit ID as a field to your update manifest,
although I don't think that that alone will fully solve the problem I've
described (since a malicious release could refer to a commit ID that simply
doesn't exist, or refer to the commit ID of a different genuine release,
on the theory that most users won't check).

It's also interesting to think about what data the signature should be
taken over.  Someone might think that it should just be a signature of
the hash field, but that would be bad, because it would allow a replay
attack where someone uses an old hash and signature, updates the version
and date, and then claims that the old version of the rulesets is actually
newer than the current version, causing victims to downgrade.  Probably
the signature should be taken over _all_ of the fields in the manifest,
which even makes me tempted to propose a structure something like

{
  update: {
    name     : <name of the ruleset release>,
    changes  : <a short description of recent changes>,
    version  : <a descriptive version scheme/number>,
    date     : <the date the new db was released>,
    hash     : <the hash of the db file>,
    source   : <the URL serving the updated ruleset db>
  }
  signature: <the signature that must be verified>
}

Then the signature could, for example, be taken over a serialization of
the "update" JSON object.

I've heard that there was a recently-designed update framework for
software packages (originally for one language, maybe Ruby or Node?)
which was being generalized for use by other projects.  Does anyone
remember what that framework is called?  It might be good to look at
what they do for their manifests and whether they've received some expert
cryptographic advice in their design process.

-- 
Seth Schoen  <schoen at eff.org>
Senior Staff Technologist                       https://www.eff.org/
Electronic Frontier Foundation                  https://www.eff.org/join
815 Eddy Street, San Francisco, CA  94109       +1 415 436 9333 x107


More information about the HTTPS-Everywhere mailing list