[HTTPS-Everywhere] Automatic testing of rules to discover rules that broke (e.g. by site redesign)

Colonel Graff graffatcolmingov at gmail.com
Thu May 17 15:26:58 PDT 2012


On Thu, May 17, 2012 at 5:34 PM, Peter Eckersley <pde at eff.org> wrote:
> By the way, Seth managed to use this script to author a bugfix commit like this:
>
> https://gitweb.torproject.org/https-everywhere.git/commitdiff/31e73569f7ef0670490b7111cf069a8fae758d5d
>
Awesome, I had picked off a few as well. I'm glad it's somewhat
useful. Out of curiousity, where does the platform tag come from or
has it just been that I need to look at the FAQs more often?

> But there are still hundreds of other rulesets that your scripts generate
> warnings about that need to be examined.  It will be quite a lot of work to do
> this automatically (for instance, your code will generate warnings about
> "transvalid" certs that are usually valid in real-world Firefoxes because of
> cached intermediate CAs, but the Python requests library doesn't know about
> this caching.  We could hook things up to the SSL Observatory to try to
> deal with this, but it's quite a bit of work:
> https://git.eff.org/?p=observatory.git;a=blob;f=transvalid.py;h=b0a5ce6ab89e35c30ea74384acbe8c45781b8695;hb=HEAD )
>
Well I finished my last final last night so I have a lot more time on
my hands. I'll look more into this and see if there's a good way of
checking it without having to hook into the SSL Observatory.

> Another option would be to start running these kinds of scripts on a nightly
> basis, and checking the output into git or otherwise publishing it, so that
> ruleset authors can see when there's possible breakage to be investigated.
>

Well, my concern would be that these websites might start to realize
that every night they're getting the same request from the same IP
addresses and get suspicious. We obviously can send user-agent
information along with this to identify the script for people, but
even then, it might result in suspicious system administrators
IP-blocking the script.

On the other hand, with git hooks, we could have a post-receive hook
for everytime you push back to the git repository on the torproject
website.
Would that be better perhaps? It would run possibly everyday, but not
necessarily everyday if there are no changes.
> On Wed, May 09, 2012 at 09:13:56PM -0400, Colonel Graff wrote:
>> Hey Ondrej,
>>
>> Peter and Scott have both been made aware of this, but these are the
>> first steps towards what you suggested. Already with them I've found a
>> few broken websites that I've fixed. Don't run them on your machine
>> though. They're highly imperfect. Feel free to add/improve them
>> though.
>> https://gitorious.org/https-everywhere-fork/https-everywhere-fork/blobs/graff/single_rule_response.py
>> https://gitorious.org/https-everywhere-fork/https-everywhere-fork/blobs/graff/trivial-response.py
>>
>> For now these will just test the destination of the redirects that
>> don't have backreferences. No breadth first search yet. That wouldn't
>> be trivial. I might go ahead and write that later, but not right now.
>> As a non-CS person (I'm far more of a hobbyist) I haven't had the joy
>> of dealing with databases yet. If you'd like to add that to this or
>> send me a good tutorial for the future, I'd appreciate it.
>>
>> On Tue, May 1, 2012 at 10:03 AM, Colonel Graff
>> <graffatcolmingov at gmail.com> wrote:
>> > On Sun, Apr 29, 2012 at 12:15 AM, Peter Eckersley <pde at eff.org> wrote:
>> >>
>> >> On Sat, Apr 28, 2012 at 06:50:01PM +0200, Ondrej Mikle wrote:
>> >> > Hi,
>> >> >
>> >> > seeing how many times "broken rule" topics appear in
>> >> > https-everywhere-rules,
>> >> > have you considered some automatic tests? (Re-reading the text of this
>> >> > mail
>> >> > again, I might be attempting to solve a problem you may not really
>> >> > have.)
>> >>
>> >> We would /love/ to have someone implement a proper test suite of this
>> >> sort!
>> >> We even offered it as a Google Summer of Code project this year, though we
>> >> didn't get any takers :(
>> >>
>> >> --
>> >> Peter Eckersley                            pde at eff.org
>> >> Technology Projects Director      Tel  +1 415 436 9333 x131
>> >> Electronic Frontier Foundation    Fax  +1 415 436 9993
>> >>
>> >> _______________________________________________
>> >> HTTPS-everywhere mailing list
>> >> HTTPS-everywhere at mail1.eff.org
>> >> https://mail1.eff.org/mailman/listinfo/https-everywhere
>> >
>> >
>> > I'm not entirely certain of the best way to do this, but if anyone's looking
>> > at doing it in Python (since we already use python for rule validation)
>> > check out Kenneth Reitz's excellent requests library. It handles https (if I
>> > remember correctly) and makes URL handling a lot easier.
>>
>> _______________________________________________
>> HTTPS-everywhere mailing list
>> HTTPS-everywhere at mail1.eff.org
>> https://mail1.eff.org/mailman/listinfo/https-everywhere
>
> --
> Peter Eckersley                            pde at eff.org
> Technology Projects Director      Tel  +1 415 436 9333 x131
> Electronic Frontier Foundation    Fax  +1 415 436 9993




More information about the HTTPS-everywhere mailing list