[PrivacyBadger] tracking CGI args

Greg Lindahl lindahl at pbm.com
Sat Apr 2 14:50:36 PDT 2016


1. IA makes a list of potential args for, say, the top million Alexa
sites.

2. Someone writes a little QA gizmo that tries to double-check the
list from (1) (we haven't done this sort of evaluation yet)

Given this reliable list of candidate args,

3a. Privacy Badger evaluates the arguments similar to cookies,
i.e. ?utm_campaign=email doesn't look like tracking, but
?uuid=3463476237676 does, or,

3b. Privacy Badger drops them all

I'll open a github issue with this sort of suggestion, I hope that'll
get more commentary. Thanks!

-- greg

On Fri, Apr 01, 2016 at 05:51:08PM -0700, Cooper Quintin wrote:
> Hi Greg,
> I think that this is super interesting and I definitely want to root out
> this type of tracking. I think maybe the best place to raise this idea
> would be on Github.  I would love to hear more though about how you
> think we could detect this sort of behavior.
> 
> - Cooper
> 
> On 03/22/2016 12:36 PM, Greg Lindahl wrote:
> > For a long while I've been annoyed by tracking CGI args, like the
> > Urchin/Google Analytics utm_* args. Like cookies, sometimes they have
> > long, unique-looking values, other times they have short values.
> > 
> > There are a few browser plugins in this area:
> > 
> > https://addons.mozilla.org/en-US/firefox/addon/au-revoir-utm/?src=cb-dl-toprated
> > https://chrome.google.com/webstore/detail/tracking-token-stripper/kcpnkledgcbobhkgimpbmejgockkplob?hl=en
> > 
> > but they are driven by static lists (like utm_*), which is not as
> > flexible and comprehensive as the Privacy Badger approach.
> > 
> > I recently started working at the Internet Archive, where our crawler
> > and Wayback Machine playback are both needlessly confused by these CGI
> > args, in both the long, unique form & the short form. Our crawler has
> > the ability to discover some of these tracking args by noticing that
> > multiple urls, differing only in their cgi args, deliver pages with
> > the same hash.
> > 
> > If you're interested at all, I could imagine several ways you might
> > want to proceed. I'm happy to provide data from IA's crawler, and I'd
> > also like to consume any data that you folks generate.
> > 
> > -- greg
> > 
> > _______________________________________________
> > PrivacyBadger mailing list
> > PrivacyBadger at eff.org
> > https://lists.eff.org/mailman/listinfo/privacybadger
> > 
> _______________________________________________
> PrivacyBadger mailing list
> PrivacyBadger at eff.org
> https://lists.eff.org/mailman/listinfo/privacybadger


More information about the PrivacyBadger mailing list