[PrivacyBadger] Making Privacy Badger break fewer sites

Cooper Quintin cooperq at eff.org
Thu Dec 8 07:30:20 PST 2016


FWIW I am also unable to reproduce blocking jsdelivr.net

I agree that the debugging info we need is something like which fqdn was
detected tracking on which fqdn and why. But that is a lot of data to
store just for debugging purpouses, and the data is pretty personal. So
I am reluctant to store it by default.

For the algiola issue they are actually setting a cookie and having a
problem with the DNT policy, so I think that's a separate issue.

As for amazonaws I think the problem there is that
getBaseDomain('css.patheos.com.s3.amazonaws.com')
returns 'com.s3.amazonaws.com'
when it should probably return one level up. e.g.
patheos.com.s3.amazonaws.com

This is because it is not in the public suffix list. The result is that
any *.com.s3.amazon domain which is tracking will end up blocking all of
those domains. We should file an issue against the PSL for this.

In conclusion I suspect there are a number of different reasons why CDNs
break. I suspect some of them set cookies, some are misconfigured, and
some are breaking because of incorrect base domains.

On 12/02/2016 11:04 AM, Alexei Miagkov wrote:
> Another CDN getting incorrectly (?) blocked, possibly the same root
> problem: https://github.com/EFForg/privacybadgerchrome/issues/1015
> 
> On Mon, Nov 21, 2016 at 12:25 PM, Alexei Miagkov <amiagkov at gmail.com
> <mailto:amiagkov at gmail.com>> wrote:
> 
>     Let's take this just filed CDN issue for
>     example: https://github.com/EFForg/privacybadgerchrome/issues/998
>     <https://github.com/EFForg/privacybadgerchrome/issues/998>. How did
>     PB decide to block cdn.jsdelivr.net <http://cdn.jsdelivr.net>?
>     Perhaps a "why did PB decide to block this" feature would indeed be
>     useful, for debugging by devs/advanced users. I visited a few sites
>     using this CDN (list below), but I don't see PB getting triggered by it.
> 
>     http://fontawesome.io/
>     http://www.urbandictionary.com/ <http://www.urbandictionary.com/>
>     https://www.ryanair.com/gb/en/
>     http://www.kicker.de/
>     https://woocommerce.com/
>     https://www.britannica.com/
>     http://www.ehowenespanol.com/
> 
>     On Mon, Nov 21, 2016 at 10:49 AM, Alexei Miagkov <amiagkov at gmail.com
>     <mailto:amiagkov at gmail.com>> wrote:
> 
>         Hey everyone,
> 
>         Alex, Cooper and I have been talking last week about what's most
>         important to work on for Privacy Badger (PB). We seem to agree
>         the top pain point for PB users seems to be site breakage. Here
>         are my notes:
> 
>         Users experience problems with sites while using PB, get
>         frustrated. Some uninstall PB. A few send emails, open issues on
>         GitHub; this is extra work for both users and developers.
> 
>         The cookieblock/yellow list is relied upon to fix breakages but
>         requires manual maintenance (high effort), and involves users
>         experiencing site problems for at least some time. (An automated
>         way to discover site breakage would help, but seems like a hard
>         problem.)
> 
>         We can reduce breakage by making the (cookie?) heuristic more
>         tolerant (more accurate/discerning in calling domains out as
>         trackers). The original parts of the heuristic haven't been
>         reviewed in a while.
> 
>         Why do some many CDNs seem to get blocked? They shouldn't,
>         right? If not, let's see what triggers blocking for them and
>         whether the heuristic can be adjusted.
> 
>         What are the other top examples of PB breaking sites? Breakages
>         caused by tightly-coupled scripts getting blocked might now be
>         fixable with the new surrogate script system (when doesn't this
>         work? scripts where tracking and functionality are inseparable?
>         such as?).
> 
> 
> 


More information about the PrivacyBadger mailing list