[HTTPS-E Rulesets] Example of site broken by Voxel ruleset; some missing default_off fields; etc.

Christopher Liu cmliu00151 at gmail.com
Fri Aug 24 22:19:47 PDT 2012


To whom it may concern:

To clarify my previous email: An example of a site actually broken by
the Voxel ruleset is www.barstoolsports.com , which uses
3432.voxcdn.com for most images and some stylesheets.
As stated previously, the Voxel requests return 404, and I suspect
that Voxel allows https access only to buckets of premium account
holders.

Syllabusshare-mismatches and openDesktop-mismatches have had their
default_off fields mistakenly omitted. Someone else filed
https://trac.torproject.org/projects/tor/ticket/6394 for the latter,
which was mistakenly closed as wontfix because no one realized what
the problem was exactly. No one seems to have entered a bug for the
former.

Observations on some tickets:
-Regarding https://trac.torproject.org/projects/tor/ticket/6592 ,
Zemanta seems to have a WordPress plugin on the server side and a
Firefox extension on the client side (
http://www.zemanta.com/download/ ) - thus this may be a dependency of
#3190, and there may be not be a ruleset fix possible.
-The bug reported at
https://trac.torproject.org/projects/tor/ticket/6556 (which concerns
Dailymotion) is probably related to the one I mentioned a few emails
ago. I did successfully test some embedded videos after making the
changes I suggested.

Also, regarding a couple of things from me just recently committed to
the repository:
-The "EchoEnabled" ruleset's name may need to be reconsidered, as the
company's name is just Echo. See aboutecho.com
-As for Cheezburger Network, I've been working on a ruleset that
covers the actual sites and not just their image content.
I was originally going to hold this until after the 3.0 release,
seeing as you are not taking new rulesets for that branch; I
reconsidered after seeing this in the repository (master).
I may as well attach what I have now, which is a bit too complicated
to explain here. I've no longer marked it as partial, for the
following reasons: (1) As far as I know, it is as comprehensive as
possible. (2) Most pages are fully secured when this is used, with
help from a few other (existing) rulesets. (3) The wildcard rule
should catch all future user-created sites.

C. Liu

P.S. Expect another email within a few days ... Again, thank you for
your time and help.
-------------- next part --------------
<ruleset name="Cheezburger Network">
<!-- Note on terminology:
     The "old" Cheezburger Network sites are icanhascheezburger.com, failblog.org, memebase.com, thedailywh.at
     and subdomains thereof.
     They have cert mismatches because of WordPress.com hosting; thus they are not handled here.
     They may go away soon per http://blog.cheezburger.com/announcements/beta-real-deal/

     The "new" Cheezburger Network sites are subdomains of cheezburger.com, with pages for individual posts at
     //cheezburger.com/\d+
-->
<!-- cheezburger.uservoice.com homepage redirects back to feedback.cheezburger.com
     Most images/CSS on feedback come from cdn.uservoice.com

     jobs.cheezburger.com IN CNAME theresumator.com

     advertise redirects to advertising
     (advertising|sites). IN CNAME unbouncepages.com
     unbouncepages-com.s3.amazonaws.com/(advertising|sites).cheezburger.com/

     (wac.1ddb|gs1.wac).edgecastcdn.net found in CNAME records of:
     - images.cheezburger.com: one image on jobs (only?)
     - static.cheezburger.com: at least one CSS file on the old sites; not used on the new sites
     - (a|i[0-3]).kym-cdn.com: image content on knowyourmeme
     i\d.kym-cdn originates from the S3 bucket kym-assets (which the site used directly before switching to EdgeCast).
     !work: https://gs1.wac.edgecastcdn.net/801DDB/kym-assets.s3.amazonaws.com/
     Other origins unknown - probably not S3, based on HTTP headers
-->
<!-- Other nonfunctional:
     - knowyourmeme.com
     - barium.cheezdev.com (protocol-relative URLs point to it, but no https response?
                            seems to be a tracking script from Amazon Elastic Load Balancing.
                            canonical name is BariumLoadBalancer-1809558234.us-east-1.elb.amazonaws.com
                            it provides no obvious functionality, so might not be worth a downgrade)

     The following are 404 even in http, hence not bugs in this ruleset:
     - cheezburger.com/home/log
       intended as a tracking script, found on sites.cheezburger.com
     - cookie.chzbgr.com/Home/Contact3rdPartyCookie[12]/
       found on contact form (IN CNAME cheezburger.com, no rule written because of cert mismatch)
     In both cases, the URLs have query parameters not shown here.
-->
   <target host="cheezburger.com" />
   <target host="*.cheezburger.com" />
      <exclusion pattern="^http://(advertise|blog|corp|developer|feedback|images|jobs|static)\.cheezburger\.com/" />
      <exclusion pattern="^http://advertising\.cheezburger\.com/(?!clkn/https?/((www\.)?dropbox|www\.isocket|s\.chzbgr)\.com/)" />
      <exclusion pattern="^http://sites\.cheezburger\.com/(?!clkn/https?/([\w\-]+\.)?cheezburger\.com/)" />
      <exclusion pattern="^http://sites\.cheezburger\.com/clkn/http/(advertise|advertising|blog|corp|developer|feedback|images|jobs|sites|static)\.cheezburger\.com/" />
      <!-- four defunct redirectors, now 403 even in http - links found in knowyourmeme.com's bottom navbar
           They were created several domain-name strategy changes ago, long before the new sites existed.
           They would probably have cert mismatches, but this needs testing. To what hosting service do they currently point? -->
      <exclusion pattern="^http://(derp|historiclols|intertuberecords|totsandgiggles)\.cheezburger\.com/" />
      <exclusion pattern="^http://(app|builder|profile|www)\.cheezburger\.com/sites/redirect\?url=(?!http%3A%2F%2F(dolanpls|olympics\.cheezburger)\.com%2F$)" />
   <target host="i.chzbgr.com" />
   <target host="s.chzbgr.com" />
   <target host="t.chzbgr.com" />
   <target host="i0.kym-cdn.com" />
   <target host="i1.kym-cdn.com" />
   <target host="i2.kym-cdn.com" />
   <target host="i3.kym-cdn.com" />
   <target host="chzb.gr" />
   <target host="dolanpls.com" />
   <target host="lolmart.com" />

<!-- Since there are cross-domain cookies, securecookie would be welcome, but testing assistance is needed
     from registered Cheezburger Network users who regularly post comments and use other features.

     Observed cross-domain cookies:
     - s_vsn_chzglobal_1
     - cheez_voting_id
     - ChzPid5
     - seenWelcomeMsg
     - bp_channel_id (related to Echo commenting system?)
     - s_sess
     - s_pers
     and Google Analytics cookies

     Cookies specific to app.cheezburger.com:
     - useg-1-ut-useg
     - ASP.NET_SessionId
     - Coyote-2-d0737673 (numbers can vary; similar cookies set by each user-created site?)
     Cookies specific to each of ^((failblog|icanhas|memebase|thedailywhat)\.)?cheezburger\.com$:
     - backplane-channel (related to Echo commenting system?)
     - __RequestVerificationToken_Lw__

     The totally unsecurable domains are mostly static information
     (except for blog|corp, but that doesn't use the main Cheezburger login).
     No paths on otherwise-secured domains are known to redirect to http.

     [ist].chzbgr.com don't appear to set any cookies
-->

<!-- Specially handle a couple cases whose final destination is a new site, so as not to hit the downgrade -->
   <rule from="^https?://(?:app|builder|profile|www)\.cheezburger\.com/sites/redirect\?url=http%3A%2F%2Fdolanpls\.com%2F$"
           to="https://memebase.cheezburger.com/dolanpls" />
   <rule from="^https?://(?:app|builder|profile|www)\.cheezburger\.com/sites/redirect\?url=http%3A%2F%2Folympics\.cheezburger\.com%2F$"
           to="https://icanhas.cheezburger.com/roflympics/" />
<!-- When fetched over https, this always redirects to https, breaking most of the links
     on the (app|builder|profile|www).cheezburger.com homepage, which point to old sites. -->
   <rule from="^https://(app|builder|profile|www)\.cheezburger\.com/sites/redirect\?url="
           to="http://$1.cheezburger.com/sites/redirect?url=" downgrade="1" />

   <rule from="^http://advertising\.cheezburger\.com/clkn/https?/(?:www\.)?dropbox\.com/"
           to="https://www.dropbox.com/" />
   <rule from="^http://advertising\.cheezburger\.com/clkn/https?/www\.isocket\.com/"
           to="https://www.isocket.com/" />
   <rule from="^http://advertising\.cheezburger\.com/clkn/https?/s\.chzbgr\.com/"
           to="https://s.chzbgr.com/" />
   <rule from="^http://sites\.cheezburger\.com/clkn/https?/([\w\-]+\.)?cheezburger\.com/"
           to="https://$1cheezburger.com/" />

<!-- This may have query parameters SecretHostnameOverride and OnoBetaOptIn.
     When they are present, this normally redirects as:
     chzb.gr/id1234?SecretHostnameOverride=referring.site&OnoBetaOptIn=true
     - bit.ly/id1234?cc=someHexDigitsHere
     - target.site?OnoBetaOptIn=true

     Since the first bounce is beyond our ability to reproduce, just strip the parameters.
     bit.ly will redirect us to the right page anyway, sans OnoBetaOptIn
     which is only for tracking purposes. -->
   <rule from="^https?://chzb\.gr/(.+)(?:\?.*)?$"
           to="https://bit.ly/$1" />

<!-- Known working:
     - //
     - app
     - builder
     - failblog
     - icanhas
     - lolmart
     - memebase
     - profile
     - search
     - thedailywhat
     - www
     - User-created sites (too many to list here)

     app, builder, profile, and www are all equivalent to each other but not to //. -->
   <rule from="^http://([\w\-]+\.)?cheezburger\.com/"
           to="https://$1cheezburger.com/" />

   <rule from="^http://([a-z])\.chzbgr\.com/"
           to="https://$1.chzbgr.com/" />

<!-- This deliberately uses both forms of S3 URLs.
     Mainly, this attempts to avoid throttling issues such as those seen on Tumblr
     (https://gitweb.torproject.org/https-everywhere.git/commitdiff/4d2e2e5173f4393c0b24ed4271a728574f24c400),
     which are a theoretical concern on the image gallery pages.
     Domain sharding is partly preserved, with an arbitrary choice of numbering,
     as all i\d are equivalent to each other and to the S3 bucket anyway.
-->
   <rule from="^http://i[02]\.kym-cdn\.com/"
           to="https://s3.amazonaws.com/kym-assets/" />
   <rule from="^http://i[13]\.kym-cdn\.com/"
           to="https://kym-assets.s3.amazonaws.com/" />

   <rule from="^https?://dolanpls\.com/?$"
           to="https://memebase.cheezburger.com/dolanpls" />
   <rule from="^https?://lolmart\.com/"
           to="https://lolmart.cheezburger.com/" />

<!-- Bad protocol-relative links in the navbars of failblog|icanhas|memebase|thedailywhat -->
   <rule from="^https://(advertising|blog|developer|feedback|jobs|sites)\.cheezburger\.com/?$"
           to="http://$1.cheezburger.com/" downgrade="1" />
   <rule from="^https://corp\.cheezburger\.com/(terms-of-service|privacy-policy|copyright-infringement-notification)/$"
           to="http://corp.cheezburger.com/$1/" downgrade="1" />
</ruleset>


More information about the HTTPS-Everywhere-Rules mailing list