<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    The ruleset style guide is now checked in, but we are still

    accepting modifications if you have suggestions on how it should be

    different:

    <a class="moz-txt-link-freetext" href="https://github.com/EFForg/https-everywhere/blob/master/ruleset-style.md">https://github.com/EFForg/https-everywhere/blob/master/ruleset-style.md</a><br>

    <br>

    The main notable change here is that we are now encouraging explicit

    listing of subdomains <target> tags, *unless* you have a rule

    that rewrites every subdomain automatically (but these are

    rare-ish). This makes it much easier to achieve sufficient test URL

    coverage.<br>

    <br>

    Here's an example of a rule updated for the new style:

<a class="moz-txt-link-freetext" href="https://github.com/EFForg/https-everywhere/pull/1050/files#diff-dd70919bf072f3db6882e974123c2058L30">https://github.com/EFForg/https-everywhere/pull/1050/files#diff-dd70919bf072f3db6882e974123c2058L30</a><br>

    <br>

    Thanks!<br>

    <br>

    --------------------------------------------------<br>

    <h1>Ruleset Style Guide</h1>

    <p>Goal: rules should be written in a way that is consistent, easy

      for humans to

      read and debug, reduces the chance of errors, and makes testing

      easy.</p>

    <p>To that end, here are some style guidelines for writing or

      modifying rulesets.

      They are intended to help and simplify in places where choices are

      ambiguous,

      but like all guidelines they can be broken if the circumstances

      require it.</p>

    <p>Avoid using the left-wildcard ("<target

      host='*.example.com'>") unless you

      really mean it. Many rules today specify a left-wildcard target,

      but the

      rewrite rules only rewrite an explicit list of hostnames.</p>

    <p>Instead, prefer listing explicit target hosts and a single

      rewrite from "^http:" to

      "^https:". This saves you time as a ruleset author because each

      explicit target

      host automatically creates a an implicit test URL, reducing the

      need to add your

      own test URLs. These also make it easier for someone reading the

      ruleset to figure out

      which subdomains are covered.</p>

    <p>If you know all subdomains of a given domain support HTTPS, go

      ahead and use a

      left-wildcard, along with a plain rewrite from "^http:" to

      "^https:". Make sure

      to add a bunch of test URLs for the more important subdomains. If

      you're not

      sure what subdomains might exist, check the 'subdomain' tab on

      Wolfram Alpha:

      <a

        href="http://www.wolframalpha.com/input/?i=_YOUR_DOMAIN_GOES_HERE_">http://www.wolframalpha.com/input/?i=_YOUR_DOMAIN_GOES_HERE_</a>.</p>

    <p>If there are a handful of tricky subdomains, but most subdomains

      can handle the

      plain rewrite from "^http:" to "^https:", specify the rules for

      the tricky

      subdomains first, and then then plain rule last. Earlier rules

      will take

      precedence, and processing stops at the first matching rule. There

      may be a tiny

      performance hit for processing exception cases earlier in the

      ruleset and the

      common case last, but in most cases the performance issue is

      trumped by readability.</p>

    <p>Avoid regexes with long strings of subdomains, e.g. <rule

      from="^<a class="moz-txt-link-freetext" href="http://(foo">http://(foo</a>|bar|baz|bananas).example.com" />. These are

      hard to read and

      maintain, and are usually better expressed with a longer list of

      target hosts,

      plus a plain rewrite from "^http:" to "^https:".</p>

    <p>Prefer dashes over underscores in filenames. Dashes are easier to

      type.</p>

    <p>When matching an arbitrary DNS label (a single component of a

      hostname), prefer

      <code>([\w-]+)</code> for a single label (i.e www), or <code>([\w-.]+)</code>

      for multiple labels

      (i.e. <a href="http://www.beta">www.beta</a>). Avoid more

      visually complicated options like <code>([^/:@\.]+\.)?</code>.</p>

    <p>For <code>securecookie</code> tags, it's common to match any

      cookie name. For these, prefer

      <code>.+</code> over <code>.*</code>. They are functionally

      equivalent, but it's nice to be

      consistent.</p>

    <p>Avoid the negative lookahead operator <code>?!</code>. This is

      almost always better

      expressed using positive rule tags and negative exclusion tags.

      Some rulesets

      have exclusion tags that contain negative lookahead operators,

      which is very

      confusing.</p>

    <p>Prefer capturing groups <code>(www\.)?</code> over non-capturing

      <code>(?:www\.)?</code>. The

      non-capturing form adds extra line noise that makes rules harder

      to read.

      Generally you can achieve the same effect by choosing a

      correspondingly higher

      index for your replacement group to account for the groups you

      don't care about.</p>

    <p>Here is an example ruleset today:</p>

    <pre><code><ruleset name="WHATWG.org">

  <target host="whatwg.org" />

  <target host="*.whatwg.org" />

  <rule from="^<a class="moz-txt-link-freetext" href="http://((?:developers">http://((?:developers</a>|html-differences|images|resources|\w+\.spec|wiki|www)\.)?whatwg\.org/"

    to=<a class="moz-txt-link-rfc2396E" href="https://$1whatwg.org/">"https://$1whatwg.org/"</a> />

</ruleset>

</code></pre>

    <p>Here is how you could rewrite it according to these style

      guidelines, including

      test URLs:</p>

    <pre><code><ruleset name="WHATWG.org">

  <target host="whatwg.org" />

  <target host="developers.whatwg.org" />

  <target host="html-differences.whatwg.org" />

  <target host="images.whatwg.org" />

  <target host="resources.whatwg.org" />

  <target host="*.spec.whatwg.org" />

  <target host="wiki.whatwg.org" />

  <target host="<a class="moz-txt-link-abbreviated" href="http://www.whatwg.org">www.whatwg.org</a>" />

  <test url=<a class="moz-txt-link-rfc2396E" href="http://html.spec.whatwg.org/">"http://html.spec.whatwg.org/"</a> />

  <test url=<a class="moz-txt-link-rfc2396E" href="http://fetch.spec.whatwg.org/">"http://fetch.spec.whatwg.org/"</a> />

  <test url=<a class="moz-txt-link-rfc2396E" href="http://xhr.spec.whatwg.org/">"http://xhr.spec.whatwg.org/"</a> />

  <test url=<a class="moz-txt-link-rfc2396E" href="http://dom.spec.whatwg.org/">"http://dom.spec.whatwg.org/"</a> />

  <rule from="^http:"

          to="https:" />

</ruleset>

</code></pre>

    <br>

  </body>

</html>