IEInternals

A look at Internet Explorer from the inside out. @EricLaw left Microsoft in 2012, but was named an IE MVP in '13 & an IE userAgent (http://useragents.ie) in '14

Understanding Domain Names in Internet Explorer

Understanding Domain Names in Internet Explorer

Rate This
  • Comments 15

Web browsers use domain names for a variety of purposes, but how they’re used is much more complicated than most developers realize. In this post, I’ll attempt to cover the most important aspects of this topic.

Definitions

When talking about “domains” the terminology alone is confusing (and contentious).  So, let’s start with some simplistic definitions for terms used in this post:

  • A label is a single component of a domain name string, delimited by periods. For instance, “www” “microsoft” and “com” are the three labels in the domain name “www.microsoft.com
  • A plainhostname is an unqualified, single label hostname like “Payroll”, which typically refers to a server on a local intranet.
  • A FQDN is an absolute, fully-qualified domain name, like “www.microsoft.com
  • A Public Suffix is the suffix portion of a FQDN under which independent entities may register subdomains. For example, ltd.co.im is a Public Suffix. A Public Suffix contains one or more labels. Sometimes the term “effective TLD” is used as a synonym.
  • A TLD is a top-level-domain, the right-most label of a domain name
  • A gTLD is a generic TLD, like ".com”, “.net”, “.gov”, etc
  • A ccTLD is a country-code TLD, like “.us” or “.ru
  • ICANN (the Internet Corporation for Assigned Names and Numbers) is responsible for the creation and management of TLDs

When web developers talk about “the domain,” they’re often referring to what this post calls the Private Domain:

  • A Private Domain is a single label with a Public Suffix appended.

For instance, the two Private Domains “Acme.ltd.co.im” and “Bayden.ltd.co.im”, are each independently operated subdomains of Public Suffix “ltd.co.im”.

Okay, now on to the fun stuff.

Domains and the IURI Interface

First, some foreshadowing…

IE7 and above use a Consolidated URI handling feature which exposes the IURI interface.  Let’s have a quick look at a partial list of IURI property values from a sample URI: http://www.example.com/path/file.ext?query=val#frag

Uri_PROPERTY_ABSOLUTE_URI "http://www.example.com/path/file.ext?query=val#frag"
Uri_PROPERTY_DISPLAY_URI "http://www.example.com/path/file.ext?query=val#frag"
Uri_PROPERTY_RAW_URI "http://www.example.com/path/file.ext?query=val#frag"
Uri_PROPERTY_SCHEME_NAME "http"
Uri_PROPERTY_DOMAIN
aka Private Domain
"example.com" 
Uri_PROPERTY_HOST
aka FQDN or Plainhostname
"www.example.com"
Uri_PROPERTY_HOST_TYPE 1
Uri_PROPERTY_PORT 80
Uri_PROPERTY_PATH "/path/file.ext"
Uri_PROPERTY_QUERY "?query=val"

 

It’s important to note that if the URI contains only a plainhostname (e.g. “http://example/”) or a Public Suffix (e.g. “http://co.uk/”), then Uri_PROPERTY_DOMAIN is null.

Why Do Browsers Care About Domains?

Every browser must be able to determine the Private Domain for a number of uses, but in this post I’ll concentrate on IE’s use of this information.

1. Domain Highlighting in the Address Bar

IE8’s Domain Highlighting feature renders the Private Domain in black text and the rest of the URL in gray to help prevent the use of misleading URLs in spoofing attacks.

If the URL contains a plainhostname, the address bar will render the plainhostname in black instead.

2. Quota management for Local Storage

IE8 applies a per-Private Domain quota to values stored using the HTML5 Local Storage API.

If the Uri_PROPERTY_DOMAIN is null (because the URL contains a plainhostname) the browser will enforce the quota against Uri_PROPERTY_HOST instead.

3. document.domain relaxation

Same-Origin-Policy typically means that two pages must have exactly-matching FQDNs in order to script against each others’ DOM. However, HTML allows a page to relax its document.domain property to a suffix of its current value to enable cross host DOM communication within a single Private Domain. Script is not permitted to change its document.domain property to a string shorter than the private domain. This prevents sites from unrelated organizations from intentionally or inadvertently scripting against each others’ DOM.

4. HTTP Cookies

When setting a cookie, a website may specify which hosts the cookie should be sent to using the domain attribute. The browser must block attempts to set a cookie where the domain attribute does not end with the current page’s Private Domain. Failure to do so results in privacy and security concerns.

Privacy: Allowing unrelated domains to share cookies can result in “super-cookies”-- cookies which are sent to multiple unrelated organizations that happen to share a Public Suffix.

Security: Session-fixation attacks, where a good site and an evil site share a Public Suffix, and the evil site sets a malicious cookie on the Public Suffix so that the Good site is sent the evil cookie.

5. Security Zones – Mapping Domains to Zones

Because Public Suffixes are typically shared by multiple unrelated organizations, URLMon does not permit users to add all sites in a given public-suffix to a security zone.

We are aware that there are scenarios where such assignments may be desirable to some organizations (e.g. perhaps I would like to assign *.mil to the Trusted Sites Zone).

6. Security Zones – Automatic Zone Determination

URLMon (subject to some caveats) is configured by default to map plainhostnames to the Intranet zone.

7. Per-site ActiveX

When the user uses the Information Bar to allow an ActiveX control to run, Internet Explorer 8’s Per-Site ActiveX feature adds the current Private Domain to the Allow list for that control.

8. Compatibility View

Internet Explorer 8’s Compatibility View button adds the current Private Domain to the compatibility view list.

9. XSS Filter

IE8’s XSS Filter uses the Private Domain to determine whether a given navigation crosses from one Private Domain to another.

10. InPrivate Filtering

IE8’s InPrivate Filtering feature uses Private Domain information to help determine whether a given request is being sent to a 3rd party site.

11. Preserve Favorite Website Data

IE8’s Delete Browsing History feature includes a new “Preserve Favorites website data” option. As I described back in this post from June, this feature relies on the Private Domain to help determine whether stored data is related to one of the user’s favorite websites.

The Challenge of ccTLDs

In the early days of the web, most ccTLDs were organized in such a way that it was relatively easy to heuristically determine the Public Suffix of any FQDN. Over time, however, different ccTLDs decided that they wanted to create new Public Suffixes within their ccTLD, or decided to allow registration of Private Domains that the heuristics would incorrectly treat as Public Suffixes. Some nations (like Tuvalu) have outsourced registration of subdomains and allow anyone to obtain Private Domains within their ccTLD (.TV).

Prior to IE8, there was no one codepath in IE where the Private Domain was calculated, so over time several point-fixes were made to liberalize cookie setting in certain ccTLDs.

The heuristic Private Domain determination algorithm in IE5+ is:

1> If the final label is empty, drop it for the purposes of this algorithm 
Otherwise "www.example.com." would have four labels "www", "example", "com", "".  Instead, we drop the final label.

2> Name the labels Ln,...,L3,L2,L1; decreasing from start (Leftmost=Ln) to finish (Rightmost=L1). 
If at any point in this algorithm the result demands >n labels, getPrivateDomain returns "".

3> Check n > 1.  If not, there's no PublicSuffix, just a plainhostname. Return ""; exit. 
Dotless FQDNs consist of a host only, there is no domain.

4> Check L1 == "tv".  If so, getPrivateDomain returns L2.L1; exit. 
"tv" is a special-case "completely flat" ccTLD for historical reasons.

5> Check Len(L1) > 2.  If so, getPrivateDomain returns L2.L1; exit. 
Len(L1)>2 suggests L1 is a gTLD rather than a ccTLD. 
If Len(L1)<=2 we assume L1 is a part of a ccTLD.

6> Check if L2 in gTLD list "com,edu,net,org,gov,mil,int".  If so, getPrivateDomain returns L3.L2.L1; exit. 
gTLDs, when they appear immediately left of a ccTLD (modulo exception in step 4), are considered a part of the Public Suffix.

7> If L1 is in the list "GR,PL" AND L2 is NOT in the gTLD list, getPrivateDomain returns L2.L1; exit. 
GR and PL are considered "flat" ccTLDs EXCEPT when a gTLD appears in L2. 
getPrivateDomain("a.pl") returns "a.pl" 
getPrivateDomain("a.uk") returns ""

8> If Len(L2) < 3 getPrivateDomain returns L3.L2.L1; exit. 
getPrivateDomain("aa.bb.cc") returns "aa.bb.cc"

9> Otherwise, getPrivateDomain returns L2.L1 
getPrivateDomain("aa.bbb.cc") returns "bbb.cc"

While this heuristic worked pretty well for many years (and still works reasonably well in general) it clearly was becoming increasingly complicated due to the fact that each ccTLD established different operating practices (and those, in turn, changed over time).

Changes in Internet Explorer 8

For IE8, we’ve updated major codepaths to use CURI’s Uri_PROPERTY_DOMAIN for Private Domain determination, helping to ensure consistency throughout the various browser components.

IE8's version of URLMon maintains a list of special-cases which are used as exceptions to the default heuristics that CURI uses. You can click this link to view the list maintained as an XML resource inside URLMon.dll. The list contains elements which should be treated as Public Suffixes (the XML nodes named “tld”) and elements which should be treated as private domains (the XML nodes named “domain”).

From a browser architecture perspective, lists like this one are the option of last resort, for a number of important reasons. However, there’s no currently no standard that promises relief. One proposal which has been discussed in a few forums is to allow the DNS itself to indicate (via a new record) which names are part of a Public Suffix and which are part of a Private Domain, but that approach is not without problems.

The (Coming) Challenges with gTLDS

ICANN recently voted to allow organizations to create new generic TLDs. Introduction of new gTLDs may introduce additional problems, because previously most of the “special cases” were found only in ccTLDs. Other parties (like Certificate Authorities) would also likely be significantly impacted by this liberalization of gTLDs.

As this area is still developing, it will likely be the topic of a future post. (For now, see this one)

Until then…

-Eric

  • I'd just like to point out that i've never heard of the ltd.co.uk suffix and are, in fact, struggling to find anything that uses it :D

    googling for site:.ltd.co.uk finds nothing, for instance

  • @frymaster: ltd.co.uk was only used as an example; if you check the list in URLMon, you'll find similar examples, like ltd.co.im and plc.co.im. I'll update the post to remove any confusion.

    FWIW: Ltd.uk is an example of a public suffix which isn't matched by the default heuristics (e.g. see https://bugzilla.mozilla.org/show_bug.cgi?id=252342)

  • Hi,

    I remember in the IE8 testing days, users complaining of not being able to access domains with two character top level domains. eg. nl.com that also matched an ICANN country code.

    I presume the list of domains in the res://urlmon.dll/ietldlist.xml resource

    <domain>cn.ca</domain> <domain>dr.dk</domain> <domain>rd.dk</domain> 
    <domain>m6.fr</domain> <domain>cz.nl</domain> <domain>nu.nl</domain>
    <domain>ou.nl</domain> <domain>vg.no</domain> <domain>km.ru</domain> 
    <domain>nm.ru</domain> <domain>ot.ru</domain> <domain>ya.ru</domain>

    flags these domains.

    Are there any standards to stop domain registrants from using "xx.yy" as a Private Domain or will this list have to be updated over time?

    Regards.

  • @Rob, I'm not sure I recall any such reports. No, "nl.com" and the like are not at all impacted by the heuristics, as "nl.com" doesn't match the "xx.yy" pattern of a ccTLD Public Suffix.

    As discussed in the text above, by default, FQDNs of the form "xx.yy" are treated as a public suffix. This has implications for cookies and so on, as described in the "Why do browsers care about domains" section.

  • Mozilla, and I think other browsers, used the Public Suffix List:

    http://publicsuffix.org/

  • @Tor: Yup. There's a discussion of this over at http://weblogs.mozillazine.org/gerv/archives/2009/11/ie_8_and_the_public_suffix_list.html

  • We have a two-letter domain, xx.tt, which gets an error if we in javascript do document.domain="xx.tt". It works for all browser, except IE (all versions).

    What can we do to work around this?

  • @Tobias: Your question is basically answered by the text of the article. For IE7 and below, you can restructure your domain; that's the only workaround. For IE8 and later, you could request to have your domain added to IE's TLD list. You'll need to contact Microsoft Support for this-- it's not something I can help you with.

  • Thanks for your answer. We did try to do a hack with www.xx.tt, but weren't able to make it work - but it might have been a bug on our part.

    Now we are playing around with http://easyxdm.net/ to see if that helps us.

    -Tobias

  • We would like to see .bv.nl added to the IE list. The Public Suffix is on the http://publicsuffix.org/ list. But how do we get this added in IE?

  • @Pieter: As I mentioned a few comments back, domain owners/registries may contact Microsoft Product support to validate their ownership and request changes in the TLD list. Thanks!

  • But how do updates get into the list used by IE? I remember getting certificate and time-zone updates all the time through WU but I have yet to see a domain suffix list update or are they (an undocumented?) part of the security updates? How quickly can we expect these updates to reach over 90% of the users?

  • @Bruno: The XML files containing the TLD list are a part of the bi-monthly cumulative update to Internet Explorer. They live as a resource inside URLMon.dll (mentioned in the comments above) in IE9 and lower.

  • @Jean-Marc, the problem is that I've got 2 different web app on same JBoss and root context of the first begins with the second. My first webApp have /cas as root context. The second have /cas-services. IE10 send JESSIONID cookie created for /cas to /cas-services calls. Bug can be followed here : connect.microsoft.com/.../ie10-redirect-on-ajax-request-not-always-followed

    JM.

    EricLaw: Your comment doesn't really belong on this post, since it has nothing to do with domain names. Nevertheless, the behavior you describe is by design.

    http://web.archive.org/web/20080205173011/wp.netscape.com/newsref/std/cookie_spec.html
    path=PATH The path attribute is used to specify the subset of URLs in a domain for which the cookie is valid. If a cookie has already passed domain matching, then the pathname component of the URL is compared with the path attribute, and if there is a match, the cookie is considered valid and is sent along with the URL request. The path "/foo" would match "/foobar" and "/foo/bar.html". The path "/" is the most general path.

  • List of new gTLDs: newgtlds.icann.org/.../delegated-strings

Page 1 of 1 (15 items)
Leave a Comment
  • Please add 4 and 8 and type the answer here:
  • Post