IEInternals

A look at Internet Explorer from the inside out. @EricLaw left Microsoft in 2012, but was named an IE MVP in '13 & an IE userAgent (http://useragents.ie) in '14

IE and the Accept Header

IE and the Accept Header

  • Comments 18

RFC 2616 describes the Accept request header as follows:

The Accept request-header field can be used to specify certain media types which are acceptable for the response. Accept headers can be used to indicate that the request is specifically limited to a small set of desired types, as in the case of a request for an in-line image.

While I’ll spare you a debate about the merits and pitfalls of server-driven content negotiation, suffice it to say that I think that such negotiation is impractical (and/or suboptimal) for most scenarios encountered by general-purpose web browsers. The primary reason to spare the debate (and inevitable flame war) is that it’s mostly a moot point, because all versions of Internet Explorer are seriously limited when it comes to support of the Accept header. Note: other browsers actually suffer from similar limitations in certain cases– I’m focused on IE in this post because it’s the browser I work on.

Let’s take a look what IE sends in the Accept header.

Install a recent version of Fiddler, run it, and click Rules > Customize Rules.  Scroll to the static function Main() block, and add the following line within:

FiddlerObject.UI.lvSessions.AddBoundColumn("Accept", 50, "@request.Accept");

Save the file, and you’ll see a new column titled Accept appears within the Fiddler UI, showing the value of the Accept request header.

Navigate to websites, and watch the value of the Accept header sent in each request.  You’ll quickly notice that in almost all cases, the Accept header contains */*, meaning that IE is willing to accept documents of any MIME type. This is technically accurate, insofar as IE will offer to download and save MIME-types it doesn’t know how to render.

However, in some navigations, you’ll see that IE sends a more complete string, containing a wide variety of MIME-types.  For instance:

image/jpeg, application/x-ms-application, image/gif, application/xaml+xml, image/pjpeg, application/x-ms-xbap, application/msword, application/vnd.ms-excel, application/x-shockwave-flash, */*

Hit F5 to refresh that page, and IE will probably go back to sending */* again.  Clearly, IE is inconsistent in what it chooses to send in the Accept header, but by now you’re probably curious where these MIME types even come from.

IE generates this list by enumerating the values listed in the following registry key:

HKLM\Software\Microsoft\Windows\CurrentVersion\Internet Settings\Accepted Documents

Any application can, upon install, attempt to advertise the MIME-types it supports using this registry key.  However, I strongly recommend that developers not list MIME types here. 

Why not?  Well, first of all, the list isn’t typically sent, so you cannot write server code which checks for your MIME-type and conditionally serves content of that type only if the client asks for it.  In almost all cases, servers will end up getting requests for content of type */* and return content in your format anyway.

More importantly, it turns out that Accept headers containing custom types cause two serious problems:

  1. Slower performance due to bloated requests
  2. Server errors on sites expecting headers of fixed maximum length

The first problem is pretty obvious: any time the full MIME-type list is sent, significant request bandwidth is wasted.  The Accept header above, for instance, is 191 characters long, and due to the asymmetrical nature of bandwidth for most users (upload bandwidth is usually a small fraction of download bandwidth) such waste can quickly add up. 

The second problem is less obvious but more serious: Many web server devices and frameworks expect HTTP headers to be shorter than a certain length and will return HTTP error codes (HTTP/400 and HTTP/406 are popular) if overlong headers are received. Beyond the immediate annoyance of such errors, there’s almost never any indication to the user what has gone wrong and how to fix it. Users cannot be expected to know that the problem is an overlong header and find the above-mentioned registry key to start deleting entries.

Often, an IE user encountering this problem will try another browser and find that it works fine, because other browsers generate their Accept headers from other lists that are less likely to be updated by installed applications. For instance, Firefox sends the value of its network.http.accept.default preference as the content of the Accept header.

Alert readers will notice that Microsoft applications are culprits in Accept header bloat, and this is something that IE will be working with other teams around the company to help mitigate.  In a future IE version, we may remove or substantially change Accept-header generation logic to help eliminate this problem.

User-Agent string extensibility causes a similar problem, and that issue is so prevalent that it will be the subject of a future post: Internet Explorer User-Agent: Use and Abuse.

Thanks for reading!

-Eric

Update: The IE9 Release Candidate significantly changes IE's use of the Accept header in IE9 Browser Mode. IE9 deprecates registry-based extensibility of the Accept header, and rather than sending */* for most downloads, now sends a more-specific Accept header based on which HTML element initiated the request. I wrote about the details of IE9 Accept Headers on the Fiddler Blog.

  • Thx, this was straight to the point. I have a question though: "I strongly recommend that developers not list MIME types here". Is there another way to instruct browser to send a particular accept header? I believe it is possible to write an addon for IE, but this could be a heavy solution.

    Radu

  • @Ruxi: There's no way to do so that would work reliably. I don't believe adding the header inside BeforeNavigate would work, because Trident will override it. You could use an APP-wrapper around the HTTP/HTTPS protocols, but this suffers from tons of performance, reliability, and maintainability issues.  Wrapping intrinsic protocols with APPs is strongly discouraged and will be the subject of a future post.

  • I have to say that the "two serious problems" ring pretty hollow given the User-Agent string that violates the same principles, to the tune of >400 characters, often.

    At least the Accept header is meant for the purpose many of the extraneous UA tokens are being used for. #1 is obvious a red herring, or the UA wouldn't be so bloated in the first place. And if #2 were a real concern, the UA would choke those servers anyway. These are arguments against putting browser capabilities in the request *at all*, which clearly isn't going to happen in the real world. Discouraging using the Accept header is just going to push the data to some other header (today the UA), with the same problems.

    I know you've been working to suggest to Microsoft teams internally to avoid spamming the UA (the .NET CLR team is the worst culprit, but there is plenty of blame to go around), and I appreciate that. Hopefully these groups will start to take this suggestion a bit more seriously.

    At least as important is re-evaluating whether every random IE Addon or Windows install should be able to add tokens to the UA. We are seeing a large number of UAs that include a nested, conflicting UA. This means I now have to question whether my web stats are evaluating the UA correctly, as well as any internal apps.

    Specifically, we have an internal app that correlates user IDs with UA so we can provide accurate support. The database originally overallocated 200 characters to the UA, but now we have UAs longer than 600 characters in some cases, many greater than 400.

    This is resulting in truncated data, inaccurate parsing, and will require re-engineering of the database and app, not to mention the increased bandwith and storage space costs. I'm certain that I'm not the only one with this problem.

    Please, please, increase your emphasis to Microsoft about the cost these tokens have on your customers, and ask if 3rd parties really need to add more tokens of arbitrary length and content to the UA.

  • @Brianary: I think you're confused. There are two points I make in this post:

    1> Adding tokens to the Accept header simply doesn't work properly in IE, and thus it should be avoided to avoid introducing problems.

    2> "Spamming" is a significant problem for the Accept header, and it's a significant problem for the User-Agent header as well.

    The User-Agent length issue is severe enough that it will be the topic of a future post.

    <<"re-evaluating whether every random IE Addon or Windows install should be able to add tokens to the UA.">>

    Indeed; as with the Accept header, we'll be looking at whether or not we could change the UA code in the future to not send custom tokens. However, you must keep in mind that other browsers also allow such tokens (e.g. see general.useragent in Firefox), and that some deployed servers and services rely on such tokens (even if it is a bad practice).

  • No, I understand and mostly agree. I was mostly addressing your concerns about the size of the headers overall, and individual header size in particular.

    However, to eliminate spamming from both Accept and UA will likely simply result in an X-* header (or many) in order to communicate what the browser supports. X-* usage is already exploding for response headers to support IE, and we've already seen a UA-CPU header because the standard UA is too spammed to use. This doesn't do anything to reduce header bloat overall. Breaking everything into separate specific, well-defined HTTP headers will address the size and format problems with UA, but also balloon the header size overall.

    Perhaps the coolest thing about REST, which seems to have gained great acceptance at Microsoft, is that is uses as much existing HTTP infrastructure as possible for simplicity and efficiency.

    We're really talking about future IE versions, since it isn't likely that Microsoft will release a patch to cull UA tokens anytime soon, so working around a buggy IE Accept implementation, rather than fixing it, seems a bit short-sighted.

    I'm hoping that many tokens will simply disappear, but realistically there's going to be some signalling of support that happens. Long term, it would just be nice if that were MIME types in the Accept header, which is much better defined and constrained, than long English freeform text detail or repetitive enumeration of CLR versions in a new X-CLR header.

    Thanks for indulging this pedantic discussion. :)

  • <<will likely simply result in an X-* header>>

    Ah, but it's extremely difficult to add such headers in the browser, even if your code is already running. Nearly none of the folks spamming the UA would be willing to go to the trouble of adding a custom header.

    <<seen a UA-CPU header because the standard UA is too spammed to use.>>

    I don't know what led you to that conclusion, but I don't find it likely. Among other things, IE communicates bitness information in the UA-string already anyway.

    <<there's going to be some signalling of support that happens>>

    I believe that standardizing JavaScript-accessible Capabilities APIs on the Navigator object are the right approach here.

    Time will tell.

  • <blockquote>

    User-Agent string extensibility causes a similar problem, and that issue is so prevalent that it will be the subject of an upcoming post.

    </blockquote>

    Any idea when?

    I'd like a UA string as follows:

    "MSIE/8.0 (Windows NT 7.0; 64bit; Trident/4.0)"

    The following parts are redundant since nobody checks for them any more, and any websites that did check for them (but haven't been updated since) probably won't work for other reasons anyway: claiming to be NN4 (Mozilla), the word "compatible" which is rather meaningless, encryption strength identifier (which only meant anything for US-made browsers anyway), and all that ".NET CLR" crp that you put in there which doesn't mean anything and often appears three or four times with different numbers.

  • @Nicholas: Simply wishing something doesn't make it so. You've provided no data backing your claims, and statements like:

    "since nobody checks for them any more..." are simply untrue.

    While it's a fine thing to want to re-envision the IE UA string, sticking one's head in the sand and ignoring the impact that the proposed changes will have helps no one.

    IE does not include an "encryption strength indicator" in the UA string.

  • I suspect that closing the door to the UA will not be the end of it, I guess. It just seems likely that Microsoft, to keep the parties that were spamming the UA in the first place happy, will provide an alternative to them in the form of X-*, particularly teams within Microsoft.

    If the UA-CPU isn't there because of muddy UAs that already include it, then what is the purpose?

    JavaScript signalling for such specialized info as CLR, et al., seems a pretty fair solution. I hope you are right.

    In your reply to Nicholas Shanks, you call him out for not providing supporting data, but then you do the same thing when you say that claims that nobody is still checking Mozilla, compatible, &c. "are simply untrue"! I'm sure if anyone has access to that data, it's your team, but it would be nice to see it. It seems unlikely to me that there is significant legacy content with markup updated to newer Trident layout, but still using ancient UA sniffing.

    If there is no potential for truly cleaning up the UA (expunging the "Mozilla", "compatible", spamming, &c.), maybe Microsoft could suggest a new well-structured header to one or more of the standards bodies they participate in (W3C, WHAT-WG, ISO). Something complete and concise like "%LayoutEngine% %LEver%,%Browser% %Bbits% %Bver%,%OS% %OSbits% %OSver%", where the layout engine portion would instead specify the downlevel engine in use when a compatibility mode is active.

  • Some notes: Opera dropped the "Mozilla/4.0 (compatible;" years ago, and works fine for every page I've tried it with.

    I've used the Firefox User Agent Switcher extension to change my Firefox UA to "Firefox/3.5.1 (Windows NT 5.2; en-US; rv:1.9.1.1) Gecko/20090715". Google's UI does seems to change, but nothing else so far.

    I've created a quick VBScript that changes my IE UA to "MSIE/7.0 (Windows NT 5.2; Win64; x64)":

    Set sh= WScript.CreateObject("WScript.Shell")

    sh.RegWrite "HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Internet Settings\5.0\User Agent\", _

    "MSIE/7.0 (Windows NT 5.2; Win64; x64)" & vbCrLf & _

    "X-User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; Win64; x64; ", "REG_SZ"

    WScript.Echo "New User-Agent: " & vbCrLf & _

    sh.RegRead("HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Internet Settings\5.0\User Agent\")

    According to http://www.enhanceie.com/ua.aspx , this does seem to confound ASP.NET's HTTPBrowserCapabilities.

  • @Brianary: I intend no offense by noting this, but I absolutely guarantee that we've spent more time thinking about user-agents over the last few years than you have. A number of people on the IE team have spent dozens of hours researching here, and investigating options for making changes.

    1> Opera isn't compatible with every site that IE is. 2> Opera users a service to "lie" about its UA string to incompatible sites. 3> Even if Opera works, there are sites hard-coded to expect specific IE user-agents that are also hard-coded to expect certain Opera UAs.  Sites break when these change.

    Your registry script to change the UA string does not do what you think it does; it mangles your text into the actual UA string.

    I encourage you to use my user-agent switcher (www.enhanceie.com/dl/uapicksetup.exe) to try out sites, with the caveat that you'll be visiting well under 1% of the Internet sites in use today, so even if you don't see broken sites, you can rest assured that they're out there.

  • Oh, I completely agree that Microsoft has more data about this than I'll ever see. My concern, I guess, is just that sites that still do browser sniffing may be overrepresented in terms of active feedback, and should mature a bit to a more modern approach, which they are unlikely to ever do until it becomes a practical consideration.

    Point taken with Opera. I haven't used it as a primary browser even on my tiny corner of the web.

    Not that it really matters, but are you sure about the script? It seems to work for me when I echo request headers back.

    Again, I appreciate your time. I'd sure like to see the legacy "Mozilla/4.0 (compatible;" stuff retired, but getting rid of much of the UA spamming would certainly reduce the constantly swelling UA string length. Thanks for your continued advocacy.

  • <<may be overrepresented in terms of active feedback>>

    To be fair, we don't get any feedback from sites that aren't actively maintained, which make up the bulk of the compatibility risk. We actively seek out broken sites.

    <<are you sure about the script>>

    I took a closer look at what your registry script does. It cleverly relies on a WinINET bug, failure to remove 0D 0A sequences from the registry value. So, while the key in question only is intended to contain the replacement for "Mozilla/4.0", you've injected a full UA and relied on the WinINET bug to shove the rest of the UA into the next header. This works in IE8, although it's a bug which should be closed in a future version.

  • <<you call him out for not providing supporting data, but then you do the same thing>>

    Well, for one thing, I happen to be right, and he's not. :-)

    Seriously though, he's asserting that billions and billions of pages don't do a particular thing, and I'm asserting that of those billions and billions of pages, some of them do such a thing. Who do you think is more likely to be correct?

    As for the specifics, I posted a few examples quite some time ago the last time such ridiculous claims were made, and as expected the reaction was: "Oh, you only listed a few major sites, which means that every other site must work properly."  

    It's utterly tiresome to conduct endless arguments with essentially anonymous folks who have no "skin in the game."

  • An appeal to authority is certainly a fair play. It would be pretty impressive, though, if Microsoft would open up the list of sites and their status on Microsoft Connect. There would be much better transparency, and we smaller players would perhaps understand the scale of the situation better, without having to tie up and IE team member (over and over each time this comes up). :)

    I'd say I have skin in the game, just on the other side, I guess.

Page 1 of 2 (18 items) 12
Leave a Comment
  • Please add 7 and 1 and type the answer here:
  • Post