IEInternals

A look at Internet Explorer from the inside out. @EricLaw left Microsoft in 2012, but was named an IE MVP in '13 & an IE userAgent (http://useragents.ie) in '14

IE and the Accept Header

IE and the Accept Header

  • Comments 18

RFC 2616 describes the Accept request header as follows:

The Accept request-header field can be used to specify certain media types which are acceptable for the response. Accept headers can be used to indicate that the request is specifically limited to a small set of desired types, as in the case of a request for an in-line image.

While I’ll spare you a debate about the merits and pitfalls of server-driven content negotiation, suffice it to say that I think that such negotiation is impractical (and/or suboptimal) for most scenarios encountered by general-purpose web browsers. The primary reason to spare the debate (and inevitable flame war) is that it’s mostly a moot point, because all versions of Internet Explorer are seriously limited when it comes to support of the Accept header. Note: other browsers actually suffer from similar limitations in certain cases– I’m focused on IE in this post because it’s the browser I work on.

Let’s take a look what IE sends in the Accept header.

Install a recent version of Fiddler, run it, and click Rules > Customize Rules.  Scroll to the static function Main() block, and add the following line within:

FiddlerObject.UI.lvSessions.AddBoundColumn("Accept", 50, "@request.Accept");

Save the file, and you’ll see a new column titled Accept appears within the Fiddler UI, showing the value of the Accept request header.

Navigate to websites, and watch the value of the Accept header sent in each request.  You’ll quickly notice that in almost all cases, the Accept header contains */*, meaning that IE is willing to accept documents of any MIME type. This is technically accurate, insofar as IE will offer to download and save MIME-types it doesn’t know how to render.

However, in some navigations, you’ll see that IE sends a more complete string, containing a wide variety of MIME-types.  For instance:

image/jpeg, application/x-ms-application, image/gif, application/xaml+xml, image/pjpeg, application/x-ms-xbap, application/msword, application/vnd.ms-excel, application/x-shockwave-flash, */*

Hit F5 to refresh that page, and IE will probably go back to sending */* again.  Clearly, IE is inconsistent in what it chooses to send in the Accept header, but by now you’re probably curious where these MIME types even come from.

IE generates this list by enumerating the values listed in the following registry key:

HKLM\Software\Microsoft\Windows\CurrentVersion\Internet Settings\Accepted Documents

Any application can, upon install, attempt to advertise the MIME-types it supports using this registry key.  However, I strongly recommend that developers not list MIME types here. 

Why not?  Well, first of all, the list isn’t typically sent, so you cannot write server code which checks for your MIME-type and conditionally serves content of that type only if the client asks for it.  In almost all cases, servers will end up getting requests for content of type */* and return content in your format anyway.

More importantly, it turns out that Accept headers containing custom types cause two serious problems:

  1. Slower performance due to bloated requests
  2. Server errors on sites expecting headers of fixed maximum length

The first problem is pretty obvious: any time the full MIME-type list is sent, significant request bandwidth is wasted.  The Accept header above, for instance, is 191 characters long, and due to the asymmetrical nature of bandwidth for most users (upload bandwidth is usually a small fraction of download bandwidth) such waste can quickly add up. 

The second problem is less obvious but more serious: Many web server devices and frameworks expect HTTP headers to be shorter than a certain length and will return HTTP error codes (HTTP/400 and HTTP/406 are popular) if overlong headers are received. Beyond the immediate annoyance of such errors, there’s almost never any indication to the user what has gone wrong and how to fix it. Users cannot be expected to know that the problem is an overlong header and find the above-mentioned registry key to start deleting entries.

Often, an IE user encountering this problem will try another browser and find that it works fine, because other browsers generate their Accept headers from other lists that are less likely to be updated by installed applications. For instance, Firefox sends the value of its network.http.accept.default preference as the content of the Accept header.

Alert readers will notice that Microsoft applications are culprits in Accept header bloat, and this is something that IE will be working with other teams around the company to help mitigate.  In a future IE version, we may remove or substantially change Accept-header generation logic to help eliminate this problem.

User-Agent string extensibility causes a similar problem, and that issue is so prevalent that it will be the subject of a future post: Internet Explorer User-Agent: Use and Abuse.

Thanks for reading!

-Eric

Update: The IE9 Release Candidate significantly changes IE's use of the Accept header in IE9 Browser Mode. IE9 deprecates registry-based extensibility of the Accept header, and rather than sending */* for most downloads, now sends a more-specific Accept header based on which HTML element initiated the request. I wrote about the details of IE9 Accept Headers on the Fiddler Blog.

  • Accept header: "image/jpeg, application/x-ms-application, image/gif, application/xaml+xml, image/pjpeg, application/x-ms-xbap, application/msword, application/vnd.ms-excel, application/x-shockwave-flash, */*"

    is the same as just using Accept header: "*/*".

    Read the http://www.ietf.org/rfc/rfc2616.txt "14.1 Accept" it clearly states that if you do not include a "q=" parameter all accept headers are equal and the server is free to choose anyone of them.

  • @Fireblaze: Actually, no... you missed a section which covers this:

    Media ranges can be overridden by more specific media ranges or specific media types. If more than one media range applies to a given type, the most specific reference has precedence. For example,

          Accept: text/*, text/html, text/html;level=1, */*

      have the following precedence:

          1) text/html;level=1

          2) text/html

          3) text/*

          4) */*

  • Eric,

     Excellent post! I'm sure you might not have realized how popular such a topic might be until after you continue to receive feedback about it several months later. Personally I have found this article very helpful as it helped me to determine how to get IE to properly handshake with a proprietary HTTP server. It couldn't handle any Accept header greater than 255 characters ("Accept: " and "\r\n" included). Now the trick is to get the rest of the organization to buy into the obvious fix of removing some of these MIME types from the registry. Alternate browsers have been used up until now as a workaround to allow us to interface with the site. With today's discovery we can hopefully move away from that and also reduce our Accept-header bloat a bit.

     The interesting thing to note is that my only concern is an organizational one. I suspect that this is also one of the many things your team would face when attempting to make changes to the Accept-header compilation mechanism. While it might make sense from your perspective (and many people in the world, including myself), that's only 1% of the battle. You have the task of working with other teams inside a huge corporation to try and change the way they've been working for a while. Regardless of the fact that Microsoft is a tech company with a lot of good employees I'm sure, they are still people and people don't like change. I can also see that not everyone in MS has the same level of understanding of the HTTP RFC you have. This further complicates the issue of change.

     This might be showing my ignorance a bit, but I'm OK with that. How many times does the MIME type actually affect what format the server uses for the target data? I could see this happen more in the case of a web service where you could specify, say, JSON vs. XML. However, IE isn't really used to interface with web services that often. It's meant for people to use, not machines. So, the question remains as to whether or not "*/*" wouldn't be sufficient in many, many cases...if not all. I'm sure you've got a bit more knowledge here and I'd appreciate your insight. Have you guys run across a website that changes its content's format based on different Accept-header MIME types?

     Sincerely,

     -Archimedes

Page 2 of 2 (18 items) 12
Leave a Comment
  • Please add 2 and 3 and type the answer here:
  • Post