IEInternals

A look at Internet Explorer from the inside out. @EricLaw left Microsoft in 2012, but was named an IE MVP in '13 & an IE userAgent (http://useragents.ie) in '14

Bugs in IE8's Lookahead Downloader

Bugs in IE8's Lookahead Downloader

All bugs mentioned in this post are now fixed. 

Internet Explorer has a number of features designed to render pages more quickly. One of these features is called the "Lookahead Downloader" and it's used to quickly scan the page as it comes in, looking for the URLs of resources which will be needed later in the rendering of the page (specifically, JavaScript files). The lookahead downloader runs ahead of the main parser and is much simpler-- its sole job is to hunt for those resource urls and get requests into the network request queue as quickly as possible. These download requests are called "speculative downloads" because it is not known whether the resources will actually be needed by the time that the main parser reaches the tags containing the URLs. For instance, inline JavaScript runs during the main rendering phase, and such script could (in theory) actually remove the tags which triggered the speculative downloads in the first place. However, this "speculative miss" corner case isn't often encountered, and even if it happens, it's basically harmless, as the speculative request will result in downloading a file which is never used.

IE8 Bugs and their impact
Unfortunately, since shipping IE8, we've discovered two problems in the lookahead downloader code that cause Internet Explorer to make speculative requests for incorrect URLs. Generally this has no direct impact on the visitor's experience, because when the parser actually reaches a tag that requires a subdownload, if the speculative downloader has not already requested the proper resource, the main parser will at that time request download of the proper resource. If your page encounters one of these two problems, typically:

  • The visitor will not notice any problems like script errors, etc
  • The visitor will have a slightly slower experience when rendering the page because the speculative requests all "miss"
  • Your IIS/Apache logs will note requests for non-existent or incorrect resources

If your server is configured to respond in some unusual way (e.g. logging the user out) upon request of a non-existent URL, the impact on your user-experience may be more severe.

The BASE Bug

Update: The BASE bug is now
 
fixed.

The first problem is that the speculative downloader "loses" the <BASE> element after its first use. This means that if your page at URL A contains a tag sequence as follows:

<html><head><base href=B><script src=relC><script src=relD><script src=relE><body>

which requests 3 JavaScript files from the path specified in "B", IE8's speculative downloader will incorrectly request download of URLs "B+relC", and "A+relD" and "A+relE". Correct behavior is to request download of URLs "B+relC", "B+relD", and "B+relE". Hence, in this case, two incorrect requests are sent, usually resulting in 404s from the server. Of course, when the main parser gets to these script tags, it will determine that "B+relC" is already available, but "B+relD", and "B+relE" have not yet been requested, and it will request those correct two URLs and complete rendering of the page.

At present, there is no simple workaround for this issue. Technically, the following syntax will result in proper behavior:

 <html><head><base href=B><script src=relC><base href=B><script src=relD><base href=B><script src=relE><body>

...but this is not standards-compliant and is not recommended. If the page removes its reliance upon the BASE tag, the problem will no longer occur.

Remember: The BASE bug is now fixed.

The Missing 4k Bug

Update: The 4k bug is now fixed. 

The second problem is significantly more obscure, although a number of web developers have noticed it and filed a bug on Connect. Basically, the problem here is that there are a number of tags which will cause the parser and lookahead downloader to restart scanning of the page from the beginning. One such tag is the META HTTP-EQUIV Content-Type tag which contains a CHARSET directive. Since the CHARSET specified in this tag defines what encoding is used for the page, the parser must restart to ensure that is parsing the bytes of the page in the encoding intended by the author. Unfortunately, IE8 has a bug where the restart of the parser may cause incorrect behavior in the Lookahead downloader, depending on certain timing and network conditions.

The incorrect behavior occurs if your page contains a JavaScript URL which spans exactly the 4096th byte of the HTTP response. If such a URL is present, under certain timing conditions the lookahead downloader will attempt to download a malformed URL consisting of the part of the URL preceding the 4096th byte combined with whatever text follows the 8192nd byte, up to the next quotation mark. Web developers encountering this problem will find that their logs contain requests for bogus URLs with long strings of URLEncoded HTML at the end.

As with the previous bug, end users will not typically notice this problem, but examination of the IIS logs will show the issue.

For many instances of this bug, a workaround is available-- the problem only appears to occur when the parser restarts, so by avoiding parser restarts, you can avoid the bug.  By declaring the CHARSET of the page using the HTTP Content-Type header rather than specifying it within the page, you can remove one cause of parser restarts.

So, rather than putting

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">

In your HEAD tag, instead, send the following HTTP response header:

Content-Type: text/html; charset=utf-8

Note that specification of the charset in the HTTP header results in improved performance in all browsers, because the browser's parsers need not restart parsing from the beginning upon encountering the character set declaration. Furthermore, using the HTTP header helps mitigate certain XSS attack vectors.

Unfortunately, however, suspension of the parser (e.g. when it encounters an XML Namespace declaration) can also result in this problem, and it's not feasible for a web developer to avoid suspension of the parser.

But, remember: The 4k bug is now fixed. 

Summary
While these problems are significant, they are not so dire as some readers will conclude at first glance. The second bug, in particular, is quite rarely encountered due to its timing-related nature and the requirement that page have a JavaScript URL spanning a particular byte in the response. Encountering the second issue is not nearly as prevalent as some web developers believe-- for instance, we've heard claims that IE6, 7, and Firefox all have this problem, which is entirely untrue. Readers can easily determine if a page is hitting either bug by examining server logs, or watching network requests with Fiddler.

The IE team will continue our investigation into these bugs and, as with any reported issues, may choose to make available an IE8 update to resolve the issues.

Remember: All bugs mentioned in this post are now fixed. 

Apologies for the inconvenience, and thanks for reading!

-Eric

  • @EricLaw:  Thanks for your help.  Is there anything I can do to help push the release?  Something on connect I and other users can vote up? Do you know when it is released, will be be part of a critical fix so everyone will get it?

    Thanks

    Mike

  • Note: The BASE issue was fixed for Windows 7 users in today's cumulative update, so the IE8 BASE bug is now resolved on all platforms.

  • @EricLaw: Unfortunately the bug seems only to be fixed for base tag href's that start with http:// - if the base tag points to a "file://" directory on the local computer still the "old" behaviour is there. If I change the compatibilty mode for that page to IE7 it works fine with a base tag that has a file:/// href but when switching to IE8 mode it does NOT work!

    When will this issue be fixed?

  • @Joe: No, you are seeing a different issue.

    Your FILE base tag is getting ignored because your FILE URI is invalid. Please .ZIP up a repro of the problem and email it to me (ericlaw@microsoft).

  • @EricLaw:  Not sure if you had anything to do with getting this pushed faster...  But it looks like it is fixed.  Thanks for your help.

    Mike

  • @EricLaw: I sent you the details by e-mail. If you place a base tag like this in the head of a document:

    <base href="file://C:/Users/User Name/Desktop/" />

    An image that is referenced like this <img src="test.png" /> will not be found. The IE8 (if set to IE8 Standards mode) will look for the image file in the same path as the HTML file. In other words the BASE tag is ignored. Actually it should find the image in on the Desktop in this case (same problem with any other path!). Also using file:///C|/Users/... does not help.

    Any help would be appreciated!

  • We also just started seeing these ViewState error in our ASP.NET application after we switched to new servers. Our codebase is same but just new server. And now we are not able to figure out if the new server has some new patches or old servers had some patches prevention from this issue. But we use to have same ASP.NET application use to work in old servers which are throwing these ViewState byte 4K issue in new servers.

    Can anyone please put some light if there are some server patch that will prevent from this issue?

    Thanks.

  • @Sam: If your ViewState errors are caused by the 4k bug, then no, there's no server patch that will prevent that client bug.

  • @Eric: This is what surprised me as well. Here we all are discussing that it is client side issue and it must be client side but we started seeing these errors only when we switched to new servers. We are having this same exact issue with ScriptResource.axd file with invalid "d" parameter causing to Invalid ViewState error. and its happending at 4K byte position of the http response. However, this does not happen all the time for every single request. Its just intermittent.

    So, I was expecting there should be some server level fix or configuration as we started seeing this issue after we moved to new server with same ASP.NET codebase.

    Only difference so far I have seen between our new servers and old are in machine.config settings.

    Our new servers were missing these machine.config:

    <system.net>

       <connectionManagement>

         <add address="*" maxconnection="48"/>

       </connectionManagement>

    </system.net>

    <processModel maxWorkerThreads="100" maxIoThreads="100" minWorkerThreads="50"/>

    We use to have above configuration with our old servers. However, we haven't applied these changes to the new servers as we are still in process of investigating this issue.

    So, does processModel thread numbers and connectionManagement connection in some how cause this issue? Is there a possibility that because of low resource allocation in server side could cause this issue?

  • @Sam: As discussed above, the issue is timing related. If the bytes are received by the client with different timings, you'll potentially see different results. However, you don't really have control over this from the server because you don't control the full path to the client.

  • Thanks Eric.

    I know we don't have control over timing and client but just wanted to mention in this thread that we are having same issue just because we switched to new server. So, I don't think it has to do anything with ASP.NET code. We have same codebase that use to work with old servers which are not working in new servers and throwing this Invalid ViewState error.

  • Just to reiterate for folks who aren't reading the comment thread:

    Unfortunately I'm not able to make any statements or speculations about IE code fixes (either availability or timeframe). I can say that this is an issue that we're getting a significant amount of customer escalations about because the workarounds are unappealing.

  • @Cindy: The problem you describe doesn't sound likely to be related to the lookahead issue in any way.

  • Since we switched (on Apache server with perl CGI) our charset from iso-8859-1 to utf-8, we have been flooded

    with damaged requests in the server's errorlog.

    We analyzed the problem for several days. Nearly all requests come from IE 8, a few from IE 7. As I understand now, these are probably IE 8 in compatibility mode reporting user agent IE 7.

    The requests in our served pages are for .js files or .css files, mainly.

    they sit in the html head section, near the beginning of the page, after some meta tags (description, keywords etc).

    These resource links were 'trashed', i.e. overwritten with content from the html page which was place exactly 4096 bytes further down in the page.

    We serve a few million of page requests per day, about 25% to IE8 clients.

    We had a few hundred such trashed requests.

    We have had some success in reducing the number of trashed requests by moving the requests closer to the beginning of the page, before the meta tags. There, they were hit less often by the 'drop 4k bug'.

    Tonight I found this blog via google, and I removed the line

    meta http-equiv="Content-Type" content="text/html; charset=utf-8

    from the html pages.

    We server correct charset info in the http-header anyway.

    Since I removed this line, about two hour ago, the trashed requests have stopped to appear in the logfile.

  • Microsoft's comment on Bug 467062 (https://connect.microsoft.com/IE/feedback/details/467062/bug-ie8-4k-dropped-invalid-viewstate-when-loading-scriptresource-axd-or-webresource-axd-asp-net)

    --------------

    Thank you for submitting your feedback on Internet Explorer 8. The information in this bug has been reviewed carefully by the team and will help inform future releases of Internet Explorer. We are now closing down the Internet Explorer 8 Feedback Program in preparation for soliciting feedback for our next version and as such, per the Microsoft Connect guidelines, all remaining bugs for IE8 will be resolved as “Postponed” for now. When the feedback program for the next release is ready, we will look at transferring this bug, along with any applicable status update.

    Thanks again for all your support and feedback.

    The IE Team

    --------------

    Does this mean they are starting IE9 and have stopped patching IE8?

    Thanks,

    Eric

Page 7 of 8 (116 items) «45678
Leave a Comment
  • Please add 8 and 1 and type the answer here:
  • Post