The browser’s networking subsystem is a crucial component for delivering a high-performance Web experience. In today’s post, I’ll demonstrate this using some real-world measurements, showing that even in a “fast” network environment, the network time dominates the time spent in other subsystems that affect page load time. Next, I’ll provide a brief tutorial of how Web browsers use the network, and outline the improvements we’ve made in Internet Explorer 9 to help ensure speedy page loads.
Last year, we analyzed top sites to measure how much of the overall page load time was related to time spent retrieving content from servers. We ran the tests in what we consider a “fast” environment, on an Intel Core2 2.4ghz running on an 802.11N network connected to a 20MB/sec cable modem. For each iteration of each site, we took 4 measurements of page-load-time (from about:blank to top-level document complete) measuring both PLT1 (clear cache) and PLT2 (primed cache) in “real world” conditions and in an “ideal” world in which network time was zero’d by returning all resources instantly from a proxy located on the local computer.
The results from six representative sites are shown in the following set of charts, one for each site tested. Each ring represents the ratio of “non-network” time (red) to “network” time (blue). The outer rings represent the ratio for the first page load ("clear cache”) while the inner ring represents the ratio for the second visit (“primed cache”).
As you can see, there’s a wide range of network performance ratios. For sites that make good use of caching (e.g. those in the third column), very little time is spent on the network for PLT2 because the vast majority of content is cached locally in the browser cache. For other sites, while the overall load time in PLT2 is much faster than PLT1, network time still represents the biggest component of page load time because even a single network request takes a relatively long time.
These findings show that even in a “fast” network environment, the network time dominates the page load time. On the initial page load, downloads account for 73% of the time needed to load the page; on a subsequent reload, 63% of the page load time is spent waiting for resources.
Stated another way, this set of sites initially loaded in a total of 17 seconds, which would be reduced to 4.6 seconds if not for download time. On reload, the sites loaded in 6.7 seconds, which would be reduced to 2.5 seconds without download time.
Now that we’ve looked at the “real world” impact of network performance, let’s examine the sources of network delay.
Every time your browser loads a Web site, an intricate set of steps occurs behind the scenes. A brief summary of these steps follows:
If your computer is not running behind a Web proxy, then your browser uses the network to perform a DNS lookup to convert the hostname you’ve typed (e.g. www.microsoft.com) into the network address (e.g. 126.96.36.199). The browser must then establish a TCP/IP network connection to the target address.
If your computer is running behind a Web proxy, your browser must first locate the proxy. The proxy might be specified directly within your browser settings, or located via a process called Web Proxy Auto-Discovery Protocol (WPAD). After the hostname of the proxy is determined, the browser uses a DNS lookup to convert the proxy hostname to a network address. The browser then establishes a network connection to the proxy server.
If the URL calls for a secure connection (HTTPS) then a SSL or TLS handshake must take place, and the certificates from the server must be verified. This can result in one or more additional network requests to Certificate Authorities (called “Revocation checks”) to ensure that none of the certificates in the chain have been revoked.
After the connection has been successfully established, the browser must send a HTTP request to the server. The server, upon receiving the request must load (or generate) the requested file and begin transmitting it to the client. If the document is a HTML file, it will typically contain references to other resources (e.g. images, script, stylesheets) that must also be loaded in order to fully display the Web page. For each resource referenced by the page, the browser must typically repeat many of the previous steps in order to download the resource needed.
Many of these operations must be performed serially (for instance, you cannot establish a TCP/IP connection without first obtaining the IP address using a DNS lookup) and hence delays in any one operation can dramatically increase the load time of the page.
In some cases, operations can be parallelized or caches can be introduced to reduce the risk of delay. Internet Explorer 9 makes use of both techniques for enhanced performance, as I’ll explain in the remainder of this post.
The Domain Name System (DNS) allows the client browser to convert a hostname into a network address. This process may take between a few milliseconds to several seconds; one recent study pegged the US Median at approximately 150ms, although there is wide variance. Because the browser can only begin establishing a network connection after determining the remote address, DNS is the first bottleneck in network performance.
Internet Explorer 9 includes three optimizations for improved DNS performance; all of these optimizations are based on the fact that multiple DNS requests may be issued in parallel.
After the third character you type into the address bar, Internet Explorer will issue DNS resolutions for the top 5 hostnames in the dropdown. The results of these resolutions will be stored in the local operating system cache for quick reuse in the future. In this way, if you subsequently navigate to one of these top matches, the browser will have a small head start and will spend less time waiting for DNS results to be returned.
When you visit a page, Internet Explorer 9 will annotate the page’s history entry with the list of hostnames that were contacted to download content used by the page.
For instance, when loading the IEBlog, the HTML references resources from five other sites (i.msdn.microsoft.com, cdn-smooth.ms-studiosmedia.com, go.microsoft.com, ie.microsoft.com, and silverlight.dlservice.microsoft.com).
Internet Explorer 9 stores these five hostnames in the history entry for the IEBlog site. If you begin to navigate to the IEBlog in a future browser session, Internet Explorer will immediately issue DNS resolutions requests for these five stored hostnames, in parallel with establishment of a connection to the blogs.msdn.com server. This helps ensure that when the blog’s HTML is subsequently returned from the server, the local DNS cache will typically already contain the network addresses needed to download the page’s embedded resources.
Internet Explorer will also resolve any hostnames for resolution that are specified by the Web developer in a LINK REL=PREFETCH element. For instance, with the following markup:
<link rel="prefetch" href="http://www.example.com/someresource.htm">
...Internet Explorer will kick off a DNS resolution for www.example.com. That way, if a connection is later made to www.example.com, the DNS resolution will have already occurred and the browser can immediately connect to the server without pausing to look up its address.
Many corporate users are configured to browse the Web using proxy servers, and proxies can have a network performance impact. To that end, we’ve made two key changes to improve performance in an environment that requires a proxy.
First, Internet Explorer has moved its proxy determination logic from the tab processes to the single unified frame process. This helps ensure that when the Web Proxy Auto-Discovery (WPAD) feature runs (and it may take from tens of milliseconds to multiple seconds) it only does so once per browser session, no matter how many browser tabs you open. This also saves some CPU time and nearly 500kb of memory per tab process. The proxy determination improvement can significantly improve the performance of starting a new browser tab and loading a site into it.
Second, Internet Explorer 9 increases the connections-per-proxy limit to 12. This allows the browser to perform more downloads in parallel as compared to IE8, which enforced a six connection limit when using a Web proxy. While the browser’s connections-per-host limit remains 6, many sites use resources from multiple domains and thus benefit from the higher connections-per-proxy limit.
As mentioned in the DNS Pre-resolution section, doing more work in parallel is a great way to improve overall performance. Some of the browser’s network operations are not inherently sequential and time can be saved by doing work in multiple threads simultaneously.
For instance, we noticed that virtually all sites end up using more than one HTTP connection per hostname, and time could be saved if we open a second “background” connection when establishing the first connection. This background connection is available for the next HTTP request without forcing it to wait for the original connection to become available, and without the delay of opening a new connection in a “just in time” manner. Only one background connection is opened per host, and this improvement can save tens to hundreds of milliseconds when loading a page.
Next, we noticed that in some cases we had a blocking behavior in our connection handling, whereby there were multiple (e.g. 3) open connections to a server, but opening another connection to the server (#4) was delayed until a prior request had completed, even though the connections-per-host limit of 6 meant that we could open a new connection right away. We’ve removed that blocking behavior and allowed more work to finish in parallel.
Lastly, we observed that we could allow the browser to get download requests for images out onto the network more quickly by enabling the lookahead downloader to kick off image file downloads. In support of the HTML5 specification, we also tweaked the image download code to prevent images with an empty source (e.g. <img src="" />) from making a network request.
Of course, the most effective way to ensure great network performance is to eliminate network time entirely. The Temporary Internet Files cache allows IE to reuse previously downloaded content without re-downloading the content over the network. Last summer, I summarized a number of caching-related improvements in IE9 that help the browser make better use of the cache. Today, I’ll build upon that post by explaining the other work we’ve done to enhance the functionality of the cache.
Internet Explorer 6, 7, and 8 limit the Web content cache size to 1/32 of the disk capacity by default; in IE7 and IE8, the default capacity value is capped 50 megabytes. Virtually all hard disks in use today are larger than 1.6 gigabytes, which means that nearly all users of IE7 and IE8 are browsing with a content cache limited to 50mb.
Internet Explorer 7’s default 50mb cache size was introduced because analysis in the Windows Vista timeframe showed that the browser’s cache-hit ratio was not significantly improved with caches larger than that size. Beyond consuming more disk space, larger caches can take longer to enumerate and clear (for instance, when performing the Delete Browser History operation). Therefore, an important design strategy is to only increase the cache size when an improved hit rate will result.
In IE9, we took a much closer look at our cache behaviors to better understand our surprising finding that larger caches were rarely improving our hit rate. We found a number of functional problems related to what IE treats as cacheable and how the cache cleanup algorithm works. After fixing these issues, we found larger cache sizes were again resulting in better hit rates, and as a result, we’ve changed our default cache size algorithm to provide a larger default cache.
Internet Explorer 9 has a default cache size limit of 1/256th of the total disk capacity, with a cap on the default to 250 megabytes. The new ratio was chosen to help ensure that low-end netbooks with tiny solid state disks are not impacted by (relatively) huge caches, while desktops with large disks will default to a 500% larger cache vs. IE8. A disk as small as 16gb will result in increase in the default cache size between IE8 (50mb) and IE9 (64mb). Any disk larger than 64gb will reach the default cap of 250mb. In all versions of Internet Explorer, you may manually adjust the cache size limit (within the range of 8mb to 1gb) using the Tools > Internet Options > General > Browsing History Settings dialog box.
If your Temporary Internet Files are stored on a high capacity disk drive with plenty of free space, you may be wondering if you would benefit from configuring an even larger cache. It is definitely possible, but our analysis suggests that increasing the limit above 250mb only results in benefits for some browsing patterns. Importantly, the caching system is internally limited to storing approximately 60,000 objects. Depending on the size of your cache and the size of the resources you download, you may encounter the object count limit before encounter the size limit.
Technical Note: Internet Explorer maintains two caches, one for content used in Protected Mode (Internet and Restricted Zones) and one for content used outside of Protected Mode (Intranet, Trusted, and Local Computer zones). Each cache is individually capped at the limit, so the aggregate maximum disk space consumed by caches is double the individual limit.
As mentioned in the previous section, our analysis of the cache cleanup algorithms used in IE8 and below indicated significant room for improvement. As a result of that analysis, we’ve substantially rewritten the cache scavenger (which removes low value entries to free up cache space) for Internet Explorer 9. The new scavenger is significantly more likely to retain valuable entries (those cache entries which are likely to be used again) by discarding less valuable entries.
Internet Explorer’s cache scavenger is kicked off as the cache limit (size or object count) is approached. Its goal is to remove the least valuable 10% (by size) of the objects in the cache. It does this by enumerating the objects in the cache. In the first pass, it assigns each a score between 0 and 66,000. In the second pass, it deletes the entries which are ranked in the lowest 10% of the scores.
In IE9, we’ve made significant changes to how the scores are calculated to better ensure that the most valuable entries are kept and the least valuable entries are discarded.
Of the 66,000 possible points, 40,000 points are determined by how recently a given resource was used. 20,000 points derive from how often the resource has been used, and 6000 points derive from the presence of validator information (HTTP response headers like Last-Modified and ETag) that allow conditional revalidation of resources after their freshness period expires.
We also take the MIME type of the resource into account; script, CSS, and HTML/XHTML resources receive full credit, while other resource types (e.g. images, audio, etc) receive only half of their allocated points. This helps ensure that the resources which have the greatest impact on page load time survive in the cache longer than lower priority resources.
Cache entries that have been used more than once receive an increasing number of reuse points; to get more than 10,000 points, the resource must have been reused over a period longer than 12 hours. This helps prevent resources that were reused frequently but only within a short period (e.g. when you’re browsing around a site you rarely visit) from getting the same amount of credit as a resource that is reused frequently across a long period (e.g. a script on a site that you visit every day).
Entries which have validators collect validator points, but the biggest impact of validators occurs after resources are no longer fresh. Any resources which have expired and do not contain validators receive a score of 0 points (since they cannot be reused nor revalidated), while those with validators retain 70% of their score. Expired resources with validators retain most of their score because they allow the browser to quickly recheck the freshness of a cached resource-- the server may reply with a small HTTP/304 response (with no body) indicating that the cached copy may be reused.
The scavenger is also sensitive to a number of special cases—for instance, no resource will be scavenged for the first 10 minutes after it was downloaded, and resources that exceed the cache size (e.g. downloading a 4gb ISO file) are temporarily exempted from scavenging to permit the download to successfully complete.
Internet Explorer 9 also includes new Tracking Protection and ActiveX Filtering features. Both of these features can improve your overall browser performance by preventing download and execution of undesired content. For instance, when loading the homepage of one popular news site, enabling the ActiveX Filter and one popular Tracking Protection list results in a significantly faster page load:
The page load time is improved by more than 50% because the browser is able to avoid downloading and running content which is not critical to the display of the page.
As you can see, we’ve taken a very broad approach to improving Internet Explorer’s network performance, in support of our engineering goal of building the world’s fastest browser. Of course, the speed of your own network connection remains an important factor in the equation, and Web developers should continue to follow best practices for performance, but IE9’s investment in network performance helps us deliver a significantly faster page load time across a wide variety of pages.