Tuesday, October 31, 2006

Tips to reduce your page load time

It is widely accepted that fast-loading pages improve the user experience. In recent years, many sites have started using AJAX techniques to reduce latency. Rather than round-trip through the server retrieving a completely new page with every click, often the browser can either alter the layout of the page instantly or fetch a small amount of HTML, XML, or javascript from the server and alter the existing page. In either case, this significantly decreases the amount of time between a user click and the browser finishing rendering the new content.

However, for many sites that reference dozens of external objects, the majority of the page load time is spent in separate HTTP requests for images, javascript, and stylesheets. AJAX probably could help, but speeding up or eliminating these separate HTTP requests might help more, yet there isn't a common body of knowledge about how to do so.

Try some of the following:

  • Turn on HTTP keepalives for external objects. Otherwise you add an extra round-trip for every HTTP request. If you are worried about hitting global server connection limits, set the keepalive timeout to something short, like 5-10 seconds. Also look into serving your static content from a different webserver than your dynamic content. Having thousands of connections open to a stripped down static file webserver can happen in like 10 megs of RAM total, whereas your main webserver might easily eat 10 megs of RAM per connection.

  • Load fewer external objects. Figure out how to globally reference the same one or two javascript files and one or two external stylesheets instead of many; try preprocessing them when you publish them. If your UI uses dozens of tiny GIFs all over the place, consider switching to CSS, which tends to not need so many of these.

  • If your users regularly load a dozen or more uncached or uncachable objects per page, consider evenly spreading those objects over four hostnames. This usually means your users can have 4x as many outstanding connections to you. Without HTTP pipelining, this results in their average latency dropping to about 1/4 of what it was before.

    Evenly spreading your images over four hostnames is most easily done with a hash function, like MD5. Rather than loading all of your objects from http://static.example.com/, create four hostnames (e.g. static0.example.com, static1.example.com, static2.example.com, static3.example.com) and use two bits from an MD5 of the image path to select between them.

    Beware that each additional hostname adds the overhead of an extra DNS lookup and an extra TCP three-way handshake. If your users have pipelining enabled or a given page loads fewer than around a dozen objects, they will see no benefit from the increased concurrency and the site may actually load more slowly. The benefits only become apparent on pages with larger numbers of objects. Be sure to measure the difference seen by your users if you implement this.

  • Allow static images, stylesheets, and javascript to be cached by the browser. This won't help the first page load for a new user, but can substantially speed up subsequent ones.

    Set an Expires header on everything you can, with a date days or even months into the future. This tells the browser it is okay to not revalidate on every request, which can add latency of at least one round-trip per object per page load for no reason.

    Instead of relying on the browser to revalidate its cache, if you change an object, change its URL. One simple way to do this for static objects if you have staged pushes is to have the push process create a new directory named by the build number, and teach your site to always reference objects out of the current build's base URL. (Instead of <img src="http://example.com/logo.gif"> you'd use <img src="http://example.com/build/1234/logo.gif">. When you do another build next week, all references change to <img src="http://example.com/build/1235/logo.gif">.) This also nicely solves problems with browsers sometimes caching things longer than they should -- since the URL changed, they think it is a completely different object.

    If you conditionally gzip HTML, javascript, or CSS, you probably want to add a "Cache-Control: private" if you set an Expires header. This will prevent problems with caching by proxies that won't understand that your gzipped content can't be served to everyone. (The Vary header was designed to do this more elegantly, but you can't use it because of IE brokenness.)

    For anything where you always serve the exact same content when given the same URL (e.g. static images), add "Cache-Control: public" to give proxies explicit permission to cache the result and serve it to different users. If a local cache has the content, it is likely to have much less latency than you; why not let it serve your static objects if it can?

    Avoid the use of query params in image URLs, etc. At least the Squid cache refuses to cache any URL containing a question mark by default. I've heard rumors that other things won't cache those URLs at all, but I don't have more information.

  • Minimize HTTP request size. Often cookies are set domain-wide, which means they are also unnecessarily sent by the browser with every image request from within that domain. What might've been a 400 byte request for an image could easily turn into 1000 bytes or more once you add the cookie headers. If you have a lot of uncached or uncachable objects per page and big, domain-wide cookies, consider using a separate domain to host static content, and be sure to never set any cookies in it.

  • Minimize HTTP response size by enabling gzip compression for HTML and XML for browsers that support it. For example, the 17k document you are reading takes 90ms of the full downstream bandwidth of a user on 1.5Mbit DSL. Or it will take 37ms when compressed to 6.8k. That's 53ms off of the full page load time for a simple change. If your HTML is bigger and more redundant, you'll see an even greater improvement.

    If you are brave, you could also try to figure out which set of browsers will handle compressed Javascript properly. (Hint: IE4 through IE6 asks for its javascript compressed, then breaks badly if you send it that way.) Or look into Javascript obfuscators that strip out whitespace, comments, etc and usually get it down to 1/3 to 1/2 its original size.

  • Consider locating your small objects (or a mirror or cache of them) closer to your users in terms of network latency. For larger sites with a global reach, either use a commercial Content Delivery Network, or add a colo within 50ms of 80% of your users and use one of the many available methods for routing user requests to your colo nearest them.

  • Regularly use your site from a realistic net connection. Convincing the web developers on my project to use a "slow proxy" that simulates bad DSL in New Zealand (768Kbit down, 128Kbit up, 250ms RTT, 1% packet loss) rather than the gig ethernet a few milliseconds from the servers in the U.S. was a huge win. We found and fixed a number of usability and functional problems very quickly.

  • (Optional) Petition browser vendors to turn on HTTP pipelining by default on new browsers. Doing so will remove some of the need for these tricks and make much of the web feel much faster for the average user. (Firefox has this disabled supposedly because some proxies and some versions of IIS choke on pipelined requests. But there are workarounds.)

The above list covers improving the speed of communication between browser and server and can be applied generally to many sites, regardless of what web server software they use or what language the code behind their site is written in.

The techniques of measurement and experimentation can be applied to other sources of latency. For example, try benchmarking common pages on your site with ab, which comes with the Apache webserver. If your server is taking longer than 5 or 10 milliseconds to generate a page, you should make sure you have a good understanding of where it is spending its time.

Get more information

Can't find what you're looking for? Try Google Search!
Google
 
Web eshwar123.blogspot.com

Comments on "Tips to reduce your page load time"