Leading by example - How do leading websites balance server load with interesting content?

How do major websites balance the need to reduce server load to give a faster delivery, while presenting lots of interesting content?

Below, we consider how three popular UK sites - Google, the BBC and the Inland Revenue (did we really say 'popular'?) - achieve this and the trade-offs they make.

Content
Google is one of the fastest sites out there. Visually it's changed in the last year or two, but it was originally very uncluttered  - almost spartan, having only 3 page elements. The majority of packets contain images, only 12 KBytes (17 packets) are required to deliver the homepage (to a broadband user). Given that most of Google's remaining pages are dynamically generated for specific search criteria, there is probably some clever proprietary caching going on at the database level to keep it so fast.

In contrast the BBC and Inland Revenue have much more content. Yet they are still very responsive.

The BBC homepage is 93 KBytes (205 packets) and has 35 page elements. Interestingly, the site does not use persistent connections. This results in more traffic being generated, as a handshake is necessary for every object. There are three additional packets per object, hence the high packet level. The BBC could reduce the HTML page of 36 Kbytes by almost 10% merely by removing white space.

With typical homepage sizes somewhere between 80 KBytes and 130 KBytes the Inland Revenue homepage at 102 KBytes (130 packets) is about average. It has 39 page elements and takes 20 seconds for a modem user to download. Eight of those elements are delivered from other central government servers to make up the "HM Government" top frame.

Compression
Often the number of packets has nearly as much effect on delivery speed as the total data payload - due to packet header sizing and the latency required to send a packet and wait for an acknowledgement.

Google squeeze their raw HTML into just 1350 Bytes, achieved by delivering the HTML compressed; the page is actually 3730 Bytes. Looking at a unique page returning search results, Google continues to compress the HTML - with an overall saving of nearly 50%. This efficient use of compression is unexpected given that the majority of Google's pages are dynamically generated, which potentially generates a large CPU load.

Neither the BBC nor the Inland Revenue use compression. For the BBC this would have significant benefits for the 36KBytes of HTML. Compression would benefit the Inland Revenue site as it contains much static content which is very easy to compress. Even compression on the fly would in principle be easier to perform as the site is not as busy as the others.

Caching
Google makes sensible use of caching. All images have HTTP headers with the 'Expires:' field set to a date far into the future - 2038. As a result, once an image has been downloaded initially, the browser can be confident that it is still fresh for a long time. As Google's homepage is only 2 packets of text - with images in a local cache - the saving is 90% of the homepage.

Due to a high degree of news elements, the BBC site is largely dynamically driven and this makes caching more difficult. The Inland Revenue does not use caching however as the majority of images on the page are likely to be unchanged for months, the site would benefit from caching.

Style Sheets
Google does not use cascading style sheets (used to define page layout separately from the page content). Given the simplicity of the site, the improved performance generally achieved by the use of style sheets may not be great, but might be considered for the future.

The BBC and the Inland Revenue do implement style sheets. The BBC, however, still has some tags embedded in the HTML that should be in the style sheet. This is perhaps historic, from the days when style sheets were not so well supported. Moving these tags into the style sheet can only help the overall presentation and will also reduce the page size. As an example, occurs 105 times. This could be defined more efficiently in the style sheet saving a further 1.5 KBytes.

Favicon
Google provides a Favicon (.ico image) of 1406 Bytes of data (2 packets). Neither the BBC nor the Inland Revenue offer a favicon.ico. When the favicon is requested by the browser, an error page is sent which is never seen by the user. For the BBC this wasted error page is nearly 3 KBytes of data (3 packets) and for the Inland Revenue 6700 Bytes (5 packets).

GIFs as text headings
Although the Inland Revenue site looks to be largely text with only a few images, the welcome heading and over 10 of the sub-headings are in fact images. Neither of the other two sites use images for pure text headings. To reduce the number of elements in the page and packets used, the Inland Revenue could replace these with text, coloured or formatted appropriately. This would also make the page more accessible for visually impaired users.

Web Farms
While not obvious from the outside, there are likely to be many web servers in the BBC web farm. The BBC site explicitly calls up a second URL for some of the images in the page: newsimg.bbc.co.uk. This may help to take some load off the main servers, and perhaps is used to allow the large images to be served from satellite servers, in closer proximity to the user. Google uses a similar initiative, implementing 'content distribution' from akamai.net for their UK site.

Frames
Frames can make a site less accessible and more difficult to use. Moreover, they are not search engine friendly. Neither Google nor the BBC use frames. The Inland Revenue does, but only to insert the obligatory "HM Government" header.

The Inland Revenue site has 1500 Bytes of additional metadata. As mentioned on page 3, this is to comply with the requirement for metadata on government websites - an extra payload the other sites don't have to contend with.

Further:  Monitoring Holiday websites and Caching bashing

Web Performance Consultancy

The above website comparison has been carried out independently of the sites discussed and is intended simply as a vehicle to explore the implementation of various web technologies. We welcome any feedback from the websites involved.

 

© SciVisum Limited 2012 - Web Application Testing  Tel +44 (0)1227 768276 -  Contact SciVisum - The Clocktower, 25-39 St George's Street, Canterbury, Kent CT1 2LE