Website wobbles as the snow falls on southern UK

Date: 6th December 2012
Author: Deri Jones

Planning and testing well in advance of Christmas sales is crucial for all eCommerce sites. One of the worst nightmares for support and marketing teams is an unforseen failure at a time when not only are traffic peaks at their highest, but user patience  and sympathy are at their lowest.

As an example of the kind of things that can go wrong for multichannel retailers in the Christmas sales period –  the UK online retailer under the spotlight today is   (no hard feelings I hope Argos!)

Generally online retailers have planned well in advance of Christmas, and done their software technology updates and installed extra hardware back in August /September, and have used October for final testing including the all important website load testing  of the live site with all its latest bells and whistles in place.

With all key scenarios tested under realistic conditions the teams are as close as they can be to being 100% sure it’s ready.

But on the morning of 5th December the Argos site was broken, likely as  a result of a rather last minute tweak in progress, but perhaps it was a short term failure?

There were two major problems happening around 9am:

  • Search did not work
  • Add to Basket failed

These are both serious website problems in their ability to stop online sales dead in their tracks.

The problems were noticed by users for over 30 minutes duration.

What would have made these problems difficult for simple, or static URL monitoring, or a human in the internal to catch, as opposed to a dynamic automated mystery shopping journey or, worse, a real user on the site, is that these problems were not occuring on the homepage or in simple product pages. They would only be revealed when a user was part way through a search and purchase process.

The danger here is that the further along the transaction journey a user is, the more extreme the negative reaction to any usability or availability problems they encounter. Disruption in the search and buy process actively prevents them from completing a specific goal, at a time of year when speed and confidence of fulfilment are of the utmost importance.

The problem highlights the need for realistic User Journey monitoring – that dynamically makes different choices each time and thus quickly finds problems such as search failing, even if failing for only one category or subcategory.

Tips to take away

A)  There’s no such thing as a  website that never has problems!

Make sure that you’re 24/7 monitoring is as realistically as possible doing what customers do – anything less and the alarms may not go off when certain subtle feaures on the site fail.

Have a problem resolution process in place that is understood by, and involves, all stakeholders from all departments.

B) There ‘s no such thing as a website that never has slow pages!

For various reasons:

  • Your traffic patterns can change this year compared to last year due to your marketing,
  • New features you’ve had built into your site mean that the load on your systems per visitor is different,
  • Visitors shopping habits may change, they may look at more pages per purchase than before .

You can’t be sure which parts of your website will be slower than you’d like, the more complex systems become, the harder it is to predict the results of all interactions without testing.

But you can reduce the risk, by load testing in advance, on the real system (not an IT  test enviroment, which is very valuable for certain kinds of test, but not so much for this one),  using realistic user journeys that together generate a realistic mix of traffic that matches the best evidence  you have of the mix of users, journeys and how they behave built from your web analytics data.

C)  You need to reduce friction between Tech and Business teams, when problems do arise.

It’s too easy to blame IT teams for problems.

Instead, you need to have more of a listening approach and base your discussions on hard numbers and evidence:   those need to be 24/7 measurements of speed and errors on the site.

A common language that works best is one based on meaningful user journeys, measured on your site 24/7  so that it’s easy to look back at problems dispassionately and see the hard numbers, know whether slowdowns/errors on some parts of the website were happening, and how they correlate with peak/poor sales hours.

Many of the issues we discover for clients could only have been found by taking the approach that we do, and the results speak for themselves.