Friday, February 01, 2013

Risk analysis v. Downtime

Amazon.com experienced an outage yesterday between 2:40 and 3:30 (pm EST). Amazon's revenue is about $5-million per hour. Does that mean Amazon lost $5-million in revenue, or will those customers just wait and come back to try again later?

Generally, it's straight lost revenue. Customers don't come back. At least, that's what large customers from diverse industries tell me.

Large websites are in a position to know. The larger a website, the more predictable the traffic, to the point where you can set your watch by it. When traffic spikes in the morning, you know that it's 9am on the East Coast when everyone gets to work and starts browsing the net.

It's so predictable that you know, within 1%, what the revenue per hour should be. Thus, when there is a downtime for an hour, you can measure the next few hours to see if customers are coming back to try again later. Even if the effect is only slight, like 5% additional traffic, you'll tease the signal out of the noise. In my experience with large customers, they don't see that additional traffic later. Customers either go elsewhere to purchase, or reconsider and don't buy the thing at all.

It's not just etailers like Amazon. For instance, consider what happens when the NASDAQ market opens at 9:30am. You'd think there would be a rush of orders compensating for the pent-up demand from over night. There isn't. Instead, it's roughly a straight line from the moment NASDAQ opens to the moment it closes at 4pm. They have a graph in the data center. It's a square wave, down to the second.

Or consider the oil industry. Let's say a virus stops production on an oil platform, and then they fix the computers and restart production. You'd think they could just pump a little extra, either from that oil platform or another, to make up the loss. Or maybe you think that while the revenue is lost today, the oil company will make up the loss in 20 years as the well goes dry and gas prices shoot the roof because we'll have past peak oil. Thus, you'd think lost revenue now equates to greater assets for later. Neither of these things happen. Oil companies account for pumping downtime as a straight loss in oil revenue.


What I'm talking about is how to account for downtime when doing risk analysis. Can you assume that you'll make up the revenue later, and thus the cost of short periods of downtime is small? Or do you account for it as a straight loss of revenue, and thus, the cost of downtime, even for short periods, is high? In my experience with large customers, they always account for it as a straight loss of revenue. Moreover, they base this decision on robust measurements of what really happens with their business.

1 comment:

Tobias said...

I don't think that all people who have decided to buy a specific product and go to a big shop site like Amazon will go elsewhere or don't buy at all. If you wanted to buy on amazon you do that because of convenience, price, service etc. and simply because you're used to it.
If Amazon is down most people will not bother to search for another shop, check for shipping terms, registering and so on, but simply try again later.
And that's the reason it's hard if not impossible to measure these delayed buys. People don't know when the site is up again and so they will come back at any time after. Some will come back right after it's up again, some an hour, some five, some a day or two after. So they disappear in the noise.
What will definitely be lost are the spontaneous buys and that may be a good fraction of expected volume.