Uptime and Availability - a tale of 2 metrics

By Pilgrim - July 27, 2020

Metrics are a key part of how a company improves its service quality, to make its customers happy, to grow, to succeed.  The metrics uptime and availability may sound similar - but their difference tells the tale of how a successful connected-device business changes as it grows.

The tale starts when a company deploys its first units into the hands of real users - perhaps 100 or so. At DevicePilot we generally call this the "trial" or "pilot" stage: for the first time there are enough units deployed to be able to start gathering statistically-valid feedback about functionality, utility, customer satisfaction - all the aspects which determine whether this new proposition swims, or sinks.

It has taken Herculean effort to get those early units built and deployed, and likely there will be rough edges around every aspect of the product: hardware, software, comms, cloud, UX etc. Indeed the very purpose of this pilot is to discover these, and address them  to enable the next order of magnitude to be deployed with greater confidence - and less-frantic customer support!

Uptime

At this stage the team is all too aware that their devices don't work all the time, so the most natural metric is up-time: what percentage of time are the devices actually working, over a rolling period of - say - the last week? Because the device is connected, this is easy to measure for example by tracking device heartbeats received in the cloud.

Obviously we'd like uptime to be 100%, but at trial stage even 90% is good going, because of all those rough edges - and it's probably good-enough to enable the discovery of the rough edges, though it's probably not good-enough for production volume. The important thing to realise about trials is that the customer isn't actually paying for the product, they're paying for the service it delivers. So it's not just a product that's being deployed for the first time, it's a process, too - the process which will measure and improve the service delivered by the product. 

Uptime is a great place to start on the iterative process of improving service quality, but over time it will become apparent that it's quite an inward-facing, technical metric, which is "necessary, but not sufficient".  Those readers who have spent time in the IT world may be familiar with the scenario when a customer complains that your application isn't working, yet when you turn to your technical team they claim that everything is working fine - and it turns-out that they're measuring server health, yet the application running on the servers has crashed. It's time to move beyond solely technical metrics. 

Availability

So let's look at things from the customer's point of view. The customer doesn't care about technical metrics, they just care that when they go to use the device, they can. This is what we call availability. Does 100% uptime imply 100% availability? No.

By looking at a couple of scenarios, we'll now discover how we can achieve high availability with low uptime, and low availability with high uptime. The use-case we'll use is electric vehicle charging sockets - and you may see analogies with your own use-case. 

poor uptime

Above we see 91.2% uptime, yet 94.3% availability - how can that be? Note that it's socket uptime and site availability. Several sockets are deployed on each site. If I drive my electric vehicle to a site with 4 sockets and find 3 of them broken and only 1 working, then right now that site has poor socket uptime, but if I'm the only customer wanting to charge my car then I can do so - so service availability at the site is good.

Now let's look at a different scenario:

poor availability

What's going on here?  We have excellent uptime, yet a customer visiting a random site has more than 1 in 10 chance of not getting a charge, which is clearly a poor level of user experience. How come? This reflects the fact that availability is not driven just by technical functionality, but also by usage. If I arrive at a site with 4 perfectly working sockets, but they're all in use, then I can't get a charge.  It's not a technical problem, but it's a big business problem. If this situation persists, then it's a sign that more sockets are needed at that site.

Conclusion

There's a lot more to this topic than we can cover in a short blog piece, but hopefully it's helped explain the two important metrics of uptime and availability, how they are different, and how - whilst uptime is often the initial focus to drive-up technical quality - service availability is often a better metric for driving customer satisfaction in the longer term. 

 

 

Comments

See how DevicePilot can make the difference

 

Industry leaders trust DevicePilot to help them improve the quality of the service they deliver at scale.

  • Eliminate revenue loss
  • Deliver a better service with the same human resource
  • Focus on growth and not firefighting
  • Get customer satisfaction through the roof

Book your personalised demo now and discover how DevicePilot can help you scale your connected business

Erik in a circle-1

Erik Fairbairn, CEO at POD Point:
Achieved 99% uptime across device estate

"We're totally data driven at POD Point, and if we can answer a question using data then we think that’s the best way - there’s no guesswork and you can use the facts.

Our DevicePilot dashboards have really let us get that actionable insight out of our devices."