Managing sensor unreliability

By Pilgrim - January 31, 2019

Connected products are far from 100% unreliable. With customers across commercial and consumer markets, DevicePilot has unique typical performance levels and the underlying reasons and mitigation.

The facts

If you're like most connected product companies today, you'll struggle to achieve "one nine" of uptime (working 90% of the time). This is a much worse experience than for typical unconnected products. To move towards two nines (99% uptime), you must reduce your downtime by a factor of 10.

Why are IoT sensors unreliable (and how do you fix that)?

Here are the main failure modes and ways to mitigate them:

Failure mode

How to mitigate

Sensor application crashes

○ Sending "heartbeats" allows remote management

○ A "watchdog timer" reboots sensor automatically

Sensor battery runs out

○ Track battery state remotely:

• automatically ship batteries just-in-time

• flag any sudden increases in consumption and root-cause (new software version? hardware problem? pathological application state?)

Sensor "falls off" network

○ Rigorous testing of network management code (notoriously complex)

○ Collect diagnostics from the network layer (e.g. cellular)

○ Store-and-forward data in the sensor

○ Don't use wireless unless you have to

○ Provide more than one comms link (e.g. cellular plus LoRA, or use meshing)

Sensor unplugged or damaged

○ Design hardware to detect the condition, for example a broken/missing temperature sensor mustn't report a plausible temperature

In general

○ Implement a "black box", logging to non-volatile memory on the sensor, so that in the worst case individual units can be diagnosed by R&D engineers

○ Add some redundancy in the data you sent. Repeat any important "state" information regularly even if it hasn't changed.

Missing data

As data trickles through an IoT system from sensors to cloud database, it's inevitable that some of it will go missing. Here are some rules of thumb to make sure can cope well with that reality:

  1. Design your application to cope with missing data. For example, asking the average temperature of a million sensors when only 100 are missing isn't an "error", it's a reasonable answer, albeit with a caveat.

  2. Don't ever "invent" data.

    1. If you're plotting a graph and some of the data is missing, don't skim over the gap - its existence is important information to show the user

    2. Likewise, as you can together pieces of analytics within your application, you might like to pass a "confidence" value each result, so you never lose track of the quality of the result

  3. Ultimately, your application may sometimes have to say "I don't know" because the input data is too patchy or too old to allow a high-confidence answer.

DevicePilot's Cohort Analysis page is a good example of these principles in action: the colour-density of each bar on the chart shows the number of devices making up that sample, making statistical significance intuitive.

Uptime Percentage by Signal Strength

How good can you get?

Be aware of some "laws of physics" limitations. For example, if your device is using cellular connectivity, is deployed indoors, and you have no control over its exact placement, then you will be lucky to achieve 92% network availability.

If you're deploying a lot of connected products, you can't ignore the challenge of reliability. DevicePilot provides a great way to get the big picture to identify, measure, analyze and resolve you smart product challenges.


See how DevicePilot can make the difference


Industry leaders trust DevicePilot to help them improve the quality of the service they deliver at scale.

  • Eliminate revenue loss
  • Deliver a better service with the same human resource
  • Focus on growth and not firefighting
  • Get customer satisfaction through the roof

Book your personalised demo now and discover how DevicePilot can help you scale your connected business

Erik in a circle-1

Erik Fairbairn, CEO at POD Point:
Achieved 99% uptime across device estate

"We're totally data driven at POD Point, and if we can answer a question using data then we think that’s the best way - there’s no guesswork and you can use the facts.

Our DevicePilot dashboards have really let us get that actionable insight out of our devices."