The challenge
Perfect Printing sells industrial printers which it deploys onto several customer sites across the UK. The printers are sold “as a service” by the day, which means that Perfect is responsible for keeping them working.
Any customer with at least two printers on-site is defined as a VIP customer (so Oxford below doesn’t qualify). VIP customers receive a signed Service Level Agreement (SLA) promising that:
80% of our printers on your site will have an uptime of >90% on any given day
Each printer sends a status report to the Cloud every 5 minutes.
The Operations Manager at Perfect wants to see a simple table showing the number of days over the past week that each VIP site has not met its SLA - information which can allow action in time to take remedial action to avoid breaching the monthly metric on which her salary depends!
The structure we’ll need is:
RELATIONAL: Count how many VIP sites have how many days with <80% good printers
RELATIONAL: Identify good printers as those with >90% uptime
STREAMING: Find the up-time ← we’ll look at this first
Finding up-time
First, let’s do a streaming query to find the up-time of each printer. We define “up” for a printer as having heard from it in the last 15 minutes (900 seconds), without any faults reported so our metric of uptime is:
DURATION(`$ts > ago(900) && fault == FALSE`) / DURATION()
If we (GROUP BY 1d, site, ID) we get:
$ts |
site |
$id |
uptime |
0d |
Edinburgh |
Printer 1 |
1.00 |
0d |
Edinburgh |
Printer 2 |
1.00 |
0d |
Edinburgh |
Printer 3 |
1.00 |
0d |
Oxford |
Printer 4 |
1.00 |
... |
|||
1d |
Edinburgh |
Printer 1 |
0.74 |
1d |
Edinburgh |
Printer 2 |
1.00 |
1d |
Edinburgh |
Printer 3 |
1.00 |
1d |
Oxford |
Printer 4 |
1.00 |
... |
i.e. a table with a row for every unique combination of day/site/printer, reporting the uptime of that printer on that day.
Thresholding, counting
We then take the output of this streaming query and apply a couple of SQL queries to it.
- First relational query:
- Threshold the uptime into good/bad
- Anything greater than 90% is up
- Anything lesser is down
- GROUP BY timestamp ts (i.e. by day) and by site, then COUNT the number of rows on each day and site, to get the number of “good printers” on that site.
- Second relational query:
- Select only sites with at least two printers
- Identify when the number of good printers on the site is less than 80% - we call that a “bad day” for that site.
Voila, here’s the table the COO wanted. A live “list of shame” for the operations team’s dashboard as they work towards achieving their goal of “zero bad days on all sites”:
bad_days_in_past_week | |
Edinburgh | 1 |
Cambridge | 0 |
London | 3 |
Conclusion
Combining streaming queries with relational queries makes it easy to track and manage up-time of a connected device estate. Next, read about using it to optimise utilisation of mobile pallets across a supply chain.