The Maintenance KPIs That Actually Matter (and the Vanity Metrics to Drop)

There's a maintenance dashboard out there with forty metrics on it. It's updated nightly, it's beautiful, and nobody has changed a single decision because of it. Forty numbers is the same as zero numbers if none of them moves an action.

Against that, picture five numbers that change what the crew does next week. That's the goal. This is an opinionated, deliberately short list — five KPIs that drive behavior, each with its formula, a healthy target, what it actually tells you, and how people game it. Plus the vanity metrics worth demoting.

The principle: measure what you can act on

A KPI earns its place by changing a decision. If a number goes up or down and nobody does anything differently, it's not a key performance indicator — it's trivia on a screen.

Two ideas keep the list honest. Leading vs. lagging: lagging metrics (like downtime) tell you what already happened; leading metrics (like schedule compliance) tell you whether you're doing the things that cause good outcomes. A good scorecard is weighted toward leading indicators you can still influence. And fewer is better: five metrics people act on beat forty they ignore. Curation is the work.

Where to start if five feels like too many

Five metrics is the destination, not the on-ramp. If you're starting from a forty-metric dashboard or no real metrics at all, trying to stand up all five at once usually means none of them gets defined well. Start with two, get them honest, then add the rest.

The two to start with are schedule compliance and reactive ratio, because together they tell you almost the whole story. Reactive ratio tells you how much of your capacity is still going to firefighting — the size of the problem. Schedule compliance tells you whether the proactive work you committed to is actually happening — whether the fix is taking hold. Watch those two move in opposite directions over a quarter (reactive ratio down, compliance up) and you have proof the program is working, before you've built out wrench-time sampling or backlog measurement.

The reason to resist starting with five is that each metric needs its definition fought out — the on-time window for PM compliance, the sampling method for wrench time, the planned-versus-reactive classification rule. Do that carefully for two, build the habit of acting on them, and the remaining three slot in against an organization that already trusts its numbers. A scorecard people believe and act on, built two metrics at a time, beats five metrics imposed at once and quietly ignored.

The five that matter

Schedule compliance

Formula: scheduled work-hours completed as planned ÷ total scheduled work-hours, over the period. Healthy target: 90%+. What it reveals: whether your plan is real. High compliance means you commit to a week and execute it; low compliance means the schedule is fiction that reality overruns daily. How it gets gamed: schedule almost nothing, or only schedule reactive work you were going to do anyway, and compliance looks great. That's why it can't stand alone (see the pairing section). It only means something against a frozen schedule of proactive, planned work — the planning-vs-scheduling distinction is what gives this metric teeth.

Wrench time (tool time)

Formula: hands-on-tools time ÷ total paid time, typically sampled rather than tracked continuously. Healthy target: there's no universal number, but planned work commonly runs roughly double the wrench time of reactive work; reactive often sits around 25–35%. What it reveals: how much of your paid labor actually touches equipment versus traveling, waiting, and hunting for parts. It's the clearest measure of the payoff from kitting and job plans. How it gets gamed: sampling bias and generous self-reporting. Use consistent sampling methodology, not honor-system logs.

PM compliance

Formula: PMs completed within their on-time window ÷ PMs scheduled, over the period. Healthy target: 90%+ — with an honest on-time window. What it reveals: whether your proactive program is actually happening or quietly slipping. How it gets gamed: a sloppy on-time window ("done within 30 days of due" on a weekly PM) makes compliance look perfect while PMs drift meaninglessly. Define the window tightly and report it alongside the percentage, or the number lies.

Reactive ratio

Formula: unplanned (reactive) labor hours ÷ total labor hours. Healthy target: trend down; world-class is often cited near 20% or below, and most mid-market shops start at 60–80%. What it reveals: how much of your capacity is consumed by firefighting — and, by proxy, how much of the hidden reactive premium you're still paying. How it gets gamed: reclassify reactive work as "planned" by writing a hasty work order after the fact. Pair it with planned/unplanned hour tracking and audit the classification.

Backlog weeks (ready backlog)

Formula: ready (planned) backlog hours ÷ weekly crew capacity hours. Healthy target: roughly 4–6 weeks of ready work per crew. What it reveals: whether you have enough planned inventory to build full weeks. Too low and you'll run reactive when the schedule hiccups; too high and work is aging faster than you can do it. (The full logic is in healthy maintenance backlog.) How it gets gamed: count total backlog instead of ready backlog, or pad with low-value work to look loaded. Measure ready backlog specifically.

Reading the scorecard together: a worked example

The five matter most when read as a set, because each one catches the others lying. Here's what that looks like in practice.

Suppose schedule compliance reads a healthy 92%. On its own, good news. Now check it against reactive ratio, which reads 70% — meaning seven of ten labor hours are still firefighting. Those two numbers together tell a different story than either alone: you're compliant with a schedule that barely contains any proactive work. The 92% is real but nearly meaningless, because you're scheduling little and executing that little faithfully. The pairing exposed what the headline number hid.

Take another: PM compliance reads 95%, which looks excellent — until you check the on-time window and find it's "within 30 days of due" on PMs that run weekly. With that window, a PM done three weeks late still counts as on-time, so the 95% is measuring almost nothing. Tighten the window to something honest and the number might drop to 60%, which is the truth you needed.

One more: backlog weeks reads a comfortable five weeks — but it's counting total backlog, and the ready slice is under one week. Schedule from that and you'll run dry within days. The healthy-looking headline masked an empty reservoir.

In each case, no single number was a lie exactly; it was incomplete, and the paired metric supplied the missing context. That's the whole argument for a small, deliberately cross-checking set over a large independent one.

Two supporting metrics that make the five trustworthy

Two more belong on the planner's scorecard — not headline KPIs, but the checks that keep the top five honest:

Planned vs. unplanned hours — the raw split underneath the reactive ratio. Tracking it directly makes reclassification gaming visible.
Estimate accuracy — actual job duration vs. estimated. It validates that your planning is real and improving, and it's the natural partner to schedule compliance.

Vanity metrics to demote

These get reported constantly and change almost nothing:

Raw work-order count — volume isn't performance. Closing more tickets can mean more failures, not better maintenance.
MTTR in isolation — mean-time-to-repair without context can improve simply because you got more practice fixing the same recurring failure. Faster repairs of frequent breakdowns is not a win.
"PMs completed" without a compliance window — a count with no on-time standard hides drift.
Uptime with no context — useful at the plant level, but on its own it doesn't tell you whether maintenance or luck produced it.

Demote, don't necessarily delete — some have a place as supporting context. Just don't let them headline.

The deeper problem with vanity metrics isn't that they're wrong; it's that they're comfortable. Raw work-order count goes up and feels like productivity. Uptime is high this month and feels like success. They reward you for activity without telling you whether the activity is the right kind, and because they rarely deliver bad news, they're the metrics that survive on dashboards long after the useful ones get cut. The discipline is to ask of every number on the screen: when this moves, does anyone do something different? If the answer is no, it's decoration, and decoration crowds out the five numbers that actually drive the week.

Gaming, and how to prevent it

Every metric invites gaming; that's not cynicism, it's a design constraint. The defense is to pair metrics so they check each other. Schedule compliance pairs with estimate accuracy (you can't fake both). Reactive ratio pairs with planned/unplanned hours (the split exposes reclassification). PM compliance pairs with its on-time window (the window exposes drift). A single metric can be gamed; a paired set is much harder to fake without the deception showing up somewhere.

Reporting cadence: match the metric to the audience

The same five numbers don't go to everyone at the same frequency:

Daily, to the crew: today's schedule and yesterday's compliance — operational, actionable now.
Weekly, to the planner: the full five, plus the two supporting metrics — the working scorecard.
Monthly or quarterly, to leadership: trends — reactive ratio falling, compliance climbing, backlog stable — the story, not the daily noise.

Match the metric to the decision the audience makes, and the report gets read.

The takeaway

Pick five KPIs, define each one precisely (especially the windows and the planned-vs-reactive classification), pair them so they police each other, and report them on a cadence matched to the audience. Five well-chosen, paired metrics that change decisions beat a forty-metric dashboard that changes nothing.

See the metrics a managed planning program governs by. Walk through the scorecard →