Preventive Maintenance Optimization: Doing Fewer PMs, Better

A plant ran 1,200 PMs a month and still got blindsided by failures. The leadership instinct was to add more PMs. But when someone finally audited the program, roughly half the PMs were guarding against failures that don't happen that way — inspecting components that don't fail, lubricating things on a calendar that didn't need it, and in a few cases physically disturbing healthy equipment often enough to cause the failures they were meant to prevent.

More PMs is not better maintenance. The goal of PM optimization is frequently fewer PMs — sharper ones, aimed at failures that actually occur. This piece is an argument against the "PM equals good, more PM equals better" reflex, and a practical way to run a PM optimization.

The over-maintenance trap

The reflex feels safe: if a little maintenance is good, more must be safer. Reliability logic says otherwise, in three ways.

First, calendar PMs can induce failure. Every intrusive intervention — opening a housing, replacing a healthy part, re-seating a connector — reintroduces infant-mortality risk. A bearing that was running fine can fail early because you disturbed it. Over-maintenance isn't neutral; it can manufacture the breakdowns it's meant to prevent.

Second, many PMs inspect things that don't fail. Effort spent checking components with no meaningful failure mode is pure waste — hours consumed for zero risk reduction.

Third, and most expensively, those hours have an opportunity cost. Every hour on a low-value PM is an hour not spent on a critical asset, a condition-based check, or planned corrective work. A bloated PM program doesn't just waste effort; it starves the work that would actually move reliability.

What PM optimization actually is

PM optimization (PMO) is reviewing each PM against the failure modes it's meant to catch, and deciding its fate: keep it as is, modify it (task or interval), kill it, or replace it with a condition-based task. It's curation, not addition.

PMO is not a one-time purge, either — it's a discipline you apply when building a program and revisit as you learn. It sits a step below full RCM in rigor and is the pragmatic way most mid-market teams should start tightening a PM program they already have.

The four questions for every PM

For each PM in your library, ask four things in order. If any answer is weak, the PM is a candidate to modify, kill, or replace.

1. What failure does this prevent?

If you can't name the specific failure mode the task addresses, that's a red flag. A PM that exists "because we've always done it" with no identified failure mode is the first thing to question.

2. Does that failure actually occur?

Check the history. If the failure mode this PM guards against has never happened on this asset class — and isn't a catastrophic-but-rare event you're deliberately insuring against — the PM may be guarding an empty doorway.

3. Is the task effective against it?

Even for a real failure, does this task actually catch or prevent it? A visual inspection won't detect an internal degradation that only vibration analysis would. An ineffective task is wasted effort dressed as diligence.

4. Is the interval right?

Too frequent wastes hours and risks inducing failure; too infrequent misses the failure it's meant to catch. The right interval comes from failure logic, not habit — which brings us to the P-F interval.

A worked review: three PMs, three fates

The four questions are easier to trust when you see them sort real PMs. Take three from a typical library.

A monthly PM to inspect and re-grease a motor bearing. Question one: what failure does it prevent? Bearing seizure from inadequate lubrication — a real, named mode. Question two: does it occur? History shows two lubrication-related bearing failures on this motor class in three years — yes, occasionally. Question three: is the task effective? Greasing does address it. Question four: is monthly right? The bearing's grease interval per usage is closer to quarterly, and monthly re-greasing risks over-greasing, which blows seals and causes failures. Verdict: modify — stretch the interval to quarterly. The PM was real but mistimed, and the over-frequency was actively harmful.

A quarterly PM to open a gearbox and inspect internal gears on a non-critical conveyor. Question one: prevents gear wear failure. Question two: zero such failures on record for this asset in its life. Question three: opening the housing every quarter introduces contamination and re-assembly risk each time. Verdict: kill — it's guarding an empty doorway while actively introducing risk through intrusion.

A weekly PM to visually check a critical pump's seal for leakage. Question one: prevents seal failure and product loss. Question two: occurs, and the consequence is high — it's a bottleneck. Question three: visual catches late-stage leakage but misses early degradation. Question four: weekly is reasonable, but the task is weak. Verdict: replace the visual check with a condition-based vibration or seal-monitoring task that catches the failure earlier in its P-F window.

Three PMs, three different fates — modify, kill, replace — and only by running the questions did the right call emerge. None of the three should simply have been kept as-is, and none should have been answered with "add more PMs."

Interval logic and the P-F interval

There are three broad interval strategies. Fixed-interval (every 30 days) is simplest but blind to actual condition. Usage-based (every 500 run-hours) ties maintenance to wear, which is often more honest. Condition-based (do it when a measured indicator says so) is the most efficient where you can sense the condition.

The concept that ties them together is the P-F interval — the window between the point a failure becomes detectable (P, potential failure) and the point it becomes a functional failure (F). Your inspection interval needs to be shorter than the P-F interval, or you'll inspect right past the warning. And here's the counterintuitive part: inspecting more often than necessary within that window doesn't make you safer — it just burns hours. "More often" is not a free safety upgrade; it has a real cost and sometimes a real risk.

Reallocating the freed hours

The point of cutting low-value PMs isn't to bank the savings — it's to redeploy them. Move the recovered capacity to where it actually reduces risk: critical assets that deserve more attention, condition-monitoring on equipment where it pays, and planned corrective work that clears the backlog. PMO that just deletes PMs and pockets the hours misses half the value. PMO that reallocates them turns a bloated program into a focused one.

How to know the optimization worked

Cutting PMs feels risky, so it's worth being explicit about how you confirm an optimization helped rather than hurt — because "we did fewer PMs and nothing broke" isn't proof on its own.

Watch three things over the cycle following the changes. First, the failures the killed PMs were guarding against: a killed PM is a hypothesis that the failure won't appear, so track whether that specific failure mode starts showing up in work-order history. If it stays absent through a full cycle, the kill was right; if it returns, you reinstate a sharper version. Second, the reallocated hours: confirm the time you freed actually moved to critical assets and condition-based work rather than quietly evaporating into more reactive firefighting. PMO that frees hours and loses track of them captured only half the value. Third, the overall reactive ratio and PM compliance together: a healthy optimization should hold or improve reactive ratio while raising PM compliance, because you're now running a smaller set of PMs you can actually complete on time.

The pairing matters. If reactive ratio climbs after a round of cuts, you cut something real and need to walk it back. If PM compliance jumps but reactive ratio holds, you trimmed waste correctly — the program got lighter without getting weaker. Measuring this way turns PMO from a one-time gamble into a controlled, reversible discipline you can run with confidence, because every cut is a hypothesis you've committed to checking rather than a permanent bet.

A lightweight PMO sprint

You don't need a year-long initiative. Run a focused sprint:

Pick a starting set — your highest-PM-hour assets, or your highest-failure assets. Don't start with the whole plant.
Pull the data — failure history, current PM list, hours consumed. (Clean asset data makes this far easier; messy data makes it guesswork.)
Run the four questions on each PM in the set, in a cross-functional review.
Decide and document — keep, modify, kill, or replace, with the rationale recorded.
Reallocate the freed hours deliberately.

A cross-functional review matters: pull in the techs who do the work and know what actually fails, not just the planner and the OEM manual.

Guardrails

Optimization is not an excuse to gut the program. Three guardrails:

Never cut PMs on safety- or compliance-critical assets without proper analysis. Regulatory and safety PMs stay, full stop.
Document the rationale for every kill and modify, so a future audit (or a future you) understands the decision.
Review after a cycle. A killed PM is a hypothesis; confirm the failure it guarded against didn't start appearing once you stopped.

The takeaway

A good PM program is curated, not maximal. More PMs isn't better maintenance — it's often worse, wasting hours and sometimes inducing the failures it's meant to prevent. Match each task to a real, addressable failure mode at the right interval, kill the rest, and redeploy the freed hours where failures actually happen.

For the higher-rigor version of this logic applied to your most critical assets, see reliability-centered maintenance without the big consulting project.

See how a managed program curates PMs instead of just adding them. Book a discovery call →