Skip to main content
← Proof·INDUSTRIAL MAINTENANCE·Benchmark Study

Unfamiliar patterns
carry 3.3× the failures.

We ran NASA's C-MAPSS benchmark through Coherany's standard pipeline. No labels, no custom code. The anomaly bucket turned out to be 3.3× more likely to contain engines in critical state.

3.3×
Critical premium
140
Patterns found
19,731
Operating windows
0
Labels required
Dataset: NASA C-MAPSS Turbofan DatasetSource: NASA Prognostics Data Repository ↗Published: April 2026

DisclosureBenchmark study on public NASA Prognostics Data Repository data. No customer data was used. Reviewed by Drew Wasem, Founder, Coherany. Methodology available on request.

THE METHODOLOGY

We asked drift-from-baseline, not is-it-over-the-limit.

Every operations leader has had this conversation. An engine fails mid-mission. The post-mortem takes weeks. The customer wants to know why nobody saw it coming. The data was there. Every threshold was monitored. No alarm fired until the cycle of failure itself.

Threshold-based maintenance asks one sensor at a time whether its current reading crosses a fixed limit. It catches catastrophic excursions. It misses the drift that precedes most real failures, because the engine that fails next week is producing readings within spec today.

We took a different approach. Every engine has its own starting signature: manufacturing tolerances, component variation, break-in wear. Instead of comparing each reading against a fleet-wide threshold, we compared every sensor against that specific engine's own baseline. Drift-from-baseline is where the failure signal lives.

THE DATA

NASA's benchmark. The most-cited dataset in predictive maintenance.

100
Engines
21
Sensors each
19,731
Operating windows
206
Avg lifetime (cycles)

THE FINDINGS

Drift from baseline beats absolute readings. Every time.

We ran the same pipeline we ran on the wildfire weather data. Same approach. Different column names. That was the only change. Three patterns emerged that would have been invisible to any single-sensor alert.

01

Every engine has its own baseline

Manufacturing tolerances, component lot variation, break-in wear: these mean every engine leaves the factory slightly different. The first twenty cycles capture that individual signature. After that, the question isn't what the sensor reads but how far the sensor has drifted from where this engine started. That drift is where the failure signal lives.

02

140 distinct operating patterns, discovered without labels

Across nearly twenty thousand operating windows, the pipeline surfaced 140 distinct patterns of engine behavior. No prior knowledge of which sensors mattered. No notion of which combinations preceded failure. Just patterns that emerged from the data itself, and an approval step where a human engineer can confirm, reject, or annotate each one.

03

The anomaly bucket is where the trouble lives

About 30% of operating windows did not fit any of the 140 discovered patterns. That anomaly bucket turned out to be the most important finding. Windows the system had never seen before were 3.3× more likely to be in a critical state than windows that matched a known pattern. Unfamiliar is the signal.

THE RESULT

Inspect the anomaly bucket first. Everything else can wait.

3.3×
Critical premium in the anomaly bucket

The clustered bucket (operating windows that matched a known pattern) contained 1.8% critical states. The anomaly bucket (windows the system flagged as unfamiliar) contained 5.9%. A 3.3× premium on inspection effort, produced without a single labeled training example.

For an MRO running 500 engines: if 30% of weekly operating windows fall into the anomaly bucket, those 150 engine-windows contain three times the concentration of critical states as the other 350. That is a prioritized inspection queue. It did not exist before the pipeline ran.

“We ran the same pipeline we ran on the weather stations. Same code. Different column names. The engines' failures were hiding in the same kind of pattern the fires were.”

DW
Drew Wasem
Founder, Coherany

HONEST LIMITS

What this does and doesn't prove.

C-MAPSS is a simulated dataset. Real engines have noisier sensors, more failure modes, and operating conditions the simulation never modeled. The 3.3× premium will move in either direction on real data. What the benchmark proves is that the architecture works: no labels, no custom code, a ranked prioritization that a technician can act on and a regulator can audit. Real validation happens on your fleet, not ours.

This isn't really about jet engines.

Pumps, turbines, compressors, vehicles, any industrial asset with sensor telemetry: the approach is the same. Request the methodology or run it on your own fleet.

We use cookies for essential site functionality. No advertising or tracking cookies. Privacy Policy