This is a concept-and-design document, not a production product.
It documents the technical basis (FMECA + KG + ML + NLP architecture, 12 engineering gaps, 5-phase roadmap, and a 826-row worldwide FMECA seed dataset). The production industrial-advisor build (Maintenance Intelligence Workbench — 11 screens, multi-tenant RBAC, real CMMS / BMS / edge integrations) is a separate, multi-year initiative tracked in docs/plans/2026-05-25-ai-maintenance-product-roadmap.md. AI is advisory-only by default; physical control remains in SIS / protection relays / BMS engineered sequences (IEC 61508).
Prescriptive Maintenance · FMECA + KG + ML + NLP
AI Engineering Maintenance — Intelligent Advisor System Concept
A concept-and-design brief for a prescriptive maintenance advisor that fuses FMECA documentation
(IEC 60812), a Neo4j knowledge graph, a Random Forest + PCA diagnostic classifier, and a
rule-based NLP layer (Aho-Corasick) into a single closed-loop advisor for engineered assets.
Two operator interaction modes: ask in natural language (By API), or upload sensor data and let
the engine diagnose (By Engine). This page synthesizes the paper, surfaces 12 engineering gaps,
and proposes concrete enhancements with a phased build roadmap.
Source paper: Lin, H. & Ompusunggu, A. P. (2026). Intelligent Advisor System for
Prescriptive Maintenance of Engineered Assets Using FMECA, Knowledge Graph and Machine Learning.
Artificial Intelligence for Engineering (Wiley / IET). DOI:
10.1049/aie2.70019.
Case-study dataset (Cranfield): 10.17862/cranfield.rd.5097649.
Neo4j graph: 10 node types · 12 relationship types. Cypher-queryable single source of truth for asset reasoning.
MODULE 3
Maintenance Analytics (ML)
Random Forest + PCA over 26 statistical features from position-error + motor-current signals. Bayesian-tuned.
MODULE 4
NLP Layer
Rule + dictionary based: Aho-Corasick word segmentation, NER to FMECA entity classes, Cypher template selection.
Architecture — The Paper's Baseline System
The advisor system has four cooperating modules. FMECA is the source-of-truth documentation; the
Knowledge Graph turns the tabular FMECA into a queryable Neo4j instance; the Maintenance
Analytics module diagnoses faults from raw sensor data; and the NLP layer translates
human questions into Cypher queries. The same Cypher-template tail is reused by both interaction modes,
which keeps the answer surface deterministic.
Knowledge / FMECA path Data / NLP path Reply / Advisor output External integration (abstract in paper)
The Four Modules
Each module has a distinct role, a clear input contract, a clear output contract, and a specific
tech-stack pick. The cards below name each one, with a sub-flowchart showing its internal pipeline.
M1
FMECA Documentation
IEC 60812 tabular
Failure Mode, Effects & Criticality Analysis. The source-of-truth worksheet built from
design knowledge and operational degradation mechanisms. Defines the canonical entities the
rest of the system reasons over.
Converts the tabular FMECA into a graph: 10 node types linked by 12 relationship
types — HAS_COMPONENTS, HAS_FAULTS, LEADS_TO, HAS_MECHANISM, HAS_RPN, HAS_STEPS,
HAS_ACTIONS_OF_FAILURES and similar. Queryable via Cypher templates from both modes.
51 trees · 7 PCA components · 80/20 train/test split
M4
NLP Layer
Aho-Corasick · rule-based
Three-stage rule-based NLP: (a) word segmentation; (b) Named Entity Recognition (NER) maps
words to FMECA entity classes — Action / Component / Effect / Failure / Fault /
Mechanism; (c) question classification picks a Cypher template, runs it, formats the
answer. Deterministic, explainable, no hallucination.
Input
Operator question in natural language
Output
Cypher query · result set · templated answer
Tech
Python pyahocorasick · dictionary built from FMECA entities · lookup-table classifier
Two Interaction Modes
The user reaches the advisor through one of two entry points. Mode A is for the engineer who
knows what to ask; Mode B is for the technician with raw data but no diagnosis. Both modes
terminate at the same Cypher-template tail, so the answer surface is consistent — the upstream
arrival path is the only thing that changes.
Mode A · By API
Natural-Language Query Path
Operator types a question. NLP segments, classifies, picks a Cypher template, runs it on the
FMECA-KG, and returns a templated answer. Bounded to the paper's predefined Cypher templates.
Q: "what is the cause of spalling?"
A: "Spalling has descriptions: Surface defects cause irregular movement and increased
wear. May arise from components: nuts, screws, balls, bearings. Mechanisms:
surface fatigue, contact-stress concentration. Prescribed actions: replace contaminated
grease, inspect raceway, schedule bearing overhaul. RPN tuple = (S, O, D)."
StrengthExplainable. Same query always returns the same answer. No hallucination.
WeaknessBounded to predefined Cypher templates. Cannot handle open-ended or unseen phrasings.
Mode B · By Engine Algorithms
Sensor-Upload Diagnostic Path
Operator uploads raw measurement data (paper's CLI: predict ./dataset/output_3).
PCA + RF classify the fault. Predicted label feeds the same Cypher tail as Mode A —
same answer surface, different arrival path.
A: "All 10 motion cycles classified as Backlash. Linked components: bearings,
nuts, balls, screws. Failure descriptions and effects attached. Corrective actions:
check pre-load, re-shim, replace nut if wear exceeds tolerance. Steps:
1. de-energise; 2. lock-out; 3. dismantle..."
StrengthCatches faults from raw signal anomalies. No expert framing required.
WeaknessBounded to 4 known classes. Cannot detect novel failure modes — will force-fit.
Case-Study Numbers — What the Paper Actually Reports
Asset under test: a single linear actuator (Cranfield dataset). Four fault states × three loads
× 50 motion cycles = 600 events. Results below are quoted verbatim from the paper —
no fabricated numbers, no averaged-up restatements.
Faults
4
Normal, Backlash, Spalling, Lack-of-lubrication
Loads
3
20 kg, 40 kg, -40 kg
Total events
600
12 conditions × 50 cycles
Sample rate
25 Hz
16-second motion cycles
Macro F1
84.84%
overall, after Bayesian opt
Precision / Recall
85.00 / 84.76
%
RF trees
51
optimum
PCA components
7
optimum
Fault class
Per-fault F1
Notes
Lack-of-lubrication
97.12%
Best performance — all events correctly classified across the 3 loads
Backlash
83.50%
Solid; backlash signal has distinct overshoot/undershoot signature
Normal
80.77%
Some leakage with weak Spalling cases (early-stage degradation overlaps)
Spalling
77.98%
Weakest — early-stage surface defects share signal characteristics with normal/backlash; this is the documented model weakness
The 77.98% Spalling F1 is the canonical opportunity for enhancement
(see Gap #10 below). Spalling is also the most economically valuable diagnosis — catching it
early prevents secondary damage to nuts, balls, and screws.
Gaps & Proposed Enhancements
Twelve concrete gaps surfaced from a critical read of the paper. Each is paired with a proposed
enhancement that names the algorithm or library, not just the wishful direction. Open any gap below.
01 Single-asset case study — no fleet generalization
Gap
The paper validates on one linear actuator. No discussion of how the KG, ML, or NLP behave across asset families (pumps, chillers, switchgear, gensets) or across multiple sites.
Why it matters
Engineering operations rarely have one asset. A fleet has thousands. Transfer learning, cross-asset reasoning, and federated training are required for operational value.
Enhancement
Build an ontology-driven KG with asset taxonomies (parent: RotatingMachine, children: Pump, Compressor, Motor; FMECA inherits through the hierarchy). Layer federated learning (FedAvg or FedProx) on the RF model so multiple sites contribute gradients without sharing raw sensor data.
02 Static FMECA — no operational feedback loop
Gap
The KG is frozen at design time. Field failures, novel fault patterns, and lessons learned from completed work orders never propagate back into the FMECA or the KG.
Why it matters
The most valuable knowledge in an engineering operation is what the technicians learn doing the job. A static FMECA wastes it. The advisor stays as smart as the day it was commissioned.
Enhancement
Close the loop with an experience-capture pipeline: when a CMMS work order is closed, an NER pass over the completion notes extracts (action, observed-fault, observed-mechanism) triples and proposes a KG diff. A reliability engineer reviews and accepts in a lightweight admin UI. Auditable, reversible, traceable.
03 No CMMS integration — prescription stops at advice
Gap
Paper mentions CMMS work orders abstractly. Advisor output is a text answer — not an action in the maintenance system. No bi-directional sync.
Why it matters
True "prescriptive" maintenance requires the prescription to land in a planner's queue, with parts reserved, technicians assigned, and an estimated MTTR. Otherwise it's just decision support.
Enhancement
Bi-directional CMMS hook via REST or webhook (IBM Maximo, SAP PM, Infor EAM, Fiix, UpKeep). On advisor output: auto-create WO with prescribed steps, BOM, MTTR estimate, criticality from RPN. On WO completion: capture outcome back into the KG via Gap #02's experience-capture pipeline.
04 PCA opacity — loss of feature interpretability
Gap
PCA collapses the 26 engineering-meaningful features (RMS, skewness, crest factor, overshoot...) into 7 abstract components. A technician cannot tell which signal feature drove a classification.
Why it matters
Maintenance staff need explainable diagnoses. "The RF says backlash" is not actionable; "the RF says backlash because overshoot OVy is 3σ above baseline" is.
Enhancement
Complement PCA with SHAP (TreeSHAP for RF, very fast) or built-in Gini-importance ranking, computed on the original 26 features in parallel. Surface the top-3 driving features per prediction in the advisor UI. Optional: keep PCA only for visualization, run RF on raw features for production scoring.
05 No anomaly detection — novel faults are force-fit
Gap
RF only chooses among the 4 known classes. Any unseen fault mode (e.g. a sudden seal failure) gets force-classified into the nearest known label, silently and confidently wrong.
Why it matters
"Unknown unknowns" are the most expensive failures. A diagnostic system that lies confidently in their presence destroys operator trust.
Enhancement
Add an anomaly pre-filter: Isolation Forest or One-Class SVM trained only on Normal data. Score on every new cycle. If anomaly-score > threshold AND RF confidence < threshold ⇒ flag "anomalous but unrecognised" and route to a reliability engineer (human-in-the-loop) instead of force-fitting.
06 No Remaining Useful Life — diagnostic, not prognostic
Gap
The system diagnoses what is failing, not when. RUL prediction is absent. "True" prescriptive maintenance must be time-aware so the planner can schedule before fault, not after.
Why it matters
Without RUL, the advisor can only react. With RUL, the advisor can schedule the work order to land 2 weeks before predicted failure, with parts in the cage and a technician booked.
Enhancement
Add a survival-analysis regressor (Cox Proportional Hazards or DeepSurv) or a recurrent RUL regressor (LSTM/GRU over the feature time-series). Train on degradation paths in the dataset. Output a posterior over RUL with quantile bands, not a single number.
07 No vendor / spare-parts / cost integration
Gap
Prescription stops at "do action X" + "follow steps Y". No link to part availability, vendor catalog, lead-time, or cost. The planner has to do all of that lookup separately.
Why it matters
If the prescribed action requires a bearing the warehouse doesn't carry with a 12-week lead time, the advisor's recommendation is operationally useless.
Enhancement
Link the FMECA-KG to the existing Spares Readiness Calculator spare-parts catalog. KG nodes get HAS_SPARE_PART edges into SPARES_CATALOG entries with availability, lead-time, cost, and supplier. Advisor output becomes: "replace bearing X — in stock at Bekasi warehouse, $420, 0-day lead time".
08 No edge / offline mode — cloud-only assumption
Gap
Architecture assumes cloud Neo4j + cloud ML inference. Industrial sites often have intermittent connectivity, data-sovereignty constraints (data must not leave the plant), and latency requirements that exclude a cloud round-trip.
Why it matters
Petrochemical, defence, and primary-industry sites won't deploy a cloud-only advisor. Many DC operations sites have the same constraint.
Enhancement
Distill the RF to a quantized ONNX / tflite model runnable on a Raspberry Pi 5 or Jetson Nano gateway. Replace Neo4j with SQLite + recursive CTEs or an embedded graph store (TerminusDB-embedded). Sync deltas to cloud KG when connectivity returns. Same Cypher template surface, two backends.
Aho-Corasick is a literal-string matcher. If a technician types "lube starvation" instead of "lack of lubrication", or "pitting" instead of "spalling", the NLP fails silently.
Why it matters
Field vocabulary varies by region, language, vendor, and seniority. A diagnostic system that needs the operator to use exact dictionary terms won't survive contact with reality.
Enhancement
Add a small-LLM rewrite stage (Phi-3 mini, Gemma-2B, or Mistral-7B-Instruct quantized) that paraphrases the user's input into canonical FMECA vocabulary, but gate by Cypher-template generation: the LLM never produces free-form output. It only emits a structured Cypher template ID + entity slots, which the deterministic backend executes. No hallucination because the answer surface stays templated.
The paper's documented weakest class. Early-stage spalling has signal characteristics overlapping normal and backlash; the 26 time-domain stats miss the fault-mechanism-specific harmonic content.
Why it matters
Spalling is the most expensive failure mode to miss — it cascades. Early detection saves the raceway, the balls, the seal, and possibly the entire bearing.
Enhancement
Add wavelet-domain features: Continuous Wavelet Transform energy at bearing-fault-mechanism-specific frequencies (BPFO, BPFI, BSF, FTF computed from geometry + speed). Optionally pre-train a self-supervised contrastive encoder (SimCLR-style) on unlabeled motion cycles before RF, so embeddings cluster spalling cases tighter even at early-stage. Expected lift: spalling F1 from 77.98% to >88%.
11 No multi-fault concurrency — single-label assumption
Gap
RF emits exactly one class per cycle. Real degraded assets often have multiple concurrent failure modes — e.g. spalling combined with lack-of-lubrication — that the current model collapses to whichever has the higher vote share.
Why it matters
If a bearing is both spalling AND lubrication-starved, prescribing only "re-grease" while ignoring spalling will accelerate the spalling. Single-label diagnosis can prescribe a partially correct, partially harmful action.
Enhancement
Convert to multi-label RF via one-vs-rest binary RFs per class with sigmoid-thresholded outputs; or migrate to a multi-head neural classifier (shared encoder, per-class sigmoid heads). KG response template then aggregates multiple HAS_FAULTS matches into a combined advisory.
12 No uncertainty quantification — raw vote share, not calibrated
Gap
RF outputs vote share, which is correlated with confidence but not calibrated. There is no statistical guarantee on the prediction; no prediction interval.
Why it matters
A reliability engineer needs to know how confident the advisor is before authorising an unplanned shutdown. "85% probable backlash" is not the same as "95% confident the true class is in {backlash, lack-of-lubrication}".
Enhancement
Apply temperature scaling on validation set for vote-share calibration, then layer conformal prediction (Mondrian conformal for per-class coverage) to emit prediction sets with a guaranteed marginal coverage (e.g. 95%). Advisor UI shows: "Likely Backlash; 95% confidence set = {Backlash, Lack-of-lubrication}".
Proposed Enhanced Engine Architecture
The diagram below incorporates the 12 enhancements into a single architecture. Green dashed lines are
the feedback / experience-capture loops. Red blocks are safety nets (anomaly detection,
uncertainty quantification). Amber blocks are knowledge surfaces. Cyan blocks are inference.
Five phases. Phase 1 reproduces the paper's baseline; Phases 2-5 add the 12 enhancements in dependency
order. Phase complexity is rated by effort, not difficulty — the actual hard part is the
cross-functional sign-off, not the code.
Phase 1 · MVP
Paper Baseline
FMECA worksheet for 1 asset
Neo4j KG (10 node types)
26-feature extraction + PCA + RF
Aho-Corasick NLP
CLI for Mode A & Mode B
Replicates 84.84% F1
Effort: 4-6 weeks · 1 ML eng + 1 reliability eng
Phase 2 · Close the loop
CMMS + Spares + SHAP
Gap #03 — bi-directional CMMS hook
Gap #07 — Spares Catalog edges
Gap #04 — SHAP explainer in UI
Web UI replaces CLI
Auth tiering integration (Pro+)
Effort: 6-8 weeks · +1 full-stack eng
Phase 3 · Safety nets
Anomaly + RUL + Uncertainty
Gap #05 — Isolation Forest pre-filter
Gap #06 — RUL regressor (Cox-PH or LSTM)
Gap #12 — conformal prediction
Human-in-the-loop queue UI
Calibration test rig
Effort: 8-10 weeks · +1 ML eng
Phase 4 · Lift the ceiling
LLM NLP + Multi-Label + Ontology
Gap #09 — gated small-LLM rewrite
Gap #11 — multi-label RF
Gap #01 — ontology-driven KG
Gap #10 — wavelet features
Gap #02 — experience-capture
Effort: 10-14 weeks · cross-functional
Phase 5 · Field-ready
Edge + Federated
Gap #08 — ONNX edge model
Gap #08 — SQLite-graph fallback
Gap #01 — FedAvg across sites
Provenance ledger + audit export
Pilot site deployment
Effort: 12+ weeks · field eng + DevOps
Open Questions for the Owner
Eight questions to answer before we can scope an MVP. Each one shifts the architecture materially.
Q1
Target asset class? The paper validates on a linear actuator. Are we targeting data-center rotating machines (chillers, CRAH fans, gensets, UPS rotating mass), electrical assets (switchgear, transformers, breakers), or industrial process equipment? Each implies a different FMECA seed and sensor topology.
Q2
What sensor streams are available today? Vibration accelerometers? Motor-current signature analysis (MCSA)? BMS analog points (temp, pressure, position)? Acoustic emission? The 26-feature spec only fits if the data look like the paper's. List what's available now vs what would need to be retrofitted.
Q3
Do we have existing FMECA documents? A live operations site usually has either nothing, a vendor-supplied OEM FMECA, or an in-house spreadsheet. Whichever it is — that's our M1 seed. If nothing exists, Phase 1 starts with an FMECA workshop, not code.
Q4
Which CMMS, if any? Maximo, SAP PM, Infor EAM, Fiix, UpKeep, or an Excel sheet? This decides the Gap #03 integration shape: REST API hook, ETL batch sync, or webhook listener.
Q5
Deploy target? Pure cloud (Neo4j Aura + sklearn on Vercel/Render)? Edge (Pi5/Jetson per asset)? Hybrid (edge inference + cloud KG)? The choice drives Gap #08 priority and the model-distillation budget.
Q6
Labeling capacity? Who labels the historical sensor data into fault classes — a reliability engineer with hours per week, or do we need self-supervised pre-training because labels are scarce? If labels are scarce, Gap #10's contrastive pre-training jumps to Phase 2.
Q7
Success metric? Macro F1 like the paper? Mean-time-to-detect (MTTD)? Avoided unplanned downtime (hours/year)? Avoided cost? The metric drives the loss function, the validation rig, and the cost-justification story.
Q8
Regulatory / safety case scope? Is the advisor advisory-only (no closed-loop control) or does it actuate anything (auto-shutdown, auto-throttle)? If actuating, Gap #12 conformal prediction + Gap #05 anomaly net become safety-critical, not optional. IEC 61508 / 62443 may apply.
Knowledge Base — Worldwide FMECA Seed Dataset (2026-05-23)
The Lin & Ompusunggu architecture is unopinionated about which failure modes it
ingests. The graph is only as useful as its seed data. To take the concept beyond a
rotating-machinery prototype, we commissioned a parallel research run for worldwide
industrial-asset failure-mode data scoped to the data-center estate. Output dropped
into docs/research/2026-05-23-fmeca-kg-worldwide-asset-failure-data.md
plus eight CSV seed files ready for Neo4j ingestion.
54% of major DC outages are power-related (Uptime Institute 2024 / 2025); 13% cooling; 12% network; ~40% involve human error somewhere in the chain. The graph weights human-error mechanism nodes (M-HUM-001/002/003) accordingly.
Liquid cooling has <10 s ride-through if the coolant pump fails, vs minutes for air. Severity scoring on pump-failure fault F11.2 is one tier hotter than the equivalent CRAH failure — a structural change to the FMECA, not a tweak.
VRLA Arrhenius: battery life halves for every 8.3 °C above 25 °C ambient. A parameterised mechanism node, not a static FMECA cell — the engine evaluates it against actual battery-room temperature in real time.
Highest RPN in the dataset = 200: diesel microbial contamination (F19.1). Slow to detect, devastating when it hits backup power. Top preventive-action target for any site running extended-runtime fuel storage.
Confidence tiers
Every fault row carries a confidence_tier column (high / medium / thin). Default stance is advisory-only. Confidence tiers gate the engine's recommendation routing, not autonomous action. Engine treatment:
High — standards-body / peer-reviewed / canonical databook sources (IEEE, IEC, ISO, ASHRAE, CIGRE, NETA, NFPA, OREDA, NPRD-2016, IEEE 493). Eligible for draft work-order generation; human approval required before any operational action. AI is advisory only; physical control remains in SIS / protection relays / BMS engineered sequences.
Medium — industry consortium / professional publication / trade-body source (Li-ion datasheets, CRAH/CRAC vendor application notes, Hydraulic Institute, CTI, OCP). Recommendation generated; reliability-engineer review required before draft work-order.
Thin — single vendor / manufacturer / OEM / unattributed (liquid cooling DLC + immersion, busway, magnetic-bearing chillers, flywheel UPS). Surfaced in UI; no auto draft work-order; vendor outreach required. Outreach plan in docs/handoff/2026-05-23-fmeca-vendor-outreach.md.
CSV seed inventory
File
Rows
Maps to KG node / edge
components.csv
144
Component nodes
faults.csv
109
Fault nodes (1 per row)
failures.csv
109
Failure-state edges
actions.csv
138
Corrective + preventive action nodes
mechanisms.csv
99
Physical degradation mechanism nodes
effects.csv
42
Effect-of-failure rows (local / system / business)
Gap #13 — Liquid-cooling fault-mode telemetry below industry benchmark (new, surfaced by this dataset)
Liquid-cooling and immersion-cooling primary sources thin out to ASHRAE TC 9.9 + OCP + one ASME paper.
Magnetic-bearing chillers and flywheel UPS depend largely on vendor-stated MTBF rather than independently audited data.
Mitigation: a vendor-outreach handoff doc has been scheduled (Vertiv, CoolIT, Asetek, Boyd for liquid;
Starline / Universal Electric for busway; Trane / York / Daikin for magnetic-bearing chillers; Piller / Hitec /
Active Power for flywheel UPS). NDA-backed telemetry requests in flight.