
Case study
AI-Powered Maintenance Event Prediction for a Dutch Wind Turbine Services Company
Plexteq partnered with a wind turbine maintenance and asset management company in the Netherlands to rebuild the predictive maintenance core of their monitoring platform — replacing threshold-based alerting with a real-time, ML-driven failure prediction engine that increased the prediction rate by 37% and cut telemetry ingestion latency from hours to seconds.
Project Highlights
Industry
Energy & Renewables
Market
EU
Expertise
AI/ML, Anomaly Detection, Predictive Maintenance, Big Data, Real-Time Data Engineering
Cooperation
2021 – 2023
Technologies
Python, GCP, BigQuery, BigQuery ML, Vertex AI, Apache Flink, pandas, TensorFlow, PyTorch
Business Challenge
Wind turbines are among the most demanding assets in the energy sector to maintain. They operate in remote, often offshore locations, where a single unplanned outage can last for days while crews, vessels, and spare parts are mobilised. Industry experience consistently shows that major failures make up a minority of fault events yet account for the overwhelming majority of total downtime. For maintenance operators, the difference between detecting a degrading gearbox bearing three weeks before failure versus three hours before failure is measured in hundreds of thousands of euros per turbine.
Our client, a Netherlands-based wind turbine maintenance and asset management company, services a fleet of several hundred onshore and offshore turbines across the Netherlands, Belgium, and the German North Sea region on behalf of wind farm owners and operators. Their commercial model is built on availability guarantees: contracts commit them to keeping turbine uptime above agreed thresholds, with financial penalties when those thresholds are missed.
The company already operated a condition monitoring platform built around SCADA data feeds from the turbines under management. However, the predictive layer of that platform had reached its limits. Fault detection relied primarily on static thresholds and rule-based alarms defined per component — an approach that generated a high volume of false positives during normal operational variance (storm fronts, curtailment events, seasonal temperature swings) while simultaneously missing slow-developing degradation patterns that never crossed any single threshold. SCADA telemetry was ingested in nightly batches, meaning the analytical layer was always operating on data that was up to 24 hours stale. Analysts spent a significant share of their time manually triaging alarms rather than planning maintenance interventions.
The client's engineering leadership set out three objectives: detect substantially more genuine failure precursors, detect them earlier, and process telemetry in near real time so that field decisions could be made on live fleet state. Having reviewed Plexteq's published work on ML-based anomaly detection for wind power systems, they engaged Plexteq to design and deliver the new predictive maintenance module end-to-end on Google Cloud Platform.
Key Challenges
Alarm Noise Over Signal
Threshold- and rule-based alerting produced excessive false positives during normal operational variance while missing gradual, multi-signal degradation patterns — eroding analyst trust and burying real failure precursors in alarm noise.
Stale, Fragmented Telemetry
SCADA telemetry from hundreds of turbines — thousands of sensor channels covering gearbox, generator, main bearing, pitch, and hydraulic systems at 10-minute and 1-second resolutions — arrived through heterogeneous feeds and was processed only in nightly batches, making real-time fleet assessment impossible.
No Reliable Failure Labels
Failure events are rare and inconsistently labelled: maintenance work orders, SCADA alarm logs, and component replacement records lived in separate systems with no reliable linkage between a recorded failure and the sensor history that preceded it.
Multi-OEM Generalisation
The fleet spanned multiple turbine OEMs, models, and age profiles, so a single global model underperformed — the solution needed to generalise across turbine types while remaining sensitive to unit-specific normal behaviour.
Solution Delivered
​
↳ Real-Time Telemetry Ingestion with Apache Flink
The foundation of the engagement was replacing the nightly batch pipeline with a streaming ingestion layer. Plexteq designed and deployed an Apache Flink pipeline consuming SCADA telemetry from the fleet through Pub/Sub, performing in-flight validation, deduplication, sensor-level quality scoring, and unit normalisation before landing the data in BigQuery. Flink's stateful stream processing was used to compute windowed aggregates continuously — rolling means, variances, exponentially weighted trends, and cross-signal ratios (e.g., gearbox bearing temperature relative to power output and ambient temperature) — so that model-ready features were available within seconds of a sensor reading being emitted, rather than the following morning.
The same pipeline reconciled historical data: five years of raw SCADA archives, alarm logs, and maintenance work orders were backfilled into BigQuery, giving the data science workstream a unified, queryable history of fleet behaviour and failure events for the first time.
​
↳ Failure Labelling and Feature Engineering
​
Before any modelling could begin, Plexteq's data engineers built a failure event registry by programmatically linking component replacement records and maintenance work orders to SCADA alarm sequences and turbine downtime windows, resolving conflicts through a set of validation rules co-designed with the client's senior maintenance engineers. This produced a labelled dataset of confirmed failure events per component class — gearbox, generator, main bearing, pitch system, and hydraulics — each paired with its full preceding sensor history.
Feature engineering was performed in BigQuery SQL for large-scale historical windows and in pandas for experimentation, ultimately producing over 120 engineered features per turbine: lag features across multiple horizons, rolling statistical windows, temperature differentials between paired components, power-curve residuals (deviation between actual and expected output given wind conditions), and operational context flags for curtailment, icing conditions, and grid events.
​
↳ Layered Anomaly Detection and Failure Prediction Models
Rather than a single model, Plexteq delivered a layered architecture in which each layer answers a progressively more specific question:
-
Unsupervised anomaly detection (Isolation Forest + PCA baseline, BigQuery ML).
As a first screening layer, Isolation Forest models trained per turbine type flag operating states that deviate from learned normal behaviour. PCA-based reconstruction error served as the interpretable statistical baseline, mirroring the approach validated in Plexteq's earlier wind turbine anomaly detection work. Running these models directly in BigQuery ML kept the screening layer close to the data and inexpensive to retrain fleet-wide.
-
LSTM autoencoders for temporal degradation patterns (TensorFlow)
The core detection layer consists of LSTM autoencoder networks trained on healthy-operation sequences per component subsystem. The autoencoder learns to reconstruct normal multivariate sensor sequences; rising reconstruction error over time is a robust signal of gradual degradation that no single-threshold rule can capture. Per-turbine calibration layers adapted the shared architecture to unit-specific baselines, solving the multi-OEM generalisation challenge.
​ -
Supervised failure classification (gradient-boosted trees, PyTorch experimentation on Vertex AI)
On top of the anomaly signals, gradient-boosted tree classifiers were trained on the labelled failure registry to estimate the probability of a component-class failure within 7-, 14-, and 30-day horizons. SHAP values were computed for every prediction, so each alert reaching an analyst is accompanied by a ranked, human-readable list of the sensor behaviours driving it.
-
Remaining useful life estimation (survival analysis)
For components flagged by the upper layers, a survival analysis model (Cox proportional hazards with gradient-boosted extensions) estimates remaining useful life distributions, allowing the planning team to schedule interventions into existing vessel and crew rotations rather than dispatching emergency call-outs.
↳ MLOps on Vertex AI
The full model lifecycle was industrialised on Vertex AI: training pipelines defined as Vertex AI Pipelines, experiment tracking and model registry for every model generation, automated evaluation gates comparing challenger models against production on held-out failure events, and continuous monitoring for feature drift as fleet composition and seasonal conditions change. Retraining runs weekly per turbine type and is triggered automatically when drift metrics exceed configured bounds. Models are served through Vertex AI endpoints consumed by the streaming scoring layer, so every incoming telemetry window is scored against the current production models in near real time.
↳ Analyst Workflow Integration
Predictions were integrated into the client's existing monitoring platform rather than delivered as a standalone tool. Each alert carries the component-level failure probability per horizon, the SHAP-based explanation, the RUL estimate, and links to the underlying sensor traces - turning alarm triage from a manual investigation into a review-and-schedule workflow.
Key Features
Real-Time Streaming Ingestion
Apache Flink pipeline replaces nightly batches - telemetry is validated, normalised, and available for model scoring within seconds of leaving the turbine.
Unified Data Platform
All fleet data consolidated in BigQuery: SCADA telemetry, alarm logs, and maintenance records queryable together for the first time.
Layered Detection Architecture
Isolation Forest and PCA screening, LSTM autoencoders for gradual degradation, and gradient-boosted classifiers for failure probability at 7-, 14-, and 30-day horizons.
Remaining Useful Life Estimation
Survival analysis models estimate how long a degrading component will last, so interventions are scheduled into planned crew and vessel rotations.
Explainable Alerts
Every alert carries SHAP-based explanations - a ranked, human-readable list of the sensor behaviours driving the prediction.
Automated MLOps on Vertex AI
Training, evaluation, drift monitoring, and deployment run as fully automated pipelines, with weekly and drift-triggered retraining requiring no manual upkeep.
Per-Turbine Calibration
One shared model architecture adapts to unit-specific baselines, serving a multi-OEM, multi-model fleet without per-turbine rebuilds.
Business Outcome
The new predictive maintenance module went live fleet-wide after a six-month shadow-running period during which its predictions were benchmarked against the legacy rule-based system on identical live data.
37%
increase in failure prediction rate
<15 sec
telemetry-to-score latency, down from up to 24 hours
21 days
average early-warning lead time for major component failures
52%
reduction in false-positive alarms
With the system in production, the client's analysts moved from triaging alarm noise to planning interventions. The 37% uplift in prediction rate meant a substantial share of failures that previously arrived unannounced — forcing emergency call-outs and penalty exposure - were now flagged weeks in advance with an explanation the maintenance team could verify against sensor traces. The streaming pipeline eliminated the 24-hour blind spot entirely: fleet health is now assessed on live data, and fast-developing faults such as pitch system malfunctions are surfaced within minutes rather than the next morning.
​
The false-positive reduction proved just as commercially significant as the prediction uplift. Halving the alarm noise cut analyst triage workload and restored trust in the alerting layer - a precondition for the planning team acting on predictions rather than second-guessing them. RUL estimates allowed offshore interventions to be bundled into planned vessel rotations, materially reducing per-intervention logistics cost.
​
Measurable gains from the engagement:
-
37% increase in failure prediction rate versus the legacy rule-based system, validated over a six-month shadow-run on live fleet data
-
Telemetry ingestion and scoring latency reduced from up to 24 hours to under 15 seconds end-to-end
-
Average early-warning lead time of 21 days for gearbox, generator, and main bearing failures
-
52% reduction in false-positive alarms, cutting analyst triage workload by more than half
-
Historical analysis over five years of fleet data reduced from multi-day manual exports to interactive BigQuery queries
-
Fully automated weekly retraining with drift-triggered updates - no manual model maintenance required from the client's team
