The sensor dataset that ruins a project
Here's a pattern we see often: a customer adds a hundred sensors to their line, logs everything, and a year later asks why the data isn't useful for predictive maintenance. The answer, in almost every case: the sensors weren't specified for the question.Below is the checklist we now run before adding any sensor to a system.
Specify the question first, the sensor second
"Add a temperature sensor to the motor" is not a spec. The spec is:- What decision will this measurement support?
- At what threshold does the decision flip?
- How fast does the underlying physical process change?
- What's the failure cost of a missed reading?
A motor temperature for stator winding monitoring (slow process, ±2 °C accuracy fine, sample at 0.1 Hz) is a different sensor from motor temperature for thermal runaway detection (fast process, ±5 °C accuracy is fine but latency must be < 100 ms, sample at 100 Hz).
The sensor specification matrix
For each sensor, write down:- Range — measurement range, with margin for fault conditions (a motor running hot can hit 150 °C; a 0-100 °C sensor saturates and you miss the event)
- Accuracy — the actual error budget at the operating point, not the catalog "0.1 % FS" which often means 0.1 % of the full scale
- Resolution — for digital sensors, ADC bits; for analog sensors, the noise floor
- Sample rate — based on the process bandwidth, not on what the sensor can do at peak
- Latency — including comms; some IO-Link sensors with sophisticated processing have surprising latency
- Environmental rating — IP, ATEX if relevant, vibration, EMC class
- Calibration story — factory only, field calibratable, drift over time
- Failure mode — what does the sensor output when it fails? Open circuit, fixed value, stuck-at-last-good, garbage?
The last point is the underrated one. A temperature sensor that fails to "0 °C" looks like a working sensor on a cold day. A vibration sensor that fails to "no signal" looks like a healthy bearing.
Comms protocol affects the data quality more than the sensor
- 4-20 mA analog — the industrial workhorse. Robust, well-understood, requires an ADC at the controller. Open / short detection comes free with NAMUR levels.
- IO-Link — the modern smart-sensor protocol. Two-way, supports parametrisation, exposes diagnostics. Adoption is widespread; we default to IO-Link for new sensor selections where supported.
- Modbus RTU / TCP — common on smart sensors. Cheap, but the polling architecture limits sample rate.
- EtherCAT / Profinet — when you need deterministic high-rate sampling, especially from many sensors.
- CAN bus — common in mobile / vehicle applications. Bandwidth-limited at high node counts.
- Wireless (LoRa / Bluetooth / 802.15.4) — for sensors where wiring isn't viable. Always with a battery / energy budget plan.
Diagnostics — the thing that makes data useful in 18 months
Every sensor signal we log includes:- The sensor value
- A sensor-health flag (good / faulted / stale / out-of-range)
- The timestamp of the actual measurement (not the timestamp the database wrote it)
- The sensor's last calibration date (if knowable)
Without sensor-health metadata, your year-of-data is full of zeros from broken sensors that nobody noticed. With it, you can filter properly and the analysis is honest.
One pattern that always pays off
Redundant sensors on critical signals — two sensors of different make and model measuring the same thing. Disagreement triggers a maintenance event, not a process trip. The cost is two sensors instead of one. The value is a continuous health check on the sensor population, with no extra calibration overhead.One pattern we'd avoid
Buying the cheapest sensor that "matches the spec on paper". The spec on paper is usually best-case. Field accuracy is what counts, and field accuracy is found in independent testing, not vendor catalogs.What sensors are giving you trouble? Curious about anyone using machine-vision-based "virtual sensors" to replace physical sensors.