The shape of every vision project we've shipped
After a few dozen vision projects across automotive, packaging, white goods, and metal, we've converged on a six-stage rollout. It's not a Gantt chart and it's not Agile theatre — it's the order in which questions actually need to be answered to avoid the failure modes we've already lived through.Stage 1 — Feasibility (1–2 weeks)
Goal: prove the physics. Forget the model.- Capture 200–500 images on the actual line, with the camera you actually plan to deploy.
- Have a human label them. Yes, by hand.
- Look at them. Are the defects visible? If you can't see them, no model can.
Stage 2 — Offline prototype (2–4 weeks)
Goal: hit target accuracy on an offline dataset. Build the dataset properly:- Train / val / test split by production batch, not random shuffle. Random shuffle leaks information across splits and makes the test scores meaningless.
- At least one batch from a different shift / week / line.
- Have the customer's QA team review the labels. Don't trust your own.
Stage 3 — Bench station (2–3 weeks)
Goal: run on real samples, off the line, in a representative cell. This is where lighting and triggering get finalised. This is where you find out the operator was leaving the door open during the day shift, raising ambient light by 200 lux. (True story.)Stage 4 — Shadow mode (4–8 weeks)
Goal: run on the live line with no consequences. The station sees every part, makes a call, logs it — but the call doesn't actuate anything. Operators continue inspecting manually.This is the stage everyone wants to skip. Don't. Every project we've seen go directly from Stage 3 to Stage 5 has come back to Stage 4 within three months, with worse trust than if it had started here.
Stage 5 — Scoped production with operator override (4 weeks)
Goal: the station rejects parts, but operators can override every decision with a single button press. Every override is logged. The override rate is the trust metric — if it's above 5 %, you're not ready.Stage 6 — Full automation
Goal: the override button gets removed, or replaced with a "raise concern" button that flags for offline review but doesn't change the disposition. By the time you reach this stage, the system has earned the right to be the final word on the part.The numbers we use to know we're done
- False reject rate < 0.5 % over a representative production week
- False accept rate < 0.05 % on a labelled holdout
- Operator override rate < 1 % in scoped production
- Camera/lighting health score > 95 % in any rolling 7-day window
One pattern we'd never repeat
Going to Stage 5 with a model that hadn't seen at least four different production batches across two seasons. Vision systems trained on one month of data drift in ways you don't predict, and the corrections happen in production, expensively.What does your rollout look like? We're curious whether anyone has trimmed this down to four stages successfully.