Architecture components
This file lists each component from the architecture diagram in detail. Use it as a reference when implementing or modifying a module.
Per-camera pipeline
IP Camera / Raspberry Pi
- Role: image acquisition.
- Expected count: 2–3 channels.
- Output: continuous video stream.
- Notes:
- Cameras may be replaced; recognition quality impact must be measurable.
- Positioning tolerance: rotation ±5°, displacement up to 10% of frame.
Position determination
- Role: decide when to capture a frame.
- Options:
- Physical sensor on conveyor (simpler, deterministic);
- AI-based detector (more flexible, needs training).
- Output: trigger signal to frame extractor.
Video stream → frame
- Role: extract still frames from the camera stream on demand.
- Input: trigger signal.
- Output: raw image frame.
Image preparing
- Role: make the image suitable for the YOLO model.
- Operations:
- normalize pixel values / brightness / contrast;
- filter noise;
- crop to region of interest;
- resize to model input size;
- rotate to compensate for small alignment errors.
- Must be configurable per camera channel via JSON config.
Inference layer
YOLO instance
- Role: run defect detection on a prepared image.
- One instance per camera / tab.
- Each instance has:
- own model weights;
- own confidence threshold;
- own class mapping;
- own preprocessing parameters.
AI Model
- Role: manage model artifacts and retraining lifecycle.
- Responsibilities:
- load weights for YOLO instances;
- export updated weights after retraining;
- keep versioned model history;
- serve as an abstraction between training pipeline and inference.
Central services
Main Server
- Role: coordination and API.
- Responsibilities:
- read JSON config at startup;
- manage camera channels;
- dispatch prepared images to correct YOLO instance;
- collect detection results;
- persist events to DB;
- expose HTTP/WebSocket API for WEB UI;
- handle retraining requests.
Database (DB)
- Role: persistent storage.
- Stores:
- inspection events (sole ID, timestamp, channel, result);
- original images;
- annotated images with bounding boxes;
- expert verification labels;
- model versions;
- configuration snapshots.
User interface
WEB UI
- Role: human operator and expert interface.
- Views:
- history list with filtering;
- detail view: original image, annotated image, probability, status;
- expert feedback buttons: correct / incorrect;
- multi-channel tabs (1, 2, 3) with per-channel YOLO settings;
- live camera preview for mechanical setup;
- settings form;
- retraining panel with date-range restriction.
Development / testing components
- Role: offline input generator.
- Used when cameras are not connected.
- Produces synthetic frames or replays recorded frames.
Artificial image generator
- Role: augment datasets to simulate factory disturbances.
- Applied to:
- initial dataset;
- learning dataset.
- Transformations:
- lighting effects;
- PNG pattern overlays (dust, dirt, lens contamination);
- rotation (±5°);
- other noise.
Learning Dataset
- Role: curated data used to train / retrain the model.
- Sources:
- initial dataset;
- artificial generator output;
- verified production data from expert feedback.
Learning / Training module
- Role: run model training and retraining.
- Inputs:
- Learning Dataset;
- configuration (hyperparameters, date range).
- Outputs:
- new model weights;
- training metrics;
- updated AI Model entry.