Newer
Older
sups_yolo / docs / architecture_components.md

Architecture components

This file lists each component from the architecture diagram in detail. Use it as a reference when implementing or modifying a module.

Per-camera pipeline

IP Camera / Raspberry Pi

  • Role: image acquisition.
  • Expected count: 2–3 channels.
  • Output: continuous video stream.
  • Notes:
    • Cameras may be replaced; recognition quality impact must be measurable.
    • Positioning tolerance: rotation ±5°, displacement up to 10% of frame.

Position determination

  • Role: decide when to capture a frame.
  • Options:
    • Physical sensor on conveyor (simpler, deterministic);
    • AI-based detector (more flexible, needs training).
  • Output: trigger signal to frame extractor.

Video stream → frame

  • Role: extract still frames from the camera stream on demand.
  • Input: trigger signal.
  • Output: raw image frame.

Image preparing

  • Role: make the image suitable for the YOLO model.
  • Operations:
    • normalize pixel values / brightness / contrast;
    • filter noise;
    • crop to region of interest;
    • resize to model input size;
    • rotate to compensate for small alignment errors.
  • Must be configurable per camera channel via JSON config.

Inference layer

YOLO instance

  • Role: run defect detection on a prepared image.
  • One instance per camera / tab.
  • Each instance has:
    • own model weights;
    • own confidence threshold;
    • own class mapping;
    • own preprocessing parameters.

AI Model

  • Role: manage model artifacts and retraining lifecycle.
  • Responsibilities:
    • load weights for YOLO instances;
    • export updated weights after retraining;
    • keep versioned model history;
    • serve as an abstraction between training pipeline and inference.

Central services

Main Server

  • Role: coordination and API.
  • Responsibilities:
    • read JSON config at startup;
    • manage camera channels;
    • dispatch prepared images to correct YOLO instance;
    • collect detection results;
    • persist events to DB;
    • expose HTTP/WebSocket API for WEB UI;
    • handle retraining requests.

Database (DB)

  • Role: persistent storage.
  • Stores:
    • inspection events (sole ID, timestamp, channel, result);
    • original images;
    • annotated images with bounding boxes;
    • expert verification labels;
    • model versions;
    • configuration snapshots.

User interface

WEB UI

  • Role: human operator and expert interface.
  • Views:
    • history list with filtering;
    • detail view: original image, annotated image, probability, status;
    • expert feedback buttons: correct / incorrect;
    • multi-channel tabs (1, 2, 3) with per-channel YOLO settings;
    • live camera preview for mechanical setup;
    • settings form;
    • retraining panel with date-range restriction.

Development / testing components

Fake input data (factory environment emulation)

  • Role: offline input generator.
  • Used when cameras are not connected.
  • Produces synthetic frames or replays recorded frames.

Artificial image generator

  • Role: augment datasets to simulate factory disturbances.
  • Applied to:
    • initial dataset;
    • learning dataset.
  • Transformations:
    • lighting effects;
    • PNG pattern overlays (dust, dirt, lens contamination);
    • rotation (±5°);
    • other noise.

Learning Dataset

  • Role: curated data used to train / retrain the model.
  • Sources:
    • initial dataset;
    • artificial generator output;
    • verified production data from expert feedback.

Learning / Training module

  • Role: run model training and retraining.
  • Inputs:
    • Learning Dataset;
    • configuration (hyperparameters, date range).
  • Outputs:
    • new model weights;
    • training metrics;
    • updated AI Model entry.