# Plan: Multipart Image Ingest (Вариант 1)

## Goal
Add a new endpoint `POST /api/v1/ingest/with-images` that accepts:
- `metadata` — JSON string with `source_slug`, `external_id`, `payload`
- `images` — 0–N binary image files via `multipart/form-data`

The parser downloads images from its own source and uploads them directly to our service. We never fight foreign CDNs again.

## Architecture

### 1. Endpoint (`router_properties.py`)
- `metadata: str = Form(...)` — validated as JSON, must contain `source_slug` + `payload` with `title|description`
- `images: list[UploadFile] = File(default=[])` — streamed to disk, not held in memory
- Flow:
  1. Parse & validate metadata
  2. Save `raw_parsing_data` (status = pending)
  3. Stream each `UploadFile` to `/var/lib/vmk/images/temp/{raw_id}/{idx}{ext}`
  4. Inject `_uploaded_image_paths` into `raw.payload`
  5. **Inline `await pipeline.process(raw.id)`** — synchronous, because images are already local and we want immediate result
  6. Return `IngestResponse`

### 2. Pipeline (`property_pipeline.py`)
- `_stage_process_images` checks `raw.payload.get("_uploaded_image_paths")`
  - If present → `_stage_process_uploaded_images(property_id, paths)`
  - If absent → `_stage_process_remote_images(property_id, urls)` (existing behaviour)
- New helper `_process_uploaded_one`:
  - Reads local file → SHA256, width, height via Pillow
  - Moves file from `temp/{raw_id}/` to permanent `/{property_id}/{hash}.{ext}`
  - Creates `PropertyImage` row with `downloaded` status
  - Runs AI image analysis via `OllamaClient.image_to_base64`
- On successful completion: cleans up temp dir for this raw_id
- On failure: leaves temp dir for inspection (cleanup later)

### 3. Image Processing Helper (`image_downloader.py`)
- New method `ImageDownloader.process_local_file(property_id, temp_path, order)`:
  - Mirrors `download()` return type (`PropertyImageDownloadResult`)
  - No HTTP, just filesystem + Pillow

### 4. Limits & Validation
- Max files per request: 50 (configurable)
- Max file size: 10 MB each (configurable)
- FastAPI `UploadFile` already spills large files to disk — we just copy.

### 5. README
- Add `curl -F` example
- Add Python `requests` multipart example
- Explain when to use `/ingest` (URLs) vs `/ingest/with-images` (binary)

## Files to modify
1. `src/vmk_data_collector/api/v1/router_properties.py`
2. `src/vmk_data_collector/services/property_pipeline.py`
3. `src/vmk_data_collector/services/image_downloader.py`
4. `README.md`

## Why inline pipeline instead of queue?
- Parser already spent resources downloading images; we should not leave them in temp for an unknown queue delay.
- Immediate feedback: parser gets `property_id`, `snapshot_id`, validation result right away.
- Simpler state management — no orphaned temp files.

## Why not base64 in JSON?
- 33% overhead, huge JSON payloads, harder to debug, timeouts. Multipart is the industry standard for file uploads.
