feat: add pgvector semantic search
- Add pgvector dependency and Alembic migration (vector extension, embedding
  column, HNSW index with cosine ops)
- Add nomic-embed-text embedding model to config
- Add OllamaClient.embed() method for /api/embed endpoint
- Add embedding generation stage to PropertyPipeline (_stage_embed)
- Add PropertyRepository.update_embedding() and search_similar() with
  cosine distance + optional filters (deal_type, city, price range)
- Add POST /api/v1/search/similar endpoint with query embedding + filters
- Add SimilarSearchRequest/Response schemas
- Add backfill script for existing listings
- Update docker-compose.yml to pgvector/pgvector:pg16 image
- Update .env to use Docker PostgreSQL on port 5433

Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 1936c70 commit ac1975fdf988e7b65cc398edcf302c9b1e351a12
@Eugene Sukhodolskiy Eugene Sukhodolskiy authored 1 day ago
Showing 12 changed files
View
alembic/versions/2a9410d9738e_add_pgvector_embedding_column_and_hnsw_.py 0 → 100644
View
docker-compose.yml
View
pyproject.toml
View
scripts/backfill_embeddings.py 0 → 100644
View
src/vmk_data_collector/api/v1/router_properties.py
View
src/vmk_data_collector/core/config.py
View
src/vmk_data_collector/db/repositories/property.py
View
src/vmk_data_collector/models/property_listing.py
View
src/vmk_data_collector/schemas/search.py 0 → 100644
View
src/vmk_data_collector/services/ollama_client.py
View
src/vmk_data_collector/services/pipeline_factory.py
View
src/vmk_data_collector/services/property_pipeline.py