Apartment hunting is a nightmare. This makes it slightly less of one.
Sherlock Homes ranks listings against your criteria using NLP, geospatial signals, and OpenAI Vision. Because staring at 47 identical "sun-drenched" listings should not be a full-time job. Currently configured for NYC rentals (previously SF purchases).
- Deployment: Fly app
sherlock-homes-nyc(https://sherlock-homes-nyc.fly.dev) - Active ingestion sources:
zillow,streeteasy - Active criteria file in production:
config/nyc_rental_criteria.yaml - StreetEasy low-count incident is resolved (runbook + verification in
docs/OPERATIONS_FLY.md)
Zillow tells you what exists. It does not tell you what is good. The north-facing "garden unit" with the $15k/year HOA and a fire station next door? Zillow will show it. We will not.
Sherlock Homes reviews 200+ signals per listing so you do not learn the obvious after a 40-minute drive.
# Start API
./run_local.sh
# Start frontend (separate terminal)
./run_frontend.shAPI: http://localhost:8000 Frontend: http://localhost:5173
Python 3.11/3.12 recommended. If uv is installed, ./run_local.sh will use it automatically.
Local data (SQLite DB, JSON exports) is kept under .local/ by default to avoid repo-root file sprawl. Existing legacy DBs at ./sherlock.db or ./homehog.db are still detected and used automatically.
NLP Scoring Reads descriptions like a suspicious buyer. Extracts 32+ keywords across categories: natural light, views, outdoor space, high ceilings, parking. Flags the bad stuff too. "Cozy" usually means small.
Visual Scoring OpenAI Vision looks at listing photos. Rates modernity, condition, brightness, staging, cleanliness. Catches water stains, tired fixtures, and the telltale signs of a flipper who watched too much HGTV.
Tranquility Score How close is this place to things that make noise? Freeways, busy streets, fire stations. No API calls. Just local SF data and geometry. Some of us have meetings.
Light Potential Estimates how much natural light you will actually get. Top floor, corner unit, south-facing equals good. North-facing basement equals lamps.
Why This Matched Every match includes explicit reasons (budget fit, neighborhood focus, recency, light, quiet) plus one tradeoff. No black-box scores.
Change Tracking Detects meaningful listing changes like price drops, status flips, and photo updates so you do not miss the quiet gems.
- Light Chaser: For people who need sunlight to function.
- Urban Professional: Walkability uber alles.
- Deal Hunter: Watches for price drops like a hawk.
- Ingestion: Scrapes Zillow + StreetEasy via ZenRows on a recurring scheduler (and on-demand via admin endpoint).
- Enrichment: NLP, geospatial, and visual scoring per listing.
- Matching: Weighted scoring against your preferences with soft and hard caps.
- Ranking: Top matches, with explanations of why.
home-hog/
├── app/ # FastAPI backend
│ ├── models/ # SQLAlchemy models
│ ├── services/
│ │ ├── nlp.py # Keyword extraction
│ │ ├── advanced_matching.py # Scoring engine
│ │ ├── geospatial.py # Tranquility calculations
│ │ └── visual_scoring.py # OpenAI Vision
│ └── routes/ # API endpoints
├── frontend/ # Vite + React app
├── scripts/ # Data tools
├── run_local.sh # Start API
├── run_frontend.sh # Start frontend
└── nuke_db.sh # Reset database
| Endpoint | What it does |
|---|---|
GET /matches/test-user |
Your ranked matches |
GET /listings |
All listings, paginated |
GET /listings/{id} |
Single listing |
GET /listings/{id}/history |
Change history for a listing |
GET /changes |
Recent listing changes |
POST /admin/ingestion/run |
Force a data refresh |
GET /ingestion/status |
Ingestion status |
GET /ping |
Health check |
Burn it down and start over:
./nuke_db.sh && ./run_local.sh
python scripts/import_from_json.pyRun visual analysis:
python -m app.scripts.analyze_visual_scoresProduction app: https://sherlock-homes-nyc.fly.dev
Use the canonical runbook:
docs/OPERATIONS_FLY.mdfor deploy, ingestion operations, validation, and rollback.
Create .env.local:
DATABASE_URL=sqlite:///./.local/sherlock.db
ZENROWS_API_KEY=your_key
OPENAI_API_KEY=your_key
# Optional: fallback for text intelligence when OpenAI is rate-limited/unset
DEEPINFRA_API_KEY=your_key
# NYC rental scoring profile (current default for this project)
BUYER_CRITERIA_PATH=config/nyc_rental_criteria.yaml
# Optional StreetEasy runtime guardrails (defaults are safe)
STREETEASY_REQUEST_TIMEOUT_SECONDS=45
STREETEASY_REQUEST_RETRIES=1
STREETEASY_MAX_DETAIL_CALLS=80Optional alerts (iMessage / email / SMS) are documented in docs/DEVELOPMENT.md.
Production operations are documented in docs/OPERATIONS_FLY.md.
The current system intentionally prefers "a little bit or more" outdoor access without being brittle:
- Baseline signals:
nlp_signals.positive.outdoor - Stronger boosts for meaningful private space:
outdoor_private,outdoor_premium - Soft penalties for weak/noisy signals:
nlp_signals.negative.weak_outdoor
Edit these in config/nyc_rental_criteria.yaml to calibrate strictness without hard-disqualifying viable listings.
- Backend: FastAPI, SQLAlchemy, Pydantic
- Frontend: Vite, React 18, TypeScript, React Query
- Database: SQLite local, PostgreSQL in Docker
- Sources: Zillow (ZenRows), StreetEasy (ZenRows)
- AI: OpenAI (vision + optional text intelligence), DeepInfra fallback
License: not specified (no LICENSE file in this repo).