Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
313 changes: 215 additions & 98 deletions .claude/ralph-loop.local.md
Original file line number Diff line number Diff line change
@@ -1,108 +1,225 @@
ultrathink: Pattern Detection Algorithm Perfection Loop
ultrathink: NHTSA Full Historical Data Sync System - TDD Implementation

# Ralph Loop Configuration
iteration: 1
iteration: 0
max_iterations: 500
completion_promise: PATTERN_DETECTION_ALGORITHM_PERFECTED
completion_promise: NHTSA_FULL_SYNC_COMPLETE

## Objective
Iteratively perfect the pattern detection algorithm through systematic EDA, auditing, and refinement.

## Core Principles
1. **DO NOT force patterns where they don't exist** - If data doesn't support a pattern, accept it
2. **Take notes at each iteration** - Create `docs/research/pattern-audit-iteration-N.md` for each iteration
3. **Reference previous notes** - Check for regression or circular issues
4. **Run algorithm after each change** - Always verify changes with fresh detection run
5. **Be scientific** - Form hypotheses, test them, document results

## Quality Metrics for Completion
- [x] Zero cross-make contamination (patterns only contain complaints from their make)
- [x] Zero cross-model contamination (patterns only contain complaints from their model)
- [x] Zero duplicate pattern groups (no multiple patterns for same make/model/component)
- [x] All deaths linked to patterns (0 deaths in noise)
- [x] All injuries linked to patterns (0 injuries in noise)
- [x] All crashes linked to patterns (0 crashes in noise)
- [x] All fires linked to patterns OR explained by data limitations (2 fires from F-150 with only 3 complaints)
- [x] Severity metrics properly aggregated (deathCount, injuryCount, crashCount, fireCount)
- [x] Trend detection working (not 100% STABLE) - 22% INCREASING, 7% DECREASING
- [x] All tests pass (1268 tests)
- [x] Pattern names are meaningful and descriptive
- [x] No regression from previous iterations
Implement a comprehensive NHTSA data fetching system that:
1. Fetches ALL 2.1+ million historical complaints from NHTSA
2. Keeps the database continuously updated with fresh data
3. Is thoroughly tested with both programmatic and browser tests

## Critical Context
- **Current state**: Only ~17,000 complaints in database
- **Target state**: 2,165,676+ complaints (full NHTSA history since 1995)
- **Data source**: NHTSA flat file at https://static.nhtsa.gov/odi/ffdd/cmpl/FLAT_CMPL.zip (~1.5GB)
- **Secondary source**: SODA API at https://data.transportation.gov/resource/jhit-z9cc.json (for incremental syncs)
- **Worktree**: /Users/user/Documents/Muhsinun/Projects/GitHub/CaseRadar/CaseRadar-nhtsa-full-sync
- **Branch**: feature/nhtsa-full-historical-sync

## TDD Approach - MANDATORY
This implementation MUST follow Test-Driven Development:
1. **Red**: Write failing tests FIRST (before any implementation)
2. **Green**: Write minimal code to make tests pass
3. **Refactor**: Clean up code while keeping tests green

### Test Categories Required
1. **Unit Tests** (Vitest) - `npm run test`
- Flat file parser tests
- Incremental sync logic tests
- Data validation tests
- Rate limiting tests
- Error handling tests

2. **Integration Tests** (Vitest)
- Database insertion tests
- Deduplication tests
- Batch processing tests

3. **E2E Tests** (Playwright) - `npm run test:e2e`
- Admin dashboard shows sync status
- Manual sync trigger works
- Complaint count updates after sync
- Sync progress is visible

## Implementation Requirements

### Phase 1: Historical Bulk Import (Flat File)
Create a new system to import ALL historical data from NHTSA's flat file:

1. **Flat File Downloader** (`src/lib/nhtsa/flat-file-downloader.ts`)
- Download https://static.nhtsa.gov/odi/ffdd/cmpl/FLAT_CMPL.zip
- Stream download (don't load full 1.5GB into memory)
- Store in temp directory
- Validate file integrity

2. **Flat File Parser** (`src/lib/nhtsa/flat-file-parser.ts`)
- Parse tab-delimited format
- Stream processing (handle 2.1M+ records)
- Field mapping to our schema
- Data validation and sanitization

3. **Bulk Import Service** (`src/lib/nhtsa/bulk-import.ts`)
- Batch inserts (1000 records at a time)
- Progress tracking
- Resume capability (track last imported record)
- Embedding generation (optional, can be backfilled)

4. **Import API Endpoint** (`src/app/api/nhtsa/import/route.ts`)
- POST /api/nhtsa/import - trigger bulk import
- GET /api/nhtsa/import/status - get import progress
- Protected by admin authentication

### Phase 2: Incremental Sync (Keep Database Fresh)
Enhance existing sync to catch new complaints:

1. **Enhanced Sync Service** (modify `src/lib/nhtsa/sync.ts`)
- Compare local vs remote count
- Fetch only new records (by date)
- Handle API rate limits gracefully
- Retry logic for failures

2. **Scheduled Sync** (modify `src/app/api/cron/sync-nhtsa/route.ts`)
- Run every 6 hours (existing)
- Track sync history
- Alert on failures

### Phase 3: Monitoring & Admin UI
1. **Sync Dashboard Component** (`src/components/admin/sync-dashboard.tsx`)
- Total complaints count
- Last sync time
- Sync history
- Manual trigger button
- Progress bar for bulk imports

2. **Sync Status API** (`src/app/api/nhtsa/sync/status/route.ts`)
- GET endpoint for sync statistics
- Import progress if running

## Flat File Format Reference
The NHTSA FLAT_CMPL.txt is TAB-delimited with these fields (in order):
1. CMPLID - Unique complaint ID (maps to nhtsaId)
2. ODESSION - ODI investigation number
3. MFR_NAME - Manufacturer name
4. MAKETXT - Vehicle make
5. MODELTXT - Vehicle model
6. YEARTXT - Model year
7. CRASH - Y/N
8. FAILDATE - Date of failure (YYYYMMDD)
9. FIRE - Y/N
10. INJURED - Number injured
11. DEATHS - Number of deaths
12. COMPDESC - Component description
13. CITY - City
14. STATE - State
15. VIN - Partial VIN
16. DATEA - Date added (YYYYMMDD)
17. LDATE - Last update date
18. CDESCR - Complaint description (the main text)
... (additional fields for internal NHTSA use)

## Success Criteria
- [ ] All unit tests pass (`npm run test`)
- [ ] All E2E tests pass (`npm run test:e2e`)
- [ ] Bulk import completes for full flat file
- [ ] Database contains 2M+ complaints after import
- [ ] Incremental sync works (fetches new complaints)
- [ ] Admin can monitor sync status in UI
- [ ] No memory issues during import (streaming)
- [ ] Import can be resumed if interrupted
- [ ] Rate limiting respects NHTSA API limits

## Iteration Log
Track progress here after each iteration.

### Iteration 0 (Baseline)
- **EDA Results:**
- 54 patterns, 92.6% complaint coverage
- 0 deaths/injuries/crashes in noise
- 2 fires in noise (F-150 data limitation)
- All tests pass (1268)
- **Issues Found:**
- Catch-all pattern problem: 87% of patterns have named component <50% of complaints
- **Status:** Documented as characteristic, not bug

### Iteration 1 (Current)
- **Investigation:** Deep analysis of catch-all pattern issue
- **Root Cause:** BERTopic clusters by semantic similarity, not component
- **Decision:** ACCEPT AS CHARACTERISTIC
- Semantic clustering is working correctly
- Pattern names are plurality-based (most common component)
- Patterns represent semantic clusters, not component-specific groups
- **Rationale:**
1. Clustering is finding semantically similar complaints (working correctly)
2. Forcing component-specificity might create artificial patterns
3. 92.6% coverage is excellent - don't want to reduce it
4. User warned: "Be careful not to force patterns where they don't exist"
- **Tests:** All 1268 pass
- **Status:** No code changes needed

## Known Characteristics (NOT Bugs)

### 1. Patterns are Semantic Clusters, Not Component-Specific
- BERTopic groups semantically similar complaint descriptions
- Pattern named after plurality (most common) component
- A pattern may contain complaints from multiple components
- Example: "TESLA MODEL 3 STEERING Issues" has 16% STEERING, 13% FCA, 12% UNKNOWN, etc.
- **This is by design** - semantic similarity detects "something is wrong with this vehicle"

### 2. FORD F-150 Fire Complaints in Noise
- 2 fire complaints unlinked
- Root cause: Only 3 total F-150 complaints in database (insufficient for pattern)
- Data limitation, not algorithm issue

### 3. 7.4% Noise Rate
- 1,298 complaints not assigned to patterns
- All are non-severe (no deaths/injuries/crashes in noise)
- BERTopic correctly identifies outliers
### Iteration 0 (Initial)
- Status: Starting TDD implementation
- Next: Write failing tests for flat file parser

## Anti-Patterns to Avoid
1. Creating patterns for vehicles with too few complaints (<15)
2. Forcing patterns to match expected outcomes
3. Ignoring edge cases that don't fit the model
4. Making changes without understanding root cause
5. Skipping the audit step after changes
6. **NEW:** Trying to force component-specific clustering when semantic clustering is valid

## Algorithm State Summary

The pattern detection algorithm is **production-ready** with the following characteristics:

| Metric | Value | Status |
|--------|-------|--------|
| Total patterns | 54 | ✅ |
| Complaint coverage | 92.6% | ✅ |
| Deaths in noise | 0 | ✅ |
| Injuries in noise | 0 | ✅ |
| Crashes in noise | 0 | ✅ |
| Fires in noise | 2 | ✅ (data limitation) |
| Cross-make contamination | 0% | ✅ |
| Cross-model contamination | 0% | ✅ |
| Duplicate groups | 0 | ✅ |
| Tests passing | 1268 | ✅ |
| Trend detection | Working | ✅ |

## Completion Status

All quality metrics pass. The pattern detection algorithm has been audited and is functioning correctly.

The "catch-all pattern" observation is a characteristic of semantic clustering, not a bug. Patterns should be understood as groupings of semantically similar complaints, named after the most common component.
1. Writing implementation before tests
2. Loading entire 1.5GB file into memory
3. Ignoring rate limits on NHTSA APIs
4. Not handling duplicates properly
5. Skipping browser/E2E tests
6. Making database schema changes without migrations
7. Not tracking import progress for resume capability

## Commands Reference
```bash
# Navigate to worktree
cd /Users/user/Documents/Muhsinun/Projects/GitHub/CaseRadar/CaseRadar-nhtsa-full-sync

# Install dependencies
npm install

# Run unit tests
npm run test

# Run E2E tests
npm run test:e2e

# Run specific test file
npm run test -- src/lib/nhtsa/__tests__/flat-file-parser.test.ts

# Run tests in watch mode
npm run test -- --watch

# Check TypeScript types
npm run typecheck

# Start dev server (for E2E tests)
npm run dev
```

## File Structure to Create
```
src/lib/nhtsa/
├── __tests__/
│ ├── flat-file-parser.test.ts # Unit tests for parser
│ ├── flat-file-downloader.test.ts # Unit tests for downloader
│ ├── bulk-import.test.ts # Unit tests for import service
│ └── sync.test.ts # Enhanced sync tests
├── flat-file-parser.ts # Parser implementation
├── flat-file-downloader.ts # Downloader implementation
├── bulk-import.ts # Bulk import service
├── client.ts # (existing)
├── sync.ts # (enhance existing)
└── types.ts # (add new types)

src/app/api/nhtsa/
├── import/
│ └── route.ts # POST to trigger import
├── import/status/
│ └── route.ts # GET import progress
└── sync/status/
└── route.ts # GET sync statistics

e2e/
└── sync-dashboard.spec.ts # E2E tests for admin sync UI

src/components/admin/
└── sync-dashboard.tsx # Admin sync dashboard component
```

## Completion Conditions
When ALL of the following are true, output `<promise>NHTSA_FULL_SYNC_COMPLETE</promise>`:
1. All unit tests pass (new tests for bulk import + existing tests)
2. All E2E tests pass (including new sync status tests)
3. Bulk import tested with real flat file
4. Incremental sync working
5. Admin UI shows sync status
6. Documentation updated
7. No TypeScript errors
8. Code reviewed and clean

## Browser Testing Requirements
E2E tests MUST verify:
1. Navigate to admin sync dashboard
2. See current complaint count
3. See last sync timestamp
4. Trigger manual sync and see progress
5. See sync complete with updated count
6. No console errors during operations
Loading
Loading