Skip to content

Latest commit

 

History

History
257 lines (207 loc) · 7.94 KB

File metadata and controls

257 lines (207 loc) · 7.94 KB

Database Index Analysis Report

Generated: 2026-01-23 Migration: 0016_add_foreign_key_and_timestamp_indexes.sql

Executive Summary

This analysis identified 45+ missing indexes across 14 tables that will significantly improve query performance for foreign key JOINs, timestamp-based filtering, and composite queries.

Index Categories

1. Foreign Key Indexes (16 indexes)

Critical for JOIN performance and referential integrity checks.

Table Column Purpose
bookmarks agencyId JOIN with agencies
users selectedCountyId JOIN with counties
users homeCountyId JOIN with counties
agencyMembers invitedBy JOIN with users
protocolVersions publishedBy JOIN with users
searchHistory countyId JOIN with counties

2. Timestamp Indexes (18 indexes)

Optimize date-range queries, sorting, and time-based analytics.

Table Column Query Pattern
feedback updatedAt Recent updates, sorting
protocolChunks lastVerifiedAt Stale protocol detection
queries createdAt Timeline analytics
auditLogs createdAt Audit trail timeline
userAuthProviders expiresAt Token cleanup jobs
agencies updatedAt Recent changes
agencyMembers invitedAt, joinedAt Activity tracking
protocolVersions publishedAt, updatedAt Publication timeline
protocolUploads createdAt Upload history
stripeWebhookEvents createdAt, processedAt Event processing queue

3. Composite Indexes (15 indexes)

Optimize multi-column queries and cover common query patterns.

Table Columns Use Case
feedback userId, updatedAt User feedback timeline
protocolChunks countyId, lastVerifiedAt County data quality
queries countyId, createdAt County query analytics
users tier, subscriptionStatus Billing queries
auditLogs entityType, entityId Entity audit trail
auditLogs userId, action, createdAt User action history
agencies state, county Geographic lookup
agencyMembers agencyId, role Permission checks
protocolVersions agencyId, publishedAt Agency version history
userCounties userId, isPrimary Primary county lookup
searchHistory userId, countyId, createdAt Search history by county
stripeWebhookEvents processed, createdAt Pending events queue
integrationLogs partner, agencyId, createdAt Partner analytics

Performance Impact Analysis

High-Impact Indexes (P0)

Immediate performance gains expected

  1. users.email - Used in login, password reset, user search
  2. queries.createdAt - Used in analytics dashboards
  3. protocolChunks.lastVerifiedAt - Used in data quality monitoring
  4. stripeWebhookEvents (processed, createdAt) - Critical for payment processing queue
  5. auditLogs (entityType, entityId) - Essential for security audit trails

Medium-Impact Indexes (P1)

Improves specific features

  1. feedback.updatedAt - Admin dashboard sorting
  2. searchHistory.countyId - County-specific search analytics
  3. userCounties (userId, isPrimary) - User preference lookups
  4. agencyMembers.role - Permission verification
  5. protocolVersions.publishedAt - Version history displays

Low-Impact Indexes (P2)

Nice-to-have optimizations

  1. agencies.county - Geographic filtering
  2. contactSubmissions.email - Contact search
  3. integrationLogs.responseTimeMs - Performance monitoring

Query Pattern Optimization

Before: Table Scan

-- Finding stale protocols (SLOW)
SELECT * FROM protocolChunks
WHERE lastVerifiedAt < DATE_SUB(NOW(), INTERVAL 90 DAY);

After: Index Scan

-- With idx_protocol_chunks_verified (FAST)
SELECT * FROM protocolChunks
WHERE lastVerifiedAt < DATE_SUB(NOW(), INTERVAL 90 DAY);

Before: Multiple JOINs Without Indexes

-- User query history by county (SLOW)
SELECT q.* FROM queries q
JOIN users u ON q.userId = u.id
WHERE u.selectedCountyId = 123
ORDER BY q.createdAt DESC;

After: Indexed JOINs

-- With idx_users_selected_county + idx_queries_created (FAST)
SELECT q.* FROM queries q
JOIN users u ON q.userId = u.id
WHERE u.selectedCountyId = 123
ORDER BY q.createdAt DESC;

Index Coverage by Table

Table Before After Added
bookmarks 3 4 1
feedback 4 6 2
protocolChunks 4 6 2
queries 2 4 2
users 8 13 5
auditLogs 3 6 3
userAuthProviders 2 4 2
agencies 2 5 3
agencyMembers 2 6 4
protocolVersions 2 5 3
protocolUploads 2 3 1
userCounties 2 4 2
searchHistory 2 5 3
stripeWebhookEvents 3 6 3
integrationLogs 3 5 2
counties 2 4 2
contactSubmissions 2 3 1

Total: 48 existing + 45 new = 93 indexes

Storage Impact Estimation

Index Size Calculations

  • Foreign key indexes: ~5-10% of table size
  • Timestamp indexes: ~3-5% of table size
  • Composite indexes: ~10-15% of table size

Estimated Overhead

  • Small tables (<10K rows): <1 MB per index
  • Medium tables (10K-100K rows): 1-10 MB per index
  • Large tables (>100K rows): 10-100 MB per index

Total Estimated Impact

  • protocolChunks (largest table): +50-100 MB
  • users: +10-20 MB
  • queries: +5-10 MB
  • Other tables: +20-30 MB

Total additional storage: ~100-200 MB

Migration Safety

Risk Level: LOW

  • Uses CREATE INDEX IF NOT EXISTS (safe for re-runs)
  • No data modifications
  • No schema changes
  • Can be run on live database with minimal locking

Recommended Execution

# Development/Staging
mysql -u user -p database < migrations/0016_add_foreign_key_and_timestamp_indexes.sql

# Production (monitor lock duration)
pt-online-schema-change --alter "ADD INDEX idx_name (column)" \
  --execute h=host,D=database,t=table

# Or use gh-ost for zero-downtime
gh-ost --database=db --table=table --alter="ADD INDEX idx_name (column)" \
  --execute --allow-on-master

Monitoring Plan

Pre-Migration Metrics

-- Capture baseline query times
SELECT table_name, avg_timer_wait/1000000000 as avg_ms
FROM performance_schema.table_io_waits_summary_by_table
WHERE table_schema = DATABASE()
ORDER BY avg_timer_wait DESC
LIMIT 20;

Post-Migration Verification

-- Verify indexes were created
SHOW INDEXES FROM bookmarks WHERE Key_name LIKE 'idx_%';

-- Check index usage
SELECT * FROM sys.schema_unused_indexes
WHERE object_schema = DATABASE();

-- Monitor query performance improvement
SELECT digest_text, avg_timer_wait/1000000000 as avg_ms
FROM performance_schema.events_statements_summary_by_digest
WHERE schema_name = DATABASE()
ORDER BY avg_timer_wait DESC
LIMIT 20;

Maintenance Recommendations

Regular Index Health Checks

-- Check for duplicate indexes
SELECT table_name, GROUP_CONCAT(index_name) as duplicate_indexes
FROM information_schema.statistics
WHERE table_schema = DATABASE()
GROUP BY table_name, column_name, seq_in_index
HAVING COUNT(*) > 1;

-- Check index cardinality
SELECT table_name, index_name, cardinality
FROM information_schema.statistics
WHERE table_schema = DATABASE()
  AND cardinality < 100
ORDER BY cardinality;

Quarterly Review

  1. Identify unused indexes (drop candidates)
  2. Review slow query log for missing indexes
  3. Analyze index fragmentation
  4. Update statistics for query optimizer

Next Steps

  1. Immediate: Apply migration to staging environment
  2. Testing: Run load tests to verify performance gains
  3. Monitoring: Track query execution times for 1 week
  4. Production: Apply during low-traffic window
  5. Follow-up: Review slow query log after 2 weeks

References

  • Migration File: /drizzle/migrations/0016_add_foreign_key_and_timestamp_indexes.sql
  • Schema File: /drizzle/schema.ts
  • Query Patterns: /server/db.ts

Reviewed By: Database Architecture Agent Status: Ready for Staging Deployment