Skip to content

Latest commit

 

History

History
282 lines (227 loc) · 7.68 KB

File metadata and controls

282 lines (227 loc) · 7.68 KB

PDF Conversation History Implementation Checklist

✅ Completed Tasks

Phase 1: Database Migration

  • Create migration script: supabase/migrations/0006_create_conversation_tables.sql
  • Extend user_pdfs table with new columns
  • Create pdf_conversations table
  • Create conversation_messages table
  • Create indexes for performance
  • Enable RLS policies
  • Create helper trigger function

Phase 2: Utility Functions

  • PDF saving utilities: src/lib/pdf/save-pdf-info.ts

    • savePDFInfo() - Save/update PDF metadata
    • createOrGetConversation() - Create/get conversation
    • updatePDFParseStatus() - Update parse status
  • Conversation saving utilities: src/lib/chat/save-conversation.ts

    • saveConversationMessage() - Save message
    • saveConversationExchange() - Save Q&A pair
    • getConversationStats() - Get stats
    • getConversationTokenCount() - Get token count
    • deleteConversationMessages() - Delete messages
  • PDF retrieval utilities: src/lib/pdf/get-pdf-list.ts

    • getPDFList() - Get user's PDFs with stats
    • getPDFWithStats() - Get single PDF
    • userOwnsPDF() - Verify ownership
    • getUserPDFCount() - Get count
  • Conversation retrieval utilities: src/lib/chat/get-conversation-history.ts

    • getConversationHistory() - Get messages
    • getRecentMessages() - Get recent messages
    • getConversationMessageCount() - Get count
    • searchConversationMessages() - Search messages
    • getConversationStats() - Get detailed stats
  • PDF deletion utilities: src/lib/pdf/delete-pdf.ts

    • deletePDF() - Delete PDF and cascade
    • deleteAllUserPDFs() - Delete all user PDFs
    • softDeletePDF() - Soft delete

Phase 3: API Endpoints

  • GET /api/pdfs/list - List user's PDFs
  • GET /api/pdfs/{id}/conversations - Get conversation history
  • DELETE /api/pdfs/{id} - Delete PDF

Phase 4: API Integration

  • Update upload API to save PDF info and create conversation
  • Update chat API to prepare for conversation saving

🔄 In Progress / Next Steps

Phase 5: Complete Chat API Integration

  • Collect full streamed response before saving
  • Save conversation messages after chat completion
  • Update conversation stats automatically
  • Add error handling for conversation saving

Phase 6: Frontend Components

  • Create PDF list component
  • Create conversation history viewer
  • Add delete confirmation dialog
  • Integrate with main page
  • Add loading states
  • Add error handling

Phase 7: Testing

  • Unit tests for utility functions
  • Integration tests for API endpoints
  • E2E tests for complete workflows
  • Test permission checks
  • Test cascade deletion
  • Test pagination
  • Test sorting

Phase 8: Deployment & Monitoring

  • Run database migration
  • Deploy API routes
  • Deploy frontend components
  • Monitor performance
  • Monitor error rates
  • Gather user feedback

📋 How to Use

1. Run Database Migration

# Using Supabase CLI
supabase migration up

# Or manually in Supabase dashboard
# Copy content of supabase/migrations/0006_create_conversation_tables.sql
# and run in SQL editor

2. Save PDF Info (in upload API)

import { savePDFInfo, createOrGetConversation } from '@/lib/pdf/save-pdf-info';

// Save PDF metadata
await savePDFInfo({
  pdfId: 'uuid',
  userId: 'user-id',
  filename: 'document.pdf',
  fileSize: 1024000,
  storagePath: '/tmp/pdf-chat/...',
  parseStatus: 'pending',
});

// Create conversation record
await createOrGetConversation({
  pdfId: 'uuid',
  userId: 'user-id',
});

3. Save Conversation Messages (in chat API)

import { saveConversationExchange } from '@/lib/chat/save-conversation';

// Save user question and assistant response
await saveConversationExchange(
  conversationId,
  pdfId,
  userId,
  userQuestion,
  assistantResponse,
  tokenCount,
  processingTime
);

4. Get PDF List (frontend)

const response = await fetch('/api/pdfs/list?limit=50&offset=0&sortBy=uploadedAt&sortOrder=desc');
const { data } = await response.json();
console.log(data.pdfs); // Array of PDFs with stats

5. Get Conversation History (frontend)

const response = await fetch(`/api/pdfs/${pdfId}/conversations?limit=100&offset=0`);
const { data } = await response.json();
console.log(data.messages); // Array of messages

6. Delete PDF (frontend)

const response = await fetch(`/api/pdfs/${pdfId}`, { method: 'DELETE' });
const { data } = await response.json();
console.log(data.messagesDeleted); // Number of messages deleted

🔍 Testing Queries

Test PDF List Endpoint

curl -X GET 'http://localhost:3000/api/pdfs/list?limit=10&offset=0' \
  -H 'Authorization: Bearer YOUR_TOKEN'

Test Conversation History Endpoint

curl -X GET 'http://localhost:3000/api/pdfs/{PDF_ID}/conversations?limit=50' \
  -H 'Authorization: Bearer YOUR_TOKEN'

Test Delete Endpoint

curl -X DELETE 'http://localhost:3000/api/pdfs/{PDF_ID}' \
  -H 'Authorization: Bearer YOUR_TOKEN'

📊 Database Queries

Get all PDFs for a user with conversation stats

SELECT 
  p.id,
  p.filename,
  p.file_size,
  p.page_count,
  p.parse_status,
  p.created_at,
  COALESCE(c.message_count, 0) as conversation_count,
  c.last_message_at
FROM user_pdfs p
LEFT JOIN pdf_conversations c ON p.id = c.pdf_id
WHERE p.user_id = 'user-id'
ORDER BY p.created_at DESC;

Get conversation messages for a PDF

SELECT 
  id,
  role,
  content,
  created_at,
  tokens,
  processing_time
FROM conversation_messages
WHERE pdf_id = 'pdf-id' AND user_id = 'user-id'
ORDER BY created_at ASC;

Get conversation statistics

SELECT 
  COUNT(*) as total_messages,
  SUM(CASE WHEN role = 'user' THEN 1 ELSE 0 END) as user_messages,
  SUM(CASE WHEN role = 'assistant' THEN 1 ELSE 0 END) as assistant_messages,
  SUM(COALESCE(tokens, 0)) as total_tokens,
  AVG(COALESCE(processing_time, 0)) as avg_processing_time,
  MIN(created_at) as first_message_at,
  MAX(created_at) as last_message_at
FROM conversation_messages
WHERE pdf_id = 'pdf-id' AND user_id = 'user-id';

🚀 Performance Tips

  1. Use pagination - Always paginate large result sets
  2. Use indexes - Queries use created indexes for fast lookups
  3. Denormalization - Message count is stored in pdf_conversations for fast queries
  4. Caching - Consider caching PDF list (TTL 5 minutes)
  5. Batch operations - Use batch deletes for multiple PDFs

🔒 Security Checklist

  • RLS policies enabled on all tables
  • User ownership verified in all endpoints
  • Input validation on all parameters
  • Proper error responses with HTTP status codes
  • Cascade deletion prevents orphaned records
  • No sensitive data in logs

📝 Documentation

  • Database schema documented
  • API endpoints documented
  • Utility functions documented
  • Error handling documented
  • Security considerations documented
  • Performance optimizations documented

🎯 Success Criteria

  • Database migration runs without errors
  • All utility functions work correctly
  • All API endpoints return correct responses
  • User ownership is verified
  • Cascade deletion works
  • Pagination works
  • Sorting works
  • Error handling is comprehensive
  • Logging is detailed
  • Documentation is complete

📞 Support

For issues or questions:

  1. Check the logs in console
  2. Review the error response from API
  3. Verify database migration ran successfully
  4. Check RLS policies are enabled
  5. Verify user authentication token is valid