Skip to content

Latest commit

 

History

History
376 lines (304 loc) · 11 KB

File metadata and controls

376 lines (304 loc) · 11 KB

PDF Conversation History Implementation - Phase 1 & 2 Complete

Overview

Implemented the database schema, utility functions, and API endpoints for the PDF conversation history management feature. This enables users to save and retrieve their PDF files and conversation records.

What Was Implemented

Phase 1: Database Migration ✅

File: supabase/migrations/0006_create_conversation_tables.sql

Created three main database components:

  1. Extended user_pdfs table with new columns:

    • page_count - Number of pages in the PDF
    • text_summary - Text summary of the PDF
    • parse_status - Status of PDF parsing (pending, parsing, completed, failed)
    • parsed_at - Timestamp when parsing completed
  2. pdf_conversations table - Tracks conversations per PDF:

    • id - Unique conversation ID
    • pdf_id - Reference to PDF
    • user_id - Reference to user
    • message_count - Denormalized count for performance
    • last_message_at - Timestamp of most recent message
    • Unique constraint on (pdf_id, user_id)
  3. conversation_messages table - Stores individual messages:

    • id - Unique message ID
    • conversation_id - Reference to conversation
    • pdf_id - Reference to PDF
    • user_id - Reference to user
    • role - Either 'user' or 'assistant'
    • content - Message content
    • tokens - Token count for billing
    • processing_time - Processing time in milliseconds

Indexes Created:

  • idx_pdf_conversations_user_id - For user queries
  • idx_pdf_conversations_pdf_id - For PDF queries
  • idx_conversation_messages_conversation_id - For conversation queries
  • idx_conversation_messages_pdf_id - For PDF queries
  • idx_conversation_messages_user_id - For user queries
  • idx_conversation_messages_created_at - For time-based queries

RLS Policies:

  • Users can only view/insert their own conversations and messages
  • Automatic cascade deletion when PDF is deleted

Helper Functions:

  • update_conversation_stats() - Trigger to update message count and last message time

Phase 2: Utility Functions ✅

PDF Management (src/lib/pdf/save-pdf-info.ts)

  • savePDFInfo() - Save or update PDF metadata
  • createOrGetConversation() - Create or retrieve conversation record
  • updatePDFParseStatus() - Update PDF parsing status

Conversation Management (src/lib/chat/save-conversation.ts)

  • saveConversationMessage() - Save individual message
  • saveConversationExchange() - Save user question + assistant response
  • getConversationStats() - Get conversation statistics
  • getConversationTokenCount() - Calculate total tokens used
  • deleteConversationMessages() - Delete all messages in conversation

PDF Retrieval (src/lib/pdf/get-pdf-list.ts)

  • getPDFList() - Get user's PDFs with conversation stats, sorting, pagination
  • getPDFWithStats() - Get single PDF with stats
  • userOwnsPDF() - Verify user ownership
  • getUserPDFCount() - Get total PDF count

Conversation Retrieval (src/lib/chat/get-conversation-history.ts)

  • getConversationHistory() - Get paginated conversation messages
  • getRecentMessages() - Get last N messages
  • getConversationMessageCount() - Get message count
  • searchConversationMessages() - Search messages by content
  • getConversationStats() - Get detailed conversation statistics

PDF Deletion (src/lib/pdf/delete-pdf.ts)

  • deletePDF() - Delete PDF and cascade delete conversations/messages
  • deleteAllUserPDFs() - Delete all PDFs for a user
  • softDeletePDF() - Mark PDF as deleted without removing data

Phase 3: API Endpoints ✅

1. GET /api/pdfs/list

Purpose: Get user's PDF list with conversation statistics

Query Parameters:

  • limit (default: 50, max: 100) - Number of PDFs per page
  • offset (default: 0) - Pagination offset
  • sortBy (default: 'uploadedAt') - Sort field: uploadedAt, conversationCount, lastConversationAt
  • sortOrder (default: 'desc') - Sort direction: asc, desc

Response:

{
  "success": true,
  "data": {
    "total": 10,
    "pdfs": [
      {
        "id": "uuid",
        "filename": "document.pdf",
        "fileSize": 1024000,
        "pageCount": 25,
        "parseStatus": "completed",
        "uploadedAt": "2026-03-10T08:00:00Z",
        "conversationCount": 5,
        "lastConversationAt": "2026-03-10T09:30:00Z"
      }
    ],
    "pagination": {
      "limit": 50,
      "offset": 0,
      "hasMore": false
    }
  },
  "timestamp": "2026-03-10T10:00:00Z"
}

2. GET /api/pdfs/{id}/conversations

Purpose: Get conversation history for a specific PDF

Query Parameters:

  • limit (default: 100, max: 100) - Messages per page
  • offset (default: 0) - Pagination offset

Response:

{
  "success": true,
  "data": {
    "pdfId": "uuid",
    "filename": "document.pdf",
    "pageCount": 25,
    "messages": [
      {
        "id": "uuid",
        "role": "user",
        "content": "What is this document about?",
        "createdAt": "2026-03-10T09:00:00Z"
      },
      {
        "id": "uuid",
        "role": "assistant",
        "content": "This document is about...",
        "createdAt": "2026-03-10T09:01:00Z",
        "tokens": 150,
        "processingTime": 2500
      }
    ],
    "pagination": {
      "limit": 100,
      "offset": 0,
      "total": 10,
      "hasMore": false
    }
  },
  "timestamp": "2026-03-10T10:00:00Z"
}

3. DELETE /api/pdfs/{id}

Purpose: Delete a PDF and all its associated data

Response:

{
  "success": true,
  "data": {
    "pdfId": "uuid",
    "messagesDeleted": 10,
    "conversationsDeleted": 1
  },
  "message": "PDF and all associated data deleted successfully",
  "timestamp": "2026-03-10T10:00:00Z"
}

Error Responses:

  • 401 Unauthorized - User not authenticated
  • 403 Forbidden - User doesn't own the PDF
  • 404 Not Found - PDF not found
  • 400 Bad Request - Invalid parameters
  • 500 Server Error - Internal error

Phase 4: Upload API Integration ✅

File: src/app/api/upload/route.ts

Updated to:

  1. Call savePDFInfo() to save PDF metadata
  2. Call createOrGetConversation() to create conversation record
  3. Handle errors gracefully without blocking upload

Phase 5: Chat API Integration (Partial) ✅

File: src/app/api/chat/route.ts

Updated to:

  1. Call recordQuotaUsage() for AI calls
  2. Prepare for conversation message saving (requires full response collection)

Note: Full conversation saving requires collecting the complete streamed response before saving. This will be enhanced in the next phase.

File Structure

src/
├── lib/
│   ├── pdf/
│   │   ├── save-pdf-info.ts (NEW)
│   │   ├── get-pdf-list.ts (NEW)
│   │   └── delete-pdf.ts (NEW)
│   └── chat/
│       ├── save-conversation.ts (NEW)
│       └── get-conversation-history.ts (NEW)
└── app/
    └── api/
        └── pdfs/
            ├── list/
            │   └── route.ts (NEW)
            ├── [id]/
            │   ├── route.ts (NEW - DELETE)
            │   └── conversations/
            │       └── route.ts (NEW - GET)
            └── (existing endpoints)

supabase/
└── migrations/
    └── 0006_create_conversation_tables.sql (NEW)

Database Schema

user_pdfs (Extended)

ALTER TABLE user_pdfs ADD COLUMN page_count INTEGER;
ALTER TABLE user_pdfs ADD COLUMN text_summary TEXT;
ALTER TABLE user_pdfs ADD COLUMN parse_status TEXT DEFAULT 'pending';
ALTER TABLE user_pdfs ADD COLUMN parsed_at TIMESTAMP;

pdf_conversations (New)

CREATE TABLE pdf_conversations (
  id UUID PRIMARY KEY,
  pdf_id UUID REFERENCES user_pdfs(id) ON DELETE CASCADE,
  user_id UUID REFERENCES auth.users(id) ON DELETE CASCADE,
  message_count INTEGER DEFAULT 0,
  last_message_at TIMESTAMP,
  created_at TIMESTAMP DEFAULT NOW(),
  updated_at TIMESTAMP DEFAULT NOW(),
  UNIQUE(pdf_id, user_id)
);

conversation_messages (New)

CREATE TABLE conversation_messages (
  id UUID PRIMARY KEY,
  conversation_id UUID REFERENCES pdf_conversations(id) ON DELETE CASCADE,
  pdf_id UUID REFERENCES user_pdfs(id) ON DELETE CASCADE,
  user_id UUID REFERENCES auth.users(id) ON DELETE CASCADE,
  role TEXT CHECK (role IN ('user', 'assistant')),
  content TEXT NOT NULL,
  tokens INTEGER,
  processing_time INTEGER,
  created_at TIMESTAMP DEFAULT NOW()
);

Next Steps

Phase 3: Complete Chat API Integration

  • Collect full streamed response before saving
  • Save conversation messages after each chat completion
  • Update conversation stats automatically

Phase 4: Frontend Components

  • Create PDF list component
  • Create conversation history viewer
  • Add delete confirmation dialog
  • Integrate with main page

Phase 5: Testing

  • Unit tests for all utility functions
  • Integration tests for API endpoints
  • E2E tests for complete workflows

Security Considerations

RLS Policies: All tables have row-level security enabled ✅ User Verification: All endpoints verify user ownership ✅ Parameter Validation: All inputs are validated ✅ Error Handling: Proper error responses with appropriate HTTP status codes ✅ Cascade Deletion: Conversations and messages are automatically deleted with PDF

Performance Optimizations

Indexes: Created on frequently queried columns ✅ Denormalization: Message count stored in pdf_conversations for fast queries ✅ Pagination: All list endpoints support pagination ✅ Sorting: Flexible sorting options for better UX

Error Handling

All endpoints include comprehensive error handling:

  • 401 Unauthorized - User not authenticated
  • 403 Forbidden - Access denied
  • 404 Not Found - Resource not found
  • 400 Bad Request - Invalid parameters
  • 500 Server Error - Internal errors

Logging

All functions include detailed console logging for debugging:

  • Function entry/exit
  • Parameter values
  • Query results
  • Error details

Testing Recommendations

  1. Unit Tests:

    • Test each utility function with valid/invalid inputs
    • Mock Supabase client
    • Test error scenarios
  2. Integration Tests:

    • Test complete workflows (upload → chat → retrieve)
    • Test permission checks
    • Test cascade deletion
  3. E2E Tests:

    • Test from user perspective
    • Test multiple users
    • Test concurrent operations

Deployment Steps

  1. Run migration: supabase migration up
  2. Deploy updated API routes
  3. Deploy updated upload/chat APIs
  4. Test all endpoints
  5. Deploy frontend components (in next phase)

Monitoring

Monitor these metrics:

  • API response times
  • Database query performance
  • Error rates
  • User adoption

Future Enhancements

  1. Conversation Search: Full-text search across all conversations
  2. Conversation Export: Export conversations as PDF/Markdown
  3. Conversation Sharing: Share conversation links
  4. Conversation Tags: Tag conversations for organization
  5. Conversation Analytics: Track usage patterns
  6. Batch Operations: Delete multiple PDFs at once
  7. Archiving: Archive old conversations instead of deleting