MindFlow - AI Voice Coach | Product Requirements Document (PRD) v2.3

Metadata	Details
Version	2.3 (Revised with Risk Mitigation)
Date	2025-11-21
Status	Ready for Development
Reviewer	Yuna Tseng

1. Critical Design Philosophy

Human Connection: This AI must feel like a trusted friend who listens, not a therapist who lectures.
Business Viability: We treat data as a strategic asset. We optimize for Unit Economics using intelligent model routing and prove value through User Sentiment Delta tracking.
Technical Robustness: Security and cost-efficiency are built into the architecture from day one.

2. Project Overview

Product Name: MindFlow - AI Voice Coach
Purpose: MindFlow is a web application that helps users maintain emotional wellbeing through voice journaling combined with AI-powered psychological support.
Core Value Proposition:
1. Validation First: AI validates feelings before offering insights.
2. Pattern Recognition (RAG): Discovers blind spots users can't see themselves by recalling past entries.
3. Adaptive Personality: Switches between listening and coaching modes.
4. Measurable Impact: Tracks emotional trajectory to prove wellbeing improvements.
Target Users: Individuals seeking mental wellness support, personal growth, or structured self-reflection.

3. User Flow

2.1 Authentication Flow

User lands on /login.
Supabase Auth handles Email/Password or Google OAuth.
Redirect to /dashboard upon success.

2.2 Voice Journaling Flow (Cost-Optimized)

Mode Selection: User selects Listening / Coaching / Smart Mode.
Recording: User records voice entry.
Transcription: Audio sent to /api/transcribe (Whisper).
Text Review (CRITICAL FOR COST SAVING):
- User sees the transcribed text.
- User allows to EDIT text manually. (This avoids re-recording/re-transcribing costs if there are minor errors).
Submission: User clicks "Save & Analyze".
Processing: System calculates sentiment, retrieves memories, and generates response.

2.3 AI Analysis & Routing Flow (The "Brain")

Technical Concept: How RAG Works Here

Embedding: Converting user text into mathematical coordinates (vectors).
Retrieval: Finding past entries that are "mathematically close" to current feelings, even if keywords differ (e.g., "tired" matches "exhausted").

Process Steps:

Tone Detection (Fast/Cheap): System identifies primary emotion and energy level using gpt-4o-mini.
Model Routing:
- Simple Validation → Routes to gpt-4o-mini.
- Complex Coaching → Routes to gpt-4o.
RAG Retrieval: Fetches relevant past entries from pgvector.
- New Cost Control: Implements embedding caching/deduplication to prevent redundant calculations.
Response Generation: AI generates response based on Mode + Tone + History.
Data Logging: Sentiment score, cost of session, and tokens used are logged.

3. Tech Stack & Config

3.1 Core Stack

Framework: Next.js 14 (App Router), TypeScript, React 18.
Database & Auth: Supabase (PostgreSQL + GoTrue).
Vector Store: Supabase pgvector.
AI Models: OpenAI (Whisper, GPT-4o, GPT-4o-mini, text-embedding-3-small).
State Management: Zustand.

3.2 UI Libraries

Shadcn/UI, Tailwind CSS, Lucide React.

3.3 Environment Variables (MANDATORY & SECURITY WARNING)

⚠️ SECURITY WARNING:

SUPABASE_SERVICE_ROLE_KEY: This key possesses full privileges to bypass Row Level Security (RLS). It must NEVER appear in frontend code (NEXT_PUBLIC_...). It is strictly for use in Next.js API Routes or Server Actions.

OpenAI Credit Check: Ensure your OpenAI account has sufficient Credit Balance (Pre-paid credits) before development. Simply having a credit card linked may not be sufficient for API access depending on account tier.

Create a .env.local file with these exact keys:

# Supabase (Retrieve from Project Settings -> API)
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key # BACKEND ONLY

# OpenAI
OPENAI_API_KEY=sk-proj-...

# App Config
NEXT_PUBLIC_APP_URL=http://localhost:3000

4. Database Schema (Enhanced)

⚠️ CRITICAL SETUP INSTRUCTION: The SQL commands below cannot be run via migration scripts reliably in all environments. You MUST manually copy-paste and run steps 4.1 and 4.2 in the Supabase SQL Editor dashboard before starting development to ensure the vector extension and indexes are correctly activated.

4.1 Extensions

-- Enable Vector extension for RAG
-- Note: Must be enabled manually in Dashboard if this fails
CREATE EXTENSION IF NOT EXISTS vector;

4.2 Tables

Profiles

CREATE TABLE public.profiles (
  id UUID REFERENCES auth.users(id) PRIMARY KEY,
  email TEXT UNIQUE NOT NULL,
  full_name TEXT,
  default_ai_mode TEXT DEFAULT 'smart',
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
ALTER TABLE public.profiles ENABLE ROW LEVEL SECURITY;
-- Add RLS policies here...

Entries (Updated for Data Strategy)

CREATE TABLE public.entries (
  id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
  user_id UUID REFERENCES auth.users(id) ON DELETE CASCADE NOT NULL,

  -- Content
  transcription TEXT NOT NULL,
  audio_url TEXT,

  -- AI Analysis & Metadata
  ai_response TEXT,
  ai_mode TEXT CHECK (ai_mode IN ('listening', 'coaching', 'smart')),
  emotion_tags TEXT[],
  detected_tone TEXT,

  -- Data Strategy & BI Columns
  sentiment_score FLOAT, -- 0.0 (Negative) to 1.0 (Positive)
  energy_score FLOAT,    -- 0.0 (Low) to 1.0 (High Energy)
  tokens_used INT,       -- Total tokens for this entry
  cost_usd FLOAT,        -- Estimated cost for this entry

  -- RAG
  referenced_entry_ids UUID[],

  -- Search
  transcription_tsv tsvector GENERATED ALWAYS AS (to_tsvector('english', transcription)) STORED,

  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Add RLS policies (uid = user_id)

Embeddings

CREATE TABLE public.embeddings (
  id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
  entry_id UUID REFERENCES public.entries(id) ON DELETE CASCADE NOT NULL,
  user_id UUID REFERENCES auth.users(id) ON DELETE CASCADE NOT NULL,
  embedding vector(1536) NOT NULL, -- Matches text-embedding-3-small
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

-- CRITICAL PERFORMANCE INDEX
-- Must be created after some data exists for optimal performance,
-- but acceptable to create early for ivfflat.
CREATE INDEX embeddings_vector_idx ON public.embeddings USING ivfflat (embedding vector_cosine_ops);

Usage Logs (For Unit Economics)

CREATE TABLE public.usage_logs (
  id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
  user_id UUID REFERENCES auth.users(id),
  feature TEXT,       -- 'transcription', 'tone_detect', 'chat_response'
  model_used TEXT,    -- 'whisper', 'gpt-4o', 'gpt-4o-mini'
  input_tokens INT,
  output_tokens INT,
  estimated_cost FLOAT,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

4.3 Helper Functions (RAG)

-- Function to find similar entries
CREATE OR REPLACE FUNCTION match_entries(
  query_embedding vector(1536),
  match_user_id UUID, -- SECURITY: API must pass auth.uid() here
  match_threshold float DEFAULT 0.7,
  match_count int DEFAULT 5,
  exclude_entry_id UUID DEFAULT NULL
)
RETURNS TABLE (
  entry_id UUID,
  transcription TEXT,
  emotion_tags TEXT[],
  detected_tone TEXT,
  created_at TIMESTAMP WITH TIME ZONE,
  similarity float
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    e.id AS entry_id,
    e.transcription,
    e.emotion_tags,
    e.detected_tone,
    e.created_at,
    1 - (emb.embedding <=> query_embedding) AS similarity
  FROM embeddings emb
  JOIN entries e ON emb.entry_id = e.id
  WHERE emb.user_id = match_user_id -- DATA ISOLATION CHECK
    AND (exclude_entry_id IS NULL OR e.id != exclude_entry_id)
    AND 1 - (emb.embedding <=> query_embedding) > match_threshold
  ORDER BY emb.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;

5. API Routes Structure

`POST /api/transcribe`

Input: FormData with audio file.
Model: Whisper-1.
Logic: Return raw text. Allow client to edit text before saving.

`POST /api/journal` (The Router Gateway)

Security: Verify supabase.auth.getUser() first.
Cost Control: Check daily usage limit for the user before processing.
Logic:
1. Save transcription to DB.
2. Embedding Cache Check: (Optional) Check if similar text exists to reuse embedding (hash check).
3. Generate Embedding (text-embedding-3-small).
4. Tone Detection: Call gpt-4o-mini to get tone, sentiment_score.
5. Routing Decision:
  - IF aiMode == 'listening' → Use gpt-4o-mini.
  - IF aiMode == 'coaching' → Use gpt-4o.
  - IF aiMode == 'smart':
    - IF sentiment_score < 0.3 (High Distress) → gpt-4o (Safety/Depth).
    - ELSE → gpt-4o-mini.
6. RAG Retrieval: Call match_entries RPC.
7. Generate Response: Call selected model with history context.
8. Logging: Record costs to usage_logs.

6. Logic & Prompts

6.1 Tone Detection (Low Cost)

Model: gpt-4o-mini

System Prompt:

Analyze the user's journal entry. Return JSON:
{
  "tone": "positive" | "negative" | "neutral" | "seeking_help",
  "emotionTags": ["tag1", "tag2"],
  "sentiment_score": 0.0 to 1.0,
  "energy_score": 0.0 to 1.0
}

6.2 Mode-Aware System Prompts

Context Injection: "User mentioned [Previous Entry Date]: [Summary]. Use this context naturally."
IF Listening Mode (gpt-4o-mini): "Validate. Mirror emotions. Keep under 50 words. No advice."
IF Coaching Mode (gpt-4o): "Validate FIRST. Reference patterns. Ask one gentle question."

7. Data Strategy & Business Intelligence

7.1 User Sentiment Delta (Impact Tracking)

Metric: Sentiment Delta (Current Entry Score vs. 7-Day Moving Average).
Visualization: "Mood Horizon" chart in Dashboard.

7.2 Intelligent Model Routing & Cost Control (Optimized)

Goal: Keep Cost Per Session (CpS) < $0.03.
Strategy:
1. Model Hierarchy: Use gpt-4o-mini for >60% of interactions (Tone detect + Listening mode). Use gpt-4o only when deep reasoning is required.
2. Embedding Optimization: Implement simple caching or deduplication for identical text entries.
3. Daily Limits: Enforce a soft limit (e.g., 5 AI-analyzed entries per day) per user to prevent cost runaways during beta.
4. Batch Processing: (Future Phase) Consider delaying non-critical Tone Detection for batch processing if real-time analysis isn't strictly required for the UI update.

8. Success Metrics

Business: Average Margin per User > 70%.
User: Positive Sentiment Delta after 14 days.
Tech: API Errors < 1%.

End of PRD.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MindFlow - AI Voice Coach | Product Requirements Document (PRD) v2.3

1. Critical Design Philosophy

2. Project Overview

3. User Flow

2.1 Authentication Flow

2.2 Voice Journaling Flow (Cost-Optimized)

2.3 AI Analysis & Routing Flow (The "Brain")

3. Tech Stack & Config

3.1 Core Stack

3.2 UI Libraries

3.3 Environment Variables (MANDATORY & SECURITY WARNING)

4. Database Schema (Enhanced)

4.1 Extensions

4.2 Tables

4.3 Helper Functions (RAG)

5. API Routes Structure

`POST /api/transcribe`

`POST /api/journal` (The Router Gateway)

6. Logic & Prompts

6.1 Tone Detection (Low Cost)

6.2 Mode-Aware System Prompts

7. Data Strategy & Business Intelligence

7.1 User Sentiment Delta (Impact Tracking)

7.2 Intelligent Model Routing & Cost Control (Optimized)

8. Success Metrics

FilesExpand file tree

PRD.md

Latest commit

History

PRD.md

File metadata and controls

MindFlow - AI Voice Coach | Product Requirements Document (PRD) v2.3

1. Critical Design Philosophy

2. Project Overview

3. User Flow

2.1 Authentication Flow

2.2 Voice Journaling Flow (Cost-Optimized)

2.3 AI Analysis & Routing Flow (The "Brain")

3. Tech Stack & Config

3.1 Core Stack

3.2 UI Libraries

3.3 Environment Variables (MANDATORY & SECURITY WARNING)

4. Database Schema (Enhanced)

4.1 Extensions

4.2 Tables

4.3 Helper Functions (RAG)

5. API Routes Structure

POST /api/transcribe

POST /api/journal (The Router Gateway)

6. Logic & Prompts

6.1 Tone Detection (Low Cost)

6.2 Mode-Aware System Prompts

7. Data Strategy & Business Intelligence

7.1 User Sentiment Delta (Impact Tracking)

7.2 Intelligent Model Routing & Cost Control (Optimized)

8. Success Metrics

`POST /api/transcribe`

`POST /api/journal` (The Router Gateway)