oddiya/.cursorrules at main · sw6820/oddiya · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
# Oddiya Project - System Prompt

**Version:** v1.3.1 - Streaming + Database Persistence
**Last Updated:** 2025-11-04
**Status:** ✅ Real-time streaming and database persistence implemented

## Critical Rules
1. **CHECK CURRENT STATUS FIRST** - Read `docs/CURRENT_IMPLEMENTATION_STATUS.md` before making changes
2. **ALWAYS EDIT EXISTING FILES** - Never create new files unless absolutely necessary or explicitly requested
3. **READ BEFORE WRITE** - Always use Read tool to check existing files before making changes
4. **PREFER UPDATES** - When adding features, update existing files rather than creating new ones
5. **NO DUPLICATE FILES** - Check for existing implementations before creating new files
6. **NEVER HARDCODE** - Use LLM for ALL travel content, configuration files for settings

   **Data & Content:**
   - ❌ NO switch/case for travel data
   - ❌ NO if/else chains for destinations/activities
   - ❌ NO inline strings for content (prompts, messages, activities)
   - ❌ NO hardcoded restaurant/activity names in YAML files
   - ✅ USE LLM (Gemini) for ALL travel content generation
   - ✅ USE external configuration files for prompts only
   - ✅ USE database for storing generated plans
   - ✅ USE Redis for caching LLM responses

   **Secrets & Credentials (CRITICAL SECURITY):**
   - ❌ NO API keys hardcoded in source code
   - ❌ NO OAuth client IDs/secrets in files
   - ❌ NO database passwords in code or scripts
   - ❌ NO AWS credentials in scripts or configs
   - ❌ NO example secrets used for comparison (even in validation)
   - ✅ USE environment variables (.env files, GitHub Secrets, AWS Secrets Manager)
   - ✅ USE pattern validation (regex) instead of hardcoded examples
   - ✅ LOAD secrets from .env at runtime only
   - ✅ NEVER commit .env or secrets to git (.gitignore protects them)

## Project Overview
- **Name:** Oddiya (v1.3.1)
- **Mission:** AI-powered mobile travel planner with real-time streaming
- **Current Phase:** Local Development → MVP Testing
- **Documentation:** See `docs/CURRENT_IMPLEMENTATION_STATUS.md` for latest status
- **Architecture:** Local development with Python FastAPI + Spring Boot + React Native

## Implementation Status (✅ = Complete, ⏳ = In Progress, ⬜ = Planned)

### Completed Features
- ✅ **Real-time Streaming (SSE)** - ChatGPT-style progressive display (2025-11-04)
- ✅ **Redis Caching** - 1-hour TTL, 99% cost savings (2025-11-04)
- ✅ **Database Persistence** - PostgreSQL storage with JPA (2025-11-04)
- ✅ **LLM-Only Architecture** - Google Gemini 2.0 Flash (Free tier)
- ✅ **Mobile App** - React Native 0.75 with SSE client
- ✅ **Plan CRUD** - Create, Read, Update, Delete implemented

### Next Steps
- ⬜ PlanDetail screen (mobile)
- ⬜ Error handling improvements
- ⬜ OAuth authentication completion
- ⬜ Video generation (future)

## Core Technology Stack

### Current Setup (Local Development)
- **Backend:** Spring Boot 3.2 (Java 21) + Python FastAPI 0.104
- **AI Engine:** Google Gemini 2.0 Flash (via direct API, NOT Bedrock)
- **AI Framework:** LangChain + LangGraph for iterative planning
- **Database:** PostgreSQL 17.0 (localhost:5432)
- **Cache:** Redis 7.4 (localhost:6379)
- **Mobile:** React Native 0.75 + Expo + Redux Toolkit
- **Streaming:** Server-Sent Events (SSE) protocol

### Infrastructure (Current)
- **Development:** Local MacOS (all services running directly)
- **Database:** PostgreSQL 17.0 with schema-per-service
- **Cache:** Redis 7.4 for LLM response caching (1hr TTL)
- **AI:** Google Gemini API (free tier, no Bedrock)
- **Future:** AWS EKS deployment planned

### External APIs
- **NONE** - Pure LLM-only strategy
- Google Gemini 2.0 Flash provides all travel content dynamically
- NO hardcoded destinations, activities, or restaurants

## Current Architecture (Local Development)

### System Flow
```
Mobile App (React Native 0.75)
    ↓ SSE (Server-Sent Events)
LLM Agent (8000) ← Python FastAPI + LangChain + Gemini
    ↓ Redis Cache Check
Redis (6379) ← 1-hour TTL caching
    ↓ (on cache miss)
Plan Service (8083) ← Spring Boot + JPA
    ↓
PostgreSQL (5432) ← Persistent storage
```

### Running Services (Currently Active)

**1. LLM Agent (FastAPI)** - Port 8000 ✅ Running
- **Real-time Streaming:** SSE endpoint `/api/v1/plans/generate/stream`
- **AI Engine:** Google Gemini 2.0 Flash (gemini-2.0-flash-exp)
- **Framework:** LangChain + LangGraph for iterative planning
- **Caching:** Redis with 1-hour TTL (cache key: `plan:{location}:{startDate}:{endDate}:{budget}`)
- **Performance:** First generation ~6s, cached <1s
- **Prompts:** Externalized in YAML files
- **NO hardcoded data** - All content generated by LLM

**2. Plan Service (Spring Boot)** - Port 8083 ✅ Running
- **Database:** JPA + PostgreSQL for persistence
- **Repositories:** TravelPlanRepository, PlanDetailRepository
- **CRUD Operations:** Create, Read, Update, Delete, Complete
- **REST API:** `/api/v1/plans` endpoints
- **User Association:** Plans tied to user_id
- **Schema:** `plan_service.travel_plans`, `plan_service.plan_details`

**3. PostgreSQL** - Port 5432 ✅ Running
- **Version:** 17.0
- **Database:** oddiya
- **Schemas:** auth_service, user_service, plan_service, video_service
- **Connection:** `jdbc:postgresql://localhost:5432/oddiya?currentSchema=plan_service`

**4. Redis** - Port 6379 ✅ Running
- **Version:** 7.4
- **Usage:** LLM response caching (1-hour TTL)
- **Cache Hit Rate:** ~90%+ (significant cost savings)

### Services Not Yet Running (Planned)
- ⬜ API Gateway (8080) - Future integration layer
- ⬜ Auth Service (8081) - OAuth 2.0 implementation in progress
- ⬜ User Service (8082) - Basic user management
- ⬜ Video Service (8084) - Future feature
- ⬜ Video Worker - Future feature

## Database Schema (PostgreSQL 17.0) - Current

### Schema-per-Service Model (Currently Implemented)

```sql
-- plan_service.travel_plans (✅ IMPLEMENTED)
CREATE TABLE plan_service.travel_plans (
    id BIGSERIAL PRIMARY KEY,
    user_id BIGINT NOT NULL,
    title VARCHAR(255) NOT NULL,
    start_date DATE NOT NULL,
    end_date DATE NOT NULL,
    budget_level VARCHAR(20),  -- LOW, MEDIUM, HIGH
    status VARCHAR(20) DEFAULT 'PENDING',  -- PENDING, CONFIRMED, COMPLETED, CANCELLED
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);

-- plan_service.plan_details (✅ IMPLEMENTED)
CREATE TABLE plan_service.plan_details (
    id BIGSERIAL PRIMARY KEY,
    plan_id BIGINT NOT NULL REFERENCES plan_service.travel_plans(id),
    day INTEGER NOT NULL,
    location VARCHAR(255) NOT NULL,
    activity TEXT,
    created_at TIMESTAMP DEFAULT NOW(),
    CONSTRAINT fk_plan FOREIGN KEY (plan_id)
        REFERENCES plan_service.travel_plans(id) ON DELETE CASCADE
);
```

### Future Schemas (Not Yet Implemented)

```sql
-- user_service.users (⬜ PLANNED)
-- auth_service.tokens (⬜ PLANNED)
-- video_service.video_jobs (⬜ PLANNED - Future feature)
```

### JPA Entities & Repositories (Implemented)

```java
// TravelPlan entity with JPA annotations
@Entity
@Table(name = "travel_plans", schema = "plan_service")
public class TravelPlan {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    private Long userId;
    private String title;
    private LocalDate startDate;
    private LocalDate endDate;
    private String budgetLevel;
    private String status;
    // ... timestamps, getters, setters
}

// Repository pattern
@Repository
public interface TravelPlanRepository extends JpaRepository<TravelPlan, Long> {
    List<TravelPlan> findByUserIdOrderByCreatedAtDesc(Long userId);
}
```

## Key Implementation Patterns (Current)

### Plan Generation Flow with Streaming (✅ IMPLEMENTED)

**Step 1: Real-time Streaming (Cache Miss)**
```
1. Mobile: POST /api/v1/plans/generate/stream with SSE
2. LLM Agent: Check Redis cache → MISS
3. LLM Agent: Stream via LangGraph + Gemini
   - Status updates (10%, 20%, ..., 100%)
   - LLM chunks (progressive text generation)
   - Completion event with full plan JSON
4. LLM Agent: Cache result in Redis (1hr TTL)
5. Mobile: Display streaming updates in real-time
```

**Step 2: Database Persistence**
```
6. Mobile: Receives complete plan → POST /api/v1/plans (REST)
7. Plan Service: Save to PostgreSQL via JPA
8. Plan Service: Returns saved plan with ID
9. Mobile: Refresh plans list
```

**Cache Hit (Same Request)**
```
1. Mobile: POST /api/v1/plans/generate/stream
2. LLM Agent: Check Redis cache → HIT (<1s)
3. Mobile: Displays "💾 Cached" badge
4. Plan Service: Save to DB (same as above)
```

### Reactive + Blocking Pattern (Spring Boot + JPA)

**Problem:** Mixing Project Reactor (Mono/Flux) with blocking JPA
**Solution:** Use `Mono.fromCallable()` for database operations

```java
// ❌ WRONG - Causes timeout
public Mono<PlanResponse> createPlan(CreatePlanRequest request) {
    return llmAgentClient.generatePlan(llmRequest)
        .map(llmResponse -> {
            // Blocking JPA call in reactive chain!
            TravelPlan saved = repository.save(plan);
            return toResponse(saved);
        });
}

// ✅ CORRECT - Wraps blocking in Callable
public Mono<PlanResponse> createPlan(CreatePlanRequest request) {
    return llmAgentClient.generatePlan(llmRequest)
        .flatMap(llmResponse -> {
            TravelPlan plan = new TravelPlan();
            // ... set fields from llmResponse

            return Mono.fromCallable(() -> {
                TravelPlan saved = repository.save(plan);
                log.info("✅ Plan saved: id={}", saved.getId());
                return toResponse(saved);
            });
        });
}
```

### LLM-Only Strategy (NO Hardcoded Data)

**All travel content is dynamically generated by Gemini:**
```python
# services/llm-agent/src/services/langgraph_planner.py
async def generate_plan_streaming(
    self,
    location: str,
    start_date: str,
    end_date: str,
    budget: str
):
    # Load prompt template from YAML (NO hardcoded prompts)
    system_prompt = self.prompt_loader.get_planning_prompt()

    # LangGraph → Gemini generates ALL content
    # NO fallback to YAML data
    # NO hardcoded restaurants/activities
    async for chunk in self.langgraph.astream(user_input):
        yield {"type": "chunk", "content": chunk}
```

## Project Structure Convention

```
oddiya/
├── docs/                           # 📚 Documentation
│   ├── CURRENT_IMPLEMENTATION_STATUS.md  # ⭐ Start here - current state
│   ├── architecture/               # System design documents
│   ├── development/                # Development guides
│   └── archive/                    # Historical implementation docs
├── services/                       # 🔧 Backend Services
│   ├── llm-agent/                  # ✅ Python FastAPI + LangChain + Gemini
│   │   ├── src/
│   │   │   ├── routes/langgraph_plans.py      # SSE streaming endpoint
│   │   │   ├── services/langgraph_planner.py  # LangGraph implementation
│   │   │   └── config/prompts.yaml            # Externalized prompts
│   │   ├── main.py                            # FastAPI app
│   │   └── requirements.txt
│   ├── plan-service/               # ✅ Spring Boot + JPA
│   │   ├── src/main/java/com/oddiya/plan/
│   │   │   ├── controller/PlanController.java
│   │   │   ├── service/PlanService.java       # CRUD with DB persistence
│   │   │   ├── repository/
│   │   │   │   ├── TravelPlanRepository.java
│   │   │   │   └── PlanDetailRepository.java
│   │   │   └── entity/
│   │   │       ├── TravelPlan.java
│   │   │       └── PlanDetail.java
│   │   └── src/main/resources/application.yml # JPA enabled
│   ├── auth-service/               # ⏳ OAuth 2.0 (in progress)
│   ├── api-gateway/                # ⬜ Future
│   └── user-service/               # ⬜ Future
├── mobile/                         # 📱 React Native Mobile App
│   ├── src/
│   │   ├── api/
│   │   │   ├── services.ts         # REST API client
│   │   │   └── streaming.ts        # SSE streaming client
│   │   ├── screens/
│   │   │   ├── PlansScreen.tsx     # Plan list view
│   │   │   └── CreatePlanScreen.tsx # Streaming UI
│   │   └── store/slices/
│   │       └── plansSlice.ts       # Redux state
│   ├── package.json
│   └── App.tsx
├── scripts/                        # 🛠️ Automation Scripts
├── .env                           # Environment variables
├── CLAUDE.md                      # System prompt for Claude Code
├── .cursorrules                   # System prompt for Cursor
├── README.md                      # Project overview
└── CHANGELOG_2025-11-04.md        # Today's changes
```

## Coding Guidelines

### For Spring Boot Services
- Use Spring Boot 3.2, Java 21
- Package structure: `com.oddiya.{service}.{layer}`
- Layers: controller, service, repository, dto, entity, config
- Use `application.yml` for configuration (not .properties)
- Database connection: Use environment variables for t2.micro EC2 private IPs
- Error handling: Use `@ControllerAdvice` for global exception handling
- Validation: Use Jakarta Bean Validation (`@Valid`)
- Logging: Use SLF4J with Logback

### For Python Services
- Use Python 3.11+
- FastAPI for LLM Agent
- Pydantic for data validation
- Black for code formatting
- Use environment variables via python-dotenv
- Async/await for I/O operations

### Kubernetes Manifests
- Use Deployments (not StatefulSets) for all services
- Resource limits: Start conservative (CPU: 200m-500m, Memory: 256Mi-512Mi)
- Health checks: liveness and readiness probes for all services
- ConfigMaps for configuration, Secrets for credentials
- Service discovery: Use K8s Service names (e.g., `http://auth-service:8081`)

## Performance Considerations

### ⚠️ Known Bottleneck
- PostgreSQL on t2.micro (1GB RAM) will be extremely slow
- This is an accepted trade-off for learning/cost reasons
- Load tests will expose this bottleneck

### Optimization Strategies
- Redis caching for LLM Agent responses (1hr TTL)
- Connection pooling for database connections (keep pool size low)
- Horizontal Pod Autoscaler (HPA) for stateless services
- SQS for async processing (avoid blocking operations)

## Common Commands (Local Development)

### Start Services

```bash
# 1. Start LLM Agent (Python FastAPI)
cd services/llm-agent
source venv/bin/activate
python main.py
# Running on http://localhost:8000

# 2. Start Plan Service (Spring Boot)
cd services/plan-service
./gradlew bootRun
# Running on http://localhost:8083

# 3. Start PostgreSQL and Redis
brew services start postgresql
brew services start redis

# 4. Start Mobile App
cd mobile
npm run ios     # iOS Simulator
npm run android # Android Emulator
```

### Health Checks

```bash
# Check services
curl http://localhost:8000/health          # LLM Agent
curl http://localhost:8083/actuator/health # Plan Service
redis-cli ping                             # Redis → PONG
pg_isready                                 # PostgreSQL

# Check running services
ps aux | grep -E "python.*main.py|java.*plan-service"
```

### Build & Test

```bash
# Java (Spring Boot)
cd services/plan-service
./gradlew clean build    # Build
./gradlew test          # Run tests

# Python (FastAPI)
cd services/llm-agent
pip install -r requirements.txt
pytest                  # Run tests

# Mobile
cd mobile
npm install
npm test
```

### Database Operations

```bash
# Connect to PostgreSQL
PGPASSWORD=4321 psql -h localhost -U admin -d oddiya

# Check plans
SELECT * FROM plan_service.travel_plans ORDER BY created_at DESC LIMIT 5;
SELECT * FROM plan_service.plan_details WHERE plan_id = 1;

# Check Redis cache
redis-cli
> KEYS plan:*
> GET "plan:Seoul:2025-11-10:2025-11-12:MEDIUM"
```

### Logs

```bash
# LLM Agent logs
tail -f /tmp/llm-agent.log | grep -E "Streaming|Cache"

# Plan Service logs
tail -f /tmp/plan-service.log | grep "PlanService"

# Success indicators:
# [PlanService] ✅ Plan saved to database: id=1
# [LangGraph] Cache HIT for key: plan:Seoul:...
```

## Environment Variables (Current)

### Required Variables

```bash
# Google Gemini (LLM Agent) - FREE TIER
GOOGLE_API_KEY=your_api_key_here
GEMINI_MODEL=gemini-2.0-flash-exp

# PostgreSQL (Local Development)
DB_HOST=localhost
DB_PORT=5432
DB_NAME=oddiya
DB_USER=admin
DB_PASSWORD=4321

# Redis (Local Development)
REDIS_HOST=localhost
REDIS_PORT=6379

# Service Ports (Current Running)
LLM_AGENT_PORT=8000
PLAN_SERVICE_PORT=8083

# Future (OAuth - In Progress)
GOOGLE_CLIENT_ID=***
GOOGLE_CLIENT_SECRET=***
APPLE_CLIENT_ID=***
APPLE_PRIVATE_KEY=***

# Future (AWS - Not Yet Used)
# AWS_REGION=ap-northeast-2
# S3_BUCKET=oddiya-storage
```

### File Locations
- `.env` - Root directory (gitignored)
- `.env.example` - Template for new developers
- `services/llm-agent/.env` - Python service env
- `services/plan-service/src/main/resources/application.yml` - Spring Boot config

## Testing Strategy

### Unit Tests
- Spring Boot: JUnit 5, Mockito
- FastAPI: pytest, pytest-asyncio

### Integration Tests
- Testcontainers for PostgreSQL/Redis
- Mock external APIs (Kakao, Bedrock)

### Load Tests
- Locust for HTTP load testing
- Test scenarios: Auth flow, Plan generation, Video submission

## Important Notes (Current Implementation)

1. **✅ LLM-Only Strategy:** Google Gemini 2.0 Flash ONLY - NO external APIs (no Kakao, no weather, no exchange rate)
2. **✅ NO Hardcoded Data:** ALL travel content dynamically generated by LLM
3. **✅ Streaming Implementation:** SSE with real-time progress updates (ChatGPT-style)
4. **✅ Database Persistence:** JPA + PostgreSQL for saving generated plans
5. **✅ Redis Caching:** 1-hour TTL, 99% cost savings on repeated requests
6. **✅ Reactive Pattern:** Use `Mono.fromCallable()` for mixing reactive + blocking JPA
7. **⏳ OAuth In Progress:** Google/Apple OAuth implementation ongoing
8. **⬜ Future:** API Gateway, User Service, Video Generation

### Current Bugs Fixed (2025-11-04)
- ✅ Timer bug in CreatePlanScreen (was showing 0.0s)
- ✅ Plans not persisting (database was disabled)
- ✅ AsyncRequestTimeoutException (fixed with `Mono.fromCallable()`)

## When Starting New Chat

1. **FIRST:** Read `docs/CURRENT_IMPLEMENTATION_STATUS.md` for latest state
2. Check this file (`.cursorrules`) for architecture and patterns
3. Review `CLAUDE.md` for project guidelines
4. Check existing service implementations before creating new ones
5. Always prefer updating existing files over creating new ones

## Questions to Ask Before Implementation

- ✅ Does this file already exist? (Read first!)
- ✅ Can I update an existing file instead of creating a new one?
- ✅ Does this follow the LLM-only strategy? (NO hardcoded data!)
- ✅ Have I checked CURRENT_IMPLEMENTATION_STATUS.md?
- ✅ Is this consistent with the current local development setup?
- ✅ Am I using the correct AI model (Gemini, NOT Bedrock)?