🎬 Task 9.6: Testing - Narrative Flow and Demo Timing Validation
Test the complete narrative flow from disaster trigger to plan conclusion, validate timing, and ensure the story is compelling.
📝 Description
Conduct comprehensive narrative testing of the July 2020 scenario to ensure it tells a compelling, coherent story that judges will remember. Time each phase, verify key talking points appear in the output, test the dramatic reveals (HWY 407 threat, mutual aid requirement), and practice explaining the system while it processes. This is your dress rehearsal.
🎯 Acceptance Criteria
🧪 Testing Protocol
Phase 1: Technical Timing Test (30 minutes)
Setup:
# Start backend with logging
cd backend
python app.py
# Start frontend
cd frontend
npm start
# Open browser DevTools
# Network tab: Monitor API calls
# Console tab: Monitor logs
# Performance tab: Ready to record
Test Sequence:
RUN 1: Baseline Timing
1. Open stopwatch (phone or online timer)
2. Click "Simulate July 2020 Fire"
3. Start timer immediately
4. Record time for each milestone:
- [ ] Progress bar appears: ___ ms
- [ ] Progress reaches 20%: ___ s
- [ ] Progress reaches 50%: ___ s
- [ ] Progress reaches 80%: ___ s
- [ ] Progress reaches 100%: ___ s
- [ ] Plan appears: ___ s
- [ ] Map danger zone appears: ___ s
- [ ] Evacuation routes appear: ___ s
- [ ] Facility markers appear: ___ s
- [ ] Total time: ___ s
TARGET: Total time 55-65 seconds
RUN 2: Console Monitoring
1. Watch console logs during processing
2. Note any errors or warnings
3. Check for:
- [ ] WebSocket messages arriving
- [ ] Progress updates smooth
- [ ] No API errors
- [ ] LLM response successful
- [ ] No React warnings
RUN 3: Network Validation
1. Monitor Network tab
2. Verify:
- [ ] POST /api/disaster/trigger succeeds
- [ ] WebSocket connection stable
- [ ] No failed requests
- [ ] Reasonable response sizes
- [ ] No unnecessary requests
RUN 4: Performance Profiling
1. Start Performance recording
2. Trigger disaster
3. Stop after plan displays
4. Analyze:
- [ ] No long tasks (>50ms)
- [ ] No layout thrashing
- [ ] Smooth 60fps animations
- [ ] Memory usage reasonable
RUN 5: Memory Leak Check
1. Take heap snapshot (baseline)
2. Trigger disaster 5 times
3. Take another heap snapshot
4. Compare memory usage
5. Check for detached DOM nodes
Phase 2: Narrative Content Validation (30 minutes)
Content Checklist - Test 5 Times:
Each time you trigger the July 2020 scenario, verify:
EXECUTIVE SUMMARY VALIDATION:
□ Mentions "HWY 407" or "Highway 407"
□ Uses urgent language ("CRITICAL", "IMMEDIATE")
□ Mentions "proactive closure" or "closure"
□ States timeline (2-3 hours)
□ Mentions mutual aid
□ Mentions population (2,000)
□ Total length: 2-3 sentences
□ Tone is professional but urgent
SITUATION OVERVIEW VALIDATION:
□ Describes fire size (40 acres)
□ Mentions location (407/410 interchange)
□ Describes weather conditions
□ Mentions spread rate
□ Mentions population at risk
□ Details infrastructure threat
□ Explains why immediate action needed
TIMELINE PREDICTIONS VALIDATION:
□ Shows HWY 407 threat
□ Timeline: 2-3 hours
□ Confidence: "high"
□ Impact described as "CRITICAL"
□ Shows residential threat
□ Timeline: 3-4 hours
□ Includes weather factors
RESOURCE ALLOCATION VALIDATION:
□ Mentions mutual aid
□ Lists Mississauga Fire
□ Lists Caledon Fire
□ Shows fire apparatus needed
□ Shows evacuation buses needed
□ Mentions highway coordination
COMMUNICATION TEMPLATES VALIDATION:
□ English template exists and is clear
□ Punjabi template exists with proper script
□ Hindi template exists with proper script
□ All templates mention location
□ All templates give clear action
□ All templates mention destination
MAP VISUALIZATIONS VALIDATION:
□ Red danger zone appears
□ Zone is in correct location (407/410)
□ Danger zone fades in smoothly
□ Green evacuation routes display
□ Routes are animated with arrows
□ Safe zone markers present
□ Facility markers appear
□ Schools marked correctly
□ Senior center marked
□ Hospital marked
□ Markers in danger zone highlighted
Scoring System:
- All checkboxes pass: ✅ Perfect - Demo ready
- 1-2 missing: ⚠️ Good - Minor adjustments
- 3-5 missing: ⚠️ Needs work - Investigate
- 6+ missing: ❌ Critical issues - Debug required
Phase 3: Storytelling Practice (45 minutes)
Practice the Narrative Arc:
Assign roles:
- Demo Driver: Controls computer, clicks buttons
- Narrator: Tells the story while system processes
- Technical Monitor: Watches for issues
- Judge Simulator: Asks questions
Script Practice Runs:
PRACTICE RUN 1: Full Script
1. Narrator follows 05_DEMO_SCRIPT.md exactly
2. Demo Driver clicks at right moments
3. Technical Monitor notes timing
4. Judge Simulator stays quiet
5. Goal: Complete in 5 minutes
Debrief:
- What felt rushed?
- What felt slow?
- Any awkward pauses?
- Adjust script accordingly
PRACTICE RUN 2: With Interruptions
1. Narrator delivers script
2. Judge Simulator interrupts with questions:
- "How does this work?"
- "Is this real data?"
- "What happens if wind changes?"
3. Narrator handles questions smoothly
4. Demo Driver keeps system running
5. Goal: Stay on track despite interruptions
Debrief:
- Were answers confident?
- Did we lose thread of story?
- Did demo keep progressing?
PRACTICE RUN 3: Emphasize Key Points
1. Narrator emphasizes critical moments:
- "HIGHWAY 407 CLOSURE" (loud, clear)
- "2.5 hours" (pause for emphasis)
- "Mutual aid" (show seriousness)
- "60 seconds" (point to timer)
2. Demo Driver highlights on screen:
- Points to danger zone
- Points to HWY 407 threat in timeline
- Points to mutual aid section
3. Goal: Judges remember key facts
Debrief:
- Did emphasis feel natural?
- Were visual cues effective?
- What resonated most?
PRACTICE RUN 4: Fast Version
1. Challenge: Complete in 4 minutes
2. Cut unnecessary words
3. Focus on core value proposition
4. Goal: Can we do shorter if needed?
PRACTICE RUN 5: Confident, Final
1. This is the real demo
2. No mistakes allowed
3. Professional, polished delivery
4. Everyone knows their role
5. Goal: Perfect run
Debrief:
- Ready for judges?
- Any remaining concerns?
- Backup plans clear?
Phase 4: Dramatic Moments Testing (15 minutes)
Test Each "Wow" Moment:
WOW MOMENT 1: Progress Bar Speed
- System processes in 60 seconds
- Test: Does it feel impressively fast?
- Timing: Not too fast (looks fake) or too slow (boring)
- Target: 55-65 seconds feels right
WOW MOMENT 2: Danger Zone Reveal
- Red polygon appears on map
- Test: Is the animation smooth and dramatic?
- Visual: Does it look professional?
- Impact: Does it make judges say "wow"?
WOW MOMENT 3: HWY 407 Call-Out
- Executive summary explicitly recommends closure
- Test: Is it in all-caps? Is it prominent?
- Reading: Can you read it aloud dramatically?
- Impact: Does it prove the value proposition?
WOW MOMENT 4: Timeline Threat
- Timeline shows "2.5 hours until HWY 407"
- Test: Is it prominently displayed?
- Visual: Is the urgency clear?
- Impact: Does it show the time advantage?
WOW MOMENT 5: Mutual Aid Request
- Resource section shows 3 municipalities
- Test: Does it show the scale of response?
- Visual: Are the cards clear?
- Impact: Does it prove this is serious?
WOW MOMENT 6: Multi-Language Alerts
- Templates in English, Punjabi, Hindi
- Test: Do the scripts display correctly?
- Visual: Is the layout professional?
- Impact: Does it show inclusivity?
Each moment should make judges:
1. Lean forward
2. Nod in approval
3. Say "impressive"
4. Ask follow-up questions
Phase 5: Edge Case Testing (20 minutes)
Test Failure Scenarios:
TEST 1: Backend Crash
1. Start demo
2. Kill backend at 50% progress
3. Verify error banner appears
4. Verify UI doesn't freeze
5. Practice recovery speech:
"The live API isn't cooperating, but let me show you
the pre-recorded version that demonstrates the same
capabilities..."
TEST 2: Slow Network
1. Throttle network to "Slow 3G"
2. Trigger disaster
3. Verify progress still updates
4. Note if timing is acceptable
5. Practice explanation:
"The system is processing on a throttled connection,
but you can see it's still completing the analysis..."
TEST 3: LLM Timeout
1. Mock LLM to take 30+ seconds
2. Verify system doesn't hang
3. Verify timeout handling
4. Practice explanation:
"The AI synthesis is taking longer than usual, but
the core analysis is complete..."
TEST 4: Missing Map Token
1. Remove Mapbox token
2. Verify map error handling
3. Plan should still display
4. Practice explanation:
"We're having a map rendering issue, but the critical
intelligence is here in the plan..."
TEST 5: WebSocket Disconnect
1. Disable WebSocket mid-processing
2. Verify fallback to polling
3. Or verify appropriate error message
4. Practice recovery
📊 Testing Results Template
Create docs/testing/july_2020_test_results.md:
# July 2020 Scenario Test Results
## Test Date: [DATE]
## Tester: [NAME]
### Timing Results (5 runs)
| Run | Total Time | Progress Time | Plan Display | Status |
|-----|-----------|---------------|--------------|--------|
| 1 | __s | __s | __s | ✅/❌ |
| 2 | __s | __s | __s | ✅/❌ |
| 3 | __s | __s | __s | ✅/❌ |
| 4 | __s | __s | __s | ✅/❌ |
| 5 | __s | __s | __s | ✅/❌ |
**Average:** __s
**Target:** 55-65s
**Result:** PASS / FAIL
### Content Validation
- [ ] HWY 407 mentioned in executive summary (5/5 runs)
- [ ] Timeline shows 2-3 hour threat (5/5 runs)
- [ ] Mutual aid clearly stated (5/5 runs)
- [ ] Map visualizations smooth (5/5 runs)
- [ ] Multi-language templates correct (5/5 runs)
### Narrative Quality (1-10)
- Compelling story: __/10
- Clear value prop: __/10
- Professional tone: __/10
- Judge engagement: __/10
### Issues Found
1. [ISSUE]: [DESCRIPTION]
- Severity: Critical / High / Medium / Low
- Status: Fixed / Open / Workaround
### Recommendations
- [RECOMMENDATION 1]
- [RECOMMENDATION 2]
### Demo Readiness
□ Technical: Ready / Not Ready
□ Content: Ready / Not Ready
□ Narrative: Ready / Not Ready
□ Team: Ready / Not Ready
**OVERALL: READY FOR DEMO / NEEDS WORK**
📸 Documentation Captures
Record the following:
✅ Final Checklist
Before marking this task complete:
⏱️ Estimated Time
90 minutes
🔗 Related Documentation
05_DEMO_SCRIPT.md - Complete demo script
06_QUICK_REFERENCE.md - Testing checklist
All previous Epic 9 tasks
🎬 Task 9.6: Testing - Narrative Flow and Demo Timing Validation
Test the complete narrative flow from disaster trigger to plan conclusion, validate timing, and ensure the story is compelling.
📝 Description
Conduct comprehensive narrative testing of the July 2020 scenario to ensure it tells a compelling, coherent story that judges will remember. Time each phase, verify key talking points appear in the output, test the dramatic reveals (HWY 407 threat, mutual aid requirement), and practice explaining the system while it processes. This is your dress rehearsal.
🎯 Acceptance Criteria
🧪 Testing Protocol
Phase 1: Technical Timing Test (30 minutes)
Setup:
Test Sequence:
Phase 2: Narrative Content Validation (30 minutes)
Content Checklist - Test 5 Times:
Each time you trigger the July 2020 scenario, verify:
EXECUTIVE SUMMARY VALIDATION: □ Mentions "HWY 407" or "Highway 407" □ Uses urgent language ("CRITICAL", "IMMEDIATE") □ Mentions "proactive closure" or "closure" □ States timeline (2-3 hours) □ Mentions mutual aid □ Mentions population (2,000) □ Total length: 2-3 sentences □ Tone is professional but urgent SITUATION OVERVIEW VALIDATION: □ Describes fire size (40 acres) □ Mentions location (407/410 interchange) □ Describes weather conditions □ Mentions spread rate □ Mentions population at risk □ Details infrastructure threat □ Explains why immediate action needed TIMELINE PREDICTIONS VALIDATION: □ Shows HWY 407 threat □ Timeline: 2-3 hours □ Confidence: "high" □ Impact described as "CRITICAL" □ Shows residential threat □ Timeline: 3-4 hours □ Includes weather factors RESOURCE ALLOCATION VALIDATION: □ Mentions mutual aid □ Lists Mississauga Fire □ Lists Caledon Fire □ Shows fire apparatus needed □ Shows evacuation buses needed □ Mentions highway coordination COMMUNICATION TEMPLATES VALIDATION: □ English template exists and is clear □ Punjabi template exists with proper script □ Hindi template exists with proper script □ All templates mention location □ All templates give clear action □ All templates mention destination MAP VISUALIZATIONS VALIDATION: □ Red danger zone appears □ Zone is in correct location (407/410) □ Danger zone fades in smoothly □ Green evacuation routes display □ Routes are animated with arrows □ Safe zone markers present □ Facility markers appear □ Schools marked correctly □ Senior center marked □ Hospital marked □ Markers in danger zone highlightedScoring System:
Phase 3: Storytelling Practice (45 minutes)
Practice the Narrative Arc:
Assign roles:
Script Practice Runs:
Phase 4: Dramatic Moments Testing (15 minutes)
Test Each "Wow" Moment:
Phase 5: Edge Case Testing (20 minutes)
Test Failure Scenarios:
📊 Testing Results Template
Create
docs/testing/july_2020_test_results.md:📸 Documentation Captures
Record the following:
✅ Final Checklist
Before marking this task complete:
⏱️ Estimated Time
90 minutes
🔗 Related Documentation
05_DEMO_SCRIPT.md- Complete demo script06_QUICK_REFERENCE.md- Testing checklistAll previous Epic 9 tasks