cahaseler · cahaseler · Jul 1, 2025 · Jul 1, 2025
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,97 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+EMPACT (Environment and Maturity Program Assessment and Control Tool) is a full-stack assessment platform implementing the IP2M METRR evaluation model. It's built with Next.js 15, TypeScript, and supports both web and desktop deployment via Tauri.
+
+## Development Commands
+
+All commands should be run from the `web/` directory unless specified otherwise:
+
+```bash
+# Development
+yarn dev              # Start development server on http://localhost:3000
+
+# Building
+yarn build            # Generate Prisma clients and build Next.js app
+yarn start            # Start production server
+
+# Code Quality
+yarn lint             # Run ESLint
+yarn lint:fix         # Run ESLint with auto-fix
+yarn typecheck        # Run TypeScript type checking
+
+# Testing
+yarn test             # Build and run Playwright tests
+yarn test:ui          # Run tests with UI
+yarn test:debug       # Run tests in debug mode
+
+# Database
+yarn prisma-generate  # Generate Prisma clients for all database types
+
+# Data Import
+yarn import:ip2m      # Import legacy IP2M METRR data from Excel
+                      # See web/scripts/README.md for detailed documentation
+```
+
+## Architecture Overview
+
+The application uses Next.js App Router with a clear separation between frontend routes and API endpoints:
+
+- **Frontend Routes** (`app/(frontend)/`): 
+  - Protected routes under `(logged-in)/` require authentication via Clerk
+  - Public routes under `(logged-out)/` for authentication flows
+  - Key pages: Dashboard, Assessments, Collections, Recommendations, Questionnaire
+
+- **API Layer** (`app/api/`):
+  - RESTful endpoints for data operations
+  - Clerk webhook handling for user synchronization
+  - Database operations via Prisma ORM
+
+- **Database Architecture**:
+  - Multi-database support (MSSQL primary, PostgreSQL/SQLite for flexibility)
+  - Prisma schemas in `web/prisma/[db-type]/schema.prisma`
+  - Generated clients in `web/prisma/generated/[db-type]/`
+  - Use `getDatabaseClient()` from `web/lib/db.ts` for database access
+
+- **Component Structure**:
+  - Shared components in `web/components/`
+  - UI components from Shadcn UI library
+  - All components use TypeScript with proper type definitions
+
+## Key Patterns and Conventions
+
+1. **Authentication**: 
+   - Clerk handles authentication with SSO support
+   - User sync happens via webhooks to maintain database records
+   - Check authentication state using Clerk's hooks/utilities
+
+2. **Database Access**:
+   - Always use the database client from `lib/db.ts`
+   - Follow existing query patterns in API routes
+   - Use Prisma's type-safe query builders
+
+3. **State Management**:
+   - Server components by default in App Router
+   - Client components marked with 'use client'
+   - Use URL state for filters and pagination
+
+4. **Error Handling**:
+   - API routes return appropriate HTTP status codes
+   - Client-side error boundaries for graceful degradation
+   - Consistent error response format
+
+5. **Testing**:
+   - Playwright for E2E tests
+   - Tests located in `web/tests/`
+   - Follow existing test patterns for new features
+
+## Important Context
+
+- The project implements the IP2M METRR evaluation model for project maturity assessment
+- Three main user roles: System Admin, Collection Manager, Assessment Manager
+- Collections group related assessments together
+- Recent work includes facilitator-controlled stopwatch functionality for timed assessments
+- The application supports both online (web) and offline (Tauri desktop) usage
diff --git a/web/package.json b/web/package.json
@@ -31,7 +31,8 @@
     "test:ui:dev": "cross-env PLAYWRIGHT_SERVER_COMMAND=\"yarn dev\" playwright test --ui",
     "test:ci": "playwright test --reporter=list",
     "test:nobuild:ui": "playwright test --ui",
-    "test:nobuild": "playwright test"
+    "test:nobuild": "playwright test",
+    "import:ip2m": "npx tsx scripts/import-ip2m-data-v2.ts"
   },
   "dependencies": {
     "@clerk/nextjs": "^6.12.6",
@@ -101,6 +102,7 @@
     "@types/node": "20.14.15",
     "@types/react": "19.0.12",
     "@types/react-dom": "19.0.4",
+    "@types/sanitize-html": "^2.16.0",
     "@typescript-eslint/eslint-plugin": "^8.27.0",
     "@typescript-eslint/parser": "^8.27.0",
     "autoprefixer": "^10.4.21",
@@ -117,7 +119,9 @@
     "prettier-plugin-organize-imports": "^4.1.0",
     "prisma-dbml-generator": "0.12.0",
     "prompt-confirm": "2.0.4",
-    "tailwindcss": "3.4.17"
+    "sanitize-html": "^2.17.0",
+    "tailwindcss": "3.4.17",
+    "xlsx": "^0.18.5"
   },
   "pkg": {
     "scripts": ".next/standalone/**/*.*",

diff --git a/web/scripts/README.md b/web/scripts/README.md
@@ -0,0 +1,224 @@
+# IP2M METRR Data Import Utility
+
+This utility imports legacy assessment data from the IP2M METRR system into EMPACT. It was created to migrate historical assessment data while preserving all responses and maintaining data integrity.
+
+## Usage
+
+```bash
+# Dry run (preview what will be imported)
+yarn import:ip2m --file="path/to/excel-file.xlsx" --dryRun
+
+# Actual import
+yarn import:ip2m --file="path/to/excel-file.xlsx"
+```
+
+## Excel File Format Specification
+
+The Excel file must contain exactly two sheets with specific column structures:
+
+### Sheet 1: `assessment`
+
+This sheet contains assessment metadata. Expected columns:
+
+| Column Name | Type | Description | Example |
+|------------|------|-------------|---------|
+| `id` | number | Legacy assessment ID (not used) | 130 |
+| `project_id` | string | Project identifier | "LCCF" |
+| `name` | string | Assessment name | "LCCF EVMS Review" |
+| `type_id` | number | Legacy type ID (not used) | 16 |
+| `type` | string | Legacy type name (not used) | "NSF Project" |
+| `location` | string | Assessment location | "Austin, Texas" |
+| `date` | number | JavaScript timestamp (milliseconds since epoch) | 1729712460000 |
+| `manager` | string | Manager name (not used) | "" |
+| `description` | string | Assessment description (can contain HTML) | "" |
+| `hide_description` | boolean | (not used) | true |
+| `status` | string | Assessment status: "STARTED", "COMPLETED", etc. | "STARTED" |
+| `has_maturity` | boolean | (not used) | false |
+| `has_environment` | boolean | (not used) | true |
+| `maturity_score` | number | Maturity score (if available) | 842 |
+| `current_progress` | string | (not used) | "IN_PROGRESS" |
+| `percent_completed` | number | (not used) | 0 |
+| `maturity_progress` | string | (not used) | "COMPLETED" |
+| `env_progress` | string | (not used) | "COMPLETED" |
+| `is_env_anonymous` | boolean | (not used) | true |
+| `lock_env` | boolean | (not used) | false |
+| `internal_assessment_status` | string | (not used) | "ACTIVE" |
+| `factor_facilitator_answers` | boolean | (not used) | false |
+| `environment_score` | number | Environment score (if available) | 926 |
+
+### Sheet 2: `assessment_user_responses`
+
+This sheet contains all user responses. Expected columns:
+
+| Column Name | Type | Description | Example |
+|------------|------|-------------|---------|
+| `id` | number | Legacy response ID | 18030 |
+| `user_id` | number | Legacy user ID | 381 |
+| `attribute_id` | string | EMPACT attribute ID | "1a" |
+| `level_id` | number | Legacy level ID (not used directly) | 340 |
+| `notes` | string | Response notes (can contain HTML) | "<p>project saying...</p>" |
+| `section_id` | string | Section ID for validation | "1" |
+| `is_facilitator_response` | boolean | (not used) | false |
+| `attributes.name` | string | Attribute name (for reference) | "The contractor organization..." |
+| `attributes.description` | string | Attribute description (for reference) | "<p>The contractor's...</p>" |
+| `levels.short_description` | string | Level short description | "Meets Most" |
+| `levels.long_description` | string | Level long description | "Rating a factor..." |
+| `levels.level_number` | number | Level number (1-5) - CRITICAL for mapping | 4 |
+| `levels.weight` | number | Level weight/score | 58 |
+| `sections.name` | string | Section name | "Culture" |
+| `sections.description` | string | Section description | "The culture category..." |
+
+## Import Process
+
+The import utility performs the following steps:
+
+### 1. Validation Phase
+- Verifies IP2M METRR assessment type exists in EMPACT
+- Validates all attribute IDs from responses exist in EMPACT
+- Maps legacy level numbers to EMPACT level IDs
+- Checks for existing data to prevent duplicates
+
+### 2. Import Phase (in transaction)
+For each assessment in the Excel file:
+
+1. **Create Assessment Collection**
+   - Name: "Import - [Assessment Name]"
+   - Type: IP2M METRR
+
+2. **Create Assessment**
+   - Project ID, name, location from Excel
+   - Status mapping:
+     - "STARTED" → "Active"
+     - "COMPLETED" → "Final"
+     - "NOT_STARTED" → "Inactive"
+   - Completed date set if status is "COMPLETED"
+   - HTML cleaned from description
+
+3. **Create Assessment Parts**
+   - Creates parts for Environment and Maturity
+   - Status: "Active"
+   - Date from Excel timestamp
+
+4. **Add Assessment Attributes**
+   - Links all 83 IP2M METRR attributes to assessment
+
+5. **Create User Group**
+   - Name: "Imported Participants"
+   - Status: "Active"
+
+6. **Create/Find Users**
+   - Email format: `imported_ip2m_user_{legacy_id}@doe.gov`
+   - Name: "IP2M User {legacy_id}"
+   - Checks for existing users to avoid duplicates
+
+7. **Create Assessment Users**
+   - Role: "Participant"
+   - Linked to user group
+   - Connected to all assessment parts
+   - No permissions (not real users)
+
+8. **Import Responses**
+   - Maps attribute IDs directly
+   - Maps level numbers to EMPACT level IDs
+   - Cleans HTML from notes
+   - Creates AssessmentUserResponse records
+
+### 3. Post-Import Notes
+- Score summaries are skipped due to database schema mismatch
+- All operations occur in a transaction (all-or-nothing)
+- 60-second timeout for large imports
+
+## Data Mapping Details
+
+### Status Mapping
+```
+Excel Status    →  EMPACT Status
+STARTED         →  Active
+COMPLETED       →  Final  
+NOT_STARTED     →  Inactive
+ACTIVE          →  Active
+INACTIVE        →  Inactive
+FINAL           →  Final
+ARCHIVED        →  Archived
+(default)       →  Active
+```
+
+### Level Mapping
+Levels are mapped using the combination of attribute ID and level number:
+- Key: `{attribute_id}-{level_number}` (e.g., "1a-4")
+- Maps to EMPACT level ID
+
+### HTML Cleaning
+Basic HTML tags and entities are removed from:
+- Assessment descriptions
+- Response notes
+
+Conversions:
+- `<p>`, `</p>`, etc. → removed
+- `&nbsp;` → space
+- `&rsquo;` → '
+- `&ldquo;` → "
+- `&rdquo;` → "
+- `&amp;` → &
+
+## Prerequisites
+
+1. **IP2M METRR Assessment Type** must exist in EMPACT database
+2. **All Attributes** referenced in responses must exist in EMPACT
+3. **All Levels** must exist for each attribute (levels 0-5)
+4. **Database Access** with write permissions
+
+## Error Handling
+
+- **Validation Errors**: Stop before import if data doesn't match
+- **Duplicate Users**: Reuses existing imported users
+- **Duplicate Assessment Users**: Logs warning but continues
+- **Missing Level Mappings**: Logs error and skips that response
+- **Transaction Rollback**: All changes reverted if any error occurs
+
+## Output
+
+The import provides detailed progress information:
+```
+📊 Starting IP2M METRR Data Import
+✅ IP2M METRR Assessment type found (ID: 1)
+✅ All 83 attributes validated
+📁 Processing assessment: LCCF EVMS Review
+  ✅ Created assessment (ID: 15)
+  ✅ Created 2 assessment parts
+  ✅ Added 83 attributes to assessment
+  ✅ Created user group for imported participants
+  ✅ Created user: imported_ip2m_user_381@doe.gov
+  ✅ Processed 16 users
+  ✅ Imported 466 responses
+🎉 Import completed successfully!
+```
+
+## Troubleshooting
+
+### "IP2M METRR assessment type not found"
+The IP2M METRR assessment type doesn't exist in the database. This is required base data.
+
+### "Missing attributes in EMPACT database"
+Some attributes in the Excel file don't exist in EMPACT. Check that all IP2M METRR attributes are loaded.
+
+### "Cannot find level mapping"
+The level number for an attribute doesn't exist. Verify levels 0-5 exist for all attributes.
+
+### "The column 'type' does not exist"
+Database schema mismatch. The ScoreSummary table structure differs from Prisma schema.
+
+## Source Code Location
+
+The import utility is located at:
+```
+web/scripts/import-ip2m-data-v2.ts
+```
+
+## Future Improvements
+
+1. **Score Calculation**: Calculate scores from responses rather than using Excel values
+2. **User Mapping**: Option to map legacy users to real EMPACT users
+3. **Multiple Assessments**: Support multiple assessments in one Excel file
+4. **Validation Report**: Generate pre-import validation report
+5. **Export Utility**: Create matching export functionality