Pwooda is an AI-based Android application designed to help people with developmental disabilities in their daily lives. It provides personalized motivation, schedule guidance, behavior improvement tips, and more based on individual data and welfare facility programs. Through voice conversations and AI image generation, it offers rich interactions for users.
- Personalized Content Based on Individual Data: Customized services based on user interests, schedules, goals, etc.
- 11 Service Categories: Schedule, motivation, medication guidance, life skills, social skills, safety, behavior improvement, object explanation, general conversation, image generation, image saving
- Conversation History-Based Intent Analysis: Accurate intent understanding considering previous conversation context
- Continuous Conversations: Remembers the last 10 user questions and 3 AI responses for natural conversation flow
- Speech Recognition: Converts user voice input to text
- TTS (Text-to-Speech): Outputs AI responses in natural voice
- Teenage Girl Tone: Communicates with users in a friendly and cute tone
- Face Recognition-Based User Identification: Detects user faces through camera to provide personalized services
- User Registration: Asks for name during first recognition and registers
- Personal Data Mapping: Links with registered user information
- Object Recognition: Provides descriptions of objects photographed with camera
- Schedule Relevance Analysis: Guides the relationship between photographed objects and user's schedule/programs
- ComfyUI-Based Local Image Generation: Fast and secure local image generation
- Gemini-Based Korean-English Prompt Translation: Converts user's Korean requests to accurate English prompts
- Background-Free Ghibli Style: Creates clean and beautiful images with transparent backgrounds and Ghibli style
- Real-Time Image Generation: High-quality image generation within 15-20 seconds
- Gallery Storage: Saves generated images to Android's default photo album
- Automatic Folder Creation: Automatically saves to
Pictures/Pwoodafolder - Save Intent Recognition: Automatically recognizes save requests like "save the picture", "save to album"
- Language: Kotlin
- UI Framework: Jetpack Compose
- Architecture: MVVM (Model-View-ViewModel)
- Networking: OkHttp
- JSON Processing: JSONObject/JSONArray
- Image Processing: Bitmap, Base64 encoding/decoding
- Gallery Storage: MediaStore API
- Permission Management: Context-based gallery access
- Natural Language Processing: Google Gemini API
- Speech Synthesis: Google Cloud TTS
- Face Recognition: ML Kit Face Detection
- Image Generation: ComfyUI (local server)
- Python: 3.10.18
- PyTorch: 2.7.1
- ComfyUI: 0.3.45
- Hardware Acceleration: MPS (Metal Performance Shaders)
- Model: SDXL (Stable Diffusion XL)
- Image Size: 512x512 (optimized)
- Sampling Steps: 20 (for Ghibli style)
- HTTP Communication: Local ComfyUI server access
- Network Security Config: HTTP communication allowance settings
- IP Address:
192.168.219.122:8000(Mac M4 local server) - Timeout: 120 seconds (for image generation)
- WRITE_EXTERNAL_STORAGE: Gallery write for Android 9 and below
- READ_EXTERNAL_STORAGE: Gallery read for Android 13 and below
- MediaStore API: Android 10+ Scoped Storage support
| No. | Category | Description | Example |
|---|---|---|---|
| 1 | Schedule | Today's schedule, schedule, program guidance | "What's today's plan?", "Tell me the schedule" |
| 2 | Goals/Motivation | Goals, motivation, encouragement, praise | "What's your goal?", "Motivate me" |
| 3 | Medication Guidance | Medication intake, side effects, emergency situations | "When should I take medicine?", "What are the side effects?" |
| 4 | Life Skills | Daily living skills, cooking, cleaning, personal hygiene | "How do I wash hands?", "How do I brush teeth?" |
| 5 | Social Skills | Conversation, greetings, friendships, social situations | "How do I greet?", "How do I talk with friends?" |
| 6 | Safety | Safety, protection, dangerous situations, first aid | "When should I call 119?", "There's a fire" |
| 7 | Behavior Improvement | Behavior improvement, emotional expression, anxiety relief | "I'm angry", "I'm anxious" |
| 8 | Object Explanation | Requests for explanation of objects, items, photos | "What's this?", "Explain this" |
| 9 | Drawing | Drawing requests, image generation requests | "Draw a picture", "Create an image" |
| 10 | Save Drawing | Requests to save generated drawings | "Save the picture", "Save to album" |
| 11 | General Conversation | All other questions | "Hello", "I'm feeling good" |
# When launching the app, the following permissions are requested
- Camera permission
- Microphone permission
- Gallery storage permission (Android 9 and below)1. When a face is detected by camera, voice guidance
"Hello! What's your name? Please say it clearly!"
2. When name is spoken clearly, user registration completes
- If the name is not registered, guidance is repeated
3. After registration, personalized services begin
User: "Tell me today's schedule"
AI: "Nuri! Today you have morning exercise at 9 AM, art activity at 10 AM~ π"
User: "When is that?"
AI: "Remember what I said earlier? Morning exercise at 9 AM, art activity at 10 AM!"
User: "Draw a cute puppy"
AI: "I'll draw a picture for you! Please wait a moment."
β Gemini converts Korean request to English prompt
β ComfyUI generates image with Ghibli style + transparent background
β Completed image displayed after 15-20 seconds
User: "Save the picture"
AI: "I've saved the picture to your gallery!"
β Automatically saved to Pictures/Pwooda folder
β Filename: Pwooda_timestamp.jpg
1. Click camera button
2. Point at object for automatic recognition
3. Object description + guidance on relevance to today's schedule
# Navigate to ComfyUI directory
cd /Users/tony/ComfyUI
# Activate virtual environment
source venv/bin/activate
# Run server with MPS optimization
PYTORCH_ENABLE_MPS_FALLBACK=1 python main.py --listen 0.0.0.0 --port 8000 --use-split-cross-attention# Check server connection
curl http://192.168.219.122:8000
# Check logs
# - Verify MPS acceleration activation
# - Verify SDXL model loading
# - Verify successful image generation logs# Navigate to project directory
cd /Users/tony/AndroidProjects/pwooda
# Gradle build
./gradlew assembleDebug# Install APK on device
adb install -r app/build/outputs/apk/debug/app-debug.apk
# Verify installation
adb shell pm list packages | grep pwoodapwooda/
βββ app/
β βββ src/main/
β β βββ java/com/banya/pwooda/
β β β βββ data/ # Data models
β β β β βββ CustomerData.kt
β β β β βββ ProductData.kt
β β β βββ service/ # Service classes
β β β β βββ FalAIService.kt # ComfyUI image generation
β β β β βββ GoogleCloudTTSService.kt
β β β β βββ CustomerDataService.kt
β β β β βββ PaymentService.kt
β β β βββ viewmodel/
β β β β βββ GeminiViewModel.kt # Main ViewModel
β β β βββ ui/
β β β β βββ screens/
β β β β β βββ MainScreen.kt
β β β β βββ components/
β β β β β βββ CameraComponent.kt
β β β β β βββ SpeechRecognitionComponent.kt
β β β β βββ theme/
β β β βββ MainActivity.kt
β β β βββ SplashActivity.kt
β β βββ assets/
β β β βββ data.json # User data
β β β βββ google_tts_key.json
β β βββ res/
β β βββ xml/
β β β βββ network_security_config.xml
β β βββ values/
β βββ build.gradle.kts
βββ build.gradle.kts
βββ README.md
// Include recent 6 conversation messages in intent analysis
val recentHistory = chatHistory.takeLast(6)
val historyContext = if (recentHistory.isNotEmpty()) {
"Previous conversation:\n" + recentHistory.joinToString("\n") { "${it.role}: ${it.content}" }
} else {
"No previous conversation"
}// Convert Korean request to English image generation prompt
val translationPrompt = """
The user has requested to draw a picture. Please convert the following request to an English image generation prompt.
Requirements:
1. Translate the user's request to English and convert it to keywords suitable for image generation
2. Add "transparent background, no background, isolated" to generate images without background
3. Add "Studio Ghibli style, Hayao Miyazaki, anime, watercolor, soft lighting, magical atmosphere" to generate in Ghibli style
4. Add "detailed, high quality, masterpiece" to generate high-quality images
"""{
"1": {"class_type": "CheckpointLoaderSimple", "inputs": {"ckpt_name": "sd_xl_base_1.0.safetensors"}},
"2": {"class_type": "CLIPTextEncode", "inputs": {"text": "prompt", "clip": ["1", 1]}},
"3": {"class_type": "CLIPTextEncode", "inputs": {"text": "negative_prompt", "clip": ["1", 1]}},
"4": {"class_type": "EmptyLatentImage", "inputs": {"width": 512, "height": 512, "batch_size": 1}},
"5": {"class_type": "KSampler", "inputs": {"seed": 123, "steps": 20, "cfg": 7.0, "sampler_name": "dpmpp_2m", "scheduler": "karras", "denoise": 1.0, "model": ["1", 0], "positive": ["2", 0], "negative": ["3", 0], "latent_image": ["4", 0]}},
"6": {"class_type": "VAEDecode", "inputs": {"samples": ["5", 0], "vae": ["1", 2]}},
"7": {"class_type": "SaveImage", "inputs": {"images": ["6", 0], "filename_prefix": "ComfyUI"}}
}- Conversation History Integration: Includes recent 6 conversation messages in intent analysis
- Context-Based Intent Understanding: Accurate intent analysis considering previous conversations
- Continuous Conversations: Accurate recognition of context-dependent questions like "When is that?", "Draw it again"
- Gallery Storage: Saves generated images to Android's default photo album
- MediaStore API: Android 10+ Scoped Storage support
- Save Intent Recognition: Automatic recognition of "save the picture", "save to album"
- Error Handling: Guidance message when no image to save
- Drawing Conversation Recording: Drawing and save requests properly recorded in conversation history
- Image Clearing: Previous images automatically deleted when starting new conversation
- Continuity Assurance: All interactions recorded in conversation history to maintain context
- Korean Request Support: Directly receives and processes user's Korean requests
- Gemini Translation: Converts Korean requests to accurate English prompts
- SDXL Limitation Resolution: Solves SDXL model's Korean understanding limitation
- More Accurate Images: Generates accurate images matching user intent
- Transparent Background: Generates clean images without background
- Ghibli Style: Automatically applies Studio Ghibli style prompts
- High-Quality Optimization: Adds detailed, high quality, masterpiece keywords
- Consistent Style: All images generated in same art style
- Local Image Generation: Fast and secure image generation through ComfyUI server
- SDXL Model: Uses high-quality Stable Diffusion XL model
- MPS Acceleration: Performance optimization with Apple Silicon GPU acceleration
- 512x512 Optimization: Balance between fast generation and quality
- HTTP Communication Allowance: Network Security Config for local ComfyUI server access
- IP Address Setting: Allows access to Mac M4 local server (192.168.219.122:8000)
- AndroidManifest.xml: Security settings applied
- Timeout Extension: Extended image generation timeout to 120 seconds
- Sampling Optimization: Optimized for Ghibli style with 20 steps
- MPS Acceleration: Apple Silicon GPU utilization optimization
- Memory Management: Efficient VRAM usage
- Bad linked input Error: Intermittently occurs due to workflow JSON structure issues
- JSON Decode Error: JSON parsing error during network transmission
- Solution: Server restart or workflow retransmission
- Permission Requests: Automatic handling of gallery storage permissions on Android 10+
- Network Timeout: Timeout occurs when image generation takes too long
- Solution: Most issues resolved with 120-second timeout setting
- Please report bugs through GitHub Issues
- Include detailed reproduction methods with logs
- Suggest new feature ideas through Pull Requests
- Suggestions that improve user experience for people with developmental disabilities are welcome
- Android Studio Arctic Fox or higher
- Kotlin 1.8+
- Java 17
- ComfyUI Python 3.10+
This project is distributed under the MIT License.
- Google Gemini API: Natural language processing and prompt translation
- ComfyUI: Local image generation server
- Stability AI: SDXL model provision
- Apple: MPS acceleration support
Pwooda - AI friend for people with developmental disabilities π«β¨
- Splash Music: intro_music.mp3 plays when app launches, main screen only loads after music ends (skip button supported)
- Gemini-Based Name Recognition: When recognizing names, Gemini LLM generates natural friendly welcome messages and outputs directly to TTS/screen
- Image Generation Animation: During image generation, running.mp4 (or sprite sheet) based animation plays in screen center with transparent background
- Sprite Sheet Animation: Utilizes drawing_sheet.png (5x9, 43 frames) sprite sheet to provide smooth animation during image generation (transparent background supported)
- Completed Image Round Masking: AI-generated completed images are masked in rounded corner boxes (32dp) for clean display
- Duplicate Image Output Prevention: UI improvement to display completed images only once
- Consistent Friendly Tone: All conversations including name recognition, welcome messages, and guidance output in consistent 10-year-old girl friend tone
- When user says "Draw a cute puppy"
- Sprite sheet animation (transparent background) plays in center with "Image generating..." text displayed
- When image generation completes, completed image is displayed once in rounded corner box
- When "save the picture" is requested, it's saved to gallery
- After face recognition, "What's your name?" voice guidance
- When user says name, Gemini extracts name and outputs friendly welcome message like "Tony! Nice to meet you~ I'll call you often from now on!" to screen+TTS
- If name is not registered, guidance is repeated