Skip to content

fauzisho/kmp-llama

Repository files navigation

🎥 KMP-Llama: SmolVLM Camera App

This repository is a simple demo for how to use llama.cpp server and mobile application with SmolVLM 500M to get real-time object detection

KMP-Llama Demo KMP-Llama Demo 2

How to setup on Laptop <> Android

  1. Install llama.cpp
  2. Run llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUF --host 0.0.0.0 --port 8080
    Note: you may need to add -ngl 99 to enable GPU (if you are using NVidia/AMD/Intel GPU)
    Note (2): You can also try other models here
  3. Run ifconfig | grep "inet"to get the LAN (Wi-Fi) address Example: inet 127.0.0.1 netmask 0xff000000 inet 192.168.0.244 netmask 0xffffff00 broadcast 192.168.0.255
  4. Run KMP App project (eg. Android)
  5. Optionally change the instruction (for example, make it returns JSON)
  6. Click on "Start" and enjoy

How to setup on Local Android

  1. Install Termux from Google Play
  2. pkg update && pkg upgrade
  3. pkg install cmake clang make git wget
  4. git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp
  5. mkdir build cd build cmake .. cmake - -build . - -config Release
  6. ./bin/llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUF

Clone and Run

git clone <repository-url>
cd kmp-llama

# Android
./gradlew :composeApp:installDebug

# Desktop
./gradlew :composeApp:run

# iOS
open iosApp/iosApp.xcodeproj

🔌 API Integration

SmolVLM Configuration

{
  "server_url": "http://192.168.0.244:8080",
  "endpoint": "/v1/chat/completions",
  "format": "OpenAI-compatible"
}

Request Format

VisionRequest(
  model = "smolvlm",
  messages = [
    Message(
      role = "user",
      content = [
        Content(type = "text", text = "What do you see?"),
        Content(type = "image_url", 
               imageUrl = ImageUrl("data:image/jpeg;base64,..."))
      ]
    )
  ]
)

📱 Platform Implementation Status

Platform UI Camera API Status
Android ✅ CameraX ✅ Ktor Complete
iOS 🔄 AVFoundation ✅ Ktor UI Ready
Desktop 🔄 Webcam ✅ Ktor UI Ready

🎯 Roadmap

Immediate (v1.1)

  • iOS camera implementation with AVFoundation
  • Desktop webcam integration
  • Image gallery and history
  • Offline model support

Future (v2.0)

  • Multi-model support (GPT-4V, Claude Vision)
  • Voice commands and audio responses
  • Real-time object tracking
  • AR overlay integration
  • Cloud sync and sharing

Future (v3.0)

🤝 Contributing

  1. Fork the repository
  2. Create feature branch: git checkout -b feature/amazing-feature
  3. Commit changes: git commit -m 'Add amazing feature'
  4. Push to branch: git push origin feature/amazing-feature
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Please use this bibtex if you want to cite this repository in your publications:

@misc{kmpllama,
   author = {Sholichin, Fauzi},
   title = {KMP-Llama: SmolVLM Camera App},
   year = {2025},
   publisher = {GitHub},
   journal = {GitHub repository},
   howpublished = {\url{https://github.com/fauzisho/kmp-llama}},
  }

Built with ❤️ using Kotlin Multiplatform

About

Kotlin Multi Platform (KMP) Real-time object detection app

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors