Release OfflineLLM · jegly/OfflineLLM

OfflineLLM v1.0.0 — Initial Release

A fully offline, private AI chat app for Android. All LLM inference runs entirely on-device via llama.cpp. No internet permissions. No cloud. No tracking. - data never leaves the users device.

Features:

On-device inference with optimized ARM NEON/SVE/i8mm native libraries
Streaming token-by-token response display
Import any GGUF model at runtime via file picker
Multiple conversations with auto-titling and rename
Chat search and individual message deletion
Theme selector (System/Light/Dark/AMOLED Black)
Accent colour picker with 9 colour options
Configurable system prompts (General, Coder, Creative Writer, Tutor, Custom)
Temperature, max tokens, and context size controls
Optional thinking tag stripping for reasoning models
Encrypted settings via Jetpack Security
Optional biometric lock
Chat export/import as JSON
Built-in help guide for downloading models from HuggingFace
Zero network permissions — verified in manifest

Recommended models:

Gemma 3 270M (Q4_K_M) — Fast, works on 4GB RAM devices - included in this APK by default.
Qwen3.5 0.8B (Q4_K_M) — Good balance for 4-6GB RAM
Gemma 3 1B (Q4_K_M) — Recommended for 6-8GB RAM
Qwen3.5 4B (Q4_K_M) — Best quality for 8GB+ RAM

Install: Enable Unknown Sources, then install the APK via file manager or adb install.

<3 JEGLY

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OfflineLLM

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!