A Pinokio app that provides a local OpenAI-compatible HTTP endpoint backed by Puter AI. Access 500+ AI models through Puter's free API using the standard OpenAI Chat Completions format.
A translation layer between applications expecting an OpenAI-compatible API and Puter AI's backend. Instead of running local models or paying for OpenAI API keys, use Puter's free AI service through a localhost endpoint.
- OpenAI-Compatible Endpoint: POST to
/v1/chat/completionsjust like OpenAI - 500+ Models Available: Access GPT-5, Claude, Gemini, and more through Puter
- Model Aliasing: Spoof model names so apps expecting "gpt-4o" work seamlessly
- Searchable Dropdowns: Quick search-as-you-type for both Puter and spoofed models
- Preset Configurations: Save and load your favorite model combinations
- Auto-Start Workflow: Pinokio automatically installs dependencies and starts the server
- Hot Configuration: Changes take effect immediately without server restart
- Health Monitoring: Built-in connectivity and status checking
- Open Pinokio
- Navigate to the "Discover" tab
- Search for "Puter Local Model Emulator" or paste the repository URL
- Click "Install"
The app will automatically:
- Install Node.js dependencies
- Start the server
- Open the configuration UI
git clone https://github.com/amondeuz/model-emulator.git
cd puter-local-model-emulator
npm install
npm startServer starts on http://localhost:11434 by default.
The configuration UI opens automatically when the app starts, or access it at:
http://localhost:11434/config.html
Features:
- Puter Model: Search/select from 500+ available models (test models filtered out)
- Spoofed Model ID: Set the model name your app expects (e.g., "gpt-4o")
- Presets: Save configurations for quick switching between setups
- Status Indicators: See Puter connectivity and emulator state at a glance
Workflow:
- Select a Puter model from the searchable dropdown
- (Optional) Enter a spoofed OpenAI model ID
- Click "Start" to activate the emulator
- Use the endpoint in your applications
Use Pinokio's "stop start.json" button on the app's home page. The server runs as a daemon and persists even if you navigate away from the Emulator tab - this is intentional so other apps can continue using the endpoint.
Common Puter models include:
GPT Models:
gpt-5-nano- Fastest, optimized for low latencygpt-5-mini- Balanced for general tasksgpt-5- Full GPT-5 with advanced reasoninggpt-5.1- Latest versiongpt-4o- GPT-4 optimized
Other Providers via Puter:
- Claude, Gemini, Llama, Mistral, and more
- See full list in the UI's searchable dropdown
Point any OpenAI-compatible application to:
http://localhost:11434/v1/chat/completions
Example: curl
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'Example: Python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:11434/v1",
api_key="not-needed" # Puter handles auth
)
response = client.chat.completions.create(
model="gpt-4o-mini", # Maps to your configured Puter model
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)Example: Another Pinokio App
Configure the app with:
- API Base URL:
http://localhost:11434/v1 - API Key: (any value or leave blank)
- Model: Your configured spoofed model ID
curl http://localhost:11434/healthReturns Puter connectivity status and server health.
/puter-local-model-emulator
├── server/
│ ├── index.js # Express server
│ ├── config.js # Configuration with hot-reload
│ ├── logger.js # Logging and diagnostics
│ ├── puter-client.js # Puter.js integration
│ └── openai-adapter.js # OpenAI format translation
├── config/
│ ├── default.json # User configuration
│ ├── models-cache.json # Cached model list
│ └── saved-configs.json # Saved presets
├── public/
│ └── config.html # Configuration UI
├── pinokio.js # Pinokio app definition (v4.0)
├── install.json # Dependency installation
├── start.json # Server startup (daemon)
└── package.json
OpenAI-compatible chat completions.
Request:
{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello"}],
"temperature": 0.7,
"max_tokens": 1000
}Response:
{
"id": "chatcmpl-...",
"object": "chat.completion",
"created": 1234567890,
"model": "gpt-4o-mini",
"choices": [{
"index": 0,
"message": {"role": "assistant", "content": "Hi!"},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 5,
"total_tokens": 15
}
}Server health and Puter connectivity check.
Current configuration, presets, models, and emulator state.
Activate the emulator with specified models.
Deactivate the emulator.
Save a configuration preset.
- Text-Only: Chat completions only - no images, audio, or file uploads
- No Streaming: Responses returned complete, not streamed
- Estimated Tokens: Token counts approximate (4 chars ≈ 1 token)
- No Function Calling: OpenAI tool/function calling not supported
Server won't start
- Check if port 11434 is in use
- Change port in
config/default.json - Verify Node.js 16+ installed
Models not loading
- Check internet connection (Puter requires network)
- Verify
PUTER_AUTH_TOKENif using authenticated access - Click "Refresh Models" in UI
Puter appears offline
- Test connectivity:
curl http://localhost:11434/health - Check Puter service status at puter.com
- Try different Puter models
Configuration UI won't open
- Ensure server running (check Pinokio app home)
- Access directly:
http://localhost:11434/config.html - Check browser console for errors
Running Tests:
npm testAdding Backends:
Edit server/puter-client.js to integrate alternative AI providers.
Adding Endpoints:
Add routes in server/index.js for features like:
/v1/embeddings- Text embeddings/v1/models- List available models/v1/images/generations- Image generation
MIT
Feel free to fork and extend for your needs.