An AI LLM powered Telegram ChatBot with switchable backends; supports OpenAI, Mistral, OpenRouter and Groq
- Multiple AI backends: OpenAI (GPT-3.5, GPT-4, GPT-4o), Anthropic Claude, OpenRouter models, and Groq (Llama 3)
- Image input/output support for vision and image generation models
- Conversation history with configurable max rounds
- Kiosk Mode for locked-down, dedicated use cases
- Extensible Plugin System (v1.9.0+) for AI-powered customization in kiosk mode
BOT_KEY- Telegram bot tokenAPI_KEY- OpenAI API keyANTHROPIC_API_KEY- Anthropic API key (for Claude models)OPENROUTER_API_KEY- OpenRouter API keyGROQ_API_KEY- Groq API key (for Llama models)
Kiosk mode provides a locked-down instance ideal for educational environments, public terminals, or dedicated single-purpose bots.
Kiosk mode is configured via kiosk.conf file. Copy kiosk.conf.example to kiosk.conf and modify as needed:
[kiosk]
# Enable kiosk mode (true/false)
enabled = true
# The model to use in kiosk mode
model = openrouter:google/gemini-2.0-flash-001
# Path to the system prompt file
prompt_file = kiosk_prompt_example.txt
# Inactivity timeout in seconds (0 = disabled)
inactivity_timeout = 3600When kiosk mode is enabled:
- Model selection is locked (users cannot change the model)
- System prompt is loaded from file and cannot be modified
/maxroundschanges are blocked/listopenroutermodelsis disabled- Only
/start,/help,/clear,/status, and/formatcommands are available - Unrecognized commands display helpful error messages
- Multi-user chats are still supported with separate conversation histories
- Visual indicator (π) shows kiosk mode is active
When using image-capable models (e.g., Gemini models with image generation) in kiosk mode, the bot automatically ensures responses include both images and explanatory text:
Automatic Enhancements:
- System Prompt Enhancement: The system prompt is automatically enhanced with instructions to always provide both image and text
- User Prompt Enhancement: When users request images (using keywords like "draw", "diagram", "illustrate"), their prompts are enhanced to explicitly request both components
- Response Validation: If a model returns only an image without text, a fallback description is provided
- Reasoning Field Fallback: Text explanations from the
reasoningfield (used by some models like Gemini) are automatically extracted
Benefits for Educational Use:
- Students receive visual aids with clear explanations
- Improves accessibility for all learning styles
- Ensures context is never lost when images are generated
- Supports the Socratic teaching method with visual + verbal guidance
Example Interaction:
Student: "Draw a diagram of the water cycle"
Bot Response:
[Generated image of water cycle]
"This diagram shows the water cycle with four main stages: evaporation
from bodies of water, condensation in clouds, precipitation as rain,
and collection back into water bodies."
- Copy the example config:
cp kiosk.conf.example kiosk.conf- Edit
kiosk.confwith your settings:
[kiosk]
enabled = true
model = openrouter:google/gemini-2.0-flash-001
prompt_file = kiosk_prompt_example.txt
inactivity_timeout = 3600-
Create or edit your system prompt file (e.g.,
kiosk_prompt_example.txt) -
Run the bot:
python ai-tgbot.pyThe bot supports chat logging with separate log levels for user and assistant messages. This provides fine-grained control over what gets logged, which is useful for privacy, auditing, and debugging purposes.
Add a [logging] section to your kiosk.conf file:
[logging]
# Separate log levels for user and assistant messages
# Values: off, minimum (text only), extended (text + attachments)
log_user_messages = minimum
log_assistant_messages = extended
# Directory where chat logs are saved
log_directory = ./chat_logsoff- No logging for this roleminimum- Log text messages onlyextended- Log text messages and images/attachments
Privacy/Auditing: Log only user messages for audit trails:
log_user_messages = minimum
log_assistant_messages = offDebugging: Log only assistant responses to troubleshoot model behavior:
log_user_messages = off
log_assistant_messages = extendedFull Logging: Log everything for comprehensive records:
log_user_messages = extended
log_assistant_messages = extendedFor backward compatibility, you can still use the legacy log_chats setting, which applies the same level to both user and assistant messages:
[logging]
log_chats = minimumThe plugin system enables powerful, AI-driven customization of kiosk mode through a flexible hook architecture. Plugins can transform messages, add features, and integrate external services.
The plugin system provides:
- 10 fine-grained hooks for message transformation at every stage
- AI helper utilities for calling vision models, expanding captions, etc.
- Robust error handling with timeouts and automatic plugin disabling on failures
- Rich context passed to every hook with session data, history, metadata
- Easy development with base class, comprehensive docs, and examples
- Enable plugins in
kiosk.conf:
[PluginConfig]
enabled = true
timeout = 5.0
max_failures = 3
debug = false- Create your plugin (
kiosk-custom.py):
from kiosk_plugin_base import KioskPlugin
class MyPlugin(KioskPlugin):
def pre_user_text(self, text, context):
# Transform user input before processing
return text.strip().lower()
def post_assistant_text(self, text, context):
# Modify bot responses before sending
return text + "\n\nPowered by AI"
# Implement all 10 hooks (can be pass-through)
def post_user_text(self, text, context): return text
def pre_user_images(self, images, text, context): return images
def post_user_images(self, images, text, context): return images
def pre_assistant_text(self, text, context): return text
def pre_assistant_images(self, images, text, context): return images
def post_assistant_images(self, images, text, context): return images
def on_session_start(self, context): pass
def on_message_complete(self, context): pass- Restart the bot to load the plugin
All hooks receive a context dict with:
session_data- Full session data (use carefully!)chat_id- Current chat/session IDhistory- Conversation historymetadata- Plugin-specific metadata dict (for storing state)ai_helper- PluginAIHelper instance for AI callsmodel- Current model namekiosk_mode- Always True (plugins only work in kiosk mode)
pre_user_text(text, context) -> str
- Called: Immediately after receiving user message
- Use for: Input validation, profanity filtering, preprocessing
post_user_text(text, context) -> str
- Called: After prompt enhancement, before sending to AI
- Use for: Adding context, modifying prompts
pre_assistant_text(text, context) -> str
- Called: Immediately after receiving AI response
- Use for: Detecting patterns (LaTeX, code blocks), metadata extraction
post_assistant_text(text, context) -> str
- Called: Before sending response to user
- Use for: Formatting, adding disclaimers, replacing placeholders
pre_user_images(images: List[str], text, context) -> List[str]
- Called: After receiving images, before adding to message
- Images are base64-encoded strings
- Use for: AI-powered caption expansion, image validation
post_user_images(images, text, context) -> List[str]
- Called: After adding images to message, before sending to AI
- Use for: Image preprocessing, adding watermarks
pre_assistant_images(images, text, context) -> List[str]
- Called: Immediately after receiving AI-generated images
- Use for: Rendering LaTeX formulas as images, adding visualizations
post_assistant_images(images, text, context) -> List[str]
- Called: Before sending images to user
- Use for: Adding generated images (syntax highlighting), postprocessing
on_session_start(context) -> None
- Called: When a new session is initialized
- Use for: Setup, initialization, welcome logic, analytics
on_message_complete(context) -> None
- Called: After complete user+assistant exchange
- Use for: Logging, analytics, cleanup, state updates
Plugins can register custom slash commands that execute arbitrary code. Commands work in both kiosk and regular modes.
get_commands() -> Dict[str, Dict[str, Any]]
- Returns: Dictionary mapping command names to command info
- Called: Once during plugin initialization
def get_commands(self):
return {
'generate-worksheets': {
'description': 'Generate practice worksheets',
'handler': self.handle_generate_worksheets,
'available_in_kiosk': True
}
}
def handle_generate_worksheets(self, chat_id, context):
# Send initial status message
self.send_message(chat_id, "π Generating... Please wait.", context)
# Use AI to generate content
ai_helper = context['ai_helper']
worksheet = ai_helper.quick_call(
system="Create educational worksheets",
user="Generate 5 math problems"
)
# Create HTML document
html_content = f"<html><body>{worksheet}</body></html>"
html_bytes = html_content.encode('utf-8')
# Send as downloadable file
self.send_document(
chat_id,
html_bytes,
'worksheet.html',
'β
Here is your worksheet!',
context
)Helper methods for commands:
send_message(chat_id, text, context)- Send text messagessend_document(chat_id, data, filename, caption, context)- Send files/documents
Commands automatically appear in /help and are available immediately after plugin loads.
The ai_helper object provides utilities for plugin development:
# Call AI with text and/or images
response = context['ai_helper'].call_ai(
prompt="Describe this image",
model="gpt-4o-mini", # Optional, defaults to gpt-4o-mini
max_tokens=500,
images=["base64_img_data"] # Optional list
)
# Quick system+user message call
response = context['ai_helper'].quick_call(
system="You are a helpful assistant",
user="What is 2+2?",
model="gpt-4o-mini" # Optional
)
# Convert between PIL and base64 (requires Pillow)
pil_image = context['ai_helper'].base64_to_pil(base64_string)
base64_string = context['ai_helper'].pil_to_base64(pil_image, format='PNG')from kiosk_plugin_base import KioskPlugin
class CaptionExpander(KioskPlugin):
def pre_user_images(self, images, text, context):
"""Auto-expand brief captions using AI vision"""
if images and len(text.strip()) < 20:
ai_helper = context['ai_helper']
description = ai_helper.call_ai(
prompt="Describe this image briefly (1-2 sentences)",
model="gpt-4o-mini",
max_tokens=150,
images=[images[0]]
)
if description:
# Store for later use
context['metadata']['ai_caption'] = description
print(f"[Plugin] Generated caption: {description[:50]}...")
return images
# ... implement other hooks as pass-through ...See kiosk-custom.py.example for a full-featured plugin demonstrating:
- AI Vision Caption Expansion - Auto-describe images with brief/no text
- LaTeX Rendering - Detect
$formula$and render to images - Syntax Highlighting - Render code blocks as highlighted images
- Profanity Filter - Basic word filtering on user input
- Analytics - Track message counts and usage in metadata
- Custom Commands -
/generate-worksheetsand/summarycommands that:- Send multiple progress messages
- Use AI to analyze conversation history
- Generate and send HTML documents
- Work in both kiosk and regular modes
All features gracefully degrade if dependencies (matplotlib, pygments) are missing.
[PluginConfig]
# Enable/disable plugin system
enabled = true
# Maximum hook execution time (seconds)
timeout = 5.0
# Max failures before auto-disable
max_failures = 3
# Debug logging for plugin execution
debug = falseRun the comprehensive test suite:
python test_kiosk_plugin.pyTests cover:
- Plugin base class structure
- Hook invocation and data transformation
- Error handling and timeout behavior
- AI helper utilities
- Health monitoring
- Context building
- Full pipeline integration
- Timeout Protection - All hooks have 5s default timeout to prevent hanging
- Error Isolation - Plugin errors won't crash the bot; original data passes through
- Health Monitoring - Plugins auto-disable after repeated failures
- No Arbitrary Imports - Plugin file is loaded directly, not via exec()
- Context Immutability - Avoid mutating session_data directly; use metadata dict
- Graceful Degradation - Handle missing dependencies cleanly
Using Metadata for State:
def on_session_start(self, context):
context['metadata']['message_count'] = 0
context['metadata']['images_processed'] = 0
def on_message_complete(self, context):
context['metadata']['message_count'] += 1Chaining Transformations:
def pre_assistant_text(self, text, context):
# Detect LaTeX formulas
context['metadata']['has_latex'] = '$$' in text
return text
def pre_assistant_images(self, images, text, context):
# Render formulas if detected
if context['metadata'].get('has_latex'):
rendered = self._render_latex_formulas(text)
images.extend(rendered)
return imagesConditional Processing:
def post_user_text(self, text, context):
# Only process for certain users or sessions
if context['chat_id'] in self.premium_users:
return self.enhance_premium(text)
return textPlugin not loading:
- Check file is named exactly
kiosk-custom.py - Ensure plugin class inherits from
KioskPlugin - Verify all 10 methods are implemented
- Check console for error messages
Plugin disabled automatically:
- Check logs for error messages
- Verify hooks don't exceed timeout
- Ensure no uncaught exceptions
- Increase
max_failuresif needed
AI helper not working:
- Verify
OPENROUTER_API_KEYis set - Check network connectivity
- Review debug logs for API errors
/help- Show help message/clear- Clear conversation context/status- Show current chatbot status/format <modality> [aspect_ratio] [image_size]- Control what model generates (see below)/maxrounds <n>- Set max conversation rounds/gpt3,/gpt4,/gpt4o,/gpt4omini- Switch to OpenAI models/claud3opus,/claud3haiku- Switch to Anthropic Claude models/llama38b,/llama370b- Switch to Groq Llama models/openrouter <model>- Switch to an OpenRouter model/listopenroutermodels- List available OpenRouter models
/start- Show welcome message/help- Show kiosk mode help/clear- Clear conversation context/status- Show current status (with kiosk mode indicator)/format <modality> [aspect_ratio] [image_size]- Control what model generates (see below)
The /format command controls what you ask the model to generate via the OpenRouter API's modalities and image_config parameters. The user always sees everything the model returns (except duplicates).
Modalities (what to request from the model):
auto(default) - Let the model decidetext- Request text-only responsesimage- Request image-only responsestext+image- Request both text and image
Aspect Ratios (optional, for Gemini models):
1:1,16:9,9:16,4:3,3:4
Image Sizes (optional, for Gemini models):
SD,HD,4K
Example Usage:
/format text+image # Request both text and image
/format image 16:9 4K # Request 4K image with 16:9 aspect ratio
/format text+image 1:1 HD # Request both, with 1:1 HD image
/format auto # Let model decide
Use /format without arguments to see current settings.
Use /status to see your current format configuration.
When using image-capable models (models that can generate images) in kiosk mode, the bot automatically ensures that responses always include both images and explanatory text:
Implementation Details:
-
Model Detection: The
model_supports_image_output()function checks if a model supports image generation by querying the OpenRouter API capabilities cache. -
System Prompt Enhancement: When initializing a session with an image-capable model in kiosk mode, the system prompt is automatically enhanced with explicit instructions:
**IMPORTANT**: When generating images, always provide BOTH: 1. A generated image that directly addresses the request 2. A clear text explanation (1-3 sentences) describing what the image shows Never generate only an image without accompanying text explanation. -
User Prompt Enhancement: When users request images (detected via keywords like "draw", "diagram", "illustrate"), their prompts are automatically enhanced with:
(Please provide both a visual representation AND a text explanation.) -
Response Parsing:
- Primary: Extract text from
contentfield - If no text in content but images exist, only images are shown
- In kiosk mode with image-capable models, if images exist but no text, provide placeholder: "(Image generated without text description)"
- Note: The
reasoningfield (which contains internal model thinking) is intentionally NOT used
- Primary: Extract text from
-
Image Request Keywords: The following keywords trigger user prompt enhancement:
draw,sketch,diagram,illustrate,visualize,show mepicture,image,graph,chart,plot,creategenerate,make,design
Testing: Run python test_image_text_kiosk.py to validate the implementation.
When using image-capable models like Google Gemini via OpenRouter for image generation, the API returns responses in the following format:
- Text responses: Text content is in the
contentfield - Image responses: Images are returned in a separate array, with text description in the
contentfield - Internal reasoning: Some models include a
reasoningfield with internal thinking - this is NOT shown to users
The bot handles responses by:
- Extracting text from the
contentfield (primary source) - Extracting images from the dedicated images array
- In kiosk mode, providing a placeholder if images exist without text description
- Never displaying the
reasoningfield (which contains internal model thinking, not user-facing content)
This ensures users see the actual response content without being confused by internal model reasoning.