Project link: https://dhunganapramod9-litgptt-app-ngzujr.streamlit.app/
An advanced, feature-rich Streamlit application for interacting with Large Language Models using LitGPT. This application provides a modern chat interface with streaming responses, conversation history, and extensive customization options.
- π¬ Interactive Chat Interface: Modern chat UI with conversation history
- β‘ Streaming Responses: Real-time token generation with live updates
- π€ Multiple Model Support: Easy switching between different LLM models
- π System Prompts: Customize model behavior with system instructions
- π Conversation Management: Save, export, and clear conversations
- Temperature Control: Adjust creativity and randomness (0.0 - 2.0)
- Top-K Sampling: Limit token selection to top K candidates
- Top-P (Nucleus) Sampling: Dynamic token selection based on cumulative probability
- Max Tokens: Control response length (10 - 512 tokens)
- Streaming Toggle: Enable/disable real-time token streaming
- Token Counting: Track total tokens generated
- Performance Metrics: Monitor generation speed (tokens/second)
- Generation Statistics: Per-message and cumulative statistics
- Model Information: Display model parameters and device info
- Export Conversations: Download conversations as JSON
- Clear History: Reset conversation with one click
- Model Reloading: Hot-reload models without restarting
- Python 3.8 or higher
- CUDA-capable GPU (recommended) or CPU
-
Clone or navigate to the project directory:
cd litchat -
Create a virtual environment:
python -m venv venv
-
Activate the virtual environment:
- Windows:
venv\Scripts\activate
- Linux/Mac:
source venv/bin/activate
- Windows:
-
Install dependencies:
pip install -r requirements.txt
-
Download a model (choose one):
litgpt download --repo_id TinyLlama/TinyLlama-1.1B-Chat-v1.0 # or litgpt download --repo_id microsoft/phi-2 # or litgpt download --repo_id meta-llama/Llama-2-7b-chat-hf
-
Run the application:
streamlit run app.py
The application will open in your default web browser at http://localhost:8501
- Select a Model: Choose from the dropdown in the sidebar
- Adjust Parameters: Fine-tune generation settings as needed
- Type Your Message: Enter your prompt in the chat input
- View Response: Watch the AI generate text in real-time (if streaming is enabled)
Use system prompts to guide the model's behavior:
"You are a helpful coding assistant."- For programming help"You are a creative writing assistant."- For creative content"You are a scientific researcher."- For technical explanations
-
Temperature (0.0 - 2.0):
- Lower (0.0-0.5): More deterministic, focused responses
- Medium (0.6-0.9): Balanced creativity and coherence
- Higher (1.0-2.0): More creative, diverse responses
-
Top-K (0-100):
- Limits sampling to the K most likely tokens
- 0 = disabled (use all tokens)
- Lower values = more focused, higher = more diverse
-
Top-P (0.0 - 1.0):
- Nucleus sampling threshold
- 0.0 = only most likely token
- 1.0 = all tokens considered
- Works with Top-K for fine-grained control
- Click "π₯ Export Conversation" in the sidebar
- Click "Download JSON" to save the conversation
- File includes messages, model used, and timestamp
- Chat Messages: User and assistant messages with timestamps
- Statistics: Per-message token count, generation time, and speed
- Input Field: Type your messages at the bottom
- Model Selection: Choose and switch between models
- Generation Parameters: Fine-tune all sampling parameters
- System Prompt: Set behavioral instructions
- Model Management: Reload models, view statistics
- Data Management: Export and clear conversations
The application supports any model compatible with LitGPT, including:
- TinyLlama/TinyLlama-1.1B-Chat-v1.0 - Fast, lightweight model
- microsoft/phi-2 - High-quality small model
- meta-llama/Llama-2-7b-chat-hf - Powerful conversational model
- mistralai/Mistral-7B-Instruct-v0.2 - Instruction-tuned model
- HuggingFaceH4/zephyr-7b-beta - Fine-tuned conversational model
To use a different model:
- Download it:
litgpt download --repo_id <model_name> - Add it to the model list in
app.py - Select it from the dropdown
Edit the model_options list in app.py:
model_options = [
"YourModel/ModelName",
# Add more models here
]Modify the default values in the sidebar sliders:
temperature = st.slider("Temperature", ..., value=0.7) # Change default
max_tokens = st.slider("Max Tokens", ..., value=150) # Change default- Use GPU: Significantly faster generation (automatic if CUDA available)
- Smaller Models: Faster responses, lower memory usage
- Adjust Max Tokens: Lower values = faster generation
- Disable Streaming: Slightly faster for non-interactive use
- Model Caching: Models are cached after first load
Error: Model not found
Solution: Download the model first:
litgpt download --repo_id <model_name>CUDA out of memory
Solution:
- Use a smaller model
- Reduce max_tokens
- Use CPU instead of GPU
Solution:
- Ensure GPU is being used
- Use a smaller model
- Reduce max_tokens
- Disable streaming
This project uses LitGPT, which is licensed under the Apache License 2.0.
Feel free to submit issues, fork the repository, and create pull requests for any improvements.
- LitGPT - Lightning AI's GPT implementation
- Streamlit - The web framework
- PyTorch - Deep learning framework
For issues related to:
- LitGPT: Check the LitGPT documentation
- This Application: Open an issue in the repository