π Groq - Evals
Groq Model Evaluator is a powerful web application that allows you to compare and evaluate different Groq language models side by side. Built with Next.js and FastAPI, it provides an intuitive interface for testing model performance, analyzing responses, and making data-driven decisions about which model best suits your needs.
- π Side-by-side model comparison
- π€ Automated reasoning about model performance
- π¨ Beautiful, responsive UI
- π Secure API key management Yet to come:
- π Visual metric representations
- π Comprehensive evaluation metrics
- π― Semantic similarity analysis
- Python 3.8+
- Node.js 18+
- Groq API key (Get one here)
- Clone the repository:
git clone https://github.com/yourusername/groq-evals.git
cd groq-evals- Install backend dependencies:
cd backend
python -m venv venv
source venv/bin/activate # On Windows: .\venv\Scripts\activate
pip install -r requirements.txt- Install frontend dependencies:
cd frontend
npm install- Create a
.envfile in the backend directory:
AVAILABLE_MODELS=["gemma2-9b-it", "llama-3.1-8b-instant", "mixtral-8x7b-32768"]
EVALUATION_MODELS=["deepseek-r1-distill-llama-70b"]- Start the backend server:
cd backend
uvicorn app.main:app --reload- Start the frontend development server:
cd frontend
npm run dev- Open http://localhost:3000 in your browser
- Compare responses from different Groq models side by side
- Automatic evaluation of response quality
- Detailed reasoning for model selection
We welcome contributions! Here's how you can help:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature- Commit your changes:
git commit -m 'Add amazing feature'- Push to your branch:
git push origin feature/amazing-feature- Open a Pull Request
- Follow the existing code style
- Add comments for complex logic
- Update documentation as needed
- Add tests for new features
- Ensure all tests pass before submitting PR
This project is licensed under the MIT License
- Groq for their amazing API
- The open-source community for inspiration and tools
- All contributors who help improve this project
Have questions? Need help? Feel free to:
- Open an issue
- Start a discussion
- Reach out to maintainers
