AI Research Ecosystem for Teaching Experiment
An intelligent educational chatbot system for Copenhagen Business School
ARETE (AI Research Ecosystem for Teaching Experiment) is an educational chatbot platform designed to support students at Copenhagen Business School. The system leverages Fine tuned open source Small Language Models with Retrieval Augmented Generation (RAG) technology to provide step to step guidance and course-specific assistance across multiple disciplines.
ARETE serves as a friendly and supportive teaching assistant that helps students with:
- Machine Learning concepts and problem-solving
- Supply Chain Management principles and applications
- Internet of Things development and troubleshooting
For questions, feedback, or collaboration opportunities, please reach out:
- Email: kok.digi@cbs.dk
- GitHub: koskath/arete
- LinkedIn: Konstantinos Katharakis
- Institution: Copenhagen Business School - Department of Digitalisation
This repository contains two main components:
The deployment folder contains the production-ready application consisting of:
- Backend API (
app.py): FastAPI-based REST API that handles chat requests, streaming responses, and conversation management - Frontend Application (
app/): Next.js React application with TypeScript providing an intuitive chat interface - RAG Pipeline (
rag_pipeline.py): Retrieval Augmented Generation system that searches vector stores for relevant course content - Model Integration (
instruct_model.py): Interfaces with various LLM providers (HuggingFace, Llama Cloud, Codestral) for generating responses - Vector Stores: ChromaDB-based vector stores for each course (ML, SC, IoT) containing embedded lecture materials
- Course Configuration (
load_course_specific.py): Dynamic loading of course-specific system prompts and vector stores
Key Features:
- Streaming chat responses for real-time interaction
- Session-based conversation history management
- Course-specific knowledge retrieval
- Feedback collection system for continuous improvement
- Multi-course support (ML, SC, IoT)
Tech Stack:
- Backend: Python, FastAPI, LangChain, ChromaDB
- Frontend: Next.js, React, TypeScript, Tailwind CSS
- Models: Fine-tuned Mistral, Codestral, Llama Cloud
The workshop folder contains experimental code, research scripts, and development tools for:
- Model Fine-tuning: Scripts for training and fine-tuning language models on course-specific datasets
- Data Processing: Tools for preparing training datasets and processing course materials
- Experimentation: Research notebooks and scripts for testing new approaches and model configurations
- Evaluation: Scripts for assessing model performance and response quality
Note: This folder is used for research and development purposes. The code here may be experimental and not production-ready.
- Python 3.8+
- Node.js 18+
- MySQL database (for conversation logging)
- HuggingFace API token (for model access)
- Clone the repository:
git clone https://github.com/koskath/arete.git
cd arete- Install Python dependencies:
pip install -r requirements.txt- Install Node.js dependencies:
cd arete_deployment
npm install- Set up environment variables:
# Create a .env file with:
HF_TOKEN=your_huggingface_token
ALLOWED_ORIGINS=*
# Add database credentials and other configuration as needed- Start the FastAPI backend:
cd arete_deployment
python app.py- Start the Next.js frontend (in a separate terminal):
cd arete_deployment
npm run devThe application will be available at http://localhost:80
This project is developed at Copenhagen Business School for educational and research purposes.