AI Training Platform

A self-sustaining marketplace that automates the collection of domain-specific data from the web, transforms it into high-quality training datasets, trains specialized AI models using QLoRA, and monetizes through dataset sales, API queries, and custom training services.

Features

Data Collection Pipeline: Automated scraping, extraction, AI-powered formatting, and deduplication
Marketplace: Browse and purchase curated training datasets
Model Training: QLoRA-based training on free infrastructure (Google Colab)
Inference API: Query specialized AI models via REST API
Custom Training: Upload your own data and train custom models
Payment Integration: Stripe for dataset purchases and credit management

Tech Stack

Backend

FastAPI: Modern Python web framework
SQLAlchemy: ORM for PostgreSQL
Celery: Distributed task queue
Redis: Caching and message broker
PostgreSQL: Database with pgvector extension
HuggingFace: Model hosting and inference
Stripe: Payment processing

Frontend

React: UI library
TypeScript: Type-safe JavaScript
Vite: Build tool
Zustand: State management
Axios: HTTP client

ML Pipeline

BeautifulSoup: HTML parsing
PyPDF2: PDF extraction
OpenAI/Claude: AI-powered formatting
sentence-transformers: Semantic similarity
Transformers: Model training with QLoRA

Quick Start

Prerequisites

Docker and Docker Compose
Python 3.11+
Node.js 20+

1. Clone the repository

git clone <repository-url>
cd ai-training-platform

2. Set up environment variables

cp .env.example .env
# Edit .env with your API keys

3. Start services with Docker Compose

docker-compose up -d

This will start:

PostgreSQL (port 5432)
Redis (port 6379)
Backend API (port 8000)
Celery worker
Frontend (port 3000)

4. Run database migrations

docker-compose exec backend alembic upgrade head

5. Access the application

Development Setup

Backend Development

cd backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements/dev.txt

# Run migrations
alembic upgrade head

# Start development server
uvicorn api.main:app --reload

# Run tests
pytest

# Run Celery worker
celery -A workers.celery_app worker --loglevel=info

Frontend Development

cd frontend

# Install dependencies
npm install

# Start development server
npm run dev

# Run tests
npm test

# Build for production
npm run build

Project Structure

ai-training-platform/
├── backend/
│   ├── api/                    # FastAPI application
│   │   ├── routers/           # API endpoints
│   │   ├── models/            # Pydantic schemas
│   │   ├── services/          # Business logic
│   │   ├── db/                # Database models and migrations
│   │   ├── core/              # Security utilities
│   │   └── utils/             # Helper functions
│   ├── ml_pipeline/           # Data collection and processing
│   │   ├── scraper/           # Web scraping
│   │   ├── extractor/         # Content extraction
│   │   ├── formatter/         # AI formatting
│   │   ├── trainer/           # Model training
│   │   └── deployer/          # HuggingFace deployment
│   ├── workers/               # Celery tasks
│   ├── tests/                 # Test suite
│   └── requirements/          # Python dependencies
├── frontend/
│   ├── src/
│   │   ├── components/        # React components
│   │   ├── pages/             # Page components
│   │   ├── services/          # API clients
│   │   ├── store/             # Zustand stores
│   │   ├── types/             # TypeScript types
│   │   └── utils/             # Utility functions
│   └── public/                # Static assets
├── data/                      # Data storage (gitignored)
├── docker-compose.yml         # Docker services
└── .env.example               # Environment template

API Documentation

Once the backend is running, visit http://localhost:8000/docs for interactive API documentation (Swagger UI).

Testing

Backend Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=api --cov=ml_pipeline

# Run specific test file
pytest tests/unit/test_auth.py

# Run property-based tests
pytest tests/property/

Frontend Tests

# Run all tests
npm test

# Run with coverage
npm test -- --coverage

# Run specific test file
npm test -- src/components/Auth.test.tsx

Deployment

See DEPLOYMENT.md for detailed deployment instructions for:

AWS Lambda (Serverless)
AWS ECS (Containers)
Traditional VPS

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For questions and support, please open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
frontend		frontend
notebooks		notebooks
.env.example		.env.example
.gitignore		.gitignore
DOCKER_SETUP.md		DOCKER_SETUP.md
PROJECT_STRUCTURE.md		PROJECT_STRUCTURE.md
README.md		README.md
docker-compose.yml		docker-compose.yml
start.ps1		start.ps1
stop.ps1		stop.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Training Platform

Features

Tech Stack

Backend

Frontend

ML Pipeline

Quick Start

Prerequisites

1. Clone the repository

2. Set up environment variables

3. Start services with Docker Compose

4. Run database migrations

5. Access the application

Development Setup

Backend Development

Frontend Development

Project Structure

API Documentation

Testing

Backend Tests

Frontend Tests

Deployment

Contributing

License

Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Training Platform

Features

Tech Stack

Backend

Frontend

ML Pipeline

Quick Start

Prerequisites

1. Clone the repository

2. Set up environment variables

3. Start services with Docker Compose

4. Run database migrations

5. Access the application

Development Setup

Backend Development

Frontend Development

Project Structure

API Documentation

Testing

Backend Tests

Frontend Tests

Deployment

Contributing

License

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages