An interactive web-based platform for learning and experimenting with foundational machine learning algorithms. Inspired by Andrew Ng's Machine Learning course, ML Pathways provides a hands-on environment where users can explore ML problems, interact with an AI agent, generate code, and execute experiments safely.
- 9 ML Problem Types: Linear regression, logistic regression, neural networks, clustering, and more
- Interactive AI Assistant: Chat with GPT-4, Claude, or Gemini for guidance, EDA, and Q&A
- Automated Code Generation: Generate production-ready Python code for ML tasks
- Sandboxed Execution: Safe code execution using E2B Code Interpreter
- Sample Datasets: Curated datasets for each problem type
- Custom Dataset Upload: Bring your own CSV files
- Automated EDA: Instant exploratory data analysis with statistics and insights
- Interactive Visualizations: Charts and graphs using Plotly.js and Recharts
- Experiment Tracking: Save and manage your ML experiments
| Layer | Technology |
|---|---|
| Frontend | Next.js 15, React 19, TypeScript |
| UI Components | Shadcn UI, Tailwind CSS |
| Backend | Next.js API Routes |
| Database | Neon Serverless Postgres |
| ORM | Drizzle ORM |
| Authentication | BetterAuth (ready to configure) |
| AI Providers | OpenAI GPT-4, Anthropic Claude, Google Gemini |
| Code Execution | E2B Code Interpreter |
| File Storage | Cloudflare R2 (ready to configure) |
| Charts | Plotly.js, Recharts |
| Monitoring | Sentry (ready to configure) |
| Deployment | Vercel |
- Node.js 18+ and npm
- PostgreSQL database (Neon recommended)
- At least one AI provider API key (OpenAI, Anthropic, or Google)
- E2B API key for code execution (optional but recommended)
- Clone the repository
git clone https://github.com/yourusername/ml-pathways.git
cd ml-pathways- Install dependencies
npm install- Set up environment variables
Copy .env.example to .env and fill in your values:
cp .env.example .envRequired environment variables:
# Database
DATABASE_URL=your_neon_postgres_connection_string
# AI Provider (choose one or more)
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
GOOGLE_API_KEY=your_google_key
# Set your preferred provider (default: claude)
AI_PROVIDER=claude # or openai, gemini
# Code Execution (optional)
E2B_API_KEY=your_e2b_key
# Authentication (optional)
BETTER_AUTH_SECRET=your_secret_key
BETTER_AUTH_URL=http://localhost:3000
# File Storage (optional)
CLOUDFLARE_R2_ACCOUNT_ID=
CLOUDFLARE_R2_ACCESS_KEY_ID=
CLOUDFLARE_R2_SECRET_ACCESS_KEY=
CLOUDFLARE_R2_BUCKET_NAME=
# Monitoring (optional)
SENTRY_DSN=- Set up the database
Run database migrations:
npm run db:push- Start the development server
npm run devOpen http://localhost:3000 in your browser.
ml-pathways/
├── src/
│ ├── app/ # Next.js app directory
│ │ ├── api/ # API routes
│ │ │ ├── chat/ # AI chat endpoint
│ │ │ ├── generate-code/ # Code generation endpoint
│ │ │ └── execute/ # Code execution endpoint
│ │ ├── dashboard/ # User dashboard
│ │ ├── problems/ # ML problems listing
│ │ ├── datasets/ # Dataset management
│ │ ├── experiments/ # Experiment tracking
│ │ └── workspace/ # Experiment workspace
│ ├── components/ # React components
│ │ ├── ui/ # Shadcn UI components
│ │ └── layout/ # Layout components
│ ├── lib/
│ │ ├── ai/ # AI provider integrations
│ │ ├── eda/ # Data analysis utilities
│ │ ├── constants/ # ML problem definitions
│ │ └── sample-data/ # Sample datasets
│ └── db/
│ ├── schema.ts # Database schema
│ └── index.ts # Database client
├── drizzle/ # Database migrations
├── public/ # Static assets
└── package.json
- Linear Regression (Single Variable) - Predict housing prices by size
- Linear Regression (Multiple Variables) - Multi-feature price prediction
- Logistic Regression - Binary classification for university admissions
- Regularized Regression - Prevent overfitting with L1/L2 regularization
- Polynomial Regression - Model nonlinear relationships
- Multi-class Classification - Handwritten digit recognition
- K-Means Clustering - Customer segmentation
- Neural Networks - Deep learning for image classification
- Principal Component Analysis (PCA) - Dimensionality reduction
- Choose an ML Problem - Browse problems by difficulty or category
- Select a Dataset - Use sample data or upload your own CSV
- Explore with AI - Chat with the AI assistant about your data
- Automated EDA - Get instant insights and visualizations
- Generate Code - AI creates Python code for your experiment
- Execute Safely - Run code in a sandboxed environment
- Visualize Results - Interactive charts and performance metrics
- Iterate & Learn - Refine your approach with AI guidance
Chat with the AI assistant.
Request:
{
"messages": [
{ "role": "user", "content": "Explain linear regression" }
],
"problemType": "linear_regression_single",
"context": "Optional additional context"
}Response:
{
"message": "AI response...",
"provider": "claude"
}Generate Python code for an ML task.
Request:
{
"problemType": "linear_regression_single",
"task": "Train a linear regression model on housing data",
"datasetInfo": {
"columns": ["size", "price"],
"rowCount": 100
}
}Response:
{
"code": "import pandas as pd...",
"explanation": "Code explanation",
"provider": "claude"
}Execute Python code in a sandbox.
Request:
{
"code": "print('Hello, ML!')",
"datasetUrl": "https://example.com/data.csv"
}Response:
{
"status": "success",
"output": "Hello, ML!",
"charts": [],
"logs": []
}Key tables:
users- User accountsdatasets- Uploaded and sample datasetsexperiments- ML experimentsexecutions- Code execution recordschat_messages- Conversation historyeda_results- Exploratory data analysis resultssample_datasets- Pre-loaded sample datasets
npm run dev # Start development server
npm run build # Build for production
npm run start # Start production server
npm run lint # Run ESLint
npm run db:generate # Generate migrations
npm run db:push # Push schema to database
npm run db:studio # Open Drizzle Studio- Add the problem type to the enum in
src/db/schema.ts - Define the problem in
src/lib/constants/ml-problems.ts - Create a sample dataset in
src/lib/sample-data/ - Add problem-specific context in
src/lib/ai/prompts.ts
- Push your code to GitHub
- Connect your repository to Vercel
- Add environment variables in Vercel dashboard
- Deploy
- Create a Neon account at neon.tech
- Create a new project
- Copy the connection string
- Add to
DATABASE_URLin your environment variables - Run
npm run db:pushto create tables
- Create an account at e2b.dev
- Get your API key
- Add to
E2B_API_KEYin environment variables
- Sandboxed code execution prevents malicious code
- API rate limiting (ready to configure)
- Input validation on all endpoints
- Secure file upload handling
- Environment variable protection
- Additional ML algorithms (SVM, Decision Trees, Random Forests)
- Real-time collaboration features
- Community dataset sharing
- Experiment leaderboards
- Advanced visualization options
- Jupyter notebook export
- Model deployment capabilities
- Mobile app version
Contributions are welcome! Please read our contributing guidelines before submitting PRs.
MIT License - see LICENSE file for details
- Inspired by Andrew Ng's Machine Learning course
- Built with Next.js, Shadcn UI, and modern ML tools
- Powered by OpenAI, Anthropic, and Google AI
For issues and feature requests, please use the GitHub issue tracker.
ML Pathways - Learn machine learning by doing, guided by AI.