An intelligent API that classifies natural-language expense descriptions into categories like Transportation, Food & Drink, Entertainment, etc powered by machine learning and FastAPI.
It also supports:
- Batch classification
- OCR classification from images/receipts
- Confidence scoring for each prediction
- Text Classification — Predicts expense categories from written descriptions.
- OCR Support — Extracts and classifies text from images (e.g. receipts).
- ML-Powered — Trained using scikit-learn with
TfidfVectorizerandLogisticRegression. - Batch Input — Classify multiple expenses in a single call.
- FastAPI Backend — Modern async API with auto-generated Swagger docs.
git clone https://github.com/your-username/ai-expense-classifier.git
cd ai-expense-classifierpython3 -m venv venv
source venv/bin/activate
pip install fastapi uvicorn scikit-learn pandas joblib pillow pytesseract python-multipart
python app/ml/train_model.py
This script trains a logistic regression classifier on the mock dataset and saves it to app/models/expense_classifier.pkl
uvicorn app.main:app --reload
Visit:
http://localhost:8000/docs (Swagger UI)
POST /classify
{
"description": "Flight to Toronto"
}
Response
{
"description": "Flight to Toronto",
"predicted_category": "Travel",
"confidence": 0.84
}