Skip to content

samuelcampozano/Neurodevelopmental-Disorders-Risk-Calculator

Repository files navigation

🧠 Neurodevelopmental Disorders Risk Calculator

Python FastAPI Machine Learning SQLAlchemy License

AI-powered REST API for early screening of neurodevelopmental disorders using validated clinical questionnaires (SCQ - Social Communication Questionnaire).

πŸ“– Overview

This project implements a comprehensive backend service that leverages machine learning to assess neurodevelopmental disorder risk based on responses to the internationally validated SCQ (Social Communication Questionnaire). The system processes 40 binary responses and provides probabilistic risk assessments with clinical interpretations.

🎯 Built for: Clinical research institutions, healthcare providers, and educational assessment tools.

✨ Key Features

πŸ€– Machine Learning Core

  • Random Forest Classifier trained on validated clinical data
  • Real-time predictions with confidence scoring
  • Risk stratification: Low, Medium, High categories
  • Probability estimates with clinical interpretations

πŸ—οΈ Enterprise Architecture

  • RESTful API built with FastAPI
  • Database persistence with SQLAlchemy ORM
  • Scalable design (SQLite β†’ PostgreSQL ready)
  • Comprehensive API documentation (OpenAPI/Swagger)
  • Health monitoring and system diagnostics

πŸ“Š Data Management

  • Complete evaluation storage (responses, demographics, predictions)
  • Statistical analytics and reporting endpoints
  • Data export capabilities for model retraining
  • GDPR-compliant data handling with consent tracking

πŸ›‘οΈ Production Ready

  • Input validation with Pydantic schemas
  • Error handling and logging
  • Performance optimization
  • Database migrations support (Alembic ready)

πŸ›οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Frontend      │───▢│   FastAPI        │───▢│   Database      β”‚
β”‚   (React)       β”‚    β”‚   Backend        β”‚    β”‚   (SQLite/      β”‚
β”‚                 β”‚    β”‚                  β”‚    β”‚   PostgreSQL)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
                       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                       β”‚   ML Model       β”‚
                       β”‚   (Random Forest)β”‚
                       β”‚   (.pkl)         β”‚
                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Quick Start

Prerequisites

  • Python 3.13+
  • Git
  • Virtual environment (recommended)

Installation

# Clone the repository
git clone https://github.com/your_username/Neurodevelopmental-Disorders-Risk-Calculator.git
cd Neurodevelopmental-Disorders-Risk-Calculator

# Create and activate virtual environment
python -m venv venv

# Windows
venv\Scripts\activate

# macOS/Linux
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Run the application
./run.sh
# or
uvicorn app.main:app --reload --port 8000

πŸ” Verify Installation

# Test the API
python test_api.py

# Access API documentation
# http://localhost:8000/docs

πŸ“‘ API Endpoints

Core Endpoints

Method Endpoint Description
GET / API information and status
GET /health System health check
POST /api/v1/predict Get risk prediction only
POST /api/v1/submit Submit evaluation (save + predict)
GET /api/v1/evaluaciones List recent evaluations
GET /api/v1/evaluaciones/{id} Get specific evaluation
GET /api/v1/stats System statistics

πŸ“‹ Usage Examples

Submit Complete Evaluation

curl -X POST "http://localhost:8000/api/v1/submit" \
  -H "Content-Type: application/json" \
  -d '{
    "edad": 8,
    "sexo": "M",
    "respuestas": [true, false, true, ...], // 40 boolean values
    "acepto_consentimiento": true
  }'

Response

{
  "success": true,
  "message": "EvaluaciΓ³n guardada exitosamente",
  "evaluation_id": 1,
  "prediction": {
    "probability": 0.23,
    "risk_level": "Low",
    "confidence": 0.77,
    "interpretation": "Bajo riesgo de trastornos del neurodesarrollo"
  }
}

Get Prediction Only

curl -X POST "http://localhost:8000/api/v1/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "responses": [1, 0, 1, 0, ...] // 40 binary values
  }'

πŸ—„οΈ Database Schema

CREATE TABLE evaluaciones (
    id SERIAL PRIMARY KEY,
    sexo VARCHAR(10),           -- Gender (M/F)
    edad INTEGER,               -- Age
    respuestas BOOLEAN[40],     -- 40 SCQ responses
    riesgo_estimado FLOAT,      -- Predicted risk probability
    fecha TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    acepto_consentimiento BOOLEAN -- Consent flag
);

πŸ”¬ Machine Learning Model

Model Details

  • Algorithm: Random Forest Classifier
  • Training Data: Validated SCQ clinical dataset
  • Features: 40 binary responses from SCQ questionnaire
  • Output: Risk probability (0.0 - 1.0)
  • Performance: Optimized for clinical screening accuracy

Risk Categories

  • Low Risk: 0.0 - 0.33 (Green)
  • Medium Risk: 0.34 - 0.66 (Yellow)
  • High Risk: 0.67 - 1.0 (Red)

πŸ“ Project Structure

Neurodevelopmental-Disorders-Risk-Calculator/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ main.py                 # FastAPI application
β”‚   β”œβ”€β”€ database.py             # Database configuration
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ predictor.py        # ML model logic
β”‚   β”‚   └── database_models.py  # SQLAlchemy models
β”‚   β”œβ”€β”€ routes/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ predict.py          # Prediction endpoints
β”‚   β”‚   └── evaluations.py      # Evaluation endpoints
β”‚   β”œβ”€β”€ schemas/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── request.py          # Pydantic models
β”‚   └── utils/
β”‚       β”œβ”€β”€ __init__.py
β”‚       └── helpers.py          # Utility functions
β”œβ”€β”€ data/
β”‚   └── modelo_entrenado.pkl    # Trained ML model
β”œβ”€β”€ tests/
β”‚   └── test_api.py             # API tests
β”œβ”€β”€ requirements.txt            # Dependencies
β”œβ”€β”€ run.sh                      # Run script
β”œβ”€β”€ .gitignore                  # Git ignore rules
└── README.md                   # This file

πŸ§ͺ Testing

# Run comprehensive API tests
python test_api.py

# Expected output: All endpoints tested successfully
# - Root endpoint βœ…
# - Health check βœ…
# - Predictions βœ…
# - Evaluations storage βœ…
# - Statistics βœ…

πŸ“Š Monitoring & Analytics

The system provides built-in analytics:

  • Total evaluations processed
  • Risk distribution (Low/Medium/High)
  • Demographic insights (age, gender)
  • System health monitoring
  • Database performance metrics

πŸš€ Deployment

Development

  • Database: SQLite (included)
  • Server: Uvicorn development server
  • Environment: Local Python environment

Production Ready

  • Database: PostgreSQL (easily configurable)
  • Server: Gunicorn + Uvicorn workers
  • Deployment: Docker, AWS, GCP, Azure compatible
  • Monitoring: Health endpoints for load balancers

πŸ”§ Configuration

Environment Variables

# Database (optional, defaults to SQLite)
DATABASE_URL=postgresql://user:password@localhost/dbname

# API Configuration
API_VERSION=v1
DEBUG=False

# Model Configuration
MODEL_PATH=data/modelo_entrenado.pkl

πŸ“ˆ Performance

  • Response Time: < 100ms average
  • Throughput: 1000+ requests/minute
  • Accuracy: Optimized for clinical screening
  • Scalability: Horizontal scaling ready

🀝 Contributing

This project follows professional ML engineering practices:

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/AmazingFeature)
  3. Commit changes (git commit -m 'Add AmazingFeature')
  4. Push to branch (git push origin feature/AmazingFeature)
  5. Open Pull Request

πŸ“‹ Future Enhancements

  • Authentication & Authorization (JWT tokens)
  • Advanced Analytics Dashboard
  • Model A/B Testing Framework
  • Automated Model Retraining Pipeline
  • Multi-language Support
  • Export to Clinical Formats (HL7 FHIR)
  • Real-time Model Monitoring

πŸ”’ Security & Privacy

  • Data Anonymization: No personal identifiers stored
  • Consent Tracking: GDPR compliance
  • Input Validation: Prevents injection attacks
  • Rate Limiting: DOS protection ready
  • Audit Logging: Complete request tracking

πŸ“š Clinical Background

The Social Communication Questionnaire (SCQ) is a validated screening tool for autism spectrum disorders and related neurodevelopmental conditions. This implementation:

  • Follows clinical best practices
  • Maintains diagnostic accuracy
  • Provides interpretable results
  • Supports research applications

πŸ‘¨β€πŸ’» Author

Samuel Campozano Lopez

  • πŸŽ“ ML Engineer & Software Developer
  • πŸ₯ Healthcare Technology Specialist
  • πŸ”¬ Clinical Data Science Researcher

Built as part of an institutional healthcare technology project and professional ML engineering portfolio.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Clinical validation provided by healthcare professionals
  • SCQ questionnaire developed by clinical researchers
  • Open source community for excellent ML tools
  • FastAPI team for the outstanding framework

πŸ“ž Support

For questions, issues, or collaboration opportunities:

  • Issues: Use GitHub Issues for bug reports
  • Discussions: Use GitHub Discussions for questions
  • Contact: samuelco860@gmail.com ⭐ If this project helps your work, please consider giving it a star!