Automated neural network architecture discovery using Differential Evolution. Achieved 69.91% accuracy on UCI Adult dataset, beating random search baseline by +0.88% and discovering a novel efficiency-complexity tradeoff.
- π Live Interactive Demo β Try it now!
- π Full Colab Notebook β Run experiments yourself
- Best Architecture:
[21, 48, 11](3-layer hourglass pattern) - Test Accuracy: 69.91% Β± 0.12%
- Improvement over Baseline: +2.67% (67.24% β 69.91%)
- Beat Random Search: +0.88% with same computational budget
- Search Time: 33.6 minutes on Tesla T4 GPU
Through systematic ablation, I discovered that single-trial evaluation outperforms multi-trial averaging at short search horizons:
| Configuration | Accuracy | Time | Result |
|---|---|---|---|
| Single-trial | 70.12% | 1,108s | β‘ Optimal |
| Full system (multi-trial) | 69.88% | 2,021s | -0.24%, +45% slower |
Insight: At short horizons (β€8 generations), population diversity creates more variance than random initialization. Multi-trial averaging adds overhead without reducing overall noise.
Impact: 2x speedup for rapid prototyping without accuracy loss.
Visit the Live Demo to explore results interactively.
# Clone repository
git clone https://github.com/omar-camara/nas-differential-evolution.git
cd nas-differential-evolution
# Install dependencies
pip install -r requirements.txt
# Option 1: Run in Colab (recommended)
# Open notebooks/Neural_Architecure_Search_DE.ipynb in Google Colab
# Option 2: Run locally (requires GPU)
python -m notebooks.Neural_Architecure_Search_DEfrom src import EnhancedNASEngine
import numpy as np
# Create dummy data
X_train = np.random.randn(1000, 80)
X_test = np.random.randn(200, 80)
y_train = np.random.randint(0, 2, (1000, 1))
y_test = np.random.randint(0, 2, (200, 1))
# Run mini search
nas = EnhancedNASEngine(X_train, X_test, y_train, y_test, budget=60, max_layers=3)
nas.run_differential_evolution(pop_size=5, max_generations=3)
print(f"Best architecture: {nas.best_architecture}")
print(f"Best accuracy: {nas.best_accuracy:.4f}")| Method | Best Architecture | Accuracy | Evaluations |
|---|---|---|---|
| Differential Evolution | [21, 48, 11] | 69.91% | 144 |
| Random Search | [25, 35, 20] | 69.03% | 144 |
| Advantage | - | +0.88% | Same budget |
Conclusion: Guided evolutionary search outperforms random exploration with identical computational cost.
Systematic component removal to measure impact:
Configuration Accuracy Time Finding
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Single Trial 70.12% 1,108s β‘ Best!
Minimal (No Features) 69.97% 1,021s Also efficient
Full System 69.88% 2,021s Baseline
No Adaptive DE 69.93% 2,129s Minimal impact
No LR Scheduler 69.96% 2,134s Minimal impact
Key Insight: At short search horizons, simpler evaluation strategies are superior.
Discovered Pattern: Hourglass
Layer 1: 21 neurons (compress)
Layer 2: 48 neurons (expand)
Layer 3: 11 neurons (compress)
This asymmetric design was automatically discovered and outperforms intuitive symmetric patterns.
- Algorithm: Differential Evolution (DE/rand/1)
- Search Space: 1-4 layers, 80 neuron budget
- Population: 8 individuals
- Generations: 8 iterations
- Mutation Factor (F): 0.8 (adaptive)
- Crossover Rate (CR): 0.7
- Framework: PyTorch 2.0+ with CUDA
- Optimizer: Adam (lr=0.001, weight_decay=0.001)
- Loss: CrossEntropyLoss
- Regularization: Dropout (0.2), L2 (0.001)
- Early Stopping: Patience=10 epochs
- LR Scheduling: ReduceLROnPlateau
- Batch Size: 256
- Name: UCI Adult Income
- Size: 39,073 training, 9,769 test samples
- Features: 80 (after preprocessing)
- Task: Binary classification (income >$50K)
- Preprocessing: StandardScaler + OneHotEncoder
- β GPU acceleration (10-50x speedup)
- β Evaluation caching (avoids redundant training)
- β Model checkpointing (saves progress every 5 generations)
- β Gradient clipping (max_norm=1.0)
- β Adaptive parameters (F adjusts based on success rate)
- Purpose: Prove DE is optimizing, not just lucky
- Setup: Same evaluation budget (144 architectures)
- Result: DE found 0.88% better architecture
- Conclusion: Guided search > random exploration β
- Purpose: Measure impact of each component
- Configurations: 5 systematic variations
- Key Finding: Single-trial optimal at short horizons
- Impact: 2x speedup for rapid prototyping
- Method: 5 independent trials per final evaluation
- Metrics: Mean, standard deviation, confidence intervals
- Result: 69.91% Β± 0.12% (reproducible)
nas-differential-evolution/
βββ notebooks/
β βββ NAS_Complete_Notebook.ipynb # Full implementation & experiments
βββ deployment/
β βββ app.py # Gradio interactive demo
β βββ requirements.txt # Demo dependencies
βββ results/
β βββ comprehensive_report.json # All experimental results
β βββ search_results.png # Main visualization
β βββ ablation_study.png # Ablation analysis
β βββ search_space_visualization.png # t-SNE plot
βββ docs/
β βββ STUDY_GUIDE.md # Comprehensive documentation
βββ README.md # This file
βββ requirements.txt # Project dependencies
βββ .gitignore # Git ignore rules
βββ LICENSE # MIT License
Theory suggested multi-trial averaging would improve stability. Empirical testing showed it hurt performance at short horizons. Lesson: Always validate assumptions with experiments.
Optimization features should match problem scale. Features that help at long horizons can hurt at short ones. Lesson: Don't optimize prematurely.
The ablation "failure" became the project's most interesting finding. Lesson: Unexpected results often teach more than expected ones.
- Colab Notebook: Fully reproducible experiments with detailed explanations
- Live Demo: Interactive exploration of results
Contributions welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/improvement) - Commit changes (
git commit -m 'Add improvement') - Push to branch (
git push origin feature/improvement) - Open a Pull Request
This project is licensed under the MIT License - see LICENSE file for details.
- Dataset: UCI Machine Learning Repository - Adult Income dataset
- Framework: PyTorch team for excellent deep learning tools
- Inspiration: Storn & Price (1997) - Differential Evolution algorithm
- Platform: Hugging Face for free hosting
Omar
MS Computer Science, Syracuse University
Graduate Teaching Assistant
- Email: omarcamara000@gmail.com
- LinkedIn: https://www.linkedin.com/in/oc18/
- Hugging Face: @Username273183
- Lines of Code: ~1,500
- Experiments Run: 500+ architecture evaluations
- GPU Hours: ~40 hours on Tesla T4
- Development Time: 2 weeks
- Key Finding: Efficiency-complexity tradeoff in evaluation strategy
- Multi-objective optimization (accuracy + model size)
- Extended search space (skip connections, batch normalization)
- Distributed evaluation across multiple GPUs
- Transfer learning initialization
- Additional datasets and benchmarks
If you use this work, please cite:
@software{omar_nas_2025,
author = {Omar},
title = {Neural Architecture Search with Differential Evolution},
year = {2025},
institution = {Syracuse University},
url = {https://github.com/omar-camara/nas-differential-evolution},
note = {Interactive demo: https://huggingface.co/spaces/Username273183/nas-differential-evolution}
}If you find this project useful, please consider giving it a star! β
Built with β€οΈ using PyTorch, Differential Evolution, and a lot of GPU hours
Last updated: December 2025
