HumanML is a human-centered machine learning library designed to simplify the machine learning workflow while providing professional results. The library automates data preprocessing, model selection, training, evaluation, and reporting, making machine learning accessible to users of all skill levels.
- Simplified Workflow: Complete machine learning pipeline in just a few lines of code
- Intelligent Adaptivity: Automatically adapts to your dataset characteristics
- Reinforcement Learning Optimization: Uses RL to find optimal hyperparameters
- Professional Reports: Generates comprehensive PDF reports
- Interactive Visualizations: Easily visualize model performance and insights
- Human-Centered Design: Clear, stepwise progress and intuitive interface
pip install humanmlOr install from the source:
pip install -e .from humanml import HumanML
import pandas as pd
from sklearn.datasets import load_iris
# Load data
data = load_iris()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = pd.Series(data.target)
# Initialize HumanML
model = HumanML()
# Fit model
model.fit(X, y)
# Make predictions
predictions = model.predict(X)
# Visualize results
model.plot()HumanML follows a 5-step process:
- Data preprocessing and feature engineering
- Model selection and hyperparameter tuning
- Model training and evaluation
- Model explanation and visualization
- Report generation and model export
fit(X, y): Train models on your datapredict(X): Make predictions with the best modelpredict_proba(X): Get probability predictions (classification only)plot(): Visualize model performanceget_results(): Get detailed results dictionaryget_best_model(): Get the best model and its name
When initializing HumanML, you can customize its behavior:
model = HumanML(
preference="speed", # Options: "accuracy", "speed", "interpretability", "balanced"
output_dir="humanml_output", # Directory for outputs
verbose=True, # Whether to print detailed information
random_state=42, # Random seed for reproducibility
n_jobs=-1, # Number of parallel jobs (-1 for all cores)
excluded_models=None, # List of models to exclude
included_models=None, # List of models to include (overrides excluded_models)
hyperparameter_tuning="auto", # Options: "auto", "grid", "random", "bayesian", "rl", "none"
cross_validation=5, # Number of cross-validation folds
test_size=0.2, # Proportion of data for testing
validation_size=0.1, # Proportion of training data for validation
auto_report=True, # Whether to automatically generate reports
report_formats=["pdf"] # Report formats to generate
)from humanml import HumanML
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
# Load data
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = pd.Series(data.target)
# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize HumanML
model = HumanML(preference="accuracy")
# Fit model
model.fit(X_train, y_train)
# Make predictions
predictions = model.predict(X_test)
# Evaluate model
from sklearn.metrics import accuracy_score
print(f"Accuracy: {accuracy_score(y_test, predictions)}")
# Visualize results
model.plot("confusion_matrix")
model.plot("roc_curve")
model.plot("feature_importance")- Redesigned fit method to show only stepwise progress
- Integrated reinforcement learning for auto-parameter tuning
- Enhanced library adaptivity and smartness
- Changed report generation to PDF only
- Improved plot utilities for better visualization
- Added new plot() method for interactive visualization
- Added support for more models
- Improved preprocessing capabilities
- Enhanced report generation
- Initial release
MIT License