Skip to content

bccfilkom/noventis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

156 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Noventis Logo Noventis

Intelligent Automation for Your Data Analysis

PyPI version Python 3.8+ License: MIT

Website β€’ Github β€’ Gmail

Screenshot From 2025-10-02 09-44-31

πŸš€ Overview

Noventis is a powerful Python library designed to revolutionize your data analysis workflow through intelligent automation. Built with modern data scientists and analysts in mind, Noventis provides cutting-edge tools for automated exploratory data analysis, predictive modeling, and data cleaningβ€”all with minimal code.

✨ Key Features

  • πŸ” EDA Auto - Automated exploratory data analysis with comprehensive visualizations and statistical insights
  • 🎯 Predictor - Intelligent ML model selection and training with automated hyperparameter tuning
  • 🧹 Data Cleaner - Smart data preprocessing and cleaning with advanced imputation strategies
  • ⚑ Fast & Efficient - Optimized for performance with large datasets
  • πŸ“Š Rich Visualizations - Beautiful, publication-ready charts and reports
  • πŸ”§ Highly Customizable - Fine-tune every aspect to match your needs

πŸ“¦ Installation

Quick Installation

pip install noventis

Install from Source

git clone https://github.com/bccfilkom/noventis.git
cd noventis
pip install -e .

Verify Installation

import noventis
print(noventis.__version__)
noventis.print_info()  # Show detailed installation info

🎯 Quick Start

1️⃣ Data Cleaner

Get started with intelligent data preprocessing and cleaning.

import pandas as pd
from noventis.data_cleaner import AutoCleaner

# Load your data
df = pd.read_csv('your_data.csv')

# Automatic data cleaning
cleaner = AutoCleaner()
df_clean = cleaner.fit_transform(df)

# The cleaned data is ready for analysis!
print(df_clean.info())

πŸ‘‰ Read the Data Cleaner Guide

2️⃣ EDA Auto

Automatically generate comprehensive exploratory data analysis reports.

from noventis.eda_auto import EDAuto

# Create EDA report
eda = EDAuto(df_clean)

# Generate comprehensive analysis
eda.generate_report()

# Show specific analyses
eda.show_distributions()
eda.show_correlations()
eda.show_missing_patterns()

πŸ‘‰ Read the EDA Auto Guide

3️⃣ Predictor

Build and train machine learning models with automated optimization.

from noventis.predictor import PredictorAuto

# Prepare data
X = df_clean.drop('target', axis=1)
y = df_clean['target']

# Automatic model training
predictor = PredictorAuto()
predictor.fit(X, y, task='classification')

# Make predictions
predictions = predictor.predict(X_test)

# Get model performance
print(predictor.get_metrics())

Read the Predictor Guide β†’

4️⃣ Complete Pipeline Example

import pandas as pd
from noventis.data_cleaner import AutoCleaner
from noventis.eda_auto import EDAuto
from noventis.predictor import PredictorAuto

# 1. Load data
df = pd.read_csv('your_data.csv')

# 2. Clean data
cleaner = AutoCleaner()
df_clean = cleaner.fit_transform(df)

# 3. Explore data
eda = EDAuto(df_clean)
eda.generate_report()

# 4. Train model
X = df_clean.drop('target', axis=1)
y = df_clean['target']

predictor = PredictorAuto()
predictor.fit(X, y, task='classification')

# 5. Evaluate
print(f"Model Accuracy: {predictor.score(X_test, y_test):.2%}")

πŸ“š Core Modules

🧹 Data Cleaner

Intelligent data preprocessing and cleaning with advanced strategies:

  • Missing Data Handling - Multiple imputation strategies (mean, median, KNN, iterative)
  • Outlier Treatment - Statistical and ML-based detection (IQR, Z-score, Isolation Forest)
  • Feature Scaling - Normalization and standardization techniques
  • Encoding - Automatic categorical variable encoding (One-Hot, Label, Target)
  • Data Type Detection - Intelligent type inference and conversion
  • Duplicate Removal - Smart duplicate detection and handling

Learn more β†’

πŸ” EDA Auto

Comprehensive exploratory data analysis automation:

  • Statistical Summary - Descriptive statistics for all features
  • Distribution Analysis - Histograms, KDE plots, and normality tests
  • Correlation Analysis - Heatmaps and correlation matrices
  • Missing Data Analysis - Visualization and patterns of missing values
  • Outlier Detection - Automatic identification of anomalies
  • Feature Relationships - Scatter plots and pairwise analysis

Learn more β†’

🎯 Predictor

Automated machine learning with intelligent model selection:

  • Auto Model Selection - Automatically selects the best algorithm for your data
  • Hyperparameter Tuning - Optimizes model parameters using advanced search algorithms
  • Feature Engineering - Creates and selects relevant features automatically
  • Cross-Validation - Robust model evaluation with k-fold validation
  • Model Explainability - SHAP values and feature importance analysis
  • Ensemble Methods - Combines multiple models for better performance

Supported Algorithms:

  • Scikit-learn: Random Forest, Gradient Boosting, Logistic Regression, SVM
  • XGBoost: Extreme Gradient Boosting
  • LightGBM: Light Gradient Boosting Machine
  • CatBoost: Categorical Boosting
  • And many more...

Learn more β†’


πŸ› οΈ Requirements

System Requirements

  • Python 3.8 or higher
  • 4GB RAM minimum (8GB+ recommended for large datasets)
  • Windows, macOS, or Linux

Core Dependencies

Noventis automatically installs these dependencies:

  • Data Processing: pandas, numpy, scipy
  • Visualization: matplotlib, seaborn
  • Machine Learning: scikit-learn, xgboost, lightgbm, catboost
  • AutoML: optuna, flaml, shap
  • Feature Engineering: category_encoders, statsmodels

See requirements.txt for complete list.


🀝 Contributing

We welcome contributions from the community! Here's how you can help:

Ways to Contribute

  1. πŸ› Report Bugs - Found a bug? Open an issue
  2. πŸ’‘ Suggest Features - Have ideas? We'd love to hear them!
  3. πŸ“– Improve Documentation - Help us make the docs better
  4. πŸ”§ Submit Pull Requests - Fix bugs or add features

Development Setup

# Clone the repository
git clone https://github.com/bccfilkom/noventis.git
cd noventis

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e .[dev]

# Run tests
pytest tests/

# Run linting
flake8 noventis/
black noventis/

See CONTRIBUTING.md for detailed guidelines.


πŸ‘₯ Contributors

This project exists thanks to all the people who contribute:

Contributor Role
Richard Product Manager
Fatoni Murfids AI Product Manager
Ahmad Nafi Mubarok Lead Data Scientist
Orie Abyan Maulana Lead Data Analyst
Grace Wahyuni Data Analyst
Alexander Angelo Data Scientist
Rimba Nevada Data Scientist
Jason Surya Winata Frontend Engineer
Nada Musyaffa Bilhaqi Product Designer

Special Thanks

A huge thank you to the maintainers of our dependencies:

  • pandas, numpy, scikit-learn, and the entire Python scientific computing community
  • XGBoost, LightGBM, and CatBoost teams for excellent gradient boosting libraries
  • Optuna and FLAML teams for amazing AutoML frameworks

πŸ“‚ Project Structure

The folder structure of Noventis project:

.
β”œβ”€β”€ πŸ“ dataset_for_examples/     # Sample datasets for testing
β”œβ”€β”€ πŸ“ docs/                     # Documentation files
β”œβ”€β”€ πŸ“ examples/                 # Example notebooks and scripts
β”œβ”€β”€ πŸ“ noventis/                 # Main library code
β”‚   β”œβ”€β”€ πŸ“ __pycache__/
β”‚   β”œβ”€β”€ πŸ“ asset/               # Asset files (if any)
β”‚   β”œβ”€β”€ πŸ“ core/                # Core functionality
β”‚   β”œβ”€β”€ πŸ“ data_cleaner/        # Data cleaning module
β”‚   β”‚   β”œβ”€β”€ πŸ“„ __init__.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ auto.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ data_quality.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ encoding.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ imputing.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ orchestrator.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ outlier_handling.py
β”‚   β”‚   └── πŸ“„ scaling.py
β”‚   β”œβ”€β”€ πŸ“ eda_auto/            # EDA automation module
β”‚   β”‚   β”œβ”€β”€ πŸ“„ __init__.py
β”‚   β”‚   └── πŸ“„ eda_auto.py
β”‚   β”œβ”€β”€ πŸ“ predictor/           # Prediction module
β”‚   β”‚   β”œβ”€β”€ πŸ“„ __init__.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ auto.py
β”‚   β”‚   └── πŸ“„ manual.py
β”‚   └── πŸ“„ __init__.py          # Main package init
β”œβ”€β”€ πŸ“ noventis.egg-info/       # Package metadata
β”‚   β”œβ”€β”€ πŸ“„ dependency_links.txt
β”‚   β”œβ”€β”€ πŸ“„ PKG-INFO
β”‚   β”œβ”€β”€ πŸ“„ SOURCES.txt
β”‚   └── πŸ“„ top_level.txt
β”œβ”€β”€ πŸ“ tests/                   # Unit tests
β”œβ”€β”€ πŸ“„ .gitignore               # Git ignore rules
β”œβ”€β”€ πŸ“„ LICENSE                  # MIT License
β”œβ”€β”€ πŸ“„ MANIFEST.in              # Package manifest
β”œβ”€β”€ πŸ“„ pyproject.toml           # Modern Python packaging config
β”œβ”€β”€ πŸ“„ README.md                # This file
β”œβ”€β”€ πŸ“„ requirements.txt         # Production dependencies
β”œβ”€β”€ πŸ“„ requirements-dev.txt     # Development dependencies
└── πŸ“„ setup.py                 # Package setup script

πŸ“Œ Notes

  • The noventis/ folder contains the main library code
  • The tests/ folder is dedicated to unit testing and integration testing
  • setup.py and pyproject.toml are used for packaging and distribution
  • requirements.txt lists the external dependencies needed for the project

πŸš€ With this structure, the project is ready for development, testing, and publishing on PyPI or GitHub.


πŸ”§ Troubleshooting

Common Issues

Problem: ModuleNotFoundError: No module named 'noventis'

# Solution: Reinstall the package
pip uninstall noventis
pip install noventis

Problem: Dependencies conflict

# Solution: Create a fresh virtual environment
python -m venv fresh_env
source fresh_env/bin/activate
pip install noventis

Problem: Import errors after installation

# Solution: Verify installation
import noventis
print(noventis.__version__)
noventis.print_info()  # Check all dependencies

Getting Help


πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

Third-Party Licenses

Noventis uses several open-source libraries. We are grateful to their maintainers:

  • Data Processing: pandas (BSD), numpy (BSD), scipy (BSD)
  • Visualization: matplotlib (PSF), seaborn (BSD)
  • Machine Learning: scikit-learn (BSD), xgboost (Apache 2.0), lightgbm (MIT), catboost (Apache 2.0)
  • AutoML: optuna (MIT), flaml (MIT), shap (MIT)
  • Feature Engineering: category_encoders (BSD), statsmodels (BSD)

All dependencies are licensed under permissive open-source licenses (BSD, MIT, Apache 2.0).


πŸ“š Citation

If you use Noventis in your research, please cite:

@software{noventis2025,
  author = {Noventis Team},
  title = {Noventis: Intelligent Automation for Data Analysis},
  year = {2025},
  url = {https://github.com/bccfilkom/noventis}
}

🌟 Star History

Star History Chart


Made with ❀️ by Noventis Team

If you find Noventis useful, please consider giving it a ⭐ on GitHub!

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 5

Languages