Quick reference guide to all project files and their purposes.
| File | Description |
|---|---|
README.md |
Project overview and quick start |
QUICKSTART.md |
Detailed quick start guide with examples |
WP1_SUMMARY.md |
Complete WP1 implementation summary |
docs/WP1_DOCUMENTATION.md |
Comprehensive API reference and documentation |
| File | Description |
|---|---|
setup.py |
Package installation configuration |
requirements.txt |
Python dependencies |
verify_installation.py |
Installation verification script |
| File | Lines | Description |
|---|---|---|
__init__.py |
20 | Package initialization and exports |
quality_metrics.py |
900+ | Core quality metrics and registry system |
utils.py |
450+ | Utility functions for quality analysis |
- 14 quality metric functions
QualityMetricclass (metric wrapper)QualityMetricRegistryclass (central registry)get_default_registry()functionget_global_registry()singleton
- Outlier detection (IQR, Z-score, Modified Z-score)
- Constraint validation (range, enum)
- Data profiling (missing patterns, cardinality, correlation)
- Normalization (min-max, z-score, robust)
- Data drift detection (KS test, Chi-squared)
- DataFrame validation
| File | Lines | Description |
|---|---|---|
test_quality_metrics.py |
600+ | Tests for all quality metrics (14 test classes, 50+ tests) |
test_utils.py |
200+ | Tests for utility functions (5 test classes, 15+ tests) |
- Completeness metrics
- Outlier detection
- Duplicate detection
- Consistency metrics
- Constraint validation
- Distribution metrics
- Timeliness metrics
- Accuracy metrics
- Registry system
- Custom metrics
- Utility functions
| File | Lines | Description |
|---|---|---|
basic_usage.py |
200+ | Basic usage with sample data and common patterns |
custom_metrics.py |
400+ | Advanced examples: custom metrics, registry extension |
- Creating sample datasets
- Computing basic metrics
- Using the registry
- Metrics with configuration
- Category-based computation
- Custom metric functions
- Extending the default registry
- Complex multi-column metrics
- Comprehensive data profiling
completeness- Overall data completenesscolumn_completeness- Column-specific completeness [config required]
outlier_rate- Statistical outlier detection (IQR/Z-score/Modified Z-score)
duplicate_rate- Duplicate row detectionkey_uniqueness- Key column uniqueness [config required]
format_consistency- Format consistency validation [config required]referential_integrity- Foreign key validation [config required]
constraint_violation- Multi-type constraint checking [config required]type_validity- Data type validation [config required]
distribution_skewness- Statistical skewnessdistribution_kurtosis- Statistical kurtosis
freshness- Timestamp-based data freshness [config required]
value_accuracy- Ground truth comparison [config required]
overall_quality- Weighted quality score
# Verify installation
python verify_installation.py
# Run basic example
python examples/basic_usage.py
# Run advanced example
python examples/custom_metrics.py
# Run all tests
pytest tests/
# Run tests with coverage
pytest tests/ --cov=inferq --cov-report=html
# Run specific test file
pytest tests/test_quality_metrics.py -v
# Run specific test class
pytest tests/test_quality_metrics.py::TestCompletenessMetrics -v- Total Lines of Code: 3,500+
- Number of Metrics: 14
- Number of Categories: 9
- Test Classes: 19
- Test Cases: 65+
- Documentation Pages: 4 major docs
- Example Scripts: 2 comprehensive examples
README.md- OverviewQUICKSTART.md- Quick startexamples/basic_usage.py- Basic exampleverify_installation.py- Verify setup
docs/WP1_DOCUMENTATION.md- Complete API referenceexamples/custom_metrics.py- Advanced patternssrc/inferq/quality_metrics.py- Source codetests/- Test examples
WP1_SUMMARY.md- Implementation detailssrc/inferq/- Source code structuretests/- Testing patternssetup.py- Package configuration
README.md
├── QUICKSTART.md (detailed guide)
├── WP1_SUMMARY.md (implementation summary)
└── docs/WP1_DOCUMENTATION.md (API reference)
setup.py
└── requirements.txt (dependencies)
src/inferq/
├── __init__.py (exports)
├── quality_metrics.py (core)
└── utils.py (helpers)
examples/
├── basic_usage.py (imports from src/inferq)
└── custom_metrics.py (imports from src/inferq)
tests/
├── test_quality_metrics.py (imports from src/inferq)
└── test_utils.py (imports from src/inferq)
...compute basic quality metrics:
- See:
examples/basic_usage.py,QUICKSTART.md - Use:
get_default_registry(),registry.compute()
...create custom metrics:
- See:
examples/custom_metrics.py,docs/WP1_DOCUMENTATION.md - Use:
registry.register_function(),QualityMetricclass
...validate constraints:
- See:
examples/basic_usage.py(section 4) - Use:
registry.compute('constraint_violation', df, constraints=[...])
...detect outliers:
- See:
docs/WP1_DOCUMENTATION.md,src/inferq/utils.py - Use:
registry.compute('outlier_rate', df)ordetect_outliers_iqr()
...profile data quality:
- See:
examples/custom_metrics.py(example 4) - Use:
registry.compute_all(df, metric_configs),compute_data_profile()
...extend the framework:
- See:
examples/custom_metrics.py,docs/WP1_DOCUMENTATION.md - Use: Custom metric functions +
registry.register_function()
...understand the code:
- See:
src/inferq/quality_metrics.py,docs/WP1_DOCUMENTATION.md - Start with: Docstrings and inline comments
...run tests:
- See:
tests/test_quality_metrics.py,tests/test_utils.py - Run:
pytest tests/orpython verify_installation.py
- API Documentation:
docs/WP1_DOCUMENTATION.md - Quick Start:
QUICKSTART.md - Examples:
examples/directory - Tests:
tests/directory (reference implementations) - Implementation Details:
WP1_SUMMARY.md
Before using InferQ, verify:
- Dependencies installed (
pip install -r requirements.txt) - Verification passes (
python verify_installation.py) - Basic example works (
python examples/basic_usage.py) - Tests pass (optional:
pytest tests/)
- Current Version: 0.1.0
- Work Package: WP 1 (Complete)
- Status: Production Ready
- Python Requirement: >=3.8
Last Updated: WP 1 Implementation Complete Next: WP 2 - Multi-Target Quality-aware Discretization (MTQD)