| Version | Supported |
|---|---|
| 1.1.x | ✅ |
| 1.0.x | ✅ |
| < 1.0 | ❌ |
If you discover a security vulnerability in DataCheck, please report it responsibly:
- Do not open a public GitHub issue for security vulnerabilities
- Email security concerns to: security@squrtech.com
- Include a detailed description of the vulnerability
- Provide steps to reproduce if possible
We will acknowledge receipt within 48 hours and provide a detailed response within 7 days.
When using DataCheck in production:
- Use environment variables for database connection strings
- Never commit credentials to version control
- Use IAM roles or managed identities when possible
- Rotate credentials regularly
- Use IAM roles instead of access keys when possible
- Limit permissions to read-only for validation tasks
- Use temporary credentials where supported
- Review bucket/container permissions regularly
- Review plugin code before loading (plugins execute Python code)
- Only load plugins from trusted sources
- Use code scanning tools (bandit, ruff) on custom plugins
- Consider plugin approval workflows for teams
- Store validation configs in version control
- Use secrets management for credentials
- Run validation in isolated environments
- Review validation results before proceeding with deployments
- Be mindful of PII in validation error messages
- Avoid logging sensitive data values
- Use sampling to reduce data exposure
- Configure Slack notifications carefully (no sensitive data in alerts)
DataCheck includes built-in security features:
- SQL Injection Protection: Parameterized queries for all database connectors
- Input Validation: Strict validation of configuration files and parameters
- Sandboxed Execution: Validation runs in process isolation (no remote code execution)
- Minimal Dependencies: Small attack surface with few required dependencies
Custom plugins execute arbitrary Python code. Only load plugins from trusted sources or after thorough code review. See plugin documentation for safe plugin development practices.
DataCheck reads data from databases but does not modify data. Use read-only database credentials when possible.
DataCheck requires read access to cloud storage. Grant minimal permissions necessary for validation tasks.
DataCheck uses well-maintained open-source libraries:
- pandas (data processing)
- pyarrow (Parquet support)
- boto3 (AWS S3)
- google-cloud-storage (GCS)
- azure-storage-blob (Azure)
- psycopg2 (PostgreSQL)
- mysql-connector-python (MySQL)
We monitor security advisories for all dependencies and update promptly.
- Security patches are released as soon as possible
- Follow releases on GitHub: https://github.com/squrtech/datacheck/releases
- Subscribe to security advisories: https://github.com/squrtech/datacheck/security/advisories
- Update DataCheck regularly:
pip install --upgrade datacheck-cli