A comprehensive toolkit for processing Data Subject Access Requests (DSARs) under GDPR Article 15 and CCPA.
- 26 Vendor Processors: Pre-built processors for common SaaS platforms
- GDPR-Compliant Redaction: Automatic third-party data redaction per Article 15(4)
- Word Report Generation: Professional DSAR response documents
- Package Compilation: Combine multi-vendor reports with cover letters
- Generic Fallbacks: CSV and JSON processors for unlisted vendors
# Clone the repository
git clone <repository-url>
cd DSAR
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt- Slack
- HubSpot
- Salesforce
- Pipedrive
- Zendesk
- Intercom
- Freshdesk
- Jira
- Asana
- Trello
- Monday.com
- CharlieHR
- BambooHR
- Greenhouse
- Notion
- Google Workspace
- Microsoft 365
- Confluence
- GitHub
- GitLab
- Mailchimp
- Stripe
- Okta
- Generic JSON (fallback for any JSON export)
- Generic CSV (fallback for any CSV export)
The easiest way to use the DSAR Toolkit is through the web interface:
# Install dependencies (first time only)
pip install -r requirements.txt
# Start the web UI
streamlit run scripts/web_ui.pyThis opens a browser interface where you can:
- Upload vendor export files
- Enter data subject information
- Select the vendor processor
- Download generated reports
- Compile multi-vendor DSAR packages
cd scripts/communication
python slack_dsar.py ~/exports/slack_export.zip "John Smith" --email john@company.compython slack_dsar.py export.zip "John Smith" --email john@company.com --redact "External Contact" "Vendor Name"cd scripts
python compile_package.py ./output "John Smith" --email john@company.com \
--request-date "2025-01-15" --company "Your Company" \
--dpo-name "Privacy Officer" --dpo-email dpo@company.comDSAR/
├── scripts/
│ ├── core/
│ │ ├── __init__.py
│ │ ├── redaction.py # RedactionEngine class
│ │ ├── docgen.py # Document generation
│ │ └── utils.py # Utilities
│ ├── communication/
│ │ └── slack_dsar.py
│ ├── crm_sales/
│ │ ├── hubspot_dsar.py
│ │ ├── salesforce_dsar.py
│ │ └── pipedrive_dsar.py
│ ├── support/
│ │ ├── zendesk_dsar.py
│ │ ├── intercom_dsar.py
│ │ └── freshdesk_dsar.py
│ ├── project_mgmt/
│ │ ├── jira_dsar.py
│ │ ├── asana_dsar.py
│ │ ├── trello_dsar.py
│ │ └── monday_dsar.py
│ ├── hr_people/
│ │ ├── charliehr_dsar.py
│ │ ├── bamboohr_dsar.py
│ │ └── greenhouse_dsar.py
│ ├── productivity/
│ │ ├── notion_dsar.py
│ │ ├── google_workspace_dsar.py
│ │ ├── microsoft365_dsar.py
│ │ └── confluence_dsar.py
│ ├── dev_tools/
│ │ ├── github_dsar.py
│ │ └── gitlab_dsar.py
│ ├── marketing/
│ │ └── mailchimp_dsar.py
│ ├── finance/
│ │ └── stripe_dsar.py
│ ├── identity/
│ │ └── okta_dsar.py
│ ├── generic/
│ │ ├── generic_json_dsar.py
│ │ └── generic_csv_dsar.py
│ ├── compile_package.py
│ └── web_ui.py # Streamlit web interface
├── tests/
│ ├── test_redaction.py
│ ├── test_docgen.py
│ └── test_utils.py
├── output/ # Generated reports (gitignored)
│ └── internal/ # Redaction keys (DO NOT SEND)
├── requirements.txt
├── pyproject.toml
└── README.md
Each processor finds the data subject in the export by matching name and/or email.
Per GDPR Article 15(4), third-party personal data is replaced with placeholders:
[User 1],[User 2]- Other users[Bot 1],[Bot 2]- Bot accounts[External 1]- External contacts[Email 1],[Phone 1]- Standalone PII
- Word Document: Professional report with profile data and activity records
- JSON Export: Machine-readable data for integration
- Redaction Key: Internal document mapping placeholders to real names (DO NOT send to data subject)
Combine reports from multiple vendors into a single ZIP with:
- Cover letter
- All vendor reports
- JSON manifest
All processors accept these standard arguments:
| Argument | Description |
|---|---|
export_path |
Path to the vendor export file |
data_subject_name |
Name of the data subject |
--email, -e |
Email of the data subject (recommended) |
--redact, -r |
Additional names to redact |
--output, -o |
Output directory (default: ./output) |
For each processor run:
{Vendor}_DSAR_{Name}_{Timestamp}.docx- Word report{Vendor}_DSAR_{Name}_{Timestamp}.json- JSON exportinternal/{Vendor}_REDACTION_KEY_{Name}_{Timestamp}.json- Redaction mapping
If multiple users match the data subject name, the processor will:
- Raise an error listing all matches
- Require the
--emailflag to disambiguate
# Run all tests
pytest
# Run with coverage
pytest --cov=scripts/core
# Run specific test file
pytest tests/test_redaction.py -v-
Redaction Keys: Files in
output/internal/contain the mapping between placeholders and real names. These are for internal audit only and MUST NOT be sent to the data subject. -
Export Files: Vendor exports may contain sensitive data. Handle according to your data protection policies.
-
Output Files: DSAR responses contain personal data. Transmit securely to the data subject.
- Create a new processor in the appropriate category directory
- Implement these functions:
find_data_subject(data, name, email)- Locate the data subjectextract_users(data)- Get all users for redactionextract_profile(data_subject)- Get profile dataextract_records(data, data_subject_id)- Get activity recordsprocess(...)- Main entry point
- Use the generic processors as templates
MIT License - see LICENSE for details.
[Contributing guidelines here]