A collection of shell scripts for anonymising and processing JSON data using jq. These tools allow you to replace sensitive data with fake alternatives while preserving the original JSON structure.
- Prerequisites
- Scripts Overview
- Installation
- Core Scripts
- Fake Data Files
- Usage Examples
- Advanced Usage
- Troubleshooting
- bash (version 4.0 or later)
- jq (JSON processor) - Install jq
- grep, wc, tr (standard Unix utilities)
| Script | Purpose | Use Case |
|---|---|---|
anonymise |
Generic property anonymisation | Replace any property values with fake data |
anonymise_emails |
Email-specific anonymisation | Replace email addresses using regex pattern matching |
reduce |
Array reduction | Limit the number of items in JSON arrays |
filter |
Array filtering | Filter arrays based on reference relationships |
- Clone or download the scripts to your desired directory
- Make scripts executable:
chmod +x anonymise anonymise_emails reduce filterReplaces any property values in JSON with fake data from a text file.
bash anonymise INPUT_FILE OUTPUT_FILE PROPERTY_PATH FAKE_DATA_FILEINPUT_FILE: Path to input JSON fileOUTPUT_FILE: Path to output JSON file (can be same as input for in-place editing)PROPERTY_PATH: JSON property path to anonymiseFAKE_DATA_FILE: Text file containing fake replacement values (one per line)
-
Nested properties in objects with arrays:
bash anonymise data.json output.json people.FirstName fakeData/firstNames.txt
For JSON like:
{"people": [{"FirstName": "John"}, {"FirstName": "Jane"}]} -
Direct arrays of objects:
bash anonymise data.json output.json name fakeData/firstNames.txt
For JSON like:
[{"name": "John"}, {"name": "Jane"}] -
Simple property arrays:
bash anonymise data.json output.json names fakeData/firstNames.txt
For JSON like:
{"names": ["John", "Jane", "Charlie"]}
# Anonymise people first names
bash anonymise input.json output.json people.FirstName fakeData/firstNames.txt
# Anonymise last names in-place
bash anonymise data.json data.json people.LastName fakeData/lastNames.txt
# Anonymise email addresses
bash anonymise input.json output.json emails.EMailAddress fakeData/emails.txt
# Anonymise a flat array structure
bash anonymise flat.json flat.json name fakeData/firstNames.txtSpecifically designed for anonymising email addresses using regex pattern matching.
bash anonymise_emails INPUT_FILE OUTPUT_FILEINPUT_FILE: Path to input JSON fileOUTPUT_FILE: Path to output JSON file
- Uses regex pattern to find email addresses anywhere in the JSON
- Automatically uses
fakeData/emails.txtfor replacements - Provides detailed summary of anonymisation
bash anonymise_emails input.json anonymised_output.jsonLimits the number of items in specified JSON arrays.
bash reduce INPUT_FILE OUTPUT_FILE PROPERTY_PATH LIMITINPUT_FILE: Path to input JSON fileOUTPUT_FILE: Path to output JSON filePROPERTY_PATH: Path to the array property (supports dot notation)LIMIT: Maximum number of items to keep (positive integer)
# Keep only first 50 people
bash reduce input.json output.json people 50
# Keep only 10 items from nested property
bash reduce input.json output.json data.people 10
# Reduce in-place
bash reduce large_file.json large_file.json records 100Filters one array based on relationships with another array.
bash filter INPUT_FILE OUTPUT_FILE FILTER_PATH REFERENCE_PATHINPUT_FILE: Path to input JSON fileOUTPUT_FILE: Path to output JSON fileFILTER_PATH: Path to array and property to filter (e.g.,addresses.person_id)REFERENCE_PATH: Path to reference array and property (e.g.,people.id)
# Keep only addresses that have matching person IDs in people array
bash filter input.json output.json addresses.person_id people.idThe fakeData/ directory contains replacement data:
Contains 40 common first names for anonymisation.
Contains 40 common last names for anonymisation.
Contains 40 safe fake email addresses for email anonymisation.
Create text files with one item per line:
# Example: fakeData/cities.txt
New York
London
Tokyo
Paris
SydneyThen use with the anonymise script:
bash anonymise data.json output.json addresses.city fakeData/cities.txtAll scripts support in-place editing by using the same file for input and output:
bash anonymise data.json data.json people.FirstName fakeData/firstNames.txt-
"Command not found" errors
# Make scripts executable chmod +x anonymise anonymise_emails reduce filter -
"jq: command not found"
Install jq
Test your JSON structure first:
# Check if JSON is valid
jq empty input.json
# Check property exists
jq '.people[0].FirstName' input.json
# Check array structure
jq 'type' input.jsonUseful jq commands for debugging manually:
# Test property extraction
jq '.people[].FirstName' input.json
# Test unique values
jq '[.people[].FirstName] | unique' input.json