-
Create and activate virtual environment:
python -m venv .venv source .venv/bin/activate -
Install dependencies:
pip install -r requirements.txt
-
Download the dataset:
bash download_script.sh
-
Run the project:
python main.py
We provide a specialized script (scripts/apply_distortions.py) to simulate real-world transmission and capture artifacts on the deepfake dataset.
What it does by default:
- Type 1 (Compression): Simulates WhatsApp/Instagram image degradation by re-encoding images to a lower quality JPEG in-memory.
- Type 2 (Moiré Effect): Simulates a digital camera taking a picture of a screen by dynamically generating and blending a subtle, randomized Moiré interference pattern over the image.
- Type 3 (Both): Computes the Moiré effect first (capture simulation), followed by the compression (transmission simulation).
- Multithreading: The script automatically utilizes all available CPU cores to process images rapidly.
1. Testing on a Subset (Recommended First Step)
Use the -n flag to limit the number of processed images (e.g., -n 3 for 3 images).
When testing, the script outputs to dataset/tmp/ and creates two files side-by-side (_original and _distorted) so you can visually compare the effect:
python scripts/apply_distortions.py 'dataset/raw/Data Set 2' 2 -n 32. Processing the Full Dataset
Run without the -n flag to process the entire directory. The script will save the results to dataset/cleaned/, preserving the exact folder structure and original file names so your data loaders continue to work seamlessly:
python scripts/apply_distortions.py 'dataset/raw/Data Set 2' 2