Add GENIE attacks (model extraction + pruning) with PyG fallback, GCN link predictor, and demo examples #25
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📋 Summary
Describe the purpose of this PR and the changes introduced
This PR ports a GENIE-like model extraction and pruning attack into the PyGIP codebase and adds a small PyTorch-Geometric (PyG) fallback so the project can run smoke demos on machines that do not have a working DGL installation. The goal is to make it straightforward for maintainers to run a quick demonstration of the GENIE attacks inside the PyGIP repo.
Files added / changed (high-level)
attacks/genie_model_extraction.py— GENIE-style model extraction attack, adapted to the repoBaseAttackusage.attacks/genie_pruning_attack.py— Pruning attack adapted to the repoBaseAttack.models/gcn_link_predictor.py— Minimal GCN link-prediction model used by the attacks and the demo trainer.pygip/datasets/datasets.pyandpygip/datasets/__init__.py— PyG-only fallback dataset loaders:load_ca_hepth,load_c_elegans, and aSimpleDatasetwrapper.examples/train_small_predictor.py— Small trainer that saves a demo checkpoint (examples/watermarked_model_demo.pth).examples/run_genie_experiments.py— Example script that runs extraction then pruning and prints metrics.🧪 Related Issues
✅ Checklist
docs/).feat/genie-watermark-ft), notmain.🧠 Additional Context (Important — please read)
Quick reproduction steps (exact commands)
From the repository root:
examples/watermarked_model_demo.pth):This prints training loss and writes examples/watermarked_model_demo.pth
What I observed during local testing (so reviewers know what to expect)
If no checkpoint is provided, extraction AUC is ~0.5 (random), pruning AUC also ~0.5 — expected since teacher is untrained.
After training the small demo teacher (
examples/train_small_predictor.py) and supplying that checkpoint:surrogate_test_auccan increase (example observed ≈ 0.70 on the tiny demo teacher).test_aucafter pruning can be significantly >0.5 depending on the demo checkpoint (observed ≈ 0.79 during local runs).The current implementation is a smoke/demo implementation — it is not a full, large-scale reproduction of the GENIE paper experiments (no large hyperparameter sweeps, multiple seeds, or large dataset jobs included).
Important limitations & notes
models/gcn_link_predictor.pyloader to match your keys.