Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
a29bfa1
infra: added features to benchmark, and started building support for …
sdhossain Aug 26, 2025
7cbd7a3
infra: fixed some bugs
sdhossain Aug 27, 2025
638e100
fix: iterated on some PR changes
sdhossain Sep 1, 2025
7d10efb
added jailbreak tuning
sdhossain Aug 21, 2025
9b6989f
refactor: added model config and some utilities for benchmarking
sdhossain Sep 7, 2025
5b1cd9d
stashing changes
sdhossain Sep 9, 2025
cbdac17
trying something out
sdhossain Sep 14, 2025
de5f032
stashing changes
sdhossain Sep 18, 2025
07d6d70
stashing changes
sdhossain Sep 23, 2025
fee12ef
stashing changes
sdhossain Oct 16, 2025
c5c083e
stashing changes
sdhossain Oct 16, 2025
8b427dd
stashing changes
sdhossain Oct 16, 2025
3575e68
stashing changes
sdhossain Oct 19, 2025
a6d330a
stashing changes
sdhossain Oct 19, 2025
fd148e4
stashing changes
sdhossain Oct 19, 2025
25afc03
stashing changes and added T-Vaccine repo
sdhossain Nov 1, 2025
8745feb
stashing changes
sdhossain Dec 30, 2025
81b2c8a
stashing changes
sdhossain Jan 4, 2026
ca5ba71
infra: major refactor + addition of evals and attacks
sdhossain Jan 14, 2026
c2b91d7
fix: add missing return in backdoor_random_paragraph method
tomtseng Jan 14, 2026
59ba3e3
fix: remove duplicate strenum dependency in pyproject.toml
tomtseng Jan 14, 2026
08157eb
docs: remove broken link to non-existent CLAUDE.md
tomtseng Jan 14, 2026
f141722
fix: correct find -mmin flag in cleanup script
tomtseng Jan 14, 2026
9a976a1
fix: correct typo 'alignmed' -> 'aligned' in defense.py
tomtseng Jan 14, 2026
f6ccc86
fix: correct docstring in test_embedding_attack_eval.py
tomtseng Jan 14, 2026
04a7ff8
fix: correct typo "consitent" -> "consistent" in output_schema.py
tomtseng Jan 14, 2026
77c91c0
docs: remove reference to non-existent example_2.sh script
tomtseng Jan 14, 2026
82d471a
docs: pyrightconfig.json -> pyproject.toml
tomtseng Jan 14, 2026
994afe3
fix: correct copy-paste docstrings in jailbreak finetune classes
tomtseng Jan 14, 2026
299fcb6
fix: correct copy-paste docstrings in jailbreak finetune classes
tomtseng Jan 15, 2026
1a24629
fix: correct type hint for to_completions data_point parameter
tomtseng Jan 14, 2026
cad5db9
fix: standardize spelling "defence_config" -> "defense_config"
tomtseng Jan 14, 2026
cd1fa7c
docs: fix typo in USAGE.md (extra backtick)
tomtseng Jan 14, 2026
b9cc1b2
configs: remove duplicate 'plain' from template choices
tomtseng Jan 14, 2026
8a08c84
fix: correct type annotation for data_row in JailbreakBenchEvaluation
tomtseng Jan 14, 2026
5b48c27
chore: remove commented-out delete_output_checkpoint call
tomtseng Jan 14, 2026
68c843c
mmlu_pro: Make private functions public since they're used in PR #44
tomtseng Jan 15, 2026
9f7fce9
Run pre-commit except for basedpyright
tomtseng Jan 15, 2026
bd98dd7
fix: iterated on PR feedback, r1
sdhossain Jan 15, 2026
673e053
docs: Move FAR cluster usage directions to separate file
tomtseng Jan 15, 2026
3495d7d
docs: Table cleanup
tomtseng Jan 15, 2026
1fd792c
analyze_results: Exit with error if file not found
tomtseng Jan 15, 2026
fe31d34
benchmark_grid: Clean-up
tomtseng Jan 16, 2026
f92c372
optuna_single: Be consistent about using hyphens vs underscores in CL…
tomtseng Jan 16, 2026
262688b
scripts/whitebox: Use existing get_repo_root() function
tomtseng Jan 16, 2026
b748d5b
lora_finetune: Fix a type annotation
tomtseng Jan 16, 2026
129db29
utils/models/config.py: Nits
tomtseng Jan 16, 2026
143c896
path_generation: Reorder StudyPaths params to match order in StudyPat…
tomtseng Jan 16, 2026
3220ce0
fix: some doc fixes
sdhossain Jan 22, 2026
a947386
fix: iterated on some PR feedback
sdhossain Jan 22, 2026
3a3532c
fix: iterated on some PR feedback
sdhossain Jan 22, 2026
0209b75
fix: iterated on some PR feedback
sdhossain Jan 22, 2026
9e46fab
fix: iterated on some PR feedback
sdhossain Jan 22, 2026
199e5c0
fix: ruff re-format
sdhossain Jan 22, 2026
60a375f
fix: ruff re-format
sdhossain Jan 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 0 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,3 @@ cython_debug/

# Results
results/

# Data directory for eval datasets
data/
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,4 @@ repos:
language: system
types: [python]
stages: [pre-commit]
pass_filenames: false
pass_filenames: true
Loading