-
Notifications
You must be signed in to change notification settings - Fork 1
infra: arxiv version #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
55 commits
Select commit
Hold shift + click to select a range
a29bfa1
infra: added features to benchmark, and started building support for …
sdhossain 7cbd7a3
infra: fixed some bugs
sdhossain 638e100
fix: iterated on some PR changes
sdhossain 7d10efb
added jailbreak tuning
sdhossain 9b6989f
refactor: added model config and some utilities for benchmarking
sdhossain 5b1cd9d
stashing changes
sdhossain cbdac17
trying something out
sdhossain de5f032
stashing changes
sdhossain 07d6d70
stashing changes
sdhossain fee12ef
stashing changes
sdhossain c5c083e
stashing changes
sdhossain 8b427dd
stashing changes
sdhossain 3575e68
stashing changes
sdhossain a6d330a
stashing changes
sdhossain fd148e4
stashing changes
sdhossain 25afc03
stashing changes and added T-Vaccine repo
sdhossain 8745feb
stashing changes
sdhossain 81b2c8a
stashing changes
sdhossain ca5ba71
infra: major refactor + addition of evals and attacks
sdhossain c2b91d7
fix: add missing return in backdoor_random_paragraph method
tomtseng 59ba3e3
fix: remove duplicate strenum dependency in pyproject.toml
tomtseng 08157eb
docs: remove broken link to non-existent CLAUDE.md
tomtseng f141722
fix: correct find -mmin flag in cleanup script
tomtseng 9a976a1
fix: correct typo 'alignmed' -> 'aligned' in defense.py
tomtseng f6ccc86
fix: correct docstring in test_embedding_attack_eval.py
tomtseng 04a7ff8
fix: correct typo "consitent" -> "consistent" in output_schema.py
tomtseng 77c91c0
docs: remove reference to non-existent example_2.sh script
tomtseng 82d471a
docs: pyrightconfig.json -> pyproject.toml
tomtseng 994afe3
fix: correct copy-paste docstrings in jailbreak finetune classes
tomtseng 299fcb6
fix: correct copy-paste docstrings in jailbreak finetune classes
tomtseng 1a24629
fix: correct type hint for to_completions data_point parameter
tomtseng cad5db9
fix: standardize spelling "defence_config" -> "defense_config"
tomtseng cd1fa7c
docs: fix typo in USAGE.md (extra backtick)
tomtseng b9cc1b2
configs: remove duplicate 'plain' from template choices
tomtseng 8a08c84
fix: correct type annotation for data_row in JailbreakBenchEvaluation
tomtseng 5b48c27
chore: remove commented-out delete_output_checkpoint call
tomtseng 68c843c
mmlu_pro: Make private functions public since they're used in PR #44
tomtseng 9f7fce9
Run pre-commit except for basedpyright
tomtseng bd98dd7
fix: iterated on PR feedback, r1
sdhossain 673e053
docs: Move FAR cluster usage directions to separate file
tomtseng 3495d7d
docs: Table cleanup
tomtseng 1fd792c
analyze_results: Exit with error if file not found
tomtseng fe31d34
benchmark_grid: Clean-up
tomtseng f92c372
optuna_single: Be consistent about using hyphens vs underscores in CL…
tomtseng 262688b
scripts/whitebox: Use existing get_repo_root() function
tomtseng b748d5b
lora_finetune: Fix a type annotation
tomtseng 129db29
utils/models/config.py: Nits
tomtseng 143c896
path_generation: Reorder StudyPaths params to match order in StudyPat…
tomtseng 3220ce0
fix: some doc fixes
sdhossain a947386
fix: iterated on some PR feedback
sdhossain 3a3532c
fix: iterated on some PR feedback
sdhossain 0209b75
fix: iterated on some PR feedback
sdhossain 9e46fab
fix: iterated on some PR feedback
sdhossain 199e5c0
fix: ruff re-format
sdhossain 60a375f
fix: ruff re-format
sdhossain File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -178,6 +178,3 @@ cython_debug/ | |
|
|
||
| # Results | ||
| results/ | ||
|
|
||
| # Data directory for eval datasets | ||
| data/ | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -24,4 +24,4 @@ repos: | |
| language: system | ||
| types: [python] | ||
| stages: [pre-commit] | ||
| pass_filenames: false | ||
| pass_filenames: true | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.