llm-backdoor

Here is 1 public repository matching this topic...

willsandersrfs / jane-street-dormant

Reverse-engineering hidden backdoor triggers in three 671B DeepSeek-V3 language models. Activation-space probing, SVD weight analysis, and absorbed MLA SVD for the Jane Street Dormant LLM Puzzle.

puzzle svd jane-street ml-security mechanistic-interpretability deepseek-v3 sleeper-agents activation-probing llm-backdoor

Updated Apr 2, 2026
Python

Improve this page

Add a description, image, and links to the llm-backdoor topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-backdoor topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-backdoor

Here is 1 public repository matching this topic...

willsandersrfs / jane-street-dormant

Improve this page

Add this topic to your repo