A modular test-driven framework for detecting and analyzing glitch tokens
✅ Glitch Token extraction for an arbitrary model ✅ Modular Test Framework ✅ Fine-tunable ✅ Run your model locally or via API
When using the default tests that come with this repository, only the GlitchTokenDiscovery.py file is needed. Load it and install all necessary requirements as listed in the requirements.txtfile. Also make sure Ollama is set up and the model to be tested is downloaded.
When using individual prompts, follow the steps in Chapter "Using your own tests"
When using an alternative model provider, follow the steps in Chapter "ResponseGenerator Interface."
Once all is set up, run the python file and stay excited for the results. Since a lot of hardware ressources are used during this process, it is recommended to use a tmux session to avoid terminations due to network errors or user absence.
If you want to use your own tests, a set of prompts, systeminstructions, predicates and a desired test order has to be defined. The following steps will guide through the individual steps.
Predicates are used to evaluate the test cases. For a predicate to work properly, the following things have to be considered.
- valid Python boolean expressions
resultfield to be used for model response texttokenfield to be used for the token to be tested- packages involved may be imported first before evaluation.
Example predicates:
| Task | Expected Result | Predicate |
|---|---|---|
| Repeat the token | The token is contained in the result | token in result |
| Spell the token | Each character should be separated by a hyphen | "-".join(char for char in token) in result |
| Return the length of a token | The integer representation of the token length should be returned | str(len(token)) in result |
With the predicates set up the prompt can be constructed and set up in a semicolon separated .CSV file. The structure should be as follows.
token-idthe index of the prompt with the desired position in the testing framework ranging from 0 to n-1 (n being the number of prompts)systemInstructionadditional information for the model to be considered before the testpromptthe prompt containing the test questionpredicateas defined in the previous section
With the .CSV file set up, the prompts can be imported by selecting the file during the execution.
a standard tokenizer.json file of a model can be used. In case a tokenizer file is not available, an adequate .CSV representation of the token vocabulary is sufficient. For that, the file needs to have the following structure: token-id;token. This file can then either be inserted in the code directly, or chosen by executing the __main__ method.
If using an alternative API or local model framework such as DeepSeek, OpenAI, Transformers library, this provider can be inserted by implementing the GenerateResult interface and inserting it in the generator field of the GlitchTest function.
def generateResponse(self, model: str, prompt: str, systemInstruction: str) -> str:
passExamples of Implementations ready to use are listed in the Generators package. (DeepSeek, OpenAI)
The result will contain a table (.CSV, ";" separated) in which the token and the according token id to each discovered glitch token is listed. Additionally the results of all four tests for this particular token is displayed in the columns on the right. The results could then be evaluated to get a better understanding of the origin and potential patterns the glitch tokens are exhibiting.
Four examples are provided in the Examples folder.
These examples demonstrate different modular aspects of the algorithm. Custom test cases, intermediate result saving and different model providers are presented in these example files.