Cognitive Flow: An LLM-Automated Framework for Quantifying Reasoning Distillation

This repository contains the official implementation, data, and experiment materials for the paper “Cognitive Flow: An LLM-Automated Framework for Quantifying Reasoning Distillation.”

For further details, please refer to the full paper for theoretical background, experimental methodology, and complete visual results.

Framework Overview

The Cognitive Flow framework converts unstructured CoT text into a structured, quantifiable representation of reasoning style through four main stages:

1. Step Segmentation

The reasoning trace between <think> and </think> tags is extracted and divided into discrete reasoning steps using double-newline delimiters (\n\n).

2. Label Set Definition

A Label Extractor LLM analyzes a large random sample of reasoning steps (≈1000) to define a concise set of cognitive state labels (e.g., Interpretation, Calculation, Verification).
Prompt templates for this stage are available in cognitive_flow_utils/prompt_templates.py.

3. Step Classification

A Step Classifier LLM assigns one cognitive label to each reasoning step in a few-shot classification setup. The corresponding prompts are also defined in prompt_templates.py.

4. Flow Aggregation and Representation

Labeled sequences are aggregated into an N×N state transition matrix, capturing the conditional probability of transitions between cognitive states — effectively a “fingerprint” of a model’s reasoning style.

This matrix is used for quantitative comparisons across models through metrics such as Cosine Similarity (CS) and Kullback-Leibler Divergence (KLD).

Repository Structure

├── cognitive_flow_utils/
│   ├── dataset_utils.py         # Helper functions for handling prompt datasets
│   ├── llm_methods.py           # Core functions for querying LLMs
│   ├── models_and_clients.py    # Definitions and clients for models (DeepSeek, Gemma, etc.)
│   ├── prompt_templates.py      # System prompts for Label Extractor and Step Annotator
│   ├── step_annotation.py       # StepAnnotator class for batch annotation
│   └── ...
│
├── mmlu-elementary-maths/
│   ├── elementary_labels.txt
│   ├── elementary_maths_prompts.csv
│   ├── ..._steps.csv
│   └── to_steps.ipynb
│
├── mmlu-high-school-maths/
│   ├── high_school_labels.txt
│   ├── hs_maths_prompts.csv
│   ├── ..._steps.csv
│   └── to_steps.ipynb
│
├── mmlu-college-maths/
│   ├── college_labels.txt
│   ├── college_maths_prompts.csv
│   ├── ..._steps.csv
│   └── to_steps.ipynb
│
├── get_completions_from_prompts.py     # Generate model reasoning completions
├── annotate_steps_dataset.py           # Label reasoning steps
├── flow_analysis.ipynb                 # Cognitive Flow (matrix/graph) analysis
├── state_distribution_analysis.ipynb   # Cognitive state frequency analysis
├── token_distribution_analysis.ipynb   # Token effort distribution analysis
└── README.md

Note for installation

If using API-served models, configure API keys (e.g., DeepSeek, Groq) as environment variables for use in
cognitive_flow_utils/models_and_clients.py.

How to Replicate

The experimental pipeline consists of four main stages:

1. Generate Model Completions

Run the reasoning generation script:

python get_completions_from_prompts.py

This script produces raw reasoning outputs (CoTs) for a target model and dataset.
Parameters like dataset path, model, and temperature are configured within the script.

2. Segment CoTs into Steps

Use the notebooks (to_steps.ipynb) in each dataset folder to:

Extract <think> text segments
Split reasoning into individual steps (\n\n)
Save as ..._steps.csv

3. Annotate Reasoning Steps

Classify reasoning steps using:

python annotate_steps_dataset.py

This script assigns cognitive labels to each step and can optionally generate a new label set.

4. Analyze Cognitive Flows

With annotated datasets, use the Jupyter notebooks to:

Build state transition matrices
Compute CS and KLD between models
Generate visualizations

Data

Experiments are based on three subsets of the MMLU benchmark, representing increasing task complexity:

Elementary Maths
High School Maths
College Maths

Each subset includes:

The original MMLU prompts
Cognitive label sets generated via the Label Extractor LLM
Annotated reasoning steps for each evaluated model

All data are available within their corresponding /mmlu-* directories.

Results Summary

The Cognitive Flow framework provides a quantitative lens on reasoning transfer.
Analysis of the DeepSeek-R1 model family shows that:

High similarity is observed between teacher and student reasoning on medium-complexity tasks.
Divergence increases significantly on both simple and highly complex tasks.
Distilled models tend to underperform in “Verification”-related reasoning, neglecting cognitive self-checking.
Independently trained RL-based models (e.g., QwenQwQ-32B) display more balanced and adaptable reasoning flows.

These findings suggest that while KD effectively transmits surface reasoning structure, it may not capture deeper, flexible cognitive strategies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cognitive Flow: An LLM-Automated Framework for Quantifying Reasoning Distillation

Table of Contents

Framework Overview

1. Step Segmentation

2. Label Set Definition

3. Step Classification

4. Flow Aggregation and Representation

Repository Structure

Note for installation

How to Replicate

1. Generate Model Completions

2. Segment CoTs into Steps

3. Annotate Reasoning Steps

4. Analyze Cognitive Flows

Data

Results Summary

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
cognitive_flow_utils		cognitive_flow_utils
mmlu-college-maths		mmlu-college-maths
mmlu-elementary-maths		mmlu-elementary-maths
mmlu-high-school-maths		mmlu-high-school-maths
.gitignore		.gitignore
README.md		README.md
annotate_steps_dataset.py		annotate_steps_dataset.py
flow_analysis.ipynb		flow_analysis.ipynb
get_completions_from_prompts.py		get_completions_from_prompts.py
state_distribution_analysis.ipynb		state_distribution_analysis.ipynb
token_distribution_analysis.ipynb		token_distribution_analysis.ipynb

Folders and files

Latest commit

History

Repository files navigation

Cognitive Flow: An LLM-Automated Framework for Quantifying Reasoning Distillation

Table of Contents

Framework Overview

1. Step Segmentation

2. Label Set Definition

3. Step Classification

4. Flow Aggregation and Representation

Repository Structure

Note for installation

How to Replicate

1. Generate Model Completions

2. Segment CoTs into Steps

3. Annotate Reasoning Steps

4. Analyze Cognitive Flows

Data

Results Summary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages