A collaborative research project investigating how the temperature hyperparameter in Large Language Models (LLMs) controls randomness and structure in generated token sequences, using the classic logistic map as a theoretical baseline.
Team Members: Sanjana Kadambe, Jasreen Mehta, and Dhwanil Mori
Advisor: Dr. Neil Johnson, Professor at George Washington University
This research explores whether LLM temperature behaves analogously to the logistic map's r-parameter, investigating if increasing temperature produces a period-doubling route to chaos similar to deterministic dynamical systems.
Does LLM temperature sampling exhibit symbolic dynamics comparable to deterministic chaos theory?
We compare token sequences generated at different temperatures against the well-studied logistic map (r โ [3.4, 4.0]) to quantify similarities and differences in chaotic behavior.
- Establish Baseline: Use the logistic map as ground truth for deterministic chaos
- Symbolic Encoding: Convert both logistic trajectories and LLM tokens to a three-symbol alphabet (A/B/D)
- Temperature Sweep: Generate sequences across T โ [0.1, 2.0] for multiple LLM families
- Comparative Analysis: Compute and compare four key dynamical metrics
- System: x_{t+1} = rยทx_tยท(1 โ x_t)
- Parameter Range: r โ [3.4, 4.0] (150 points, 20 seeds each)
- Symbolic Encoding:
- A: Attractor band [0.48, 0.52]
- B: Above band (> 0.52)
- D: Below band (< 0.48)
| Model | Parameters | Status | HuggingFace ID |
|---|---|---|---|
| Alibaba Qwen 1.5B | 1.8B | โ Complete | Qwen/Qwen1.5-1.8B |
| Google Gemma 2B | 2.61B | โ Complete | google/gemma-2-2b |
- OpenAI GPT-2 Series (124M โ 1.5B)
- Qwen2 7B (scaling study)
- Qwen2-VL 32B (multimodal extension)
- Temperature Points: 20 evenly spaced in [0.1, 2.0]
- Sequences per Temperature: 10 diverse prompts
- Sequence Length: 200 tokens
- Total Sequences: 200 per model (20 temps ร 10 prompts)
For each sequence, we compute:
- Minimal Period (k โค 16; โ = chaotic)
- Entropy Rate (bits/symbol)
- Spectral Gap (mixing rate indicator)
- Symbol Frequencies (A/B/D distribution)
| Metric | Gemma 2B | Logistic Map | ฮ |
|---|---|---|---|
| Chaotic Fraction | 90.5% | 63.4% | +27.1pp |
| Mean Entropy Rate | 0.788 bits | 0.488 bits | +0.300 |
| Mean Spectral Gap | 0.846 | 0.457 | +0.389 |
| Symbol A Frequency | 1.9% | 7.0% | -5.1pp |
| Symbol B Frequency | 32.4% | 59.7% | -27.3pp |
| Symbol D Frequency | 65.7% | 33.3% | +32.4pp |
- Predominantly Chaotic: LLM outputs are 90%+ aperiodic, lacking the clear period-doubling cascade of deterministic chaos
- Temperature Control: Entropy increases from ~0.50 bits (Tโค0.5) to ~1.01 bits (Tโฅ1.5)
- Fast Mixing: LLMs exhibit ~85% higher spectral gaps, indicating shorter memory horizons
- Symbol Imbalance: Heavy bias toward D symbols (artifact of modulo-based encoding)
- Fundamental Stochasticity: LLM token streams are stochastic, not deterministic, chaotic
- Python 3.8 or higher
- CUDA-capable GPU recommended (8GB+ VRAM)
- 16GB+ system RAM
# Clone the repository
git clone https://github.com/yourusername/Data_network_Research_Project.git
cd Data_network_Research_Project
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt- Core ML/DL: PyTorch โฅ2.0.0, Transformers โฅ4.35.0, Accelerate โฅ0.24.0
- Data Processing: NumPy โฅ1.24.0, Pandas โฅ2.0.0, SciPy โฅ1.11.0
- Visualization: Matplotlib โฅ3.7.0, Seaborn โฅ0.12.0, NetworkX โฅ3.1
- Utilities: tqdm, Jupyter, ipywidgets
See requirements.txt for a complete list.
-
Open Jupyter Notebook:
jupyter notebook LLM_Temperature_Studies.ipynb
-
Run Sections in Order:
- Section 5: Logistic map baseline (if not already computed)
- Section 6.3: Qwen 1.5B implementation
- Section 6.2: Gemma 2B implementation (if available)
-
Expected Outputs:
- CSV files with metrics (
qwen_temperature_results.csv, etc.) - Visualization plots (PNG format)
- Console progress bars and statistics
- CSV files with metrics (
# Adjust temperature range
TEMPERATURE_MIN, TEMPERATURE_MAX = 0.5, 1.5
N_TEMPERATURES = 30 # More granular sampling
# Change sequence length
SEQ_LENGTH = 500 # Longer sequences for better statistics
# Modify prompts
N_PROMPTS_PER_TEMP = 20 # More samples per temperature
# Try different encoding methods
symbols = token_ids_to_symbols(token_ids, method='hash')Data_network_Research_Project/
โโโ README.md # This file
โโโ requirements.txt # Python dependencies
โโโ LLM_Temperature_Studies.ipynb # Main research notebook
โโโ LLM_Temperature_Study_Presentation.txt # Presentation slides text
โโโ QWEN_IMPLEMENTATION_SUMMARY.md # Qwen integration details
โโโ attractor_sequence_code_files/ # Baseline experiments
โ โโโ llm_symbol_maps_explorer_LOGISTIC_MAP.ipynb
โ โโโ llm_symbol_maps_explorer_band_no_transient(1).ipynb
โโโ [Generated Files]
โโโ qwen_temperature_results.csv # Qwen experiment data
โโโ logistic_baseline_results.csv # Baseline data
โโโ qwen_temperature_results.png # Qwen visualizations
โโโ qwen_vs_logistic_comparison.png # Comparative plots
- CPU with 8GB RAM (float32 inference)
- ~35 minutes per model (CPU)
- GPU with 8GB+ VRAM (float16 inference)
- ~15 minutes per model (GPU)
- GPU with 16GB+ VRAM
- Enables larger model experiments (7B+)
| Task | Time |
|---|---|
| Model Loading | 1-3 min (first run) |
| Temperature Sweep | 10-30 min (200 sequences) |
| Visualization | <1 min |
| Total per Model | 15-35 min |
- Run the Qwen experiment and validate results
- Analyze period-doubling behavior patterns in detail
- Perform quantitative comparison with the logistic baseline
- Qwen2 7B: Study parameter scaling effects (1.8B โ 7B)
- Qwen2-VL 32B: Multimodal symbolic dynamics
- Statistical significance testing
- Identify universal vs. model-specific behaviors
- Architecture impact analysis
- Map LLM temperature to logistic parameter r
- Develop temperature selection guidelines
- Create practical recommendations for practitioners
- Embedding-based symbol encodings
- Semantic clustering for A/B/D classification
- Prompt sensitivity analysis
- Longer sequence lengths for rare period detection
Understanding temperature's effect on token-level dynamics can:
- Inform prompt engineering best practices
- Guide sampling strategy selection
- Provide theoretical models of LLM creativity vs. coherence trade-offs
- Bridge connections between statistical models and dynamical systems theory
This project builds on:
- Classic chaos theory (logistic map, symbolic dynamics)
- Information theory (entropy rate, Markov chains)
- Spectral analysis (mixing times, eigengap)
- LLM sampling methods (temperature, top-p, top-k)
Contributions are welcome! Areas for contribution:
- Additional LLM model integrations
- Improved symbolic encoding methods
- Statistical analysis enhancements
- Visualization improvements
- Documentation and tutorials
If you use this research in your work, please cite:
@misc{llm_temperature_dynamics,
title={Symbolic Dynamics of LLM Temperature Sampling},
author={Kadambe, Sanjana and Mehta, Jasreen and Mori, Dhwanil},
year={2025},
publisher={GitHub},
url={https://github.com/Dhwanil25/Data_network_Research_Project},
note={Research conducted under the supervision of Dr. Neil Johnson, George Washington University}
}This project is licensed under the MIT License - see the LICENSE file for details.
For questions or collaboration opportunities:
- GitHub Issues: Create an issue
- Email: dhwanilmori03@gmail.com
- Dr. Neil Johnson, Professor at George Washington University, for his invaluable guidance and mentorship throughout this research
- Model Providers: Alibaba Cloud (Qwen), Google (Gemma), OpenAI (GPT)
- HuggingFace: For model hosting and the transformers library
- Open Source Community: PyTorch, NumPy, SciPy, Matplotlib contributors
Status: ๐ข Active Research Project
Version: 1.0