My Master's Dissertation Project: "Generating Dynamic Virtual Environments Using Large Language Models"
llmterraingen was a research-driven project that explored the use of Large Language Models (LLMs) to dynamically generate and extend virtual environments. The project integrates LLMs with Minecraft to create procedural terrain generation, analyse user interactions, and evaluate the effectiveness of AI-assisted world-building.
- LLM-Driven Terrain Generation – Uses Google Gemini to generate structured JSON outputs for creating diverse landscapes.
- Custom Minecraft Mod – Implements unique commands, terrain-altering mechanics, and automated world modifications.
- Quantitative and Qualitative Analysis – Evaluation of AI-generated terrains using statistical comparisons and user feedback.
- High-Performance Data Processing – Optimised scripts for processing large datasets, including block data compression and trace analysis.
- Automated Screenshot Capture & Filtering – Scripts for collecting and curating terrain snapshots for model tuning.
- Multi-Step Experiment Workflow – Includes tools for analyzing terrain similarity, user study data, and model outputs.
This GIF demonstrates a statically-generated LLM world being dynamically extended when the player reaches the edge of the terrain. Essentially, a new "tile" is being requested from the Minecraft Client which is translated and sent to Google Gemini to produce a list of generation commands that will continue this terrain in a coherent manner. This particular example was with Google Gemini Pro 2.0, but we can expect that leveraging the Flash 2.0 model would have given us at least a 3x speedup (around 15 seconds to generate the new tile).
The majority of project files, especially those required to run the modded Minecraft version, take up gigabytes of storage. For the sake of brevity, only the most relevant files have been included in this repository.
Contains the data from the user study, including:
- Combined short-form responses and longer feedback forms.
- A Jupyter notebook for calculating metrics and generating charts/plots.
Stores the code written for the Minecraft mod, such as custom commands and items.
Contains various scripts used throughout the project lifecycle:
block_data_compression.py– Converts raw block data from a Minecraft world into a list of frequencies for each layer above sea level.JSON_automation.mjs– Automates the stringification of JSON files (RLE data and processed data) and triggersblock_data_compression.pywhen a new raw_data file is detected.model.js– The main NodeJS server handling communication between Google Gemini and Minecraft.quantitative_analysis.ipynb– A Jupyter notebook that compares terrains quantitatively.screenshot_automation.ahk– An AutoHotKey script used for capturing 300 screenshots for model tuning.screenshot_filtering.py– Moves screenshots with insufficient terrain rendering to another folder.worldgen.ipynb– A Jupyter notebook used to generate "random" terrains with constraining parameters.
A folder containing example responses from the model used in terrain generation.
- Code: Licensed under the Apache 2.0 License, allowing modification and redistribution with attribution.
- Report & Documentation: Licensed under the Creative Commons Attribution 4.0 License, permitting sharing and adaptation with proper credit.
For more details, see the respective licence files in this repository.
