mlmcbot - Machine Learning Archery Bot for Minecraft

My Summer 2023 machine learning research project for training autonomous agents to perform accurate bow-and-arrow shots in Minecraft using Mineflayer bots, computer vision, and reinforcement learning.

Project Type: Summer 2023 Research Project
Status: Paused Development
Key Features: POV Recording, Barrier Detection, Archery Data Collection, DQN/DDPG RL Training

Project Overview

Goal

The primary objective of this project is to develop an AI-driven Minecraft bot capable of accurately shooting moving targets with a bow and arrow. The bot learns to predict projectile trajectories and adjust aim parameters (yaw and pitch) in real-time using deep reinforcement learning.

Long-term Vision: Create a system where the bot can:

Predict target movement patterns
Calculate predictive lead (lead shots ahead of moving targets)
Dynamically adjust aim based on distance, height difference, and target velocity
Continuously improve accuracy through reinforcement learning

Current Capabilities

Implemented:

Bot spawning and server connection (Mineflayer)
First-person POV recording and streaming
Target player detection and tracking
Arrow trajectory tracking
Gravity and physics data collection
Discrete action DQN training
Continuous action DDPG training
YOLOv8-based object detection for barriers
Reward calculation based on shot accuracy
Model persistence (save/load training state)

In Development:

Movement prediction models
Predictive shot calculation
Multi-target engagement
Real-time trajectory optimization

System Architecture

High-Level Overview

┌─────────────────────────────────────────────────────┐
│          Minecraft Server (1.12)                    │
│  ├─ Bot Agent (Mineflayer)                          │
│  └─ Target Player                                   │
└────────────┬──────────────────────────────────────┬─┘
             │                                      │
             ▼                                      ▼
      ┌─────────────────┐              ┌──────────────────┐
      │ Prismarine      │              │ Physics Engine   │
      │ Viewer          │              │ (Arrow/Gravity)  │
      │ (POV Stream)    │              └──────────────────┘
      └────────┬────────┘
               │
      ┌────────▼──────────────────────┐
      │  RL Training Pipeline         │
      │  ├─ State: Screenshot         │
      │  ├─ Action: Yaw/Pitch Change  │
      │  ├─ Reward: Hit Distance      │
      │  └─ Training: DQN/DDPG        │
      └───────────────────────────────┘

Core Components

1. Bot Framework (Mineflayer)

JavaScript library for Minecraft bot control
Handles server connection, chat, movement, and item usage
Provides entity tracking and world state information

2. Reinforcement Learning Modules

Discrete RL (discrete_rl.py):

DQN architecture with fully-connected layers
Discrete action space (fixed yaw/pitch increments)
Action space: 15 pitch changes × 1 yaw change = 15 actions
Batch size: 64, Gamma: 0.99, Learning rate: 1e-4

DDPG RL (ddpg_rl.py):

Deep Deterministic Policy Gradient for continuous actions
Separate actor and critic networks using TensorFlow/Keras
Continuous yaw and pitch adjustments
Replay buffer with capacity 2000, batch size 64

3. Vision & Detection

OpenCV for image processing
YOLOv8 for barrier/obstacle detection
POV screenshots captured from Prismarine viewer
Image normalization and template matching for target tracking

4. Physics Simulation

Gravity calculation from empirical data
Arrow trajectory tracking in real-time
Hit detection and reward computation

Getting Started

Prerequisites

Software Requirements:

Python 3.8+
Node.js 14+ (for Mineflayer)
Minecraft Java Edition 1.12 server

Python Libraries:

numpy
torch / tensorflow
opencv-python
pillow
scikit-learn
matplotlib
seaborn
pandas
ultralytics (YOLO)

Node.js Modules:

mineflayer
minecraft-data
prismarine-viewer

Installation

Clone the repository:
```
git clone <repo-url>
cd mlmcbot
```
Install Python dependencies:
```
pip install -r requirements.txt
```

Install Node.js dependencies:

npm install mineflayer minecraft-data prismarine-viewer

Set up the Minecraft server:
- Launch a Minecraft 1.12 server
- Ensure it's accessible at 127.0.0.1:25565 (configurable)
- Create a flat/prepared testing world
Configure paths: Update file paths in the bot scripts to match your system:
- SAVE_PATH: Directory for saving trained models
- RECORD_PATH: Directory for recording RL training data
- Output paths in bot initialization

Quick Start

Start the Minecraft server:
```
[Start your 1.12 server]
```
Launch the bot:
```
python BotScripts/mlmcbot.py
```
View POV stream:
- The bot will print a localhost URL (e.g., http://localhost:3000)
- Open in a web browser to see the bot's first-person view
Connect target player:
- Join the server with username rl_target
- Position yourself in the bot's field of view

How It Works

1. Bot Control Loop

# Main event loop in mlmcbot.py
def on_physics_tick_handler():
    # Called every tick (~20ms)
    # - Update arrow positions
    # - Track target location
    # - Calculate hit/miss
    # - Update RL state

Key operations:

Tracking: Bot maintains real-time position of target player and arrows
State representation: Captures POV screenshot as state for RL agent
Action execution: Applies yaw/pitch adjustments to aim
Reward calculation: Computes distance between arrow and target at impact

2. Trajectory Prediction

Currently implemented:

Physics-based: Gravity constant determined empirically
Arrow tracking: Real-time position updates from Minecraft entity data

Future enhancement:

ML-based prediction: Train LSTM/Transformer on historical trajectories
Lead calculation: Predict target's future position and adjust shot accordingly

3. Reinforcement Learning Training

DQN (Discrete Actions)

State Space:     4-dimensional vector or 64×64 RGB screenshot
Action Space:    {-10°, -7°, -5°, ..., 0°, ..., 7°, 10°} pitch changes (15 discrete actions)
Reward:          -distance_to_target (negative, lower is better)
Training:        Experience replay + target network updates every 24 episodes

DDPG (Continuous Actions)

State Space:     256×128 RGB screenshot
Action Space:    [Δyaw ∈ [-90, 90], Δpitch ∈ [-90, 90]]
Reward:          -distance_to_target + bonus for hits
Training:        Off-policy with critic guidance, soft target updates (TAU=0.005)

4. State & Action Space

State: First-person POV screenshot processed to 64×64 or 256×128 RGB

Actions (Discrete):

Yaw changes: [0] (no horizontal adjustment in base version)
Pitch changes: 15 discrete values from -10° to +10°

Actions (Continuous - DDPG):

Yaw adjustment: [-90°, +90°] (full rotation possible)
Pitch adjustment: [-90°, +90°] (look up/down range)

Reward Function:

reward = -euclidean_distance(arrow_pos, target_pos) + hit_bonus
# Incentivizes minimizing distance to target
# Bonus applied if arrow hits within tolerance

Training the Models

Starting Training

Discrete DQN:

python BotScripts/archer_mcbot.py
# Uses discrete_rl.py for action selection
# Trains policy network incrementally

Continuous DDPG:

python BotScripts/archer_bot_ddpg_rl.py
# Uses ddpg_rl.py for continuous control
# Actor-critic architecture

Training Process

Initialization:
- Bot spawns target player
- RL agent initializes policy and target networks
- Replay buffer created
Episode Loop:
- Bot aims at target (random initialization)
- Selects action from RL policy
- Applies yaw/pitch adjustment
- Records POV screenshot as state
- Shoots arrow
- Observes reward (distance to target)
- Stores transition: (state, action, reward, next_state, done)
Learning:
- Sample minibatch from replay buffer
- Compute TD-error using target network
- Update policy network via gradient descent
- Soft update target network (TAU = 0.001 - 0.005)
Evaluation:
- Track rewards over episodes
- Save best model weights
- Plot loss curves and accuracy metrics

Hyperparameters

DQN:

Batch size: 64
Gamma (discount): 0.99
Learning rate: 1e-4
Epsilon start: 0.9, end: 0.05, decay: 1000 steps
Buffer capacity: 10,000
Target update frequency: Every 24 episodes

DDPG:

Batch size: 64
Gamma (discount): 0.99
Learning rate: 1e-4
TAU (soft update): 0.005
Buffer capacity: 2,000
Actor/Critic networks: Dense layers (configurable)

Monitoring Training

Models and data are saved to:

C:\Users\btm74\AdventuresInMinecraft-PC\BowRL\
├── policy_net.pth          # Trained policy network
├── target_net.pth          # Target network
├── memory.pth              # Replay buffer
├── ddpg_rl_records/        # DDPG training records
│   ├── states.npy
│   ├── actions.npy
│   ├── rewards.npy
│   ├── next_states.npy
│   └── dones.npy
└── bow_rewards.csv         # Training rewards log

To plot training progress:

import pandas as pd
import matplotlib.pyplot as plt

rewards = pd.read_csv('bow_rewards.csv')
plt.plot(rewards['episode'], rewards['reward'])
plt.xlabel('Episode')
plt.ylabel('Cumulative Reward')
plt.show()

Bot Commands and Operations

Chat Commands

The bot responds to in-game chat commands:

!aim                    # Start aiming at target player
!shoot                  # Fire arrow at current aim
!train                  # Begin RL training episode
!record                 # Record POV video
!detect                 # Run barrier detection
!gravity_test           # Collect gravity data
!range_test             # Test shooting range
!stop                   # Halt current operation

Programmatic Control

from BotScripts.mlmcbot import mlmcbot

bot = mlmcbot("ArcherBot")

# Manual control
bot.look_at_player(target_x, target_y, target_z)
bot.shoot_arrow()

# RL training
bot.run_rl_training(episodes=100)

# Data collection
bot.collect_gravity_data()
bot.collect_trajectory_data()

Data Collection

Types of Data Collected

1. Physics Data

Gravity measurements: Fall time vs. fall height
Trajectory data: Arrow position, velocity, target distance
Stored in: gravity-data.txt, trajectory-data.csv

2. Training Data

State: POV screenshots (256×128 or 64×64 RGB)
Actions: Yaw/pitch adjustments
Rewards: Distance to target
Next states: POV after action
Stored in: Replay buffer (in-memory + checkpoint files)

3. Detection Data

Images: POV screenshots with barriers/obstacles
Annotations: LabelMe JSON format
Converted to: YOLO format for YOLOv8 training
Path: Detection_Povs/ and Detection_Povs_Test/

4. Evaluation Metrics

Hit distance: Minimum distance arrow reached to target
Accuracy: Percentage of arrows within X blocks of target
Correlation: NCC (Normalized Cross-Correlation) for template matching
Exported to: CSV and text files for analysis

Processing Pipeline

Raw POV Screenshot
    ↓
Image Normalization
    ↓
Grayscale Conversion (for correlation)
    ↓
Resize to 64×64 / 256×128
    ↓
Normalize to [0, 1] or [-1, 1]
    ↓
Feed to RL Agent / Detection Model

Future Work

Phase 1: Movement Prediction (Short-term)

Goal: Enable the bot to predict player movement and calculate predictive leads.

Implementation:

Collect trajectory data:
- Record player positions over time
- Sample at 10-20 Hz frequency
- Create dataset of movement patterns

Train movement predictor:

Architecture: LSTM / Transformer
Input: Player position history (last 10 positions)
Output: Predicted position at t+Δt
Loss: MSE on positional error

Calculate lead:

predicted_pos = movement_model(position_history)
time_to_arrow_impact = distance / arrow_velocity
aim_vector = predicted_pos - current_pos
aim_at(aim_vector)

Expected improvement: 30-50% increase in hit rate for moving targets

Phase 2: Dynamic Trajectory Optimization (Medium-term)

Goal: Learn to adjust shot trajectory based on distance and environmental factors.

Approach:

Factor extraction:
- Distance to target
- Height difference
- Wind/physics parameters (future for modded servers)

Conditional policy:

policy(state | distance, height, velocity) → (yaw, pitch)

Multi-agent RL:
- Train separate policies for different distance ranges
- Ensemble approach for robustness

Expected improvement: 40-60% accuracy on diverse ranges

Phase 3: Adaptive Learning (Long-term)

Goal: Bot continuously adapts to server-specific physics and player behavior.

Features:

Online learning during gameplay
Few-shot adaptation to new scenarios
Transfer learning from related games/tasks
Curriculum learning (easy targets → moving targets → multiple targets)

Phase 4: Advanced Perception (Speculative)

Potential enhancements:

3D pose estimation of target player
Obstacle detection and navigation
Wind/environmental factor prediction
Multi-modal learning (visual + physical sensors)

Project Structure

mlmcbot/
├── BotScripts/                      # Main bot implementations
│   ├── mlmcbot.py                   # Base bot class with all features
│   ├── archer_mcbot.py              # Discrete DQN trainer
│   ├── archer_bot_ddpg_rl.py        # DDPG continuous trainer
│   ├── discrete_rl.py               # DQN network + training logic
│   ├── ddpg_rl.py                   # DDPG network + training logic
│   ├── rl.py                        # Legacy RL utilities
│   ├── rlutil.py                    # RL helper functions
│   ├── utils.py                     # General utilities (image, file, math)
│   ├── mcyolov8.py                  # YOLOv8 object detection wrapper
│   ├── neuralnet.py                 # Custom neural network definitions
│   ├── discrete_rl_V2.py            # Improved discrete RL variant
│   ├── archerV2.py                  # Alternative archer bot
│   └── prof_nn_def.py               # Profiling utilities
├── New_BotScripts/                  # Development/experimental code
│   ├── new_archer_bot/              # Refactored bot implementation
│   ├── new_rl.py                    # Experimental RL approach
│   └── utils.py                     # Updated utilities
├── README.md                        # This file
├── LICENSE                          # Project license
└── .git/                            # Version control

Key Directories (External):
BowRL/                               # Trained models directory
├── policy_net.pth
├── target_net.pth
├── memory.pth
└── bow_rewards.csv

MineflayerData/                      # Collected data
├── gravity-data.txt
├── correlation-data.txt
├── detection_data.txt
└── Detection_Povs/

Key Classes and Functions

mlmcbot.mlmcbot (Main Bot Class)

class mlmcbot:
    def __init__(self, botname, sentient=True, ip=HOST, port=PORT)
    
    # Control
    def shoot_arrow()
    def aim_at_player()
    def look_at(x, y, z)
    
    # Training
    def run_rl_training(episodes, trainer='dqn')
    def collect_trajectory_data()
    def collect_gravity_data()
    
    # Detection
    def detect_barriers()
    def run_barrier_detection()
    
    # Utilities
    def get_target_position() → (x, y, z)
    def get_arrow_position() → (x, y, z)
    def calculate_distance_to_target() → float

discrete_rl.DQN (Discrete Action Network)

class DQN(nn.Module):
    def __init__(self, state_size=4, action_dim=15)
    def forward(x) → action_logits
    
# Training functions
def select_action(state) → action_index
def optimize_model() → loss

ddpg_rl.ReplayBuffer (DDPG Experience Storage)

class ReplayBuffer:
    def add_record(state, action, reward, next_state, done)
    def sample(batch_size) → batch
    def save(folder_name)
    def load(folder_name)

Troubleshooting

Bot Won't Connect

Check server: Ensure Minecraft server is running on 127.0.0.1:25565
Python environment: Verify mineflayer-js bridge is installed
Firewall: Disable local firewall rules blocking port 25565

POV Stream Not Loading

Port conflict: Check if viewer port (3000+) is in use
Browser cache: Clear cache or use incognito window
Console errors: Check browser developer console (F12) for WebSocket errors

Training Not Improving

Reward scaling: Ensure reward values are reasonable (not too large/small)
State preprocessing: Verify POV screenshots are being captured correctly
Action bounds: Confirm yaw/pitch adjustments are within valid ranges
Learning rate: Try adjusting LR (1e-4 is typical starting point)

Memory Issues

Batch size: Reduce BATCH_SIZE if OOM errors occur
Image resolution: Use 64×64 instead of 256×128
Buffer capacity: Reduce BUFFER_CAPACITY if memory limited

References

Reinforcement Learning

Libraries & Tools

Mineflayer - Minecraft bot framework
PyTorch - Deep learning framework
TensorFlow/Keras - Alternative DL framework
YOLOv8 - Object detection
Prismarine - 3D world viewer

Project Context

Summer 2023 Research Project
Focus: AI-driven gameplay and trajectory prediction
Testbed: Minecraft 1.12 with custom world

Contributing

Contributions are welcome! Areas of focus:

Implement LSTM-based movement predictor
Optimize RL training (better hyperparameters)
Add multi-target support
Implement curriculum learning
Create comprehensive documentation
Add unit tests
Performance profiling and optimization

License

See LICENSE for details.

Author: Research Team | Last Updated: January 2026

For questions or issues, please refer to the repository's issue tracker or contact the project maintainers.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
BotScripts		BotScripts
New_BotScripts		New_BotScripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

mlmcbot - Machine Learning Archery Bot for Minecraft

Table of Contents

Project Overview

Goal

Current Capabilities

System Architecture

High-Level Overview

Core Components

1. Bot Framework (Mineflayer)

2. Reinforcement Learning Modules

3. Vision & Detection

4. Physics Simulation

Getting Started

Prerequisites

Installation

Quick Start

How It Works

1. Bot Control Loop

2. Trajectory Prediction

3. Reinforcement Learning Training

DQN (Discrete Actions)

DDPG (Continuous Actions)

4. State & Action Space

Training the Models

Starting Training

Training Process

Hyperparameters

Monitoring Training

Bot Commands and Operations

Chat Commands

Programmatic Control

Data Collection

Types of Data Collected

1. Physics Data

2. Training Data

3. Detection Data

4. Evaluation Metrics

Processing Pipeline

Future Work

Phase 1: Movement Prediction (Short-term)

Phase 2: Dynamic Trajectory Optimization (Medium-term)

Phase 3: Adaptive Learning (Long-term)

Phase 4: Advanced Perception (Speculative)

Project Structure

Key Classes and Functions

mlmcbot.mlmcbot (Main Bot Class)

discrete_rl.DQN (Discrete Action Network)

ddpg_rl.ReplayBuffer (DDPG Experience Storage)

Troubleshooting

Bot Won't Connect

POV Stream Not Loading

Training Not Improving

Memory Issues

References

Reinforcement Learning

Libraries & Tools

Project Context

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages