🌐 OverWorld Inference Engine

Overview

Core library for world model inference:

Simple API to load models and generate image frames from text, control inputs, and prior frames
Encapsulates the frame-generation stack (DiT, autoencoder, text encoder, KV cache)
Optimized backends for Nvidia, AMD, Apple Silicon, etc., on consumer and data center GPUs
Loading base World Models and LoRA adapters

Out of scope

Not a full client:

No rendering/display of video or images
No reading controller/keyboard/mouse input
No FAL or other external integrations

Reference client can be found in the LocalWorld repo

Out-of-scope pieces can go in examples/, which is not part of the world_engine.* package.

Quick Start

Setup

# Recommended
python3 -m venv .env
source .env/bin/activate

# Install
pip install \
  --index-url https://download.pytorch.org/whl/test/cu128 \
  --extra-index-url https://download.pytorch.org/whl/nightly/cu128 \
  --extra-index-url https://pypi.org/simple \
  --upgrade --ignore-installed \
  "world_engine @ git+https://github.com/Wayfarer-Labs/world_engine.git"

# Specify HuggingFace Token (https://huggingface.co/settings/tokens)
export HF_TOKEN=<your access token>

Run

from world_engine import WorldEngine, CtrlInput

# Create inference engine
engine = WorldEngine("OpenWorldLabs/CoDCtl-Causal-SelfForcing-UniformSigma", device="cuda")

# Specify a prompt
engine.set_prompt("A fun game")

# Optional: Force the next frame to be a specific image
img = pipeline.append_frame(uint8_img)  # (H, W, 3)

# Generate 3 video frames conditioned on controller inputs
for controller_input in [
		CtrlInput(button={48, 42}, mouse=[0.4, 0.3]),
		CtrlInput(mouse=[0.1, 0.2]),
		CtrlInput(button={95, 32, 105}),
]:
	img = engine.gen_frame(ctrl=controller_input)

Docs

WorldEngine

WorldEngine computes each new frame from past frames, the controls, and the current prompt, then appends it to the sequence so later frames stay aligned with what has already been generated.

CtrlInput

@dataclass
class CtrlInput:
    button: Set[int] = field(default_factory=set)  # pressed button IDs
    mouse: Tuple[float, float] = (0.0, 0.0)  # (x, y) position

button keycodes are defined by Owl-Control
mouse is the raw mouse velocity vector

Usage

from world_engine import WorldEngine, CtrlInput

Load model to GPU

engine = WorldEngine("OpenWorldLabs/CoDCtl-Causal-SelfForcing-UniformSigma", device="cuda")

Specify a prompt which will be used until this function is called again

engine.set_prompt("A fun game")

Generate a image conditioned on current controller input (explicit) and history / prompt (implicit)

controller_input = CtrlInput(button={48, 42}, mouse=[0.4, 0.3])
img = engine.gen_frame(ctrl=controller_input)

Instead of generating, set the next frame as a specific image. Typically done as a step before generating.

# example: random noise image
uint8_img = torch.randint(0, 256, (512, 512, 3), dtype=torch.uint8)
img = pipeline.append_frame(uint8_img)  # returns passed image

Note: returned img is always on the same device as engine.device

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
assets		assets
examples		examples
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌐 OverWorld Inference Engine

Overview

Out of scope

Quick Start

Setup

Run

Docs

WorldEngine

CtrlInput

Usage

Examples

About

Uh oh!

Releases

Packages

Languages

License

daydreamlive/world_engine

Folders and files

Latest commit

History

Repository files navigation

🌐 OverWorld Inference Engine

Overview

Out of scope

Quick Start

Setup

Run

Docs

WorldEngine

CtrlInput

Usage

Examples

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages