Dynamic MoE

A framework for dynamically converting Transformer-based models into Mixtures of Experts (MoE) architectures.

Model Directory

The model/ directory contains the core implementation for converting standard transformer models (specifically LLaMA-based models) into dynamic Mixture of Experts architectures.

Files Overview

init.py: Exports the main classes and sets up the device (CUDA if available, CPU otherwise).
main.py: Entry point for the application, with a simple interface to run MoE conversion experiments.
moe.py: Contains the MixtureOfExperts class, a general-purpose MoE implementation with top-k routing.
expert.py: Implements ExpertLayer, the individual expert modules used within MoE layers.
llama_moe.py: Contains LLaMaMoEBlock, a modified LLaMA transformer block with MoE replacing the standard FFN.
run_experiment.py: Orchestrates experiments for converting standard models to MoE, with layer-by-layer testing.
utils.py: Utility functions for loading models and other helper functions.
pyproject.toml & poetry.lock: Package dependency management for the project.

Key Components

MixtureOfExperts

The core MoE implementation that:

Uses a gating/routing mechanism to direct tokens to the most relevant experts
Supports top-k routing (can select multiple experts per token)
Maintains the same interface as a standard feed-forward neural network

LLaMaMoEBlock

A drop-in replacement for standard LLaMA transformer blocks that:

Preserves the original attention mechanism
Replaces the feed-forward network with an MoE layer
Maintains compatibility with the original model's interface
Initializes expert weights from the original model's parameters

Experiment Framework

The run_experiment.py file provides:

Automatic conversion of standard models to MoE architectures
Layer-by-layer testing to ensure model stability
Options to control how many and which layers to convert
Fallback mechanisms to revert problematic conversions

Usage

To convert a model to use MoE:

from model.run_experiment import run_moe_conversion

model, tokenizer = run_moe_conversion(
    model_name="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    num_experts=4,        # Number of experts per MoE layer
    top_k=2,              # How many experts to route each token to
    conversion_frequency=4  # Convert every 4th layer
)

The framework automatically handles model loading, conversion, testing, and can optionally save the converted model.

Installation and Setup

Prerequisites

Python 3.10 or higher
Poetry (dependency management)

Environment Setup

Install Poetry (if not already installed):

curl -sSL https://install.python-poetry.org | python3 -

Clone the repository:

git clone https://github.com/yourusername/dynamic-moe.git
cd dynamic-moe

Install dependencies:
```
cd model
poetry install
```

Running the Project

Infrastructure

It is recommended to run this project in a GPU-acceleratd environment. See this guide to launch and SSH into a GPU-accelerated EC2 instance: https://docs.google.com/document/d/1Hky7NQRuBwpyDnrvl4j9ksPi_ORabHdHzkXRHhqhxZg/edit?tab=t.0

Running from Command Line

To run the default conversion experiment:

cd model
poetry run python -m main

This will:

Load the TinyLlama-1.1B-Chat model
Convert layers to MoE architecture
Run tests to ensure stability
Save the converted model

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
eval		eval
finetuning-final		finetuning-final
finetuning		finetuning
models		models
prompt_complexity		prompt_complexity
.gitignore		.gitignore
README.md		README.md
skeleton.py		skeleton.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dynamic MoE

Model Directory

Files Overview

Key Components

MixtureOfExperts

LLaMaMoEBlock

Experiment Framework

Usage

Installation and Setup

Prerequisites

Environment Setup

Running the Project

Infrastructure

Running from Command Line

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

tarunyaa/dynamic_moe

Folders and files

Latest commit

History

Repository files navigation

Dynamic MoE

Model Directory

Files Overview

Key Components

MixtureOfExperts

LLaMaMoEBlock

Experiment Framework

Usage

Installation and Setup

Prerequisites

Environment Setup

Running the Project

Infrastructure

Running from Command Line

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages