PruningOfExperts

This repository contains the implementation for the final project of the LLMs class at MVA.

The code of this repo is mainly based on the following repository: Expert_Sparsity which contains the code from the paper Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models.

Implementation details:

The main goal of our work was to extend the previous implementation for the efficient expert pruning from the paper mentioned above under severe memory constraints. Particularly, we modified the base implementation to allow the pruning over a 4-bit quantized version of the Mixtral8x7B model (model used in the original work) and the DeepSeek MoE 16B Base model. There are significant changes in the architecture and representation of the model after 4-bit quantization, including changes in the shape of layers' weights, that had to be considered in the current implementation.

Pruned models:

We used as calibration data the c4 dataset, one of the two considered in the paper mentioned before. We got two pruned versions of the Mixtral-8x7B-Instruct-v0.1 model, the first one with 6 experts per layer and a second one with 4, as well as one pruned version of the deepseek-moe-16b-base model (keeping 16 out of 64 experts per layer):

Mixtral8x7B-4bit-pruned-1: 6 experts - ~18GB in V-RAM
Mixtral8x7B-4bit-pruned-2: 4 experts - ~12GB in V-RAM
deepseek-moe-16b-pruned: 16 experts - ~4GB in V-RAM

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
method		method
model		model
LICENSE		LICENSE
README.md		README.md
evaluation.ipynb		evaluation.ipynb
pruning_deepseek.ipynb		pruning_deepseek.ipynb
pruning_mixtral.ipynb		pruning_mixtral.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PruningOfExperts

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PruningOfExperts

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages