Skip to content

Latest commit

 

History

History
33 lines (21 loc) · 1.09 KB

File metadata and controls

33 lines (21 loc) · 1.09 KB

Llama2 Transformer Model in Rust

This repository contains the Rust implementation of the Llama2 Transformer model, focusing on performance and correctness. The implementation covers model creation, tokenization, and operations such as matrix multiplication and softmax, essential for the transformer's forward pass. llama rust

Features

  • Fast matrix multiplication with quantized tensors.
  • Efficient softmax implementation.
  • Custom tokenizer compatible with pre-trained models.
  • Memory-efficient operation utilizing memory mapping.

Getting Started

To get started with the Llama2 Transformer in Rust, clone the repository and build the project using Cargo.

Prerequisites

Installation

  1. Clone the repository:
git clone https://github.com/your-github-username/llama2-rs.git
cd llama2-rs
cargo build --release
./target/release/llama2_rs llama2.bin -n 100 -m "Once upon a time"