Skip to content

Latest commit

 

History

History
48 lines (37 loc) · 2.5 KB

File metadata and controls

48 lines (37 loc) · 2.5 KB

dfdx-mamba

Ports a minimal (non-optimized) implementation of Mamba.

In short and simple terms, Mamba is an alternative, with trade-offs, to the attention mechanism. Mamba can be used in RNNs that steps over a single sequence point at a time (instead of requiring to observe multiple sequence points at the same time) but it needs to carry over the previous state so it's memory and time requirements are fixed for each sequence point.

Cargo.toml
[dependencies.dfdx-mamba]
git = 'https://github.com/swfsql/dfdx-mamba.git'
branch = "main"
## instead of using a branch, you can pin to a specific commit:
# rev = ""
features = ["nightly", "safetensors"]

Note that this depends on a fork of dfdx that has some draft prs merged into it:

[dependencies.dfdx]
git = 'https://github.com/swfsql/dfdx.git'
rev = "c4a2995"
# branch = "this-main"
default-features = false
features = ["nightly", "safetensors"]
Example

You can check an example using this mamba block for inference in here (you can also check it in the browser in WebAssembly).

Implementation References
Learn More
S4
Mamba