-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Hi, I have tested the classifier on my dataset with variable length input sequences up to 1024, and I am receiving NaN values for predictions for some reason.
The setup I use is:
MODEL = "state-spaces/mamba-130m-hf"
tokenizer = AutoTokenizer.from_pretrained(MODEL, add_eos_token=True, use_fast=True)
def tokenize(text):
tokenizer(text, padding="max_length", truncation=True, max_length=1024)
id2label = {0: "NEGATIVE", 1: "POSITIVE"}
label2id = {"NEGATIVE": 0, "POSITIVE": 1}
model = MambaForSequenceClassification.from_pretrained(model, num_labels=2, id2label=id2label, label2id=label2id, use_cache=False)
# train with HF trainer and dataset
...Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels