-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Hello, I really liked your crossword repository.
I have a question regarding the GenerativeQA. I am basically creating an application which takes multiple documents from the user and on querying, it retrieves the relevant documents and answers the questions based upon that.
I am willing to use the "GPT-J-6B" model for it. And this is the approach I would like to go with:
# Step 1: Create document store (in haystack) and store all documents
# Step 2: Ask question. So assuming based on the question we have received top 10 relavant documents
# Step 3: Append these documents together and pass as the context.
# Step 4: Create the prompt
prompt = \
f"""Answer the question from the context below. And don't try to make up an answer.
If you don't know the answer, then say I don't know.
Context: {context}
Question: {query}
Answer:"""
# Step 5: Create a model and try generating the answers
# using gpt-neo-125m for now
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-125M")
model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-neo-125M")
# Step 6: Get the answers
tokens = tokenizer(query, return_tensors="pt")
output = model.generate(**tokens,
temperature=0.5,
min_length=5,
max_length=200,
early_stopping=True,
do_sample=True,
num_beams=8,
repetition_penalty=2.0, top_k=50)
print(tokenizer.decode(output[0]))But I am getting the error here because of the long length of the prompt. Will you please help me through this?
Input length of input_ids is 553, but max_length is set to 200. This can lead to unexpected behavior. You should consider increasing max_new_tokens.
And...
The expanded size of the tensor (200) must match the existing size (554) at non-singleton dimension 0. Target sizes: [200]. Tensor sizes: [554]
So obviously, I need to truncate the text some how.
Is this the right/fast way to create such QA systems?
Are my parameters wrong?
Please help. Thanks 🙏