-
Notifications
You must be signed in to change notification settings - Fork 121
Open
Description
Hi, thanks for the quick support for the qwen3-embedding model!
1.In my scenario, the format of my dataset is as follows:
{"query": "ABC", "response": "ABC", "rejected_response": ["DF","ER","FG]}
2.However, when I performed fine-tuning, the fine-tuning code was:
deepspeed --include localhost:6,7 --master_port 60000 --module tevatron.retriever.driver.train \
--deepspeed tevatron/deepspeed/ds_zero3_config.json \
--output_dir retriever-qwen3-emb-ft \
--model_name_or_path "/data/xzy/Qwen/Qwen3-Embedding-0.6B" \
--lora \
--lora_target_modules q_proj,k_proj,v_proj,o_proj,down_proj,up_proj,gate_proj \
--save_steps 50 \
--dataset_path "/data/search.jsonl" \
--query_prefix "Instruct: Given a scientific claim, retrieve documents that support or refute the claim.\nQuery:" \
--passage_prefix "" \
--bf16 \
--pooling last \
--padding_side left \
--normalize \
--temperature 0.01 \
--per_device_train_batch_size 8 \
--gradient_checkpointing \
--train_group_size 21 \
--learning_rate 1e-4 \
--query_max_len 32 \
--passage_max_len 512 \
--num_train_epochs 10 \
--logging_steps 10 \
--overwrite_output_dir \
--gradient_accumulation_steps 1
3.The error reported is as follows:
[rank0]: Traceback (most recent call last):
[rank0]: File "/data/miniconda3/envs/xzy_swift_new/lib/python3.10/runpy.py", line 196, in _run_module_as_main
[rank0]: return _run_code(code, main_globals, None,
[rank0]: File "/data/miniconda3/envs/xzy_swift_new/lib/python3.10/runpy.py", line 86, in _run_code
[rank0]: exec(code, run_globals)
[rank0]: File "/data/xzt/tevatron/src/tevatron/retriever/driver/train.py", line 114, in <module>
[rank0]: main()
[rank0]: File "/data/xzt/tevatron/src/tevatron/retriever/driver/train.py", line 107, in main
[rank0]: trainer.train(resume_from_checkpoint=(last_checkpoint is not None))
[rank0]: File "/data/miniconda3/envs/xzy_swift_new/lib/python3.10/site-packages/transformers/trainer.py", line 2240, in train
[rank0]: return inner_training_loop(
[rank0]: File "/data/miniconda3/envs/xzy_swift_new/lib/python3.10/site-packages/transformers/trainer.py", line 2509, in _inner_training_loop
[rank0]: batch_samples, num_items_in_batch = self.get_batch_samples(epoch_iterator, num_batches, args.device)
[rank0]: File "/data/miniconda3/envs/xzy_swift_new/lib/python3.10/site-packages/transformers/trainer.py", line 5263, in get_batch_samples
[rank0]: batch_samples.append(next(epoch_iterator))
[rank0]: File "/data/miniconda3/envs/xzy_swift_new/lib/python3.10/site-packages/accelerate/data_loader.py", line 566, in __iter__
[rank0]: current_batch = next(dataloader_iter)
[rank0]: File "/data/miniconda3/envs/xzy_swift_new/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 701, in __next__
[rank0]: data = self._next_data()
[rank0]: File "/data/miniconda3/envs/xzy_swift_new/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 757, in _next_data
[rank0]: data = self._dataset_fetcher.fetch(index) # may raise StopIteration
[rank0]: File "/data/miniconda3/envs/xzy_swift_new/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
[rank0]: data = [self.dataset[idx] for idx in possibly_batched_index]
[rank0]: File "/data/miniconda3/envs/xzy_swift_new/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in <listcomp>
[rank0]: data = [self.dataset[idx] for idx in possibly_batched_index]
[rank0]: File "/data/xzy/tevatron/src/tevatron/retriever/dataset.py", line 151, in __getitem__
[rank0]: query_id = group['query_id']
[rank0]: KeyError: 'query_id'
0%| | 0/267080 [00:00<?, ?it/s]
Metadata
Metadata
Assignees
Labels
No labels