B3/data_processing at main · raghavlite/B3

Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
modify_posnegtext_instruction2.py	modify_posnegtext_instruction2.py
spectral_clustering4_ibn.py	spectral_clustering4_ibn.py
spectral_clustering4_ibn_classification.py	spectral_clustering4_ibn_classification.py

Name

Last commit message

Last commit date

README.md

modify_posnegtext_instruction2.py

spectral_clustering4_ibn.py

spectral_clustering4_ibn_classification.py

Clustering

This step requires the N×N teacher-ranked matrix to be available at the MODEL_BASE_PATH.

python spectral_clustering4_ibn.py --dataset MSCOCO_i2t --negs hn --nmax 100 --batch_size 32 --K 5

Adding Instructions

This step uses the clustered outputs generated from the previous stage.

python modify_posnegtext_instruction2.py

Multiple clustering files with different parameters have already been created and provided in B3/MMEB-train2.
For example,
*_bs32bi_30.130_qwen2b means:

Teacher model: vlm2vecqwen2b
Cluster size: $K = 32$
Parameters: $p = 30$, $m = 100$

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Clustering

Adding Instructions

FilesExpand file tree

data_processing

Directory actions

More options

Directory actions

More options

Latest commit

History

data_processing

Folders and files

parent directory

README.md

Clustering

Adding Instructions