Skip to content

Thorin215/GRE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

arXiv preprint Dataset Model Benchmark

🔍 About GRE Suite

GRE Suite is designed to augment VLMs with structured reasoning chains for accurate and interpretable location inference. It consists of three primary components:

  • Dataset (GRE30K)

GRE30K is a geo-localization reasoning dataset designed to enhance the visual reasoning capability of MLLMs. Specifically, GRE30K consists of GRE30K-CoT for cold-start Initialization and GRE30K-Judge for reinforcement learning.

  • Model (GRE)

GRE is an effective Reasoning MLLM, which employs a multi-stage reasoning strategy to progressively infer scene attributes, local details, and semantic features, thereby narrowing down potential geographic regions with enhanced precision.

  • Benchmark (GREval-Bench)

GREval-Bench is a geographical reasoning benchmark that employs a semi-automated pipeline to curate geographically informative images with explicit and implicit indicators, and provides annotated Chain-of-Thought steps and reference GPS coordinates for comprehensive evaluation of models' geo-localization capabilities.

🛠️ Requirements and Installation

Basic Dependencies:

  • Python >= 3.8
  • Pytorch >= 2.5.0
  • CUDA Version >= 11.8
  • transformers == 4.40.0
  • tokenizers == 0.19.1
git clone https://github.com/Thorin215/GRE.git
cd GRE
conda create -n GRE python=3.10
conda activate GRE
bash environment.sh

🌟 Getting started

Step1: download GRE-7b and set model_name_or_path in infer.ipynb to the path of GRE-7b.

Step2: refer to the examples in infer.ipynb for detailed instructions on how to use our model for image geo-localization.

🚀 Main Results

We perform a comparative analysis of GRE against worldwide Geo-Localization benchmarks, Im2GPS3k and GWS15k. In all metrics, our method surpasses the previous state-of-the-art model on Im2GPS3k, achieving improvements of +0.5%, +4.2%, +3.0%, +1.7% and +2.5% in the 1km, 25km, 200km, 750km, and 2500km thresholds respectively.

We compare our approach on GREval-Bench with the previous generalist models, including InternVL2.5 series, InternVL3 series, Qwen2.5-VL series. We conduct comprehensive evaluations of models, analyzing the above metric across different distance thresholds and scenarios, while also assessing the quality of its reasoning chains.

🗝️ Training & Evaluation

Training

The all datasets for training can be found in Dataset preparation.

The training pipeline of our model is structured into three distinct stages.

  • Stage1: Cold-start Initialization

    • Download Qwen2.5-VL-7B-Instruct
    • Set model_name_or_path in stage1.sh to the path of Qwen2.5-VL-7B-Instruct.
    • Prepare GRE-30K for cold-start initialization.
    • Run bash scripts/train/stage1.sh.
  • Stage2: RL stage I

    • Set model_name_or_path in stage2.sh to the path of stage1 checkpoint.
    • Prepare datasets used for stage2.
    • Run bash scripts/train/stage2.sh.
  • Stage3: RL stage II

    • Set model_name_or_path in stage3.sh to the path of stage2 checkpoint.
    • Prepare datasets used for stage3.
    • Run bash scripts/train/stage3.sh.

Evaluation

For model evaluation, please refer to eval.

📰 Coming Soon

🌏 Checkpoints

Model Name Base Model # Training Epochs

🖨️ GRE30K

The dataset can be accessed on 🤗dataset.

GRE30K-CoT Data format:

[
    {
        "image": "images/xxx.jpg",
        "conversations": [
            {
                "from": "human",
                "value": "<image>\n{CoT Instruction}?"
            },
            {
                "from": "gpt",
                "value": "..."
            }
        ],
        "gt_lat": {gt_lat},
        "gt_lon": {gt_lon}
    },
    ...
]

GRE30K-Judge Data format:

[
    {
        "image": "images/xxx.jpg",
        "conversations": [
            {
                "from": "human",
                "value": "<image>\n{Judge Instruction}?"
            },
            {
                "from": "gpt",
                "value": "True/False"
            }
        ],
        "predicted_cot": "{predicted_cot}",
        "predicted_answer": "{predicted_answer}",
        "gt_lat": {gt_lat},
        "gt_lon": {gt_lon}
    },
    ...
]

GRE30K-Seed Data format:

[
    {
        "image": "images/xxx.jpg",
        "instructions": "{Seed Instruction}",
        "gt_lat": {gt_lat},
        "gt_lon": {gt_lon}
    },
    ...
]

🕹️ GREval-Bench

GREval-Bench assesses the models in two key areas: localization performance and Chain-of-Thought quality​​.

  • The annotations of the benchmark can be found in 🤗benchmark.

  • The usage of GREval-Bench is detailed in doc.

📑 Citation

If you find GRE Suite useful for your research and applications, please cite using this BibTeX:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •