GitHub - csiemssen/APP-RAS-Driving-with-Language

🏆 DriveLM Challenge

📄 Presentations and Report

⚒️ Tools

Zotero

Setup

Download the NuScenes training and validation datasets, and place them together in the data/nuscenes directory
Install the package requirements from requirements.txt and run pip install flash-attn==2.8.0.post2 --no-build-isolation.
To make sure you can download all of the tested models, authenticate your machine with huggingface using the huggingface-cli, by running huggingface-cli login

Docker

Make sure you have the correct CUDA and driver version (>=12.8) available on your system.

We will supply two images for every release on the docker hub. One for running the eval of the current implementation and one for running the training loop.

To run the training, make sure to mount the models directory:

docker run --gpus all -v ./models:/app/models <img-name>

To run the eval, make sure to mount the output directory:

docker run --gpus all -v ./data/output:/app/data/output <img-name>

Notebooks

You can find the notebooks in the notebooks directory to run training and evaluation.

Training:
Evaluation:
Test:

Evaluation

To evaluate, please use the official test server.
To evaluate locally, you need to install the language evaluation packages by following this link: https://github.com/bckim92/language-evaluation.

To generate the test dataset from the train dataset, run the following command in your project root:

python -m src.data.extract_test_dataset

Once installed, you can run the evaluation with:

python evaluation.py --prediction_file <predictions> --test_file <ground_truth> --output_path <output>

To ignore the missing predictions answers to questions in the evaluation, you can use the --ignore_missing flag. To override the output file, you can use the --override flag.

To use the chat evaluation feature, follow these steps:

Set Up API Keys:
- Add the API key for the provider (OPENAI_API_KEY or GEMINI_API_KEY) in your .env file.
Install Required Dependencies:
- Install dotenv for environment variable management:
```
pip install dotenv
```
- Install the SDK for your chosen provider:
  - For OpenAI:
```
pip install openai
```
  - For Google:
```
pip install google-genai
```
Specify Provider and Model:
- Use the --chat_provider flag to specify the LLM provider (openai or google).
- Use the --chat_model flag to specify the model.

Alternatively, you can use the provided Docker image Dockerfile-score. Make sure to mount your evaluation, gpt-evaluation, prediction file, ground truth file, and an output path for the results:

docker run --rm -v "$(pwd)":/app <image-name> \
    --prediction_file <predictions> \
    --test_file <ground_truth> \
    --output_path <output>

To add the secrets for the API keys in runtime, you can use --env-file .env:
Ensure all required files are in the current directory ($(pwd)), or adjust the volume path accordingly.

Note

The name of the prediction file is used to name the results file, so make sure to name it accordingly.

Results

Folder	File	chatgpt	language/Bleu_1	language/Bleu_2	language/Bleu_3	language/Bleu_4	language/ROUGE_L	language/CIDEr	match	final_score
Qwen_Qwen2.5-VL-3B-Instruct	baseline_eval.json	67.3456904541242	0.23925483214648655	0.11544553891542789	0.057574440433052446	0.020769188970051755	0.19079430087529606	0.006791738547463531	32.46449704142012	0.3542940542224235
Google_Gemma-3-4b	baseline_eval.json	64.21501390176088	0.20031983735402523	0.07057776733950633	0.020562285652540205	0.0065920170980479755	0.15465341169836444	0.002624376073528029	35.52662721893491	0.3432085651226965
OpenGVLab_InternVL3-2B	baseline_eval.json	68.50231696014829	0.19311353330793474	0.07443754271197525	0.02748667571393649	0.009723712529328537	0.1677772971700306	0.005031769222350847	22.62869822485207	0.3355647203008346

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
.github		.github
courseDocs		courseDocs
data		data
docs		docs
evaluation		evaluation
models		models
notebooks		notebooks
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile-eval		Dockerfile-eval
Dockerfile-score		Dockerfile-score
Dockerfile-train		Dockerfile-train
README.md		README.md
evaluation.py		evaluation.py
main.py		main.py
merge_model_and_adapter.py		merge_model_and_adapter.py
requirements.txt		requirements.txt
system_prompts_template.yml		system_prompts_template.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏆 DriveLM Challenge

📄 Presentations and Report

⚒️ Tools

Setup

Docker

Notebooks

Evaluation

Results

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

csiemssen/APP-RAS-Driving-with-Language

Folders and files

Latest commit

History

Repository files navigation

🏆 DriveLM Challenge

📄 Presentations and Report

⚒️ Tools

Setup

Docker

Notebooks

Evaluation

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages