Skip to content

csiemssen/APP-RAS-Driving-with-Language

Repository files navigation

🏆 DriveLM Challenge

📄 Presentations and Report

⚒️ Tools

Setup

  • Download the NuScenes training and validation datasets, and place them together in the data/nuscenes directory
  • Install the package requirements from requirements.txt and run pip install flash-attn==2.8.0.post2 --no-build-isolation.
  • To make sure you can download all of the tested models, authenticate your machine with huggingface using the huggingface-cli, by running huggingface-cli login

Docker

Make sure you have the correct CUDA and driver version (>=12.8) available on your system.

We will supply two images for every release on the docker hub. One for running the eval of the current implementation and one for running the training loop.

To run the training, make sure to mount the models directory:

docker run --gpus all -v ./models:/app/models <img-name>

To run the eval, make sure to mount the output directory:

docker run --gpus all -v ./data/output:/app/data/output <img-name>

Notebooks

You can find the notebooks in the notebooks directory to run training and evaluation.

  • Training: Open In Colab
  • Evaluation: Open In Colab
  • Test: Open In Colab

Evaluation

To evaluate, please use the official test server.
To evaluate locally, you need to install the language evaluation packages by following this link: https://github.com/bckim92/language-evaluation.

To generate the test dataset from the train dataset, run the following command in your project root:

python -m src.data.extract_test_dataset

Once installed, you can run the evaluation with:

python evaluation.py --prediction_file <predictions> --test_file <ground_truth> --output_path <output>

To ignore the missing predictions answers to questions in the evaluation, you can use the --ignore_missing flag. To override the output file, you can use the --override flag.

To use the chat evaluation feature, follow these steps:

  1. Set Up API Keys:

    • Add the API key for the provider (OPENAI_API_KEY or GEMINI_API_KEY) in your .env file.
  2. Install Required Dependencies:

    • Install dotenv for environment variable management:
      pip install dotenv
    • Install the SDK for your chosen provider:
      • For OpenAI:
        pip install openai
      • For Google:
        pip install google-genai
  3. Specify Provider and Model:

    • Use the --chat_provider flag to specify the LLM provider (openai or google).
    • Use the --chat_model flag to specify the model.

Alternatively, you can use the provided Docker image Dockerfile-score. Make sure to mount your evaluation, gpt-evaluation, prediction file, ground truth file, and an output path for the results:

docker run --rm -v "$(pwd)":/app <image-name> \
    --prediction_file <predictions> \
    --test_file <ground_truth> \
    --output_path <output>
  • To add the secrets for the API keys in runtime, you can use --env-file .env:
  • Ensure all required files are in the current directory ($(pwd)), or adjust the volume path accordingly.

Note

The name of the prediction file is used to name the results file, so make sure to name it accordingly.

Results

Folder File accuracy chatgpt language/Bleu_1 language/Bleu_2 language/Bleu_3 language/Bleu_4 language/ROUGE_L language/CIDEr match final_score
Qwen_Qwen2.5-VL-3B-Instruct baseline_eval.json 0.0 67.3456904541242 0.23925483214648655 0.11544553891542789 0.057574440433052446 0.020769188970051755 0.19079430087529606 0.006791738547463531 32.46449704142012 0.3542940542224235
Google_Gemma-3-4b baseline_eval.json 0.0 64.21501390176088 0.20031983735402523 0.07057776733950633 0.020562285652540205 0.0065920170980479755 0.15465341169836444 0.002624376073528029 35.52662721893491 0.3432085651226965
OpenGVLab_InternVL3-2B baseline_eval.json 0.0 68.50231696014829 0.19311353330793474 0.07443754271197525 0.02748667571393649 0.009723712529328537 0.1677772971700306 0.005031769222350847 22.62869822485207 0.3355647203008346

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •