GitHub - tonyliu12345/openvla-oft: Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success

BEHAVIOR-1K Challenge -- Finetune OpenVLA-OFT

This repo provides a simplest version instruction to finetune OpenVLA-OFT on BEHAVIOR-1K Challenge dataset.

Repo Clone

git clone https://github.com/evansh666/openvla-oft.git
git clone https://github.com/StanfordVL/BEHAVIOR-1K.git

You can also start with the original openvla-oft repo. This finetuning instruction is adapted from the ALOHA finetuning task.

Installation

conda deactivate
conda create -n openvla-oft python=3.10 -y
conda activate openvla-oft

cd openvla-oft
pip install -e .

pip install packaging ninja
ninja --version; echo $?  # Verify Ninja --> should return exit code "0"
pip install "flash-attn==2.5.5" --no-build-isolation

# Install for RLDS dataset: tensorflow, tensorflow_datasets, tensorflow_hub, apache_beam
pip install tensorflow_hub
pip install apache_beam

# Install Behavior env for server deploy 
cd ../BEHAVIOR-1K # go to BEHAVIOR-1K directory

# Install bddl
cd bddl
pip install -e .
pip install pymeshlab==2022.2.post4

# Install omnigibson with eval dependencies
cd Omnigibson
pip install .

Data Conversion

Since Openvla-oft requires RLDS dataset, we need first convert BEHAVIOR dataset into RLDS format.

See instructions for converting to RLDS here.
A sample BEHAVIOR data to RLDS conversion script is available here, you can use the following code to get RLDS-formatted data:

cd RLDS_builder
cd behavior_dataset/behavior_turn_on_radio
tfds build --data_dir /path/to/save/rlds/dataset

If you want to customize your own dataset, revise the dataset builder (e.g., 'behavior_turn_on_radio_dataset_builder.py'.

Finetune OpenVLA-OFT+

There are a few files in OpenVLA-OFT+ needs change to adapt our new robot:

Register the dataset (e.g. behavior_turn_on_radio) with openvla-oft dataloader by adding an entry in the following files:
- Add an entry in StateEncoding and ActionEncoding; and Add a data name mapping in configs.py (here)
- Add data transform in transforms.py (here)
- Add data mixture proportion in mixtures.py (here).
- Set constants of BEHAVIOR, e.g., desired action chunk size (here)
- Add normalize and absolute action mask in materialize.py (here).
- Add behavior in three camera views selection (here)
Revise dataset and setting in finetune.sh. For more detailed parameter selection, please refer OpenVLA-Finetune Instruction.

torchrun --standalone --nnodes 1 --nproc-per-node X vla-scripts/finetune.py \
  --vla_path openvla/openvla-7b \
  --data_root_dir /PATH/TO/RLDS/DATASETS/DIR/ \
  --dataset_name /YOUR/DATASET/NAME \
  --run_root_dir /YOUR/CHECKPOINTS/AND/LOG/DIR/ \
  --use_l1_regression True \
  --use_diffusion False \
  --use_film True \
  --num_images_in_input 3 \
  --use_proprio True \
  --batch_size 4 \
  --learning_rate 5e-4 \
  --num_steps_before_decay 50000 \
  --max_steps 100005 \
  --use_val_set True \
  --val_freq 10000 \
  --save_freq 10000 \
  --save_latest_checkpoint_only False \
  --image_aug True \
  --lora_rank 32 \
  --wandb_entity "YOUR_WANDB_ENTITY" \
  --wandb_project "YOUR_WANDB_PROJECT" \
  --run_id_note parallel_dec--25_acts_chunk--continuous_acts--L1_regression--3rd_person_img--left_right_wrist_imgs--proprio_state--film

Evaluation

After finetuning, you can deploy the

Deploy finetuned checkpoint:

python vla-scripts/deploy.py \
  --pretrained_checkpoint /PATH/TO/FINETUNED/MODEL/CHECKPOINT/DIR/ \
  --use_l1_regression True \
  --use_film True \
  --num_images_in_input 3 \
  --use_proprio True \
  --center_crop True \
  --unnorm_key /NAME/OF/DATASET

  # Or directly run after modifying deploy.sh
  ./scripts/deploy.sh

This opens a connection listening on 0.0.0.0:8000.

Run the evaluation on BEHAVIOR

# activate your installed behavior environment
# if you haven't install behavior environment, check https://github.com/StanfordVL/BEHAVIOR-1K
conda deactivate
conda activate behavior 

# run eval script
cd BEHAVIOR-1K/Omnigibson/omnigibson/learning
python eval.py policy=websocket task.name=turning_on_radio

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
RLDS_builder		RLDS_builder
docs		docs
experiments/robot		experiments/robot
prismatic		prismatic
scripts		scripts
vla-scripts		vla-scripts
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup_env.sh		setup_env.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BEHAVIOR-1K Challenge -- Finetune OpenVLA-OFT

Repo Clone

Installation

Data Conversion

Finetune OpenVLA-OFT+

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

tonyliu12345/openvla-oft

Folders and files

Latest commit

History

Repository files navigation

BEHAVIOR-1K Challenge -- Finetune OpenVLA-OFT

Repo Clone

Installation

Data Conversion

Finetune OpenVLA-OFT+

Evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages