rescaled collision reward based on vehicle speed #251

riccardosavorgnan · 2026-01-11T22:10:11Z

first working version
made it a configurable option

greptile-apps · 2026-01-11T22:24:28Z

Greptile Overview

Greptile Summary

This PR implements speed-based collision reward scaling for the driving simulation environment. When enabled via the new speed_based_collisions_reward configuration option, collision penalties are scaled based on the agent's velocity using a sigmoid function—vehicles moving faster receive larger penalties than slower vehicles.

Core Implementation:

Adds compute_collision_reward() function that calculates vehicle speed from velocity components (vx, vy) and applies sigmoid scaling
The sigmoid is calibrated so that 0 m/s yields ~50% penalty and 10 m/s yields ~95% penalty
New configuration option speed_based_collisions_reward (0=disabled, 1=enabled) threads through Python → C binding → environment struct
Modified c_step() to conditionally apply scaled vs. flat collision rewards

Integration:
The feature is properly integrated across the codebase: configuration parsing (env_config.h), Python interface (drive.py), C binding (binding.c), visualization (visualize.c), and demo code (drive.c). All files correctly propagate the new flag.

Issues Found:

Critical: Integer division bug in speed normalization (line 2018 of drive.h) breaks the sigmoid calibration
Breaking: Deletion of puffer_drive_weights.bin will break existing tests and code references
Unrelated: Config change reducing num_maps from 10000 to 1000 appears unintentional

Confidence Score: 1/5

Not safe to merge - contains critical math bug and breaking changes
The PR has a critical integer division bug that breaks the core feature's speed normalization logic. Additionally, two unrelated changes (weights file deletion and num_maps reduction) will break existing tests and potentially training pipelines. The speed-based collision reward feature itself is well-integrated across the codebase, but the math error means it won't work as intended.
pufferlib/ocean/drive/drive.h (fix integer division), pufferlib/config/ocean/drive.ini (revert num_maps), pufferlib/resources/drive/puffer_drive_weights.bin (restore file or update references)

Important Files Changed

File Analysis

Filename	Score	Overview
pufferlib/ocean/drive/drive.h	1/5	Adds sigmoid function and speed-based collision reward calculation. Critical integer division bug at line 2018 causes incorrect speed normalization.
pufferlib/ocean/drive/drive.c	5/5	Adds speed_based_collisions_reward field initialization in demo() function. No issues found.
pufferlib/ocean/drive/drive.py	5/5	Adds speed_based_collisions_reward parameter to Drive class constructor and passes it through to C environment. No issues found.
pufferlib/ocean/drive/binding.c	5/5	Adds speed_based_collisions_reward assignment from config to environment. No issues found.
pufferlib/ocean/drive/visualize.c	5/5	Adds speed_based_collisions_reward field initialization in eval_gif() function. No issues found.
pufferlib/ocean/env_config.h	5/5	Adds speed_based_collisions_reward field to env_init_config struct and INI parser handler. No issues found.
pufferlib/config/ocean/drive.ini	2/5	Enables speed_based_collisions_reward feature. Includes unrelated num_maps reduction from 10000 to 1000.
pufferlib/resources/drive/puffer_drive_weights.bin	0/5	Binary weights file deleted. This file is referenced by tests and other code, deletion will cause breakage.

Sequence Diagram

sequenceDiagram
    participant Config as drive.ini
    participant Python as drive.py
    participant Binding as binding.c
    participant Env as Drive struct
    participant Step as c_step()
    participant Compute as compute_collision_reward()
    
    Config->>Python: speed_based_collisions_reward=1
    Python->>Binding: Initialize environment
    Binding->>Env: Set speed_based_collisions_reward flag
    
    loop Each simulation step
        Step->>Step: Detect vehicle collision
        alt speed_based_collisions_reward == 1
            Step->>Compute: compute_collision_reward(env, agent_idx)
            Compute->>Compute: Calculate current_speed from vx, vy
            Compute->>Compute: rescaled_speed = current_speed / (10/3)
            Note over Compute: BUG: Integer division 10/3=3
            Compute->>Compute: sigmoid(rescaled_speed)
            Compute->>Step: Return scaled collision_reward
            Step->>Env: Apply scaled reward
        else speed_based_collisions_reward == 0
            Step->>Env: Apply flat collision_reward
        end
    end

greptile-apps

_{4 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-11T22:24:31Z

pufferlib/ocean/drive/drive.h

+    float collision_reward = env->reward_vehicle_collision;
+
+    // normalize speed such that 0 m/s gets a 50% of the reward and it gets 95% of the reward at 10 m/s
+    float rescaled_speed = current_speed / (10 / 3); // rescale so that 10 becomes 3


[P0] Integer division bug: 10 / 3 evaluates to 3 (integer division), not 3.333... as intended. This breaks the speed normalization—at 10 m/s the rescaled_speed becomes 10 / 3 = 3.0 as expected, but the calculation is performed using integer division. Change to 10.0f / 3.0f for correct floating-point division.

Prompt To Fix With AI

This is a comment left during a code review. Path: pufferlib/ocean/drive/drive.h Line: 2018:2018 Comment: [P0] Integer division bug: `10 / 3` evaluates to `3` (integer division), not `3.333...` as intended. This breaks the speed normalization—at 10 m/s the rescaled_speed becomes `10 / 3 = 3.0` as expected, but the calculation is performed using integer division. Change to `10.0f / 3.0f` for correct floating-point division. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-11T22:24:32Z

pufferlib/ocean/drive/drive.h

    env->entities[agent_idx].steering_angle = 0.0f;
 }

+double sigmoid(float x) { return 1.0 / (1.0 + exp(-x)); }


[P2] Type inconsistency: Function returns double but is used in float contexts. Consider changing return type to float or cast the parameter x to double for consistency. Current usage causes implicit conversions.

Prompt To Fix With AI

This is a comment left during a code review. Path: pufferlib/ocean/drive/drive.h Line: 2006:2006 Comment: [P2] Type inconsistency: Function returns `double` but is used in `float` contexts. Consider changing return type to `float` or cast the parameter `x` to `double` for consistency. Current usage causes implicit conversions. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-11T22:24:33Z

pufferlib/config/ocean/drive.ini

 termination_mode = 1 # 0 - terminate at episode_length, 1 - terminate after all agents have been reset
 map_dir = "resources/drive/binaries/training"
-num_maps = 10000
+num_maps = 1000


[P1] Unrelated change: num_maps reduced from 10000 to 1000. This change appears unrelated to the speed-based collision reward feature described in the PR. If intentional, should be in a separate commit or documented in the PR description.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: pufferlib/config/ocean/drive.ini Line: 52:52 Comment: [P1] Unrelated change: `num_maps` reduced from 10000 to 1000. This change appears unrelated to the speed-based collision reward feature described in the PR. If intentional, should be in a separate commit or documented in the PR description. <sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub> How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-11T22:24:38Z

Additional Comments (1)

pufferlib/resources/drive/puffer_drive_weights.bin
[P1] Unrelated deletion: This 2.4MB weights file is referenced by multiple files (tests/test_drive_render.py, pufferlib/utils.py, pufferlib/ocean/drive/visualize.c, pufferlib/ocean/drive/drive.c). Deleting it will break tests and code that depend on it. This change appears unrelated to the speed-based collision feature.

Prompt To Fix With AI

This is a comment left during a code review.
Path: pufferlib/resources/drive/puffer_drive_weights.bin
Line: 1:1

Comment:
[P1] Unrelated deletion: This 2.4MB weights file is referenced by multiple files (`tests/test_drive_render.py`, `pufferlib/utils.py`, `pufferlib/ocean/drive/visualize.c`, `pufferlib/ocean/drive/drive.c`). Deleting it will break tests and code that depend on it. This change appears unrelated to the speed-based collision feature.

How can I resolve this? If you propose a fix, please make it concise.

riccardosavorgnan added 3 commits January 7, 2026 22:42

first working version

ed7ca76

made it a configurable option

f92e9bf

Merge branch '3.0' into speed_based_collision

41f5c0b

riccardosavorgnan mentioned this pull request Jan 11, 2026

Collision reward based on speed #233

Open

riccardosavorgnan marked this pull request as ready for review January 11, 2026 22:21

greptile-apps bot reviewed Jan 11, 2026

View reviewed changes

riccardosavorgnan added 2 commits January 12, 2026 11:31

fixes based on greptile suggestions

ff56db9

revert to 10000 maps default

f18d121

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rescaled collision reward based on vehicle speed #251

rescaled collision reward based on vehicle speed #251

Uh oh!

riccardosavorgnan commented Jan 11, 2026

Uh oh!

greptile-apps bot commented Jan 11, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Jan 11, 2026

Uh oh!

greptile-apps bot Jan 11, 2026

Uh oh!

greptile-apps bot Jan 11, 2026

Uh oh!

greptile-apps bot commented Jan 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rescaled collision reward based on vehicle speed #251

Are you sure you want to change the base?

rescaled collision reward based on vehicle speed #251

Uh oh!

Conversation

riccardosavorgnan commented Jan 11, 2026

Uh oh!

greptile-apps bot commented Jan 11, 2026

Greptile Overview

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Jan 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants