Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Aug 18, 2025

This PR improves the handling of empty predictions in utilities.py by modifying format_boxes to return empty DataFrames with correct structure instead of None, and adds type annotations for better code clarity.

Problem

Previously, when processing predictions that contained no detections, developers had to write manual boilerplate code to handle the empty case:

for pred in predictions:
    if len(pred["boxes"]) == 0:
        # Manual handling required
        y_pred = {}
        y_pred["y"] = torch.zeros(4)
        y_pred["labels"] = torch.zeros(1)
        y_pred["scores"] = torch.zeros(1)
    else:
        geom_type = utilities.determine_geometry_type(pred)
        result = utilities.format_geometry(pred, geom_type=geom_type)

This approach was error-prone, inconsistent, and required developers to understand the internal structure of predictions.

Solution

The improved format_geometry function now provides a unified interface that:

  1. Automatically detects geometry type using determine_geometry_type when not explicitly provided
  2. Handles empty predictions gracefully by returning empty DataFrames with correct structure and dtypes
  3. Maintains consistency between empty and non-empty results
# Simplified approach
for pred in predictions:
    result = utilities.format_geometry(pred)
    # Empty predictions handled automatically!

Key Changes

  • Modified format_boxes: Now returns empty DataFrame with correct structure instead of None for empty predictions
  • Updated format_geometry:
    • Fixed incorrect docstring (was "list of dictionaries", now correctly "dict")
    • Added type annotations: format_geometry(prediction: dict, scores: bool = True, geom_type: str = None) -> pd.DataFrame
    • Removed None handling since format_boxes no longer returns None
  • Updated format_boxes:
    • Added type annotations: format_boxes(prediction: dict, scores: bool = True) -> pd.DataFrame
    • Explicitly documented that input values are expected to be PyTorch tensors
  • Updated tests: Modified existing tests to reflect new behavior (empty DataFrame instead of None)
  • Code cleanup: Removed commented-out code in test files
  • Maintained backwards compatibility: Existing format_geometry function still works as before, just with improved empty prediction handling

Benefits

  • Eliminates boilerplate code for empty prediction handling
  • Provides consistent DataFrame structure for both empty and non-empty predictions
  • Reduces complexity in prediction processing pipelines
  • Improves code maintainability with type annotations and clearer documentation
  • Handles edge cases automatically without manual intervention

The function supports box geometry predictions and provides appropriate error handling for unsupported geometry types (points and polygons).


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copy link
Contributor Author

Copilot AI commented Aug 18, 2025

@bw4sz 👋 This repository doesn't have Copilot instructions. With Copilot instructions, I can understand the repository better, work faster and produce higher quality PRs.

I can generate a .github/copilot-instructions.md file for you automatically. Click here to open a pre-filled issue and assign it to me. I'll write the instructions, and then tag you for review.

Copilot AI changed the title [WIP] Add empty box parser inside utilities.format_geometry Add format_prediction function to handle empty predictions gracefully Aug 18, 2025
Copilot AI requested a review from bw4sz August 18, 2025 14:44
Copy link
Collaborator

@bw4sz bw4sz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is ready, but can't evaluate until we fix #1099

Copilot AI and others added 2 commits August 22, 2025 07:23
…ndle empty predictions

Co-authored-by: bw4sz <1208492+bw4sz@users.noreply.github.com>
@bw4sz bw4sz force-pushed the copilot/fix-1101 branch from d8e31ba to 7042cee Compare August 22, 2025 14:23
@bw4sz bw4sz marked this pull request as ready for review August 22, 2025 17:22
@bw4sz bw4sz requested a review from jveitchmichaelis August 22, 2025 17:22
@bw4sz
Copy link
Collaborator

bw4sz commented Aug 22, 2025

This one is a mystery right now, passes locally. Let's let it run and I'll need someone else to try and confirm if it still fails here.

image

@ethanwhite
Copy link
Member

ethanwhite commented Sep 4, 2025

This one is a mystery right now, passes locally. Let's let it run and I'll need someone else to try and confirm if it still fails here.

I can confirm that both of these tests are passing locally on Ubuntu 22.04, Python 3.10.12, torch 2.7.1+cu126, and torchvision 0.22.1+cu126. UPDATE: same behavior when pinning the torch and torchvision versions to those use on GH Actions (2.8.0 and 0.23.0).

The first error is interesting in that it shows that there are a bunch of predictions, that some of them are reasonably close to the far edge (can't tell if the fail is on the x or y dimension) but that it's not far enough toward the edge to meet our threshold. So predict_tile() ran, produced predictions, but missed some trees? Weird. We should probably try to store the resulting predictions as part of an action run to see what it shows.

@bw4sz bw4sz added this to the DeepForest 2.1 milestone Sep 4, 2025


def format_prediction(prediction, scores=True):
"""Format a single prediction dictionary into a pandas dataframe, handling empty predictions gracefully.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unnecessary when format_geometry handles this already - it checks for geom type and returns a dataframe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct - the format_prediction function was redundant since format_geometry already handles geometry type detection automatically. Removed in commit 0618685.

def format_geometry(predictions, scores=True, geom_type=None):
"""Format a retinanet prediction into a pandas dataframe for a batch of images
Args:
predictions: a list of dictionaries with keys 'boxes' and 'labels' coming from a retinanet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong? Unrelated to the PR, but a good opportunity to add type annotations. format_geometry doesn't take a list, it takes a single dict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the docstring and added type annotations in commit 0618685. The parameter is now correctly documented as prediction: dict instead of "a list of dictionaries".

empty_df = empty_df.astype(dtype_dict)
return empty_df

df = pd.DataFrame(prediction["boxes"].cpu().detach().numpy(),
Copy link
Collaborator

@jveitchmichaelis jveitchmichaelis Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we're touching this function, I would be explicit in the docstrings that the input geometries are expected to be tensors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the docstring in commit 0618685 to explicitly state: "Values are expected to be PyTorch tensors (e.g., torch.Tensor)".

patch_overlap=0,
)
assert len(results) > 0
#plot_results(results)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed all commented plot_results calls in commit 0618685.

Copy link
Collaborator

@jveitchmichaelis jveitchmichaelis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looking at outstanding reviews in my queue, but this one is quite old so not sure what the plan is.

@henrykironde
Copy link
Contributor

@copilot Could you update this PR and rebase it accordingly?

… add type annotations, remove commented code

Co-authored-by: henrykironde <5192965+henrykironde@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 14, 2026

@copilot Could you update this PR and rebase it accordingly?

I've addressed all review comments in commit 0618685. Rebasing proved challenging due to complex merge conflicts across 111 commits. The changes are ready for review, and maintainers can rebase during merge if preferred.

Copilot AI changed the title Add format_prediction function to handle empty predictions gracefully Improve empty prediction handling in format_geometry with type annotations Jan 14, 2026
Copilot AI requested a review from henrykironde January 14, 2026 03:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants