-
Notifications
You must be signed in to change notification settings - Fork 237
Improve empty prediction handling in format_geometry with type annotations #1102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@bw4sz 👋 This repository doesn't have Copilot instructions. With Copilot instructions, I can understand the repository better, work faster and produce higher quality PRs. I can generate a .github/copilot-instructions.md file for you automatically. Click here to open a pre-filled issue and assign it to me. I'll write the instructions, and then tag you for review. |
bw4sz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is ready, but can't evaluate until we fix #1099
…ndle empty predictions Co-authored-by: bw4sz <1208492+bw4sz@users.noreply.github.com>
d8e31ba to
7042cee
Compare
I can confirm that both of these tests are passing locally on Ubuntu 22.04, Python 3.10.12, torch 2.7.1+cu126, and torchvision 0.22.1+cu126. UPDATE: same behavior when pinning the torch and torchvision versions to those use on GH Actions (2.8.0 and 0.23.0). The first error is interesting in that it shows that there are a bunch of predictions, that some of them are reasonably close to the far edge (can't tell if the fail is on the x or y dimension) but that it's not far enough toward the edge to meet our threshold. So |
src/deepforest/utilities.py
Outdated
|
|
||
|
|
||
| def format_prediction(prediction, scores=True): | ||
| """Format a single prediction dictionary into a pandas dataframe, handling empty predictions gracefully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unnecessary when format_geometry handles this already - it checks for geom type and returns a dataframe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct - the format_prediction function was redundant since format_geometry already handles geometry type detection automatically. Removed in commit 0618685.
src/deepforest/utilities.py
Outdated
| def format_geometry(predictions, scores=True, geom_type=None): | ||
| """Format a retinanet prediction into a pandas dataframe for a batch of images | ||
| Args: | ||
| predictions: a list of dictionaries with keys 'boxes' and 'labels' coming from a retinanet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong? Unrelated to the PR, but a good opportunity to add type annotations. format_geometry doesn't take a list, it takes a single dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the docstring and added type annotations in commit 0618685. The parameter is now correctly documented as prediction: dict instead of "a list of dictionaries".
| empty_df = empty_df.astype(dtype_dict) | ||
| return empty_df | ||
|
|
||
| df = pd.DataFrame(prediction["boxes"].cpu().detach().numpy(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we're touching this function, I would be explicit in the docstrings that the input geometries are expected to be tensors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the docstring in commit 0618685 to explicitly state: "Values are expected to be PyTorch tensors (e.g., torch.Tensor)".
tests/test_multiprocessing.py
Outdated
| patch_overlap=0, | ||
| ) | ||
| assert len(results) > 0 | ||
| #plot_results(results) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed all commented plot_results calls in commit 0618685.
jveitchmichaelis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just looking at outstanding reviews in my queue, but this one is quite old so not sure what the plan is.
|
@copilot Could you update this PR and rebase it accordingly? |
… add type annotations, remove commented code Co-authored-by: henrykironde <5192965+henrykironde@users.noreply.github.com>

This PR improves the handling of empty predictions in
utilities.pyby modifyingformat_boxesto return empty DataFrames with correct structure instead ofNone, and adds type annotations for better code clarity.Problem
Previously, when processing predictions that contained no detections, developers had to write manual boilerplate code to handle the empty case:
This approach was error-prone, inconsistent, and required developers to understand the internal structure of predictions.
Solution
The improved
format_geometryfunction now provides a unified interface that:determine_geometry_typewhen not explicitly providedKey Changes
format_boxes: Now returns empty DataFrame with correct structure instead ofNonefor empty predictionsformat_geometry:format_geometry(prediction: dict, scores: bool = True, geom_type: str = None) -> pd.DataFrameNonehandling sinceformat_boxesno longer returnsNoneformat_boxes:format_boxes(prediction: dict, scores: bool = True) -> pd.DataFrameNone)format_geometryfunction still works as before, just with improved empty prediction handlingBenefits
The function supports box geometry predictions and provides appropriate error handling for unsupported geometry types (points and polygons).
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.