-
Notifications
You must be signed in to change notification settings - Fork 11
Introduce generic post-processing framework #954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for antenna-preview canceled.
|
mihow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for thinking about the framework abstractly, @mohamedelabbas1996! This looks like a good start.
Another aspect to consider: How do we want to show the output of the post-processing functions in the UI and track what was applied an occurrence in the DB?
Right now we show the classification model that was used, the model type, and the date that the prediction was applied. I think we should add a new field for tracking the post-processing step that was applied as well.
Classification
- model
- date
- filter name / filter class or list of post_processing steps
Alternatively!
We could register each post processing step as an Algorithm. This may fit into our current structure more naturally with less effort (a Pipeline is already a series of algorithms applied).
It may work for some filters and not others. But most of them are types of algorithms. (rank rollups, tracking, etc).
In the AMI Data Companion we consider the tracking stage as the last algorithm applied.
…en creating a terminal classification with the rolled up taxon
|
Here are some notes from a previous design discussion Documentation of the workflow implemented in #915
Other notes
|
ami/main/admin.py
Outdated
| def run_class_masking(self, request: HttpRequest, queryset: QuerySet[SourceImageCollection]) -> None: | ||
| jobs = [] | ||
|
|
||
| DEFAULT_TAXA_LIST_ID = 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need to add the management command from here to use the class masking before we have a UI to trigger it. Otherwise we won't be able to specify the right taxa list.
…gress, and algorithm binding
…f.logger and progress updates
…g and progress tracking
ami/ml/post_processing/base.py
Outdated
| POSTPROCESSING_TASKS: dict[str, type["BasePostProcessingTask"]] = {} | ||
|
|
||
|
|
||
| def register_postprocessing_task(task_cls: type["BasePostProcessingTask"]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you showing or using the list of available post processing tasks anywhere? If it's only in the tests for now that's okay. I'm just curious if the registry is working. We can display the options in the UI later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list of tasks isn’t shown anywhere right now, its only use at the moment is through the function
def get_postprocessing_task(name: str) -> type["BasePostProcessingTask"] | None:
"""
Get a task class by its registry key.
Returns None if not found.
"""
return POSTPROCESSING_TASKS.get(name)
which retrieves the task class from the registry when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a comprehensive post-processing framework for Antenna, providing a standardized way to implement and execute data cleanup and refinement tasks after the main classification pipeline. The framework includes two initial post-processing tasks: Small Size Filter for removing low-information detections and Rank Rollup for improving classification confidence by rolling up uncertain predictions to higher taxonomic ranks.
Key Changes
- Implemented a base post-processing framework with task registration and execution capabilities
- Added two concrete post-processing tasks: Small Size Filter and Rank Rollup
- Integrated post-processing jobs into the existing job infrastructure with admin interface support
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| ami/ml/post_processing/base.py | Core framework defining BasePostProcessingTask abstract class and task registration system |
| ami/ml/post_processing/small_size_filter.py | Task implementation for filtering out small detections |
| ami/ml/post_processing/rank_rollup.py | Task implementation for rolling up uncertain classifications to higher taxonomic ranks |
| ami/jobs/models.py | Added PostProcessingJob type to job execution framework |
| ami/main/admin.py | Added admin actions to trigger post-processing tasks from SourceImageCollection admin |
| ami/main/models.py | Added applied_to field to Classification model for tracking post-processing relationships |
| ami/ml/models/algorithm.py | Added POST_PROCESSING task type to AlgorithmTaskType enum |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| new_score = None | ||
| for rank in rollup_order: | ||
| threshold = thresholds.get(rank, 1.0) | ||
| candidates = {t: s for t, s in taxon_scores.items() if t.rank == rank} |
Copilot
AI
Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comparison t.rank == rank assumes t.rank is a string, but it's likely a TaxonRank enum. This should use t.rank.value == rank or compare against the enum value directly.
| candidates = {t: s for t, s in taxon_scores.items() if t.rank == rank} | |
| candidates = {t: s for t, s in taxon_scores.items() if str(t.rank).upper() == rank.upper()} |
ami/ml/post_processing/base.py
Outdated
|
|
||
|
|
||
| def get_postprocessing_task(name: str) -> type["BasePostProcessingTask"] | None: | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the registry isn't working how you intended, you could change it to import the post processing tasks using their full path.
get_postprocessing_task(key='size_filter', full_path='yuyanslib.processing.biometrics.Filter`):
cls = __import__('ami.ml.post_processing.size_filter.SizeFilter')
# raise "No post processing task registered with the name 'size filter' could be loaded
| ) | ||
| # job = models.CharField(max_length=255, null=True) | ||
|
|
||
| applied_to = models.ForeignKey( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thank you
| occurrence = det.occurrence | ||
| self.assertIsNotNone(occurrence, f"Detection {det.pk} should belong to an occurrence.") | ||
| occurrence.refresh_from_db() | ||
| self.assertEqual( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thank you for this test
|
@mohamedelabbas1996 I think this is the simplest option for the registry that will work well for the current scope. This is what I am doing in the API for the AMI data companion https://github.com/RolnickLab/ami-data-companion/blob/bf0fe16a533a0cc3b94cec7d5da65564c06d99c5/trapdata/api/api.py#L42-L61 from .small_size_filter import SmallSizeFilterTask
# Add more imports as you add tasks
POSTPROCESSING_TASKS = {
"small_size_filter": SmallSizeFilterTask,
# "another_task": AnotherTask,
}
def get_postprocessing_task(key: str):
return POSTPROCESSING_TASKS.get(key)If we want developers to start adding more tasks outside of the post_processing module, we could try the approach that uses the full module path. But honestly the first method is probably the best for now. def get_postprocessing_task(class_path: str) -> type["BasePostProcessingTask"] | None:
"""
Get a task class by its full Python path.
Example:
task_cls = get_postprocessing_task("ami.ml.post_processing.small_size_filter.SmallSizeFilterTask")
Returns None if not found or invalid.
"""
try:
module_path, class_name = class_path.rsplit(".", 1)
module = importlib.import_module(module_path)
task_cls = getattr(module, class_name)
# Validate it's a post-processing task
if not issubclass(task_cls, BasePostProcessingTask):
logging.error(f"{class_path} is not a subclass of BasePostProcessingTask")
return None
return task_cls
except (ValueError, ImportError, AttributeError) as e:
logging.error(f"Failed to load post-processing task '{class_path}': {e}")
return NoneUsage: job.params = {
"task": "ami.ml.post_processing.small_size_filter.SmallSizeFilterTask",
"config": {
"size_threshold": 0.01,
"source_image_collection_id": 123
} |


Summary
This PR introduces a reusable and unified framework for post-processing in Antenna, providing a consistent pattern to implement and manage post-processing tasks.
List of Changes
Introduced Post-Processing Framework
BasePostProcessingTaskto define a common structure for all post-processing tasks.Added Two basic Post-Processing Tasks
Marks detections with a small relative bounding-box area (compared to the full image) as Not Identifiable, making it easy for noisy or low-information detections to be filtered out.
New Job Type:
PostProcessingJobJobTypethat executes post-processing tasks .Trigger tasks from admin page
Related Issues
#957
Detailed Description
This PR lays the foundation for Antenna’s post-processing framework.
It provides a modular, extensible framework for running data cleanup and refinement tasks after the main classification pipeline to improve the processing pipeline results.
The post-processing framework allows to:
BasePostProcessingTask.Initial tasks include:
How to Test the Changes
Screenshots
Deployment Notes
Includes several migrations
Checklist