Introduce generic post-processing framework #954

mohamedelabbas1996 · 2025-09-18T14:25:25Z

Summary

This PR introduces a reusable and unified framework for post-processing in Antenna, providing a consistent pattern to implement and manage post-processing tasks.

List of Changes

Introduced Post-Processing Framework
- Added a new base class BasePostProcessingTask to define a common structure for all post-processing tasks.
Added Two basic Post-Processing Tasks
- Small Size Filter
  Marks detections with a small relative bounding-box area (compared to the full image) as Not Identifiable, making it easy for noisy or low-information detections to be filtered out.
New Job Type: PostProcessingJob
- Introduced a new JobType that executes post-processing tasks .
Trigger tasks from admin page
- Both the Small Size Filter task can now be triggered directly from the SourceImageCollection admin page.

Related Issues

#957

Detailed Description

This PR lays the foundation for Antenna’s post-processing framework.
It provides a modular, extensible framework for running data cleanup and refinement tasks after the main classification pipeline to improve the processing pipeline results.

The post-processing framework allows to:

Implement new post-processing logic by simply subclassing BasePostProcessingTask.
Execute them as jobs through the existing job infrastructure.
Access logging, progress tracking, and error handling.

Initial tasks include:

Small Size Filter: Flags small detections as non-identifiable.

How to Test the Changes

Open the Django admin interface.
Go to Source Image Collections.
Select one or more collections.
Trigger:
- Run Small Size Filter admin actiona and verify detections below the size threshold are relabeled as Not Identifiable.
Observe logs and job progress under Jobs to confirm successful execution and completion.

Screenshots

Deployment Notes

Includes several migrations

Checklist

I have tested these changes appropriately.
I have added and/or modified relevant tests.
I updated relevant documentation or comments.
I have verified that this PR follows the project's coding standards.
Any dependent changes have already been merged to main.

netlify · 2025-09-18T14:25:47Z

✅ Deploy Preview for antenna-preview canceled.

Name	Link
🔨 Latest commit	`102d0b5`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-preview/deploys/68f089917766690008671dc1

mihow

Thanks for thinking about the framework abstractly, @mohamedelabbas1996! This looks like a good start.

Another aspect to consider: How do we want to show the output of the post-processing functions in the UI and track what was applied an occurrence in the DB?

Right now we show the classification model that was used, the model type, and the date that the prediction was applied. I think we should add a new field for tracking the post-processing step that was applied as well.

Classification

model
date
filter name / filter class or list of post_processing steps

Alternatively!

We could register each post processing step as an Algorithm. This may fit into our current structure more naturally with less effort (a Pipeline is already a series of algorithms applied).

It may work for some filters and not others. But most of them are types of algorithms. (rank rollups, tracking, etc).

In the AMI Data Companion we consider the tracking stage as the last algorithm applied.

…en creating a terminal classification with the rolled up taxon

mihow · 2025-10-13T22:55:43Z

Here are some notes from a previous design discussion

Documentation of the workflow implemented in #915

A TaxaList is chosen to use as the categories we want to see results for (e.g. moths of Oregon)
The user selects a set of images, and the results that they want to apply the filter to (e.g. classifications from the global moth model).
The mask is applied to the logits of the classifications of the predictions of all the detections in those images. The logits cannot be set to Zero. They must either be removed, or set to a very low number (which is what we do in the example). Then the softmax scores are recalculated, so we can see the top 1 prediction for each detection.
The previous predictions are updated so that terminal=False (the classification we are masking from the global model)
The occurrence determinations are recalculated
You can run this from a management command, or run it on a single occurrence from action in the Django admin. Which is the best way to debug it. Then you are working with a occurrence at a time.

Other notes

I think we should probably create an AlgorithmCategoryMap & an Algorithm based on the existing model and the TaxaList filter. Then you can see the history of what's been applied in the prediction history. For example, rather than showing 2 predictions from the Global Moth Classifier with different results, we can create a new algorithm dynamically and call it "Oregon taxa from Global moths". Then we can also skip the step of masking the AlgorithmCategoryMap each time the process is run.
In the end, we will have a Pipeline that is pre-selectable for "Oregon moths". Which will show the Detector, Binary classifier, Global classifier, Oregon class mask.
Another option is to send the taxalist to the AMI data companion and let the filtering happen there.

mihow · 2025-10-14T00:59:17Z

ami/main/admin.py

+    def run_class_masking(self, request: HttpRequest, queryset: QuerySet[SourceImageCollection]) -> None:
+        jobs = []
+
+        DEFAULT_TAXA_LIST_ID = 5


I think we will need to add the management command from here to use the class masking before we have a UI to trigger it. Otherwise we won't be able to specify the right taxa list.

https://github.com/RolnickLab/antenna/pull/915/files#diff-c50e8d1a96421d4b5d8dbe5634e99a71bf7cf1fc820349c88875f260630e6af6

https://github.com/RolnickLab/antenna/blob/19b0cecfacee2d3e62ae89f56b4e81990f3cdfff/ami/ml/management/commands/test_class_masking.py

…un() call

…gress, and algorithm binding

…f.logger and progress updates

…g and progress tracking

mihow · 2025-10-14T20:22:05Z

ami/ml/post_processing/base.py

+POSTPROCESSING_TASKS: dict[str, type["BasePostProcessingTask"]] = {}
+
+
+def register_postprocessing_task(task_cls: type["BasePostProcessingTask"]):


Are you showing or using the list of available post processing tasks anywhere? If it's only in the tests for now that's okay. I'm just curious if the registry is working. We can display the options in the UI later.

The list of tasks isn’t shown anywhere right now, its only use at the moment is through the function

def get_postprocessing_task(name: str) -> type["BasePostProcessingTask"] | None: """ Get a task class by its registry key. Returns None if not found. """ return POSTPROCESSING_TASKS.get(name)

which retrieves the task class from the registry when needed.

…cation

Copilot

Pull Request Overview

This PR introduces a comprehensive post-processing framework for Antenna, providing a standardized way to implement and execute data cleanup and refinement tasks after the main classification pipeline. The framework includes two initial post-processing tasks: Small Size Filter for removing low-information detections and Rank Rollup for improving classification confidence by rolling up uncertain predictions to higher taxonomic ranks.

Key Changes

Implemented a base post-processing framework with task registration and execution capabilities
Added two concrete post-processing tasks: Small Size Filter and Rank Rollup
Integrated post-processing jobs into the existing job infrastructure with admin interface support

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
ami/ml/post_processing/base.py	Core framework defining BasePostProcessingTask abstract class and task registration system
ami/ml/post_processing/small_size_filter.py	Task implementation for filtering out small detections
ami/ml/post_processing/rank_rollup.py	Task implementation for rolling up uncertain classifications to higher taxonomic ranks
ami/jobs/models.py	Added PostProcessingJob type to job execution framework
ami/main/admin.py	Added admin actions to trigger post-processing tasks from SourceImageCollection admin
ami/main/models.py	Added applied_to field to Classification model for tracking post-processing relationships
ami/ml/models/algorithm.py	Added POST_PROCESSING task type to AlgorithmTaskType enum

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-15T03:22:49Z

ami/ml/post_processing/rank_rollup.py

+                new_score = None
+                for rank in rollup_order:
+                    threshold = thresholds.get(rank, 1.0)
+                    candidates = {t: s for t, s in taxon_scores.items() if t.rank == rank}


The comparison t.rank == rank assumes t.rank is a string, but it's likely a TaxonRank enum. This should use t.rank.value == rank or compare against the enum value directly.

Suggested change

candidates = {t: s for t, s in taxon_scores.items() if t.rank == rank}

candidates = {t: s for t, s in taxon_scores.items() if str(t.rank).upper() == rank.upper()}

ami/ml/post_processing/small_size_filter.py

ami/ml/post_processing/rank_rollup.py

ami/ml/post_processing/small_size_filter.py

ami/ml/post_processing/rank_rollup.py

…sing tasks

…reation

…agement

mihow · 2025-10-15T06:29:00Z

This is getting super close! All of the co-pilot comments look valid, but note that they are about the specific filters, rather than the overall framework.

Also I am noticing the determination is not updating automatically. After I run a filter it still looks like this:

But should look like this:

After the new classifications are created in batch, You have to loop through every occurrence that was modified and run update_determination(). There is no batch method for that.

mihow · 2025-10-15T15:49:54Z

ami/ml/post_processing/base.py

+
+
+def get_postprocessing_task(name: str) -> type["BasePostProcessingTask"] | None:
+    """


If the registry isn't working how you intended, you could change it to import the post processing tasks using their full path.

get_postprocessing_task(key='size_filter', full_path='yuyanslib.processing.biometrics.Filter`): cls = __import__('ami.ml.post_processing.size_filter.SizeFilter') # raise "No post processing task registered with the name 'size filter' could be loaded

mihow · 2025-10-15T19:29:38Z

ami/main/models.py

    )
    # job = models.CharField(max_length=255, null=True)
-
+    applied_to = models.ForeignKey(


Looks good, thank you

mihow · 2025-10-15T19:31:01Z

ami/ml/tests.py

+            occurrence = det.occurrence
+            self.assertIsNotNone(occurrence, f"Detection {det.pk} should belong to an occurrence.")
+            occurrence.refresh_from_db()
+            self.assertEqual(


Nice, thank you for this test

mihow · 2025-10-15T21:22:02Z

@mohamedelabbas1996 I think this is the simplest option for the registry that will work well for the current scope. This is what I am doing in the API for the AMI data companion https://github.com/RolnickLab/ami-data-companion/blob/bf0fe16a533a0cc3b94cec7d5da65564c06d99c5/trapdata/api/api.py#L42-L61

from .small_size_filter import SmallSizeFilterTask
# Add more imports as you add tasks

POSTPROCESSING_TASKS = {
    "small_size_filter": SmallSizeFilterTask,
    # "another_task": AnotherTask,
}

def get_postprocessing_task(key: str):
    return POSTPROCESSING_TASKS.get(key)

If we want developers to start adding more tasks outside of the post_processing module, we could try the approach that uses the full module path. But honestly the first method is probably the best for now.

def get_postprocessing_task(class_path: str) -> type["BasePostProcessingTask"] | None:
    """
    Get a task class by its full Python path.
    
    Example:
        task_cls = get_postprocessing_task("ami.ml.post_processing.small_size_filter.SmallSizeFilterTask")
    
    Returns None if not found or invalid.
    """
    try:
        module_path, class_name = class_path.rsplit(".", 1)
        module = importlib.import_module(module_path)
        task_cls = getattr(module, class_name)
        
        # Validate it's a post-processing task
        if not issubclass(task_cls, BasePostProcessingTask):
            logging.error(f"{class_path} is not a subclass of BasePostProcessingTask")
            return None
            
        return task_cls
    except (ValueError, ImportError, AttributeError) as e:
        logging.error(f"Failed to load post-processing task '{class_path}': {e}")
        return None

Usage:

job.params = {
    "task": "ami.ml.post_processing.small_size_filter.SmallSizeFilterTask",
    "config": {
        "size_threshold": 0.01,
        "source_image_collection_id": 123
    }

…lnickLab/antenna into feat/postprocessing-framework

feat: add base and runner classes for generic post-processing framework

f46e88c

mihow reviewed Sep 18, 2025

View reviewed changes

mohamedelabbas1996 self-assigned this Sep 19, 2025

mohamedelabbas1996 linked an issue Sep 23, 2025 that may be closed by this pull request

Implement a reusable post-processing framework #957

Closed

mohamedelabbas1996 added 15 commits September 30, 2025 13:09

feat: add post-processing framework base post-processing task class

d86ea4d

feat: add small size filter post-processing task class

2c0f78f

feat: add post processing job type

ffba709

feat: trigger small size filter post processing task from admin page

63cd84b

feat: add a new algorithm task type for post-processing

cab62bf

chore: deleted runner.py

6d0e284

feat: add migration for creating a new job type

4cfe2d8

fix: fix an import error with the AlgorithmTaskType

b42e069

feat: update identification history of occurrences in SmallSizeFilter

cb7c83a

feat: add rank rollup

10103db

feat: add class masking post processing task

2e81d90

feat: trigger class masking from admin page

0baf8ce

fix: modified log messages

f3caa18

fix: set the classification algorithm to the rank rollup Algorithm wh…

65d4fef

…en creating a terminal classification with the rolled up taxon

feat: trigger rank rollup from admin page

e13afc1

mihow mentioned this pull request Oct 13, 2025

Follow-up fixes to default filters #994

Open

4 tasks

mihow reviewed Oct 14, 2025

View reviewed changes

mohamedelabbas1996 added 7 commits October 14, 2025 11:31

Remove class_masking.py from framework branch

7ecc18c

fix: initialize post-processing tasks with job context and simplify r…

f214025

…un() call

feat: add permission to run post-processing jobs

20ff4b6

chore: remove class_masking import

5b66ae3

refactor: redesign BasePostProcessingTask with job-aware logging, pro…

0419eff

…gress, and algorithm binding

refactor: adapt RankRollupTask to new BasePostProcessingTask with sel…

1ad1e76

…f.logger and progress updates

refactor: update SmallSizeFilter to use BasePostProcessingTask loggin…

d97e8e0

…g and progress tracking

feat: improved progress tracking

21e6648

mihow reviewed Oct 14, 2025

View reviewed changes

feat: add applied_to field to Classification to track source classifi…

6632c31

…cation

mihow requested a review from Copilot October 15, 2025 03:21

Copilot AI reviewed Oct 15, 2025

View reviewed changes

mohamedelabbas1996 added 4 commits October 15, 2025 00:13

tests: added tests for small size filter and rank roll up post-proces…

23f80fb

…sing tasks

fix: create only terminal classifications and remove identification c…

336636a

…reation

refactor: remove inner transaction.atomic for cleaner transaction man…

0d90cde

…agement

tests: fixed small size filter test

23469e2

mihow reviewed Oct 15, 2025

View reviewed changes

mohamedelabbas1996 added 4 commits October 15, 2025 13:33

chore: remove Rank Rollup implementation from framework branch

916b4b6

feat: add project filter to SourceImageCollection admin

865ffbc

fix: update occurrence determination

ade9a51

Merge branch 'main' into feat/postprocessing-framework

6746848

mihow reviewed Oct 15, 2025

View reviewed changes

ami/main/models.py

)

# job = models.CharField(max_length=255, null=True)

applied_to = models.ForeignKey(

Copy link

Collaborator

mihow Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you

mihow reviewed Oct 15, 2025

View reviewed changes

mohamedelabbas1996 added 2 commits October 15, 2025 17:45

refactor: use static registry for post-processing task lookup

58f2850

Merge branch 'feat/postprocessing-framework' of https://github.com/Ro…

9066bbb

…lnickLab/antenna into feat/postprocessing-framework

mihow changed the title ~~[Draft] Introduce generic post-processing framework~~ Introduce generic post-processing framework Oct 16, 2025

mihow marked this pull request as ready for review October 16, 2025 01:05

mihow added 6 commits October 15, 2025 18:08

fix: handle missing dimensions (invalid detections)

e53bea4

feat: update bulk saving of size filter results

f179746

chore: update logging

4162644

fix: reduce size threshold and stop repeat updates of occurrences

e61fc9f

feat: update default threshold

2f8c06d

feat: update name of post processing tasks & algorithms

102d0b5

mihow merged commit b387478 into main Oct 16, 2025
6 checks passed

mihow deleted the feat/postprocessing-framework branch October 16, 2025 06:07

		POSTPROCESSING_TASKS: dict[str, type["BasePostProcessingTask"]] = {}


		def register_postprocessing_task(task_cls: type["BasePostProcessingTask"]):

	candidates = {t: s for t, s in taxon_scores.items() if t.rank == rank}
	candidates = {t: s for t, s in taxon_scores.items() if str(t.rank).upper() == rank.upper()}



		def get_postprocessing_task(name: str) -> type["BasePostProcessingTask"] \| None:
		"""

Introduce generic post-processing framework #954

Introduce generic post-processing framework #954

Uh oh!

Conversation

mohamedelabbas1996 commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

List of Changes

Related Issues

Detailed Description

How to Test the Changes

Screenshots

Deployment Notes

Checklist

Uh oh!

netlify bot commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-preview canceled.

Uh oh!

mihow left a comment

Choose a reason for hiding this comment

Uh oh!

mihow commented Oct 13, 2025

Uh oh!

mihow Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mihow Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

mohamedelabbas1996 Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Key Changes

Reviewed Changes

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mihow commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mihow Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mihow Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

mihow Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

mihow commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mohamedelabbas1996 commented Sep 18, 2025 •

edited

Loading

netlify bot commented Sep 18, 2025 •

edited

Loading

mihow Oct 14, 2025 •

edited

Loading

mohamedelabbas1996 Oct 14, 2025 •

edited

Loading

mihow commented Oct 15, 2025 •

edited

Loading

mihow Oct 15, 2025 •

edited

Loading

mihow commented Oct 15, 2025 •

edited

Loading