Dev mainloop integration 1 by kaya-david · Pull Request #949 · fkie-cad/Logprep

kaya-david · 2026-03-26T06:28:15Z

The rendered docs for this PR can be found here.

mhoff

Hey @kaya-david I did a first review pass. Happy to discuss

logprep/ng/runner.py

logprep/ng/util/async_helpers.py

logprep/ng/util/configuration.py

logprep/run_ng.py

pyproject.toml

logprep/ng/util/worker/worker.py

mhoff

I just got one comment for now. I have not further analyzed kafka or opensearch output as @Pablu23 is already working on those

logprep/abc/component.py

mhoff

Hello @kaya-david, I have added some comments

benchmark.py

logprep/ng/abc/output.py

logprep/ng/connector/confluent_kafka/input.py

logprep/ng/connector/confluent_kafka/output.py

logprep/ng/connector/file/input.py

mhoff · 2026-04-02T06:39:30Z

logprep/ng/connector/http/input.py

+    async def shut_down(self):
        """Raises Uvicorn HTTP Server internal stop flag and waits to join"""
        if self.http_server:
            self.http_server.shut_down()
-        return super()._shut_down()
+        await super().shut_down()


As explained above, we no longer introduce separate "clean-up" paths, but instead rely on a single, well-defined tear-down path that implies the instance must not be used afterwards.

logprep/ng/connector/opensearch/output.py

logprep/ng/runner.py

logprep/ng/util/defaults.py

mhoff · 2026-04-02T06:47:13Z

logprep/ng/util/worker/worker.py

        tasks_but_current = [t for t in self._worker_tasks if t is not current_task]

-        logger.debug("waiting for termination of %d tasks", len(tasks_but_current))
+        logger.debug(f"waiting for termination of {len(tasks_but_current)} tasks")


I am a bit puzzled by this change. I thought using %d would be the proper way to avoid string interpolation if the log level is not activated.

Good catch, thanks! I’ve reverted this change.

Pablu23

Thanks for your work, I left you a few Comments I noticed in regards to the in and outputs

Pablu23 · 2026-04-02T06:33:43Z

logprep/ng/connector/confluent_kafka/input.py

+        consumer = await self.get_consumer()
+
+        if consumer is not None:
+            await consumer.unsubscribe()


I believe unsubscribe is unnecessary here, but its probably also not doing any bad

Yes you are rigth, unsubscribe is only needed for dynamic topic switching during runtime.

In our case, shut_down is not designed for that. It follows RAII-like semantics: calling it implies a full teardown of the instance and all associated resources. After that, a fresh instance is expected to be created via setup (as the counterpart to shut_down).

Continuing to operate on the same instance after a partial cleanup (e.g. via unsubscribe) is explicitly not part of the intended lifecycle.

Pablu23 · 2026-04-02T06:37:50Z

logprep/ng/connector/confluent_kafka/output.py

+
        except BufferError:
-            # block program until buffer is empty or timeout is reached
            self._producer.flush(timeout=self.config.flush_timeout)
            logger.debug("Buffer full, flushing")

+            try:
+                self._producer.produce(
+                    topic=target,
+                    value=self._encoder.encode(document),
+                    on_delivery=partial(self.on_delivery, event),
+                )
+            except BufferError as err:
+                event.state.current_state = EventStateType.FAILED
+                event.errors.append(err)
+                logger.error("Message delivery failed after retry: %s", err)
+                self.metrics.number_of_errors += 1
+                return


This is new logic, now we dont try to flush everytime, we only try to flush if we get a BufferError, which I guess is fine, if we just want to flush on a full Buffer but I dont think we do. Also I dont like nesting try, except like this. But I also dont have a better solution for now, maybe recursivly call this same function an increment an optional depth argument, and if that reaches the set retry times, we can error out again

This applied to the old sync producer where we had to manually coordinate produce/poll/flush. I’ve now migrated the producer to the async AIOProducer, where delivery is handled via awaitable futures and internal batching, so this control flow (and related concerns) no longer applies.

Could you please cherry-pick the relevant parts into your output ticket and take another look there? If needed, feel free to implement your own async variant of store_custom / producer handling.

For now I’d keep the current implementation as is and suggest we clean this up together in a separate PR to properly align on the async approach.

Pablu23 · 2026-04-02T06:44:38Z

logprep/ng/connector/confluent_kafka/output.py

+            logger.error("Message delivery failed: %s", err)
+            self.metrics.number_of_errors += 1
+            return
+


Also this part is weird, why do we have to handle the case that Kafka ran into an Exception twice? First in the try, except of the store_custom function, and also here? This Callback is run from the Context of the store_custom function

see comment above

Pablu23 · 2026-04-02T06:45:52Z

logprep/ng/connector/confluent_kafka/output.py

+        if "_producer" in self.__dict__:
+            await self.flush()


Why do we do this? Shouldnt we just always flush, I mmean shouldnt the flush be agnostic to, there is a producer and there is none? Also I dont like this if, isnt there any other way to check if we have a producer?

_producer is a cached property and is only initialized on first access. A shut_down could technically occur before it was ever used (i.e. before the producer exists), which would cause a crash during flush. This check is therefore a precaution.

Pablu23 · 2026-04-02T06:48:13Z

logprep/ng/connector/opensearch/output.py

+        search_context = self.__dict__.get("_search_context")
+        if search_context is not None:
+            await search_context.close()


Suggested change

search_context = self.__dict__.get("_search_context")

if search_context is not None:

await search_context.close()

await self._search_context.close()

Same as above, I added a guard here as well.

PS: I removed the override decorator - as long as mypy does not complain (which is currently the case), overrides don’t add much value. Also, since we have many overloads and don’t consistently use override elsewhere, I prefer to omit it here for consistency.

…p_factory

… for Worker and some logs

…cle across components - unify component lifecycle by introducing async setup/shut_down across NG components - remove legacy _shut_down pattern and simplify base Component shutdown logic - align Connector/Input/Output/Processor lifecycle interfaces - fix kafka output delivery semantics by setting DELIVERED only via on_delivery callback - improve kafka error handling (BufferError retry, KafkaException -> FAILED) - ensure proper resource cleanup (consumer unsubscribe/close, producer flush, opensearch context close) - improve worker shutdown by cancelling only unfinished tasks # Conflicts: # logprep/ng/connector/opensearch/output.py

- remove docker compose teardown from SIGINT handler to avoid interfering with active OpenSearch requests - introduce coordinated shutdown via _shutdown_requested flag - add shutdown checkpoints to abort benchmark flow safely - ensure compose teardown happens only in controlled finally blocks - fix intermittent 503 errors during OpenSearch _count caused by concurrent shutdown

…le shutdown semantics

… (unsubscribe only needed for dynamic topic switching during runtime)

…with awaitable delivery futures

… and align with project naming conventions

…N_PERCENT

…OverRide

…OverRide decorator for consistency

kaya-david force-pushed the dev-mainloop-integration-1 branch from 9c6b0c4 to 8c36e95 Compare March 26, 2026 06:38

kaya-david marked this pull request as draft March 26, 2026 07:31

kaya-david self-assigned this Mar 26, 2026

kaya-david marked this pull request as ready for review March 30, 2026 08:08

kaya-david requested a review from mhoff March 30, 2026 08:08

mhoff reviewed Mar 30, 2026

View reviewed changes

kaya-david force-pushed the dev-mainloop-integration-1 branch from 9db4ab9 to 04234fd Compare March 31, 2026 09:12

mhoff reviewed Mar 31, 2026

View reviewed changes

logprep/ng/util/worker/worker.py Outdated Show resolved Hide resolved

mhoff reviewed Mar 31, 2026

View reviewed changes

logprep/abc/component.py Show resolved Hide resolved

kaya-david requested a review from Pablu23 April 2, 2026 04:18

kaya-david force-pushed the poc-mainloop branch from 5b82bb8 to 4769f6f Compare April 2, 2026 04:29

kaya-david force-pushed the dev-mainloop-integration-1 branch from af8deb4 to 5dbb963 Compare April 2, 2026 04:32

kaya-david requested a review from mhoff April 2, 2026 04:57

mhoff reviewed Apr 2, 2026

View reviewed changes

Pablu23 reviewed Apr 2, 2026

View reviewed changes

kaya-david force-pushed the poc-mainloop branch from 6a88115 to 2265739 Compare April 7, 2026 04:40

kaya-david added 14 commits April 7, 2026 06:40

refactor: remove unnecessary types

6463412

refactor: replace uvloop.run with asyncio.Runner and configurable loo…

2c06bef

…p_factory

fix: fix config refresh, remove config scheduler, small adaptions

70f1542

refactor: remove loop_factory

4438165

feat: add asyncio exception handler for unhandled errors

6cee075

feat: improve config refresh sync/async

dea4dbd

feat: improve config refresh setup and teardown logic + improve types…

8ac6cd2

… for Worker and some logs

refactor: adjust naming to follow Python conventions (shadowing)

4815394

refactor: remove print

a2c4183

refactor: remove unused import

50ea0b8

refactor: simplify config refresh

6b6990c

refactor: annotation

d8422c1

refactor: improve exception handler

99ef937

kaya-david added 5 commits April 7, 2026 06:43

fix: clean up exporter port before and after logprep runs

b61b83f

refactor: restore _shut_down hook to preserve idempotent and extensib…

b97d924

…le shutdown semantics

refactor: review issues

99cd7ec

kaya-david force-pushed the dev-mainloop-integration-1 branch from b037231 to 99cd7ec Compare April 7, 2026 04:43

kaya-david added 2 commits April 7, 2026 08:46

refactor: remove unsubscribe call, as close() already handles cleanup…

6062160

… (unsubscribe only needed for dynamic topic switching during runtime)

feat: migrate to async AIOProducer and replace on_delivery callbacks …

6d8bb81

…with awaitable delivery futures

kaya-david force-pushed the dev-mainloop-integration-1 branch from 5e28118 to 6d8bb81 Compare April 7, 2026 09:16

kaya-david added 5 commits April 7, 2026 11:36

refactor: fix review issue

64656bf

refactor: rename module to logging_helpers to avoid stdlib name clash…

b87ba3b

… and align with project naming conventions

refactor: remove unused constant MAX_CONFIG_REFRESH_INTERVAL_DEVIATIO…

d0d2d72

…N_PERCENT

refactor: fix review issue

db44fe8

refactor: guard cached _search_context on shutdown and remove unused @…

bcba7a9

…OverRide decorator for consistency

Conversation

kaya-david commented Mar 26, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mhoff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mhoff left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mhoff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Pablu23 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaya-david Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kaya-david commented Mar 26, 2026 •

edited by github-actions bot

Loading

mhoff left a comment •

edited

Loading

kaya-david Apr 7, 2026 •

edited

Loading