Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the capability to run a reproducer test via testing-farm after a package build. The changes are extensive, touching configuration, agent logic, data models, and adding new tools. While the overall approach is sound, I've identified a few critical and high-severity issues that need to be addressed. Specifically, there's a critical logic error in the backport agent's workflow that will cause it to fail, an invalid JSON example in the triage agent's prompt that could lead to parsing errors, and a potential command injection vulnerability in the new testing-farm tool. Please see the detailed comments for suggestions on how to resolve these issues.
0bdb015 to
e2764c8
Compare
agents/build_agent.py
Outdated
There was a problem hiding this comment.
This makes no sense. First, URLs to built artifacts are already returned from the build_package tool, second, there could be multiple RPMs (even hundreds in some cases) - which one should the model choose? The correct answer is all of them, but that's not always possible, they could conflict with each other for example. It would be best to leave this to Testing Farm and pass a Copr build ID instead of a RPM.
agents/triage_agent.py
Outdated
There was a problem hiding this comment.
Why limit this only to backports?
agents/triage_agent.py
Outdated
There was a problem hiding this comment.
I suggest using reproducer test case instead of reproducer, everywhere, not just here. Reproducer test case succeeds when the reproducer fails - the issue didn't manifest.
agents/triage_agent.py
Outdated
There was a problem hiding this comment.
This should be part of the output schema - it seems it already is, so there is no reason for this to be here.
| process = await asyncio.create_subprocess_exec( | ||
| *cmd, | ||
| stdout=asyncio.subprocess.PIPE, | ||
| stderr=asyncio.subprocess.PIPE, | ||
| ) |
There was a problem hiding this comment.
I would rather send a request using aiohttp.
agents/backport_agent.py
Outdated
There was a problem hiding this comment.
Like I said elsewhere, this is confusing. We want the reproducer to fail - if it fails, the fix worked. But only if it didn't fail before applying the fix, if it did, it's a bogus reproducer. I would use the term reproducer test case, and run it twice, first without the new build, expected to fail, and then with the new build, expected to succeed.
agents/backport_agent.py
Outdated
There was a problem hiding this comment.
Isn't this line redundant? All the info is on the line above. Though I think we should come up with a convention how to reference test cases, for example, inspired by pip and similar (I think we even use something like that in packit):
https://gitlab.com/redhat/rhel/tests/expat.git@master#testcase=Security/RHEL-114639-CVE-2025-59375-expat-libexpat-in-Expat-allows
e2764c8 to
6f3550d
Compare
6f3550d to
4ee008e
Compare
No description provided.