Skip to content

Conversation

@prashantbytesyntax
Copy link
Contributor

@prashantbytesyntax prashantbytesyntax commented Sep 4, 2025

This implementation adds "Smart Skipping Logic" to gProfiler to reduce error rates when profiling short-lived processes, addressing the issues described in Intel gProfiler Issue #996.

Description

Problem Statement
Currently gProfiler has high error rates during profiling due to:

Short-lived processes: Profilers attempting to profile processes that exit during profiling
Impact: Multiple errors/day from rbspy, py-spy failing on transient processes
Root Cause: Race conditions with process lifecycle

Solution: Smart Skipping Logic
Core Implementation
Process Age Checking: Skip processes younger than min_duration seconds
Enhanced Error Handling: Graceful handling for processes that exit during profiling
Applied Across: Ruby, Java, and Python profilers

How Has This Been Tested?

unit tests
x86_64, arm_64

`
Use default 10 second threshold
./gprofiler

Custom threshold - skip processes younger than 5 seconds
./gprofiler --min-duration 5

More aggressive - skip processes younger than 30 seconds
./gprofiler --min-duration 30
`

Screenshots

Checklist:

  • I have read the CONTRIBUTING document.
  • I have updated the relevant documentation.
  • I have added tests for new logic.

@mlim19
Copy link
Contributor

mlim19 commented Sep 25, 2025

@prashantbytesyntax Please fix the linter issue

dkorlovs
dkorlovs previously approved these changes Oct 22, 2025
@mlim19
Copy link
Contributor

mlim19 commented Oct 23, 2025

@prashantbytesyntax and @ashokbytebytego, many tests still failed. One possible cause is below. Please revise.

CRITICAL: gprofiler.profilers.factory: Couldn't create the Perf profiler, not continuing. Run with --no-perf to disable this profiler\nTraceback (most recent call last):\n File "gprofiler/profilers/factory.py", line 54, in get_profilers\nTypeError: SystemProfiler.init() got an unexpected keyword argument 'min_duration'\n'

@mlim19
Copy link
Contributor

mlim19 commented Oct 28, 2025

@prashantbytesyntax @ashokbytebytego, I still see the linter issue with the PR. Please fix it. And please rebase your PR with the latest master which probably can resolve one test failure.

@ashokbytebytego
Copy link
Contributor

@mlim19 I have ran lint and tests locally and test as well. Please let me know if I miss anything here?

source venv311/bin/activate && ./lint.sh --ci. - for lint
Skipped 3 files
All done! ✨ 🍰 ✨
73 files would be left unchanged.
Success: no issues found in 73 source files

ource venv311/bin/activate && sudo -E $(which python3) -m pytest -v tests/test_short_lived_process_fix.py
/home/achatharajupalli/code/PnterestGprofiler/Intel/gprofiler_arm_64/venv311/lib/python3.11/site-packages/pytest_asyncio/plugin.py:217: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset.
The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session"

warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET))
============================================================================================================================================================= test session starts ==============================================================================================================================================================
platform linux -- Python 3.11.14, pytest-8.3.5, pluggy-1.6.0 -- /home/XXX/Intel/gprofiler_arm_64/venv311/bin/python3
cachedir: .pytest_cache
rootdir: /home/XXXXX/Intel/gprofiler_arm_64/tests
configfile: pytest.ini
plugins: asyncio-0.26.0
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 15 items

tests/test_short_lived_process_fix.py::TestShortLivedProcessFix::test_custom_min_duration_threshold PASSED [ 6%]
tests/test_short_lived_process_fix.py::TestShortLivedProcessFix::test_error_scenarios PASSED [ 13%]
tests/test_short_lived_process_fix.py::TestShortLivedProcessFix::test_much_older_process_is_not_skipped PASSED [ 20%]
tests/test_short_lived_process_fix.py::TestShortLivedProcessFix::test_older_process_is_not_skipped PASSED [ 26%]
tests/test_short_lived_process_fix.py::TestShortLivedProcessFix::test_process_age_calculation_accuracy PASSED [ 33%]
tests/test_short_lived_process_fix.py::TestShortLivedProcessFix::test_process_at_threshold_is_not_skipped PASSED [ 40%]
tests/test_short_lived_process_fix.py::TestShortLivedProcessFix::test_process_just_under_threshold_is_skipped PASSED [ 46%]
tests/test_short_lived_process_fix.py::TestShortLivedProcessFix::test_very_young_process_is_skipped PASSED [ 53%]
tests/test_short_lived_process_fix.py::TestShortLivedProcessFix::test_young_process_is_skipped PASSED [ 60%]
tests/test_short_lived_process_fix.py::TestShortLivedProcessFix::test_zero_min_duration_disables_skipping PASSED [ 66%]
tests/test_short_lived_process_fix.py::TestErrorReductionScenarios::test_build_script_scenario PASSED [ 73%]
tests/test_short_lived_process_fix.py::TestErrorReductionScenarios::test_container_init_scenario PASSED [ 80%]
tests/test_short_lived_process_fix.py::TestErrorReductionScenarios::test_database_scenario PASSED [ 86%]
tests/test_short_lived_process_fix.py::TestErrorReductionScenarios::test_utility_command_scenario PASSED [ 93%]
tests/test_short_lived_process_fix.py::TestErrorReductionScenarios::test_web_server_scenario PASSED

=============================================================================================================================================================== warnings summary ===============================================================================================================================================================
gprofiler/metadata/py_module_version.py:31
/home/achatharajupalli/code/PnterestGprofiler/Intel/gprofiler_arm_64/gprofiler/metadata/py_module_version.py:31: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
import pkg_resources

tests/test_short_lived_process_fix.py:22
/home/achatharajupalli/code/PnterestGprofiler/Intel/gprofiler_arm_64/tests/test_short_lived_process_fix.py:22: PytestCollectionWarning: cannot collect test class 'TestProfilerBase' because it has a init constructor (from: test_short_lived_process_fix.py)
class TestProfilerBase:

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================================================================================================================================================== 15 passed, 2 warnings in 0.02s ========================================================================================================================================================

@prashantbytesyntax prashantbytesyntax merged commit acea67f into intel:master Oct 31, 2025
78 of 83 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants