Skip to content

#586: Investigate erratic and unpredictable CI error regarding synthetic case (NEW)#590

Draft
cwschilly wants to merge 8 commits intodevelopfrom
586-investigate-erratic-and-unpredictable-CI-error-regarding-synthetic-case-NEW
Draft

#586: Investigate erratic and unpredictable CI error regarding synthetic case (NEW)#590
cwschilly wants to merge 8 commits intodevelopfrom
586-investigate-erratic-and-unpredictable-CI-error-regarding-synthetic-case-NEW

Conversation

@cwschilly
Copy link
Contributor

Fixes #586

@cwschilly cwschilly force-pushed the 586-investigate-erratic-and-unpredictable-CI-error-regarding-synthetic-case-NEW branch from d2a2b79 to 96a3368 Compare February 7, 2025 21:28
@cwschilly cwschilly marked this pull request as draft February 7, 2025 21:29
@cwschilly cwschilly self-assigned this Feb 7, 2025
@ppebay
Copy link
Contributor

ppebay commented Feb 8, 2025

Running 100 times the corresponding case (with only 8 iterations and 1 object per transfer):

for run in {1..100}; do python LBAF_app.py -c synthetic-acceptance.yaml; (cat ../../../output/imbalance.txt; echo) >> res; done

on macOS 14.3 with Python 3.9.19 yields:

awk '{ sum += $1 } END { print sum }' res
0

In other all 100 runs correctly converged to an imbalance of 0.0.

This seems to indicate rather a problem with the CI system itself, and specifically with Python 3.8
@cwschilly

@ppebay
Copy link
Contributor

ppebay commented Feb 9, 2025

In keeping with the above comment, I created a Python 3.8 environment and performed the same test (100 runs):

for run in {1..100}; do python LBAF_app.py -c synthetic-acceptance.yaml; (cat ../../../output/imbalance.txt; echo) >> res; done

and obtained as well

awk '{ sum += $1 } END { print sum }' res
0

i.e., all 100 runs passed the test.

Specifically the version of Python is:

 python --version
Python 3.8.20

@cwschilly can we verify if this is the same version that runs in CI?

@cwschilly
Copy link
Contributor Author

cwschilly commented Feb 10, 2025

@ppebay The CI uses Python 3.8.18. I have been able to recreate the problem locally by running tox with Python 3.8.19, so I don't think it's just a problem with CI.

@cwschilly cwschilly marked this pull request as ready for review February 17, 2025 20:12
@cwschilly cwschilly requested a review from ppebay February 17, 2025 20:12
@cwschilly cwschilly marked this pull request as draft February 18, 2025 20:30
@cwschilly
Copy link
Contributor Author

After changes to the ordering of ranks, Python 3.8 acceptance passed several times in a row. However, Python 3.11 failed: https://github.com/DARMA-tasking/LB-analysis-framework/actions/runs/13397121375/job/37418861102?pr=590

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate erratic and unpredictable CI error regarding synthetic case

2 participants