-
Notifications
You must be signed in to change notification settings - Fork 87
Improve MHP precision using ancestor locksets #1865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some general questions:
- Is there a way to enforce configuration settings to be set to certain values in order for the analysis to work (i.e. similar to how analyses can be declared as dependencies)? This would be necessary for the settings revolving around thread ids as described in the pr summary. Checking settings using
GobConfig.get_stringat the start of transfer functions doesn't really seem ideal to me - is there an ideal amount of tests I should write or can I write as many as I feel to be necessary?
src/analyses/creationLockset.ml
Outdated
| match tid_lifted, child_tid_lifted with | ||
| | `Lifted tid, `Lifted child_tid -> | ||
| let descendants = descendants_closure child_ask child_tid in | ||
| let lockset = ask.f Queries.MustLockset in | ||
| let to_contribute = cartesian_prod (TIDs.singleton tid) lockset in | ||
| TIDs.iter (contribute_lock man to_contribute) descendants | ||
| | _ -> (* TODO deal with top or bottom? *) () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it mean for a thread id to be top or bottom? Not sure how to deal with this here and in some other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should usually not happen. If it does, I think the best is to assume that all bets are off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you suggest emitting a warning or something similar here or can I keep everything the way it is? (except for removing the comments of course)
|
As many as necessary to cover all the functionality and ideally the corner cases of the domains/analyses. |
that's what I needed, nice :D |
I'm going to add some more in the coming days 👍 |
|
The ci builds failed due to an incorrect semi colon. I fixed this in f1efd32, so everything should be working now |
|
90/10 exposes a bug, which is caused by the fact that locksets have a more granular path sensitivity than the implemented creation lockset analyses. This will be fixed soon. |
|
I asked copilot for a review, 50% joking. See what makes sense to you, but don't buy all the stuff it claims. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements ancestor lockset analysis to improve May-Happen-in-Parallel (MHP) precision in the Goblint analyzer. The approach tracks locks held by ancestor threads when creating descendant threads, enabling better race detection by identifying threads that are protected by ancestor locks.
Key changes:
- Adds
transitiveDescendantsanalysis to track transitive descendant threads - Implements
creationLocksetandtaintedCreationLocksetanalyses for ancestor lock tracking - Refactors lock domain handling with a new helper function
- Includes comprehensive test suite with 17 test cases covering various scenarios
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/analyses/transitiveDescendants.ml |
New flow-insensitive analysis mapping threads to may-sets of descendants |
src/analyses/creationLockset.ml |
Core implementation of ancestor lockset analysis with creation and tainted creation lockset specs |
src/cdomains/lockDomain.ml |
Adds of_addr helper function to convert addresses to must-locks |
src/analyses/mutexGhosts.ml |
Refactored to use new LockDomain.MustLock.of_addr helper |
src/domains/queries.ml |
Adds DescendantThreads and MayCreationLockset query types with supporting infrastructure |
src/goblint_lib.ml |
Registers new TransitiveDescendants and CreationLockset modules |
src/maingoblint.ml |
Adds configuration validation for creation lockset analysis requirements |
tests/regression/90-races_ancestor_locksets/*.c |
Comprehensive test suite with 17 test cases covering race-free and racing scenarios |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/regression/53-races-mhp/12-lockset_inter_threaded_lock_transitive_racefree.c
Show resolved
Hide resolved
|
@dabund24: I am inviting you to the organization so the CI runs automatically for your PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not yet completely reviewed it (please re-request my review once you have fixed the bug you found).
I like the degree to which this re-uses things that are already there, and think it will make a nice addition to Goblint (even if we have to contribute the cases where it makes a difference to SV-COMP ourselves, but that's more a problem of how few benchmarks we have there).
While looking over it superficially, I started wondering about something:
You are explicitly constructing this set of descendants here. Is this actually needed?
- Can you get away with reconstructing it dynamically via the
ancestorinformation encoded in the thread ids, e.g., inmay_race? - Or in the case of tainted locksets by storing something along the lines of (whatever is a descendant of
t_1except those that are given explicitly inJhas potentially tainted lockset)?
May be that there is very good reasons that this is not possible (or not possible without massive precision loss), but one should maybe waste a few minutes thinking about it. The more modular the collected information is, the nicer the whole analysis becomes.
Stupid question: Is that not a problem of how CL is constructed that is in principle independent of path-sensitivity and can, e.g., also arise as a result of context-sensitivity? Assuming Would Then, Afterthought: how do you deal with ambiguous creators? I guess giving up when the thread id is no longer unique? The push for a more modular solution was that I implemented something that looks somewhat similar on the surface (#1065) which turned out to cause a slow-down by a factor of 4 (#1120), which we have still not fixed. But maybe we can go with the descendant global invariant for now and then check later if it causes any slowdown we're unwilling to pay on real programs? We can still go for the more involved local solution later if this is the case? |
I haven't thought about that, but you are right, we do have the same problem here. In 1db14cb I added another test, which covers this. However, I think that the fix for the path-sensitivity problem should fix this, too, since in that case, we would again have another creation statement without the mutex locked and add it to the tainted set (or use your approach).
I was assuming initially, that the three cases in the section "Notes on non-unique thread ids" from the PR-summary are the only relevant cases, where ambiguous creators could be a problem, but thinking about it, that is a really bold claim, which I just should not make without knowing a proof for it. Checking if the descendant threads have a unique thread id wouldn't even result in a loss of precision in most cases[1], since the
I'm amazed, that is so much nicer O_O
I am going to implement the other case, too, since I am also somewhat intrigued now, how the two approaches will compare. Thanks a lot for your remarks! [1] it would if we never unlock and never join, but I think that this wouldn't be too tragic |
|
Is there a way of accessing all (must-)ancestors of a thread? Getting all threads in general or all keys of a global analysis would also work, but I don't see any of that to be possible at first glance. If that's the cas, I can add a |
The simpler version is now done (at least that's what I believe) and in sync with the PR-summary. If you want to review it already, you can do so. Otherwise, feel free to ignore the review request until the alternative solution is also implemented. |
I probably won't get around to it until some time next week, but I added it to my TODO (list / stack / multiset). |
I think in general no. However, if you only need the must ancestors of definite thread ids (which I guess is true in your case?), you can reconstruct them as the new create edge is simply appended to the sequence of the parent. Such a function |
sim642
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine by me as well, but @DrMichaelPetter should also have a look before we merge this.
DrMichaelPetter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Huge thanks to all of you for your reviews and helpful hints! |

First part of #1805.
Second case is handled in #1913.
To be handled
Non-transitive version
When creating$t_1$ , $t_0$ must hold a lock $l$ . If $l$ is not released before $t_1$ is definitely joined into $t_0$ , $t_1$ is protected by $l$ .
Examples
graph TB; subgraph t1; E["..."]-->F["return;"]; end; subgraph t0; A["lock(l);"]-->B; B["create(t1);"]-->C; C["join(t1);"]-->D["unlock(l);"] end; B-.->E F-.->Cgraph TB; subgraph t1; E["..."]-->F["return;"]; end; subgraph t0; A["lock(l);"]-->B; B["create(t1);"]-->C[return;]; end; B-.->EGeneral version
Let$t_1$ be a must-ancestor of $t_0$ . When creating $t_1$ , $t_0$ must hold a lock $l$ . If $l$ is not released before $t_d$ is definitely joined into $t_0$ , $t_d$ is protected by $l$ .
Example
graph TB; subgraph td; G["..."]-->H["return;"]; end; subgraph t1; E["create(td);"]-->F["return;"]; end; subgraph t0; A["lock(l);"]-->B; B["create(t1);"]-->C; C["join(td);"]-->D["unlock(l);"] end; B-.->E E-.->G H-.->CDependency Analyses
Conditions to satisfy
create(t1)increate(t1)inunlock(l)inAnalysis
MapBotContributions
create(t1):unlock(l):unlockof unknown mutex:Rules for MHP exclusion
Program points$s_1$ with $\mathcal T_1$ , $\mathcal L_1$ and $\mathcal{CL}_1$ and $s_2$ with $\mathcal T_2$ , $\mathcal L_2$ and $\mathcal{CL}_2$ cannot happen in parallel if at least one condition holds: