Skip to content

Conversation

@xFrednet
Copy link
Collaborator

@xFrednet xFrednet commented Jan 9, 2025

The first commit is a copy pasta from #34 and will be removed once it's merged

This PR covers all Py_CHECKWRITE in bytecodes.c and cellobject.c. The bytecode instructions probably still need more work for local references, but that can easily be done in a followup PR

@xFrednet xFrednet marked this pull request as draft January 10, 2025 09:20
@xFrednet xFrednet force-pushed the phase3-bytecode-wb branch 2 times, most recently from d993533 to 95cc74b Compare January 10, 2025 09:41
@xFrednet xFrednet changed the title [Draft] Phase3: Add write barrier to Cell and some byte code [Draft] Phase3: Add write barrier to Cell and some bytecodes Jan 10, 2025
@xFrednet

This comment was marked as outdated.

@xFrednet xFrednet force-pushed the phase3-bytecode-wb branch 2 times, most recently from 66c3073 to b24bd15 Compare January 15, 2025 15:28
Comment on lines +2002 to +2041
bool _Py_RegionChangeReference(PyObject* src, PyObject* old_tgt, PyObject* new_tgt) {
// Only run the write barrier(WB) if the reference targets are different.
// This is important for bridge objects, as there can only be one owning
// reference at a time and adding a new reference first would throw a
// region error due to the WB believing that it would result in a second
// owning reference being created.
//
// Removing the reference before the new one also doesn't work. As it
// could cause a parent region to close prematurely and release a cown.
// It would also require adding the removed reference again, if the
// `_Py_RegionAddReference` fails.
if (old_tgt == new_tgt) {
Copy link
Collaborator Author

@xFrednet xFrednet Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned in earlier this week that we're only allowed to run the WB if the referenced object actually changes. I tried to document the reason here.

While writing, I also noticed that this case isn't handled in the "Pseudo code for write-barrier" in the paper we submitted to PLDI25. It will simply require a new if statement in writeBarrier(src, oldtgt, tgt) like this:

if oldtgt is tgt:
  return

or probably better

if not oldtgt is tgt:
  addReference(src, tgt)
  removeReference(src, oldtgt)

cc: @TobiasWrigstad @mjp41


EDIT: I also created a reminder PR in the paper repo fxpl/pyrona#12

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent @xFrednet!

If I understand correctly, the problem is due to double capturing of a borrowed ref to a bridge object, which is always an error except in the very special case when the double capturing happens by the same field. This program produces the error:

r1 = Region() # borrowed ref to region
r2 = Region() # borrowed ref to region
class Obj: pass
r1.obj = Obj()
r1.obj.f = r2 # o captures r2, but r2 remains a borrowed ref because no move
r1.obj.f = r2 # 2nd attempt at storing r2 in o.f fails (but does not trigger invariant) because r2 is already captured

If we ran REMOVEREFERENCE before ADDREFERENCE this would work, but we don't do that (nor should we).

If my understanding is correct, I am on the fence if we should actually fix it because it is almost certainly an error in the program (although I am sure Python people would like it to work).

If we do want to fix it, I suggest the fix happens as an extra check after add_to_region detects a failure due to r2 not being free. Otherwise, we have one more branch in the barrier for a case that is super rare. This penalises this very unlikely case instead of adding an extra check in the barrier top-level.

@xFrednet xFrednet marked this pull request as ready for review January 15, 2025 15:33
@xFrednet
Copy link
Collaborator Author

This should be ready for review. The first commit includes some changes from #34 to hopefully prevent merge conflicts later

@xFrednet
Copy link
Collaborator Author

I just rebased to get the Py_TPFLAGS_REGION_AWARE flag

@xFrednet xFrednet changed the title [Draft] Phase3: Add write barrier to Cell and some bytecodes [Phase3: Add write barrier to Cell and some bytecodes Jan 15, 2025
@xFrednet xFrednet changed the title [Phase3: Add write barrier to Cell and some bytecodes Phase3: Add write barrier to Cell and some bytecodes Jan 15, 2025
Copy link
Owner

@mjp41 mjp41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, my main concern is if trigger remove reference before removing the edge from the heap could be a problem. Could we end up with remove reference triggering a desctructor, and then the invariant not holding? Are there worse things that could happen.

return -1;
}

PyCell_SET(op, Py_XNewRef(value));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could there be any issues here that the edge still exists when the region remove reference occurs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. The invariant will only run when the bytecode is done executing. If a different thread has access to the Cell and gains access to the object between the Py_REGIONCHANGEREFERENCE and PyCell_SET we have a different problem on our hands.

I believe this should be safe, as long as region remove reference only decreases the LRC and doesn't try to verify anything

Copy link
Owner

@mjp41 mjp41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One very minor thing.

Comment on lines 143 to 144
Py_REGIONREMOVEREFERENCE(op, op->ob_ref);
Py_CLEAR(op->ob_ref);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Py_REGIONREMOVEREFERENCE(op, op->ob_ref);
Py_CLEAR(op->ob_ref);
Py_CLEAR_OBJECT_FIELD(op, op->ob_ref);

@xFrednet
Copy link
Collaborator Author

I only rebased previously, now your comment should be addressed, and I also fixed a small issue with the macro and where it got it's type from.

Copy link
Owner

@mjp41 mjp41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xFrednet xFrednet force-pushed the phase3-bytecode-wb branch from 93af619 to c35b414 Compare April 8, 2025 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants