Skip to content

Sentinel#74

Draft
15r10nk wants to merge 8 commits intoalexmojaki:masterfrom
15r10nk:sentinel
Draft

Sentinel#74
15r10nk wants to merge 8 commits intoalexmojaki:masterfrom
15r10nk:sentinel

Conversation

@15r10nk
Copy link
Collaborator

@15r10nk 15r10nk commented Sep 10, 2023

different fixes for the SentinelNodeFinder

currently work in progress.

The question if this are required when we stop supporting python < 3.8

#64 is a new node finder which could be used for the same versions.

@15r10nk 15r10nk force-pushed the sentinel branch 2 times, most recently from bad6239 to 271b0a4 Compare July 18, 2024 06:01
@15r10nk
Copy link
Collaborator Author

15r10nk commented Jul 19, 2024

hi @alexmojaki, I'm currently working on the 3.13 branch but found some issues with the SentinelNode finder which cause failing tests. The test failures are random in the ci because we test a random subset of the cpython source code.

I fixed some of theme but I have now one problem which I don't know to solve.

The last wip commit of this branch contains a small_sample which reproduces this bug.

(dict(((k.lower(), v) for (k, v) in self.itermerged())) == (k.lower for (k, v) in something))

you can analyse it with:

python tests/analyse.py tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25fd0baffa6104bcc10
155fa1.py 0:20

Bildschirmfoto 2024-07-19 07:29:41

I hope you can help fixing it or understand why this bytecode can not be found, in which case we would need some code in test_main.py:TestFiles.check_filename which accepts this bug as known issue.

@15r10nk
Copy link
Collaborator Author

15r10nk commented Jul 19, 2024

This bug can be reproduced with python 3.8.3

@alexmojaki
Copy link
Owner

Can you give a traceback for the error?

@15r10nk
Copy link
Collaborator Author

15r10nk commented Jul 19, 2024

----------------------------------------- Captured stdout call -----------------------------------------
check /home/frank/projects/executing/tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25fd0baf
fa6104bcc10155fa1.py
----------------------------------------- Captured stderr call -----------------------------------------
node without associated Bytecode
in file: /home/frank/projects/executing/tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25fd0
baffa6104bcc10155fa1.py
correct: False
len(values): 0
values: []
deadcode: False

ast node:
lower
Attribute(value=Name(id='k', ctx=Load()), attr='lower', ctx=Load())
parents: [<_ast.GeneratorExp object at 0x7f7e5f952bb0>, <_ast.Compare object at 0x7f7e5f952220>, <_ast.E
xpr object at 0x7f7e5f952460>, <_ast.Module object at 0x7f7e6010f550>]
node range: lineno=2 end_lineno=2 col_offset=60 end_col_offset=67
source code:
   1: 
   2: (dict(((k.lower(), v) for (k, v) in self.itermerged())) == (k.lower for (k, v) in something))

======================================= short test summary info ========================================
FAILED tests/test_main.py::test_small_samples[471ff7c2daa37eded70f016214f3ebe3915659b25fd0baffa6104bcc10
155fa1.py] - AssertionError

The execption is raised inside the test, because it can not find bytecode for the node. It looks for me like the bytecode node mapping fails and this node is missing a bytecode now. But I have no idea what the problem with this code sample could be.

@alexmojaki
Copy link
Owner

I meant a traceback for the exception Expected one value, found 2. I think the problem is that both code objects look similar (name, lineno, variable names) so it fails there before even looking at instructions.

@15r10nk
Copy link
Collaborator Author

15r10nk commented Jul 19, 2024

here is the backtrace where the NotOneValueFound exception is raised:

>       TestFiles().check_filename(full_filename, check_names=True)

tests/test_main.py:749: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/test_main.py:882: in check_filename
    result = list(self.check_code(code, nodes, decorators, check_names=check_names))
tests/test_main.py:1461: in check_code
    for x in self.check_code(inst.argval, nodes, decorators, check_names=check_names):
tests/test_main.py:1237: in check_code
    ex = Source.executing(frame)
executing/executing.py:238: in executing
    node_finder = NodeFinder(frame, stmts, tree, lasti, source)
executing/executing.py:589: in __init__
    matching = list(self.matching_nodes(exprs))
executing/executing.py:658: in matching_nodes
    original_instructions = self.get_original_clean_instructions()
executing/executing.py:647: in get_original_clean_instructions
    for inst in self.compile_instructions()
executing/executing.py:742: in compile_instructions
    code = only(self.find_codes(module_code))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

it = [<code object <genexpr> at 0x7fe1e69bda80, file "/home/frank/projects/executing/tests/small_samples
/471ff7c2daa37eded7...k/projects/executing/tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25
fd0baffa6104bcc10155fa1.py", line 2>]

    def only(it):
        # type: (Iterable[T]) -> T
        if isinstance(it, Sized):
            if len(it) != 1:
>               raise NotOneValueFound('Expected one value, found %s' % len(it),it)
E               executing.executing.NotOneValueFound: Expected one value, found 2

executing/executing.py:71: NotOneValueFound
----------------------------------------- Captured stdout call -----------------------------------------
check /home/frank/projects/executing/tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25fd0baf
fa6104bcc10155fa1.py
mapping failed
Expected one value, found 2
value: <code object <genexpr> at 0x7fe1e69bda80, file "/home/frank/projects/executing/tests/small_sample
s/471ff7c2daa37eded70f016214f3ebe3915659b25fd0baffa6104bcc10155fa1.py", line 2>
value: <code object <genexpr> at 0x7fe1e69bd7c0, file "/home/frank/projects/executing/tests/small_sample
s/471ff7c2daa37eded70f016214f3ebe3915659b25fd0baffa6104bcc10155fa1.py", line 2>
search bytecode Instruction(opname='LOAD_FAST', opcode=124, arg=1, argval='k', argrepr='k', offset=10, s
tarts_line=None, is_jump_target=False)
in file /home/frank/projects/executing/tests/small_samples/471ff7c2daa37eded70f016214f3ebe3915659b25fd0b
affa6104bcc10155fa1.py

You can see the code objects in the screenshot above where I analysed them.
But LOAD_ATTR(lower) and LOAD_METHOD(lower) looks different for me.

@alexmojaki
Copy link
Owner

After making AST modifications with the sentinel and compiling that, it needs to find which code object contains the changes. So it looks for a code object that looks similar to the code object of the original frame. The instructions will necessarily contain some differences so it can't use those to match. You can see the heuristics used in find_codes.

Now that I think about it, it could probably just look for a code object which has the sentinel string in co_consts. But maybe there's a reason that doesn't work. But even if that doesn't always work by itself, it would almost certainly work in combination with the existing strategy.

Otherwise, this is a known limitation that goes back all the way to the beginning. See

str([{tester(x) for x in [1]}, {tester(x) for x in [2]}])
. It's a very specific and rare edge case. In your example it happens because there's two <genexpr> codes on the same line, containing the same variable names (self and something aren't in those frames). Change k to k2 in one of them, or put a newline between them, and the problem should go away. This has probably never happened before in test_sys_modules, and if it's not easy to apply the suggested fix above then it's fine to just skip the file/module where it happens based on its name.

@15r10nk
Copy link
Collaborator Author

15r10nk commented Jul 19, 2024

Thank you for the explanation. I skipped the module tests for now in the 3.13 branch. I will look at it later again.

@alexmojaki
Copy link
Owner

Ah, but compile_instructions is called in two places. One of those places is where "look for a code object which has the sentinel string in co_consts" makes sense. The other place is in your traceback, before the AST is modified:

def get_original_clean_instructions(self):
# type: () -> List[EnhancedInstruction]
result = self.clean_instructions(self.code)
# pypy sometimes (when is not clear)
# inserts JUMP_IF_NOT_DEBUG instructions in bytecode
# If they're not present in our compiled instructions,
# ignore them in the original bytecode
if not any(
inst.opname == "JUMP_IF_NOT_DEBUG"
for inst in self.compile_instructions()
):
result = [
inst for inst in result
if inst.opname != "JUMP_IF_NOT_DEBUG"
]

But it only needs to do that if there are any JUMP_IF_NOT_DEBUG instructions in result, which is probably quite rare. If it checked that before calling compile_instructions then that would usually avoid the error at that line. It would also probably make everything significantly faster in general since compiling is expensive.

@15r10nk
Copy link
Collaborator Author

15r10nk commented Jul 23, 2024

I added the fix that you proposed, but I get now an exception at a different place:

tests/test_main.py:881: in check_filename
    result = list(self.check_code(code, nodes, decorators, check_names=check_names))
tests/test_main.py:1460: in check_code
    for x in self.check_code(inst.argval, nodes, decorators, check_names=check_names):
tests/test_main.py:1236: in check_code
    ex = Source.executing(frame)
executing/executing.py:238: in executing
    node_finder = NodeFinder(frame, stmts, tree, lasti, source)
executing/executing.py:589: in __init__
    matching = list(self.matching_nodes(exprs))
executing/executing.py:674: in matching_nodes
    instructions = self.compile_instructions()
executing/executing.py:740: in compile_instructions
    code = only(self.find_codes(module_code))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

it = [<code object <listcomp> at 0x7f473b5540e0, file "/home/frank/projects/executing/tests/small_sample
s/d8987afe0f74653bc...k/projects/executing/tests/small_samples/d8987afe0f74653bcddfe56e18c616dbd52d934cd
ff78644c5221935b35186f2.py", line 3>]

    def only(it):
        # type: (Iterable[T]) -> T
        if isinstance(it, Sized):
            if len(it) != 1:
>               raise NotOneValueFound('Expected one value, found %s' % len(it),it)
E               executing.executing.NotOneValueFound: Expected one value, found 2

executing/executing.py:71: NotOneValueFound
----------------------------------------- Captured stdout call -----------------------------------------
check /home/frank/projects/executing/tests/small_samples/d8987afe0f74653bcddfe56e18c616dbd52d934cdff7864
4c5221935b35186f2.py
mapping failed
Expected one value, found 2
value: <code object <listcomp> at 0x7f473b5540e0, file "/home/frank/projects/executing/tests/small_sampl
es/d8987afe0f74653bcddfe56e18c616dbd52d934cdff78644c5221935b35186f2.py", line 3>
value: <code object <listcomp> at 0x7f473b554500, file "/home/frank/projects/executing/tests/small_sampl
es/d8987afe0f74653bcddfe56e18c616dbd52d934cdff78644c5221935b35186f2.py", line 3>
search bytecode Instruction(opname='LOAD_GLOBAL', opcode=116, arg=0, argval='len', argrepr='len', offset
=8, starts_line=None, is_jump_target=False)
in file /home/frank/projects/executing/tests/small_samples/d8987afe0f74653bcddfe56e18c616dbd52d934cdff78
644c5221935b35186f2.py

@15r10nk
Copy link
Collaborator Author

15r10nk commented Jul 23, 2024

Now that I think about it, it could probably just look for a code object which has the sentinel string in co_consts. But maybe there's a reason that doesn't work. But even if that doesn't always work by itself, it would almost certainly work in combination with the existing strategy.

I will try this

@alexmojaki alexmojaki mentioned this pull request Aug 10, 2024
@alexmojaki
Copy link
Owner

OK, I tried this for a while, it doesn't really work. I could maybe make it work in your case where the instructions are clearly different, but it doesn't seem worth it. Let's just skip this specific edge case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants