Conversation
|
Almost working, it seems to be mixing up the |
|
Thanks for writing this up Tony! Our project is definitely sensitive to these kinds of changes and they can be somewhat tricky to debug. Please let us know if we can help in any way :) |
|
Python 3.13 is coming. Any updates here? |
|
I think we need to match so its not sufficient to delete PRECALL, we need to update a bunch of opcodes' sizes. And its changing even further in cpython 3.13 I'm not familiar with cpython, so
Or, you could stop supporting backwards compatibility and update Hope this helps someone get started on the enhancement! 🫡 |
|
hello, how to deal with code instrument error for Python 3.12.8: Log Thanks! |
|
i modified i am using termux to build it and it is amazing that it has not failed |
|
it successfully installed. changes i did:
python version: python -c "import atheris": i will fix that, the fact that it works is enough for me |
|
Thx @mahdoosh1! It's great to see such engagement from end users. Maybe a little more transparency about where I'm at with debugging would help. Unfortunately, what is described above is definitely not sufficient to get the fuzzer to actually run: there is also the problem of jump instruction offsets being calculated differently based on the number of CACHE instructions. From https://docs.python.org/3.12/library/dis.html "Changed in version 3.12: The argument of a jump is the offset of the target instruction relative to the instruction that appears immediately after the jump instruction’s CACHE entries." I have the following code in my draft change to handle this, all within instrument_bytecode.py: def get_cache_offset(i: int, instructions: List[dis.Instruction]) -> int: Then I've created a new attribute in the Instruction class called and in the check_state method on the same class: I modified str_fuzzing_example.py to include a for loop (which is what currently causes it to fail) to reproduce the breakages we're seeing in our other internal fuzzers. Even with this change the fuzzer still crashes with this error: === Uncaught Python exception: === The lack of debug info is...concerning. I am not sure why this is breaking and don't really have the tools to debug, because what I really need is to step through bytecode instruction by instruction, but there is no tooling support to do that internally (I have already reached out to the Python team internally to confirm this). So this is why it's stuck. I cannot emphasize enough how important it is that Atheris can run on Google's internal infrastructure, if we cannot do that then my fear is the project may ultimately be abandoned due to lack of business value to the company. So I can't just ignore the internal side of things. One thing I haven't investigated enough is how this interacts with Google's internal Python infra and build system. Maybe that is the next step. |
|
@AidenRHall that is unfortunate, it might be possible to rewrite some functions and classes that break in python >= 3.11; i am afraid i can't help, this is a big project (owned by big company google), and i have no experience on such projects, i will help you if i manage to make it work. |
|
The breakage I'm seeing is too general (it breaks if there's a for loop) to paper over with specific refactors - fuzzing instrumentation has to work in the general case. Of course I don't expect you to debug this (or Google's internal python infra), although I appreciate your efforts. Big companies own lots of big projects, and unfortunately this one is small compared to the others - it is unrealistic to expect any additional resources get allocated to Atheris in the forseeable future, other than through me evangelizing Atheris internally to find teams who might want to use it for their projects. |
|
i get EDIT: i did not fix this. this is coming from an experimental function. |
|
debug info: |
|
it might be because of LOAD_FAST: |
|
Well done @mahdoosh1! Love to see the engagement here. I suspect this is due to the change in how jump offsets are computed: https://docs.python.org/3/whatsnew/3.12.html#cpython-bytecode-changes, specifically this part: Remove the LOAD_METHOD instruction. It has been merged into LOAD_ATTR. LOAD_ATTR will now behave like the old LOAD_METHOD instruction if the low bit of its oparg is set. (Contributed by Ken Jin in gh-93429.) If you want to account for this I added these lines to version_dependant.py: and then replace the calls inside instrument_bytecode.py with these functions. I hope that unblocks your debugging - it's actually really useful to have someone working on this on the OSS side, please let me know what your next error is if you continue pursuing this work. I am getting this one now, but I'm stuck on getting better debug info: Part of the problem is the debug metadata isn't getting modified on my end, so I'm pretty sure this error message is misleading. I am gonna try messing with the modified version of str_fuzzing_example.py that I have locally to see if I can uncover more information about when it does or does not crash: Hope this helps! |
|
Okay, now my crash is fixed, thank you for the fix. i will work on yours as well soon |
|
Ah the problem was that a The bytecode in question looks like this: So if we write a for loop that looks like this: It prints because the lack of a NULL value on the stack, causing the 2nd calling convention from above to be used, in which case the iterable item is called instaed of the actual |
|
i'm confused, where is adjust_arg used? EDIT: i noticed it, it is used when working with instr.arg |
|
it is acting weird on my side, it is giving me Segmentation Fault. |
This adds support for Python 3.12 (so far, the release is months away).
PRECALLandLOAD_METHODhave been removed. So the if-macro that says version >= 3.11 would be invalid for all future releases.JUMP_IF_TRUE_OR_POPandJUMP_IF_FALSE_OR_POPhave been removed.