Skip to content

Apparent bug with for loops #94

@micsthepick

Description

@micsthepick

I only know a little about how the fuzzing process works, and barely at all about how atheris instruments the bytecode, but I've noticed the following discrepancy:

import atheris

with atheris.instrument_imports():
    import sys

@atheris.instrument_func
def Fuzz(data: bytes):
    string = 'thisisalongstringtotestatherisandtomakesurethatithandlesaforloopcorrectly'

    if len(data) < 1:
        return

    fdp = atheris.FuzzedDataProvider(data)

    data_unicode = fdp.ConsumeUnicode(len(data))

    if len(data_unicode) <= 0 or data_unicode[0] != "t":
        return
    elif len(data_unicode) <= 1 or data_unicode[1] != "h":
        return
    ...<repetitive source code elided for brevity>...
    elif len(data_unicode) <= 71 or data_unicode[71] != "l":
        return
    elif len(data_unicode) <= 72 or data_unicode[72] != "y":
        return
    raise ValueError("BOOM!")


if __name__ == '__main__':
    atheris.Setup(sys.argv, Fuzz)
    atheris.Fuzz()

works fine, but when I condense it into a for loop:

import atheris

with atheris.instrument_imports():
    import sys


@atheris.instrument_func
def Fuzz(data: bytes):
    string = 'thisisalongstringtotestatherisandtomakesurethatithandlesaforloopcorrectly'

    if len(data) < 1:
        return

    fdp = atheris.FuzzedDataProvider(data)

    data_unicode = fdp.ConsumeUnicode(len(data))

    for i in range(len(string)):
        if len(data_unicode) <= i or data_unicode[i] != string[i]:
            break
    else:
        raise ValueError("BOOM!")


if __name__ == '__main__':
    atheris.Setup(sys.argv, Fuzz)
    atheris.Fuzz()

Expected behaviour:
Both examples take a comparable amount of time (taking into consideration that the unrolled loop is probably faster) and finish with the completed string as a crash example.

Observed behaviour:
The for loop doesn't finish, and only gets a few character right at once

Further notes:
Because both are functionally equivalent, I wouldn't expect the for loop to take so much longer (at this stage it's looking like a heat death kind of slow).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions