Skip to content

Please add a customized summarization for code files for the read_file tool. #4

@DavidBeste

Description

@DavidBeste

Is your feature request related to a problem? Please describe.
Currently, the agent reads code files and truncates the output at a certain length. When reading a code file the agent should have an overview of different functions that can be used to generate a fuzzer. The truncation carries the risk that the agent misses potentially vulnerable functions to fuzz.

An example can be found in the same demo file also used for the write_to_file issue write_to_file_example.txt where read_file with the path "libtiff/libtiff/tif_read.c" is executed.

Describe the solution you'd like
Please add an option to systematically summarize the contents of code files in one of the two following ways:

  1. Instruct the LLM during the read_file tool to summarize the function heads of functions in code files.
  2. Use a parser to generate an abstract syntax tree to accomplish this. The implementation would be more complex but the risk of hallucination would be removed.

Describe alternatives you've considered
An alternative would be to read the files chunkwise over multiple turns or remove the trunctation completely to also have the function bodies as context for the agent, however, this might result in high token usage and potentially long runs. Therefore, for our initial experiment, I suggest using function heads to see how well this performs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions