Skip to content

Conversation

@skissue
Copy link
Owner

@skissue skissue commented Nov 24, 2025

Decided to make some miscellaneous improvements while rewriting it to be async.

@skissue
Copy link
Owner Author

skissue commented Nov 24, 2025

@ultronozm, would you like to take a glance?

@ultronozm
Copy link
Contributor

I tried it and it works, but I found it to be a bit brittle -- if the LLM attempts a search that returns "too much", then it eventually locks up my Emacs due to memory issues. I also noticed that it searches in .git subfolders, by default. I don't have any quick suggestions and suspect the same issues occurred for the earlier implementation.

One thing: the case-arg variable is never used, and should be passed to start-process.

@skissue
Copy link
Owner Author

skissue commented Nov 25, 2025

I tried it and it works, but I found it to be a bit brittle -- if the LLM attempts a search that returns "too much", then it eventually locks up my Emacs due to memory issues. I also noticed that it searches in .git subfolders, by default. I don't have any quick suggestions and suspect the same issues occurred for the earlier implementation.

Hm, perhaps we should add a default --max-count parameter to limit the amount of content returned from each file. However, that wouldn't fix the case where the LLM decides to search e.g. my home folder and finds thousands of files with matches. Maybe we should start a shell to pipe into head?

With regards to .git subfolders: this seems to simply be a shortcoming of grep. I am planning to make a similar tool that uses ripgrep instead, which respects .gitignore and such by default.

One thing: the case-arg variable is never used, and should be passed to start-process.

🤦, let me fix that…

* llm-tool-collection.el (grep): Clarify argument documentation for LLMs. Add an
argument to ignore case while searching. Handle error when the LLM gives a
nonexistent path. Rewrite to be asynchronous.
* README.org (grep): Update documentation.
@avishefi
Copy link

Maybe redirect the output to a temporary file so intermediary output is not handled by emacs? Or use async?

@skissue
Copy link
Owner Author

skissue commented Nov 26, 2025

Maybe redirect the output to a temporary file so intermediary output is not handled by emacs?

Hmm, that might help. Right now, all the output is going to a temporary buffer, which forces the memory allocation to Emacs itself. On Linux, most temporary files are created within volatile memory anyway, but leaving it to the OS might be a better approach. Then again, we will have to put the file into a buffer to read the output anyway. I'm not familiar enough with Emacs internals to know if this approach is better overall.

Though, a search would have to return hundreds of thousands of results to cause memory pressure on modern systems, which I reckon would exceed the context window of any model, so perhaps a hard limit on result size would suffice.

Or use async?

This change already makes the grep tool async. I don't see how that would affect memory, unless you're referring to a different async?

@avishefi
Copy link

Output to a buffer suffers from emacs buffering, I'm not sure why it would be better than redirecting it to a temporary file and then reading it.

I was referring to the emacs async package that creates an emacs subprocess, which is not needed when using a temporary file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants