#fix llm response is too long,the scanner.Scan can't read #310

sunli1223 · 2025-05-25T14:52:19Z

this is for issue #309

- Introduced a new `readBodyStream` function to read and process HTTP response bodies line by line, enhancing the ability to handle streaming data. - Refactored `RequestAndParseStream` to utilize `readBodyStream`, improving code clarity and maintainability. - Added a new test file `http_wrapper_reader_test.go` with comprehensive unit tests for parsing JSON bodies from chunked responses, ensuring robust handling of various scenarios including invalid JSON and large data sets.

Yeuoly · 2025-05-26T12:00:12Z

internal/utils/http_requests/http_warpper.go

 }

+func readBodyStream(resp io.ReadCloser, callback func(data []byte) error) error {
+	reader := bufio.NewReaderSize(resp, 4*1024) // init with 4KB buffer


actually, the default buffer max size of bufio.Scanner is 64*1024, it's big enough, I'm not sure what's your specific scenarios, but 4*1024 should not works.

@Yeuoly
64k is not enough, as many current LLMs support more than 64k of context. If MCP is used, it will add all the prompt words, the MCP response, and all the bytes of the JSON format structure, making it easy to exceed this limit.

reader := bufio.NewReaderSize(resp, 4*1024) // init with 4KB buffer

The 4K here is just a buffer to improve performance, and there is no maximum buffer limit.Therefore, the maximum byte size of a line ultimately depends on the context size supported by the LLM and the size of the response returned by the tool.

In fact, Dify uses stream mode for all the LLM calls, response was split into chunks, each chunk should not have a size larger than 64k.

As for the unlimited buffer, I suppose it's not a good design, it leads to DoS attack. for example, I can create a fake OpenAI server and returns 1G data in a single chunk, that's terrible, it might be a configurable environment variable, but not hardcoded. anyway, the buffer should not to be too large, if you use Dify in personal scenarios, yeah, just make it configurable.

@Yeuoly
in my case, the llm return a chunk as below，when use tool calling, every chunk include the prompt_messages, the prompt_messages is very big:

{ "data": { "model": "deepseek-v3-250324", "prompt_messages": [ { "role": "system", "content": "system promt(Text in Chinese with a length of 9197 characters.)", "name": "" }, { "role": "user", "content": "user query", "name": "" }, { "role": "assistant", "content": "", "name": "" }, { "role": "tool", "content": "Tool execution result: [{'type': 'text', 'text': 'too response in Chinese with a length of 29008 characters", "name": "hotel_info" } ], "system_fingerprint": "", "delta": { "index": 1, "message": { "role": "assistant", "content": "###", "name": "", "tool_calls": [] }, "usage": null, "finish_reason": null } }, "error": "" }

In Go, a Chinese character generally occupies 3 bytes, so although the text doesn't seem too long, multiplying it by 3 could surpass 64K.

@Yeuoly

Nov1c444 · 2025-05-29T01:49:05Z

This is mainly because the Inner model API call passes along all the historical prompt_messages, which causes the response to be excessively long. Under normal circumstances, the default size limit should be sufficient, so this PR should resolve the issue.
langgenius/dify#20391

sunli1223 · 2025-05-29T02:49:21Z

This is mainly because the Inner model API call passes along all the historical prompt_messages, which causes the response to be excessively long. Under normal circumstances, the default size limit should be sufficient, so this PR should resolve the issue. langgenius/dify#20391

Thank you , I will wait for the next version and conduct tests.

tushverma · 2025-06-09T10:58:55Z

Hey @Nov1c444 - I tried the change in my system - langgenius/dify#20391 , but was still getting the same issue. In my case I am pulling data from my database which has long strings in one of the columns. Those strings are of character length > 70k.
I added @sunli1223 's change and it worked as expected.

sunli1223 and others added 2 commits May 25, 2025 22:41

#fix llm response is too long,the scanner.Scan can't read

2480864

Yeuoly reviewed May 26, 2025

View reviewed changes

Yeuoly mentioned this pull request Jun 11, 2025

refactor: replace compact response generation with length-prefixed response for backwards invocation api langgenius/dify#20903

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

#fix llm response is too long,the scanner.Scan can't read #310

#fix llm response is too long,the scanner.Scan can't read #310

sunli1223 commented May 25, 2025

Uh oh!

Yeuoly May 26, 2025

Uh oh!

sunli1223 May 28, 2025

Uh oh!

Yeuoly May 28, 2025

Uh oh!

sunli1223 May 29, 2025

Uh oh!

Nov1c444 commented May 29, 2025

Uh oh!

sunli1223 commented May 29, 2025

Uh oh!

tushverma commented Jun 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

#fix llm response is too long,the scanner.Scan can't read #310

Are you sure you want to change the base?

#fix llm response is too long,the scanner.Scan can't read #310

Conversation

sunli1223 commented May 25, 2025

Uh oh!

Yeuoly May 26, 2025

Choose a reason for hiding this comment

Uh oh!

sunli1223 May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Yeuoly May 28, 2025

Choose a reason for hiding this comment

Uh oh!

sunli1223 May 29, 2025

Choose a reason for hiding this comment

Uh oh!

Nov1c444 commented May 29, 2025

Uh oh!

sunli1223 commented May 29, 2025

Uh oh!

tushverma commented Jun 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants