Skip to content

Conversation

@sunli1223
Copy link

this is for issue #309

sunli1223 and others added 2 commits May 25, 2025 22:41
- Introduced a new `readBodyStream` function to read and process HTTP response bodies line by line, enhancing the ability to handle streaming data.
- Refactored `RequestAndParseStream` to utilize `readBodyStream`, improving code clarity and maintainability.
- Added a new test file `http_wrapper_reader_test.go` with comprehensive unit tests for parsing JSON bodies from chunked responses, ensuring robust handling of various scenarios including invalid JSON and large data sets.
}

func readBodyStream(resp io.ReadCloser, callback func(data []byte) error) error {
reader := bufio.NewReaderSize(resp, 4*1024) // init with 4KB buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, the default buffer max size of bufio.Scanner is 64*1024, it's big enough, I'm not sure what's your specific scenarios, but 4*1024 should not works.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yeuoly
64k is not enough, as many current LLMs support more than 64k of context. If MCP is used, it will add all the prompt words, the MCP response, and all the bytes of the JSON format structure, making it easy to exceed this limit.

reader := bufio.NewReaderSize(resp, 4*1024) // init with 4KB buffer

The 4K here is just a buffer to improve performance, and there is no maximum buffer limit.Therefore, the maximum byte size of a line ultimately depends on the context size supported by the LLM and the size of the response returned by the tool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, Dify uses stream mode for all the LLM calls, response was split into chunks, each chunk should not have a size larger than 64k.

As for the unlimited buffer, I suppose it's not a good design, it leads to DoS attack. for example, I can create a fake OpenAI server and returns 1G data in a single chunk, that's terrible, it might be a configurable environment variable, but not hardcoded. anyway, the buffer should not to be too large, if you use Dify in personal scenarios, yeah, just make it configurable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yeuoly
in my case, the llm return a chunk as below,when use tool calling, every chunk include the prompt_messages, the prompt_messages is very big:

{
    "data": {
        "model": "deepseek-v3-250324",
        "prompt_messages": [
            {
                "role": "system",
                "content": "system promt(Text in Chinese with a length of 9197 characters.)",
                "name": ""
            },
            {
                "role": "user",
                "content": "user  query",
                "name": ""
            },
            {
                "role": "assistant",
                "content": "",
                "name": ""
            },
            {
                "role": "tool",
                "content": "Tool execution result: [{'type': 'text', 'text': 'too response in Chinese with a length of 29008 characters",
                "name": "hotel_info"
            }
        ],
        "system_fingerprint": "",
        "delta": {
            "index": 1,
            "message": {
                "role": "assistant",
                "content": "###",
                "name": "",
                "tool_calls": []
            },
            "usage": null,
            "finish_reason": null
        }
    },
    "error": ""
}

In Go, a Chinese character generally occupies 3 bytes, so although the text doesn't seem too long, multiplying it by 3 could surpass 64K.

@Yeuoly

@Nov1c444
Copy link
Contributor

This is mainly because the Inner model API call passes along all the historical prompt_messages, which causes the response to be excessively long. Under normal circumstances, the default size limit should be sufficient, so this PR should resolve the issue.
langgenius/dify#20391

@sunli1223
Copy link
Author

This is mainly because the Inner model API call passes along all the historical prompt_messages, which causes the response to be excessively long. Under normal circumstances, the default size limit should be sufficient, so this PR should resolve the issue. langgenius/dify#20391

Thank you , I will wait for the next version and conduct tests.

@tushverma
Copy link

Hey @Nov1c444 - I tried the change in my system - langgenius/dify#20391 , but was still getting the same issue. In my case I am pulling data from my database which has long strings in one of the columns. Those strings are of character length > 70k.
I added @sunli1223 's change and it worked as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants