feat: changes to ingestion flow - added docling chunker.#1113
Open
ricofurtado wants to merge 1 commit intomainfrom
Open
feat: changes to ingestion flow - added docling chunker.#1113ricofurtado wants to merge 1 commit intomainfrom
ricofurtado wants to merge 1 commit intomainfrom
Conversation
mpawlow
requested changes
Mar 13, 2026
| @@ -1,93 +1,6 @@ | |||
| { | |||
Collaborator
There was a problem hiding this comment.
(2a) [Blocker] flows/ingestion_flow.json is invalid JSON
- Confirmed by Python JSON parser:
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 2925 column 41 (char 255868)
- The outputs array inside the
OpenSearchVectorStoreComponentMultimodalMultiEmbeddingnode's template is unclosed- Fields from the ingest_data input template were spliced in, corrupting the structure. Langflow will fail to import the flow entirely.
- Root cause?: merge commit 9475c56 introduced a broken merge of the flow JSON
| @@ -1,93 +1,6 @@ | |||
| { | |||
Collaborator
There was a problem hiding this comment.
(2b) [Major] flows_service.py hardcodes "Split Text" display name
- The PR removes the
SplitTextnode from the flow, but flows_service.py still references it by display name "Split Text". - Calls to
update_ingest_flow_chunk_size()andupdate_ingest_flow_chunk_overlap()via the Settings API will silently do nothing since the node lookup will return no match. - Affected lines: 911–916 and 918–927
| @@ -1,93 +1,6 @@ | |||
| { | |||
Collaborator
There was a problem hiding this comment.
(2c) [Major] langflow_file_service.py hardcodes "SplitText-QIKhg"
- The removed node SplitText-QIKhg is still referenced in langflow_file_service.py for per-run tweaks.
- Any per-ingestion chunkSize, chunkOverlap, or separator settings will be sent to a non-existent node and silently dropped by Langflow.
- Affected lines: 292–309
| @@ -1,93 +1,6 @@ | |||
| { | |||
Collaborator
There was a problem hiding this comment.
(2d) [Major] api/langflow_files.py hardcodes "SplitText-PC36h"
- A second stale
SplitTextnode ID (SplitText-PC36h) is referenced in the file upload endpoint for tweaks. - Same silent-drop behavior as Issue (2c)
- Affected lines: 96–104
| @@ -1,93 +1,6 @@ | |||
| { | |||
Collaborator
There was a problem hiding this comment.
(2e) [Major] api/settings.py reads defaults from "SplitText-QIKhg"
GET /api/settingsparses the live Langflow flow to populate chunkSize, chunkOverlap, and separator defaults.- Since
SplitText-QIKhgis gone, this code will never match and the defaults will always fall back to YAML-config values instead of live flow values. - Affected lines: 277–290
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes in default ingestion flow so it can support docling-based chunking.