Skip to content

fix(#799): batch contact IDs in move-contacts to avoid URI length errors#802

Open
binokaryg wants to merge 4 commits intomainfrom
799-move-uri-too-large
Open

fix(#799): batch contact IDs in move-contacts to avoid URI length errors#802
binokaryg wants to merge 4 commits intomainfrom
799-move-uri-too-large

Conversation

@binokaryg
Copy link
Copy Markdown
Member

@binokaryg binokaryg commented Apr 2, 2026

Description

Split IDs into batches of 100 per request

#799

Code review items

  • Readable: Concise, well named, follows the style guide, documented if necessary.
  • Documented: Configuration and user documentation on cht-docs
  • Tested: Unit and/or integration tests where appropriate
  • Backwards compatible: Works with existing data and configuration. Any breaking changes documented in the release notes.

AI disclosure

This PR was developed with assistance from Claude Code. AI was used for:

  • Investigating the root cause of the 414/400/431 errors
  • Calculating the URL length threshold (~112 contact IDs for an 8KB limit)
  • Implementing the batching fix in fetchReportsByCreator

License

The software is provided under AGPL-3.0. Contributions to this project are accepted under the same license.

@binokaryg binokaryg marked this pull request as ready for review April 3, 2026 05:41
@binokaryg binokaryg linked an issue Apr 3, 2026 that may be closed by this pull request
@binokaryg binokaryg requested a review from sugat009 April 3, 2026 05:50
Copy link
Copy Markdown
Member

@sugat009 sugat009 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ID batching approach is sound for avoiding the URI length issue, but there are a few concerns with the current implementation.

return res.rows.map(row => row.doc);
};

const fetchReportsBatch = async (db, idsBatch, useNouveau, skip) => {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug): The skip parameter from the outer pagination loop in updateReports (index.js:78-92) is passed identically to every sub-batch here. This means on iteration 2 of the outer loop (skip=10000), each of the sub-batch queries independently skips 10,000 rows from its own small result set, likely returning 0 results. The outer loop then terminates early, silently missing reports.

The inner ID-batching loop and the outer report-pagination loop serve different purposes. The inner loop should always fetch all reports for its ID batch. The outer skip should not leak into it.

Also, allResults accumulates across all sub-batches, so the total can exceed BATCH_SIZE, which breaks the outer loop's termination condition (reportDocsBatch.length >= BATCH_SIZE).


const fetchReportsBatch = async (db, idsBatch, useNouveau, skip) => {
if (useNouveau) {
return api().getReportsByCreatedByIds(idsBatch, BATCH_SIZE, skip);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (improvement): The Nouveau endpoint uses POST with the query in the request body (see cht-core/shared-libs/cht-datasource/src/local/libs/nouveau.ts), so there is no URI length constraint on this path. The URI-too-large issue from #799 only affects the CouchDB view path (CHT < 5.0.0).

Nouveau also uses bookmark-based pagination rather than skip, so the batching of contact IDs is unnecessary here. Consider only applying the ID batching to the CouchDB view branch.

const createdByKeys = createdByIds.map(id => [`contact:${id}`]);
return await getFromDbView(db, 'medic-client/reports_by_freetext', createdByKeys, skip);
return allResults;
};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (consistency): fetchReportsBySubject (just below) passes all IDs directly to getFromDbView without batching. If the contact has many subject IDs, this has the same URI length vulnerability that this PR is fixing. Should this be batched as well for consistency?

…e POST request and update request body structure
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Getting "414 Request-URI Too Large" when moving contacts on CHT 5.x

2 participants