nickfujita: Fix: Message buffering, connection handling, and concurrent connection support by tdrz · Pull Request #873 · electric-sql/pglite

tdrz · 2026-01-12T08:41:00Z

See #775

…ple tcp packages'

…ies' into tdrz/nickfujita-improvements

github-actions · 2026-01-12T08:47:40Z

github-actions · 2026-01-12T10:01:21Z

Demos: https://github.com/electric-sql/pglite/actions/runs/20914676541/artifacts/5095946838

github-actions · 2026-01-12T10:01:32Z

🚀 Deployed on https://697d09c3e82b834a966e558b--pglite.netlify.app

samwillis · 2026-01-13T09:51:44Z

I've not had a chance yet to look in detail (I'm tied up this morning in a mob session) - but I did ask GPT5.2 for a review of the branch. It looks like it's spotted a few things.

PR 873 Review — `@electric-sql/pglite-socket`

Summary

This PR targets three production issues:

Large queries causing crashes: PostgreSQL wire-protocol messages split across TCP packets were not being reassembled.
ECONNRESET crashes: abrupt disconnects could trigger unhandled rejections and crash the server process.
Single connection limitation: connection-level locking prevented concurrent clients from connecting.

The proposed architecture (per-connection handlers + shared query-level serialization) is the right direction for PGlite’s single-threaded execution model.

What looks good

Query-level serialization: a shared queue that serializes access to db.execProtocolRaw(...) aligns with PGlite’s constraints while allowing multiple sockets to stay connected.
Connection lifecycle separation: isolating per-socket state (buffers, idle timeout, cleanup) into a handler class is a clean boundary.
Operational intent is sound: treating ECONNRESET as normal behavior for pooled clients is correct.
Config additions: maxConnections and idleTimeout are reasonable operational controls.

Key risks / correctness concerns to double-check

1) TCP fragmentation / buffering must be concurrency-safe

Because Node can emit multiple 'data' events rapidly, ensure the handler’s buffering/drain logic cannot run concurrently in a way that races on shared state (e.g. messageBuffer). A race here can reintroduce message boundary corruption for large queries.

Recommendation: ensure buffer append and buffer-drain are serialized per connection (one drain loop at a time).

2) Queue draining edge cases

The queue manager should guarantee forward progress even if enqueues arrive around the time the processor finishes draining.

Recommendation: verify there is no timing window where processing flips false while there are still items queued (i.e. no “stuck until next enqueue” behavior).

3) Protocol parsing coverage

The parsing logic should handle:

StartupMessage (no type byte; [len:int32][protocol:int32][params...])
SSLRequest / CancelRequest / GSSENCRequest variants (also no type byte, but different request codes)
Regular frontend messages ([type:byte][len:int32][payload...])

Recommendations:

Treat lengths as unsigned and validate bounds (e.g. reject absurd lengths to avoid memory blow-ups).
Confirm that SSLRequest/CancelRequest packets are not misclassified as regular messages.

4) `maxConnections` rejection behavior

If rejecting connections when maxConnections is reached, ensure the server doesn’t write arbitrary plaintext to a socket that expects Postgres protocol frames.

Recommendation: either close the socket cleanly, or send a proper Postgres ErrorResponse (more work, better UX).

5) Public API / types

If the handler/server options types changed, ensure:

Existing construction patterns remain valid (backwards compatibility expectations).
Publicly exported types don’t reference internal/private-only classes (to avoid .d.ts issues).

Testing suggestions (beyond current unit coverage)

Fragmentation regression test: send a single large query (>64KB) that is forced to split across multiple TCP packets and confirm the server processes it correctly.
Abrupt disconnect test: disconnect mid-query and confirm no unhandled rejections; server continues serving new connections.
Concurrency test: open N connections (e.g. 20–50), run interleaved queries, confirm serialization and no deadlocks/starvation.
Idle timeout test: with idleTimeout set, verify only idle sockets are closed; active sockets are unaffected.

Files of interest

packages/pglite-socket/src/index.ts (handler, queue, server implementation)
packages/pglite-socket/src/scripts/server.ts (CLI runner / operational entrypoint)
packages/pglite-socket/tests/*.test.ts (coverage for multiplexing/disconnect/fragmentation)

github-actions · 2026-01-14T11:18:19Z

Demos: https://github.com/electric-sql/pglite/actions/runs/20991908022/artifacts/5125295546

github-actions · 2026-01-14T11:38:43Z

Demos: https://github.com/electric-sql/pglite/actions/runs/20992492046/artifacts/5125499547

github-actions · 2026-01-17T18:58:52Z

Demos: https://github.com/electric-sql/pglite/actions/runs/21099168091/artifacts/5165141122

…nsaction

github-actions · 2026-01-18T10:12:18Z

Demos: https://github.com/electric-sql/pglite/actions/runs/21109890721/artifacts/5168123727

github-actions · 2026-01-19T09:16:33Z

Demos: https://github.com/electric-sql/pglite/actions/runs/21131324828/artifacts/5174607318

github-actions · 2026-01-21T14:36:20Z

Demos: https://github.com/electric-sql/pglite/actions/runs/21213205177/artifacts/5205503391

samwillis

@tdrz sorry for the delay! looking really good - I was all set to approve but Opus found a missing await, thats the main this. There are a few other things it's found too:

PR Review: Connection Multiplexer for PGLite Socket Server

Summary

This PR implements a connection multiplexer allowing multiple clients to share a single PGlite instance via a QueryQueueManager that serializes query execution while maintaining transaction isolation per connection. The approach is sound and the implementation handles the core use cases well.

What's Working Well ✅

Transaction tracking is correctly integrated - The socket server uses execProtocolRaw, and I verified that isInTransaction() works correctly because transaction state is tracked at the WASM write callback level (#pglite_write), which parses CommandCompleteMessage for all protocol execution paths.
Handler ID tracking - Each connection gets a unique ID for transaction attribution, enabling correct isolation.
Transaction-aware queue processing - When a transaction is active, only queries from the transaction owner are processed, preventing interleaving.
Cleanup on disconnect - clearQueueForHandler rejects pending queries and clearTransactionIfNeeded rolls back orphaned transactions.
Good integration test coverage - Tests cover:
- Interleaved transaction and query from different clients
- Transaction owner disconnect/crash scenarios
- Two independent interleaved transactions

Issues Found

1. 🔴 Critical: Missing `await` on ROLLBACK

async clearTransactionIfNeeded(handlerId: number): Promise<void> {
  if (this.db.isInTransaction() && this.lastHandlerId === handlerId) {
    this.db.exec('ROLLBACK')  // ← Missing await!
    this.lastHandlerId = null
    await this.processQueue()
  }
}

The rollback may not complete before processQueue() is called, potentially causing the next query to execute while the rollback is still in progress. This could lead to undefined behavior or transaction state corruption.

Fix:

await this.db.exec('ROLLBACK')

2. 🟡 Medium: Potential Indefinite Blocking

When a transaction is active but the transaction owner has no queries in the queue, processQueue() breaks out of the loop:

if (i === -1) {
  query = null
}
if (!query) break

If the transaction owner is slow (e.g., user thinking, network delay), other clients' queries will sit in the queue indefinitely until the owner sends another query or disconnects.

Scenario:

Client A sends BEGIN → transaction starts
Client B sends SELECT 1 → queued, waiting
Client A is idle for 30 seconds...
Client B's query is blocked the entire time

Consider adding a warning log when queries are blocked, or a configurable timeout for blocked queries.

Test Coverage Gaps

The integration tests are good, but there are some gaps:

No unit tests for QueryQueueManager - The handler tests mock it, so the actual transaction queue logic isn't unit tested in isolation.
No test for slow/idle transaction owner - Would be valuable to verify behavior when the transaction owner doesn't send queries for an extended period.
No explicit queue ordering verification - No test explicitly verifies that transaction owner's queries are prioritized correctly when multiple handlers have queued queries.

Minor Observations

The CONNECTION_QUEUE_TIMEOUT constant (line 5) appears to be exported but unused after the refactor to the new multiplexing approach.
Consider adding JSDoc documentation to QueryQueueManager explaining the transaction isolation strategy.

Verdict

The architecture is solid and the transaction isolation approach is correct. Please fix the missing await on the ROLLBACK before merging. The blocking concern is worth noting but acceptable for an initial implementation - it could be addressed in a follow-up if it becomes an issue in practice.

tdrz · 2026-01-30T19:40:53Z

@samwillis Thank, good points! Fixed the await.

The rest are all valid points, will address them in the future.

github-actions · 2026-01-30T19:42:57Z

Demos: https://github.com/electric-sql/pglite/actions/runs/21528108005/artifacts/5323374216

nickfujita and others added 4 commits August 28, 2025 23:57

'add bugger to incoming queries to wait for full message across multi…

22db0f5

…ple tcp packages'

'Add connection multiplexing and query level queue'

2403cc0

Merge remote-tracking branch 'nickfujita/pglite-socket-fix-large-quer…

285eaaf

…ies' into tdrz/nickfujita-improvements

changeset

5fe3b08

tdrz mentioned this pull request Jan 12, 2026

[pglite-socket] Fix: Message buffering, connection handling, and concurrent connection support #775

Closed

fixes

981bbc3

tdrz requested a review from samwillis January 12, 2026 10:02

added tests over 64kb

f4b7653

style

d1b8e10

Merge branch 'main' into tdrz/nickfujita-improvements

466e665

tdrz added 5 commits January 18, 2026 10:11

handle transactions in multi-connections

137b9ff

Add test for clearTransactionIfNeeded when client disconnects mid-tra…

98ea096

…nsaction

remove debug

a17a125

fix tests

e8fa6a0

style

644226b

more tests for pglite-socket

6634736

default: do not allow multiple connections

c6e0988

samwillis requested changes Jan 30, 2026

View reviewed changes

await ROLLBACK on clearing transaction

d38c95c

tdrz requested a review from samwillis January 30, 2026 19:41

Conversation

tdrz commented Jan 12, 2026

Uh oh!

github-actions bot commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 12, 2026

Uh oh!

github-actions bot commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samwillis commented Jan 13, 2026

PR 873 Review — @electric-sql/pglite-socket

Summary

What looks good

Key risks / correctness concerns to double-check

1) TCP fragmentation / buffering must be concurrency-safe

2) Queue draining edge cases

3) Protocol parsing coverage

4) maxConnections rejection behavior

5) Public API / types

Testing suggestions (beyond current unit coverage)

Files of interest

Uh oh!

github-actions bot commented Jan 14, 2026

Uh oh!

github-actions bot commented Jan 14, 2026

Uh oh!

github-actions bot commented Jan 17, 2026

Uh oh!

github-actions bot commented Jan 18, 2026

Uh oh!

github-actions bot commented Jan 19, 2026

Uh oh!

github-actions bot commented Jan 21, 2026

Uh oh!

samwillis left a comment

Choose a reason for hiding this comment

PR Review: Connection Multiplexer for PGLite Socket Server

Summary

What's Working Well ✅

Issues Found

1. 🔴 Critical: Missing await on ROLLBACK

2. 🟡 Medium: Potential Indefinite Blocking

Test Coverage Gaps

Minor Observations

Verdict

Uh oh!

tdrz commented Jan 30, 2026

Uh oh!

github-actions bot commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Jan 12, 2026 •

edited

Loading

github-actions bot commented Jan 12, 2026 •

edited

Loading

PR 873 Review — `@electric-sql/pglite-socket`

4) `maxConnections` rejection behavior

1. 🔴 Critical: Missing `await` on ROLLBACK