Skip to content

VPC mode: stream stays open during post-response memory save, causing long "Thinking" delay in UI #49

@kaleko

Description

@kaleko

Problem

When the AgentCore Runtime is deployed in VPC mode, users experience a ~20 second delay after the agent finishes generating its response. The UI shows "Thinking..." with the chat input blocked, even though the full response text is already visible on screen.

This also occurs in PUBLIC mode but is much less noticeable due to lower latency on direct internet calls.

Root Cause

The frontend ChatInterface.tsx sets isLoading(false) in a finally block that only runs after await client.invoke(...) fully resolves — i.e., when the HTTP stream closes. The backend keeps the stream open after the last text chunk while it performs post-response work:

  1. Memory save (conversation history persistence)
  2. MCP client teardown (Gateway connection cleanup)

These operations go through VPC endpoints (PrivateLink), which add latency per call. The cumulative effect is a noticeable delay between "response visible" and "stream closed."

Expected Behavior

The UI should unblock the chat input as soon as the agent's response text is fully streamed, regardless of backend cleanup work.

Suggested Fix

Either:

  • Backend: Close the HTTP stream immediately after the response is complete, then perform memory save and MCP cleanup asynchronously (fire-and-forget or background task)
  • Frontend: Detect the end of the response content (e.g., a sentinel event or stop_reason) and set isLoading(false) before the stream fully closes

Reproduction

  1. Deploy FAST with network_mode: VPC in config.yaml
  2. Send any message to the agent
  3. Observe the response text appears quickly, but "Thinking..." persists for ~15-20 seconds after

Environment

  • Discovered during VPC deployment testing (feat/vpc_deployment branch merged with main)
  • Affects both tool-using and non-tool responses
  • More pronounced in VPC mode due to PrivateLink latency, but the underlying issue exists in PUBLIC mode too

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions