Skip to content

fix(repl): prevent multiline prediction hangs#202

Merged
kunchenguid merged 2 commits intomainfrom
fix/prediction-hang
Apr 1, 2026
Merged

fix(repl): prevent multiline prediction hangs#202
kunchenguid merged 2 commits intomainfrom
fix/prediction-hang

Conversation

@kunchenguid
Copy link
Copy Markdown
Owner

Summary

This change fixes a REPL prediction hang caused by entering multiline mode on incomplete input. Previously, that transition cancelled the visible prediction by calling OnInputChanged(""), which also started a hidden empty-input debounced prediction; if that prediction path stalled, it could keep the REPL tied up and interfere with foreground command execution because interpreter context was shared globally.

Risk Assessment: 🟡 Medium — Medium risk: tests are strong and the fix is reasonably bounded, but it changes core cancellation/timeout behavior and includes a real timeout-precedence flaw in prediction requests.

Architecture

flowchart TD
subgraph repl_flow["REPL Prediction Flow"]
  direction TB
  multiline_input["Multiline input transition (updated)"]
  prediction_state["PredictionState.Cancel() (updated)"]
  event_provider["EventPredictionProvider (updated)"]
  predict_middleware["repl.predict middleware / prediction agent (unchanged)"]

  multiline_input -->|"uses cancel instead of empty prediction request"| prediction_state
  prediction_state -->|"stops in-flight prediction work"| event_provider
  event_provider -->|"emits repl.predict"| predict_middleware
end

subgraph runtime_flow["Cancellation And Model Runtime"]
  direction TB
  command_exec["REPL command execution (updated)"]
  goroutine_contexts["Interpreter goroutine context store (added)"]
  lite_model["Default lite model config (updated)"]
  openai_provider["OpenAI provider timeout handling (updated)"]

  command_exec -->|"sets foreground command context"| goroutine_contexts
  lite_model -->|"supplies optional timeout"| openai_provider
end

event_provider -->|"sets prediction context with 30s fallback"| goroutine_contexts
predict_middleware -->|"uses gsh.models.lite for LLM predictions"| lite_model
predict_middleware -->|"invokes model requests"| openai_provider
goroutine_contexts -->|"keeps prediction and command cancellation isolated"| openai_provider
Loading

Key changes made

  • Adds PredictionState.Cancel() and uses it for multiline input, so newline transitions invalidate in-flight predictions without launching a replacement empty prediction.
  • Moves interpreter execution context storage to a per-goroutine map and switches callers to ClearContext(), so prediction work and foreground command execution no longer overwrite each other's cancellation state.
  • Adds timeout protection around prediction and model requests: prediction events get a fallback deadline, OpenAI-compatible models now support a timeout field, and the default gsh.models.lite Ollama model is configured with 15000.
  • Updates docs/tutorial/03-command-prediction.md so the command prediction tutorial documents the new timeout guidance.

How was this tested

  • go test ./internal/script/interpreter ./internal/repl/input ./internal/repl/predict
  • go test ./internal/repl/...
  • go test ./cmd/gsh/...
  • go test -run 'TestInterpreter_Context_|TestOpenAIProviderChatCompletion_UsesModelTimeout|TestMultiLineInputDoesNotStartHiddenPrediction' ./internal/script/interpreter ./internal/repl/input
  • golangci-lint fmt
  • golangci-lint run --fix
  • golangci-lint run

Critique Comments

  • 🔴 internal/repl/predict/event_provider.go:73 - This outer WithTimeout(..., 30s) caps every repl.predict run, so a user-configured model timeout above 30s can never take effect. Slow local models will still be cancelled early; derive the limit from model config or make this cap configurable.

@kunchenguid kunchenguid merged commit 7e54d84 into main Apr 1, 2026
1 check passed
@kunchenguid kunchenguid deleted the fix/prediction-hang branch April 1, 2026 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant