Skip to content

Replace Fatal calls with retries when zcashd becomes unavailable#550

Open
zmanian wants to merge 1 commit intozcash:masterfrom
zmanian:fix-blockingestor-retry
Open

Replace Fatal calls with retries when zcashd becomes unavailable#550
zmanian wants to merge 1 commit intozcash:masterfrom
zmanian:fix-blockingestor-retry

Conversation

@zmanian
Copy link

@zmanian zmanian commented Feb 11, 2026

Summary

  • BlockIngestor crashed with Log.Fatal on any getbestblockhash RPC failure, meaning a temporary zcashd restart or network glitch would kill lightwalletd immediately
  • Now retries indefinitely with exponential backoff (8s to 2min), logging only the first failure and again on recovery
  • Converts three Log.Fatal calls in getBlockFromRPC to returned errors (already retried by BlockIngestor's existing retry logic)
  • Converts Cache.Add failure from Log.Fatal to a logged warning with retry

As discussed in #416, the consistent behavior should be for lightwalletd to retry RPCs indefinitely rather than crashing, since zcashd can restart for many reasons and some monitoring process should handle zcashd availability independently.

Closes #416

Test plan

  • go build ./... passes
  • go vet ./common/... passes
  • go test ./common/... passes (all 11 tests including TestBlockIngestor)
  • Stop zcashd while lightwalletd is running; verify lightwalletd logs a warning and keeps retrying
  • Restart zcashd; verify lightwalletd logs recovery and resumes syncing

Generated with Claude Code

BlockIngestor crashed with Log.Fatal on any getbestblockhash RPC
failure, meaning a temporary zcashd restart or network glitch would
kill lightwalletd. Now retries with exponential backoff (8s, 16s, 32s,
64s, capped at 2 minutes), logging the first failure and re-logging
every 5 minutes during sustained outages to avoid filling the
filesystem. Logs a recovery message when zcashd comes back.

Also converts three Log.Fatal calls in getBlockFromRPC to returned
errors (already retried by BlockIngestor), converts the Cache.Add
Fatal to a retry, and changes the getblock failure log level from
Info to Warn.

Fixes zcash#416

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@zmanian zmanian force-pushed the fix-blockingestor-retry branch from 087d30e to 7f81ece Compare February 11, 2026 23:24
@zmanian zmanian changed the title Replace Fatal calls with retries when zcashd becomes unavailable Replace Fatal calls with retries and implement darkside getaddresstxids Feb 11, 2026
@zmanian zmanian force-pushed the fix-blockingestor-retry branch from 83acb35 to 7f81ece Compare February 11, 2026 23:51
@zmanian zmanian changed the title Replace Fatal calls with retries and implement darkside getaddresstxids Replace Fatal calls with retries when zcashd becomes unavailable Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

"invalid http POST response" and later crash

1 participant