⚠️ This is v0.0, beware of bugs and breaking changes.
lockd is a single-binary coordination plane that combines exclusive leases,
atomic JSON state, searchable document storage, and an at-least-once
queue (with optional envelope encryption) behind one API. It stays CGO-free,
statically linked, and happy as PID 1 in tiny containers or VMs.
Think of lockd as a coordination kernel for workflows and services. It lets you:
- Protect shared work with leases so only one worker advances a checkpoint at a time.
- Persist JSON documents directly on disks, S3/MinIO, or Azure Blob, then query them with selectors or stream them back as whole documents.
- Move work between stages using the built-in queue (enqueue/dequeue/ack/nack/extend) without deploying a separate message bus.
Because everything rides on the same storage abstraction, you can run lockd as a lightweight state/queue service in your own cluster, push it behind mTLS on the edge, or layer it over cloud object stores to create a simple hosted NoSQL-style document store with queues attached. Typical real-world uses include workflow runners, ETL checkpoints, IoT or fleet coordination, long-running ML tasks, or any system that needs “small database + queue + leasing” without managing three separate products.
Typical flow:
- A worker acquires a lease for a key (e.g.
orders). - The worker reads the last committed JSON state / checkpoint.
- Work is performed, the JSON state is updated atomically (CAS).
- The lease is released so the next worker can continue.
If the worker crashes or the connection drops, TTL expiry allows the next worker to resume from the last committed state.
Note from the author
lockd started as an experiment to see how far you could really go with AI today. I’ve spent over two decades working across IT operations and software development, with a strong focus on distributed systems - enough experience to lead a project like lockd, but not the time, cognitive bandwidth, or budget to build something like this on my own.
What’s been achieved here genuinely amazes me. It shows what augmentation really means — how far human expertise can be amplified when combined with capable AI systems. It’s not fair to say lockd was "vibe coded"; you still need solid foundations in distributed computing. But nearly all of the code was written by an AI coding agent, while I’ve acted as the solution architect - steering design decisions, ensuring technical feasibility, and keeping the vision cohesive.
The result is far beyond what I first imagined. Just a few years ago, if someone told you I built lockd in less than two weeks of spare time, you’d probably laugh. Today, you might just believe it.
We’re living in an incredible moment - where building at the speed of twelve parsecs or less is no longer science fiction.
- Exclusive leases per key with configurable TTLs, keep-alives, fencing tokens, and sweeper reaping for expired holders.
- Acquire-for-update helpers wrap acquire → get → update, keeping leases alive while user code runs and releasing them automatically.
- Public reads let dashboards or downstream systems fetch published state without holding leases (
/v1/get?public=1,Client.Getdefault behavior).
- Atomic JSON state up to ~100 MB (configurable) with CAS (version + ETag) headers.
- Namespace-aware indexing across S3/MinIO, Azure, or disk backends. Query with RFC 6901 selectors (
/workflow/state="done", braces,>=, etc.). - Streaming query results: keys-only (compact JSON) or NDJSON documents with metadata, consumable via SDK/CLI.
- Encryption everywhere when configured—metadata, state blobs, queue payloads all flow through
internal/storage.Crypto.
- At-least-once queue built on the same storage: enqueue, stateless/stateful dequeue, ack/nack/extend, DLQ after
max_attempts. - Visibility controls (ack deadlines, reschedule, extend) and QRF throttling hooks for overload protection.
- State sidecar per message to hold workflow context; created lazily via
EnsureStateExists.
- Simple HTTP/JSON API (no gRPC) with optional mTLS. All endpoints support correlation IDs, structured errors, and fencing tokens.
- Go SDK (
client) with retries, structuredAPIError, acquire-for-update, streaming query helpers, and document helpers (client.Document). - Cobra/Viper CLI that mirrors the SDK (leases, queue operations,
lockd client query,lockd client namespace, etc.) and is covered by unit tests.
- S3 / MinIO / S3-compatible using conditional copy CAS and optional KMS/SSE.
- Azure Blob Storage (Shared Key or SAS) with the same storage port.
- Disk backend optimized for SSD/NVMe, with optional inotify watcher for queue change feed; works over NFS via pure polling.
- In-memory backend for tests and demos.
- Structured logging (
pslog) with subsystem tagging (server.lifecycle.core,search.index,queue.dispatcher, etc.). - Watchdogs baked into unit/integration tests to catch hangs instantly.
lockd verify storediagnostics ensure backend credentials + permissions + encryption descriptors are valid before deploying.- Integration suites (
run-integration-suites.sh) cover every backend/feature combination; use them before landing cross-cutting changes.
Every structured log carries a sys field (system.subsystem.component). The
strings come directly from loggingutil.WithSubsystem and represent concrete
code paths. The primary production subsystems are:
client.sdk– Go SDK calls (leases, queue APIs, acquire-for-update helper).client.cli,cli.root,cli.verify– Cobra/Viper CLI plumbing, namespace helpers, and thelockd verifyworkflows.api.http.server– TLS listener, mux, and server-level errors. Each handler also emitsapi.http.router.<operation>(for exampleapi.http.router.acquire,api.http.router.queue.enqueue,api.http.router.query).control.lsf.observer– Host sampling loop that feeds the QRF controller.control.qrf.controller– Applies throttling/Retry-After headers and exposes demand metrics to HTTP handlers.server.lifecycle.core– Supervises background loops (sweeper, telemetry, namespace config) and owns process-level lifecycle logging.server.shutdown.controller– Drives drain-aware shutdown, emitsShutdown-Imminent, and monitors the close budget.namespace.config– Manages per-namespace capabilities (scan vs. index) and backs/v1/namespaceplus search adapters.queue.dispatcher.core– Ready cache, change-feed watcher, and consumer demand reconciliation.search.scan– Selector evaluation + scan adapter (explicit engine or fallback).search.index– Index manager, memtable flushers,/v1/index/flush.storage.pipeline.snappy.pre_encrypt– Optional compression pass executed before encryption whenever storage compression is enabled.storage.crypto.envelope– Kryptograf envelope encryption for metadata/state/queue payloads.storage.backend.core– Storage adapters (S3/MinIO, disk, Azure, mem).observability.telemetry.exporter– OTLP exporter for traces and metrics.
Additional prefixes show up in specialised contexts:
bench.disk.*,bench.minio.*– Benchmark harnesses and perf suites.api.http.router.*– Every HTTP route (acquire, query, queue ops, etc.).client.cli.*/cli.verify– CLI subcommands and auth workflows.
Lockd’s major workflows are documented as PlantUML sequence diagrams:
- State path — acquire/get/update/release over the lease/state surface.
- Queue path — enqueue, dequeue, and ack/nack/extend across the queue service.
- Search path —
/v1/queryexecution and/v1/index/flush.
- Acquire –
POST /v1/acquire→ acquire lease (optionally blocking). - Get state –
POST /v1/get→ stream JSON state with CAS headers. SupplyX-Lease-ID+X-Fencing-Tokenfrom the acquire response. - Update state –
POST /v1/update→ upload new JSON withX-If-Versionand/orX-If-State-ETagto enforce CAS. Include the currentX-Fencing-Token. - Remove state (optional) –
POST /v1/remove→ delete the stored JSON blob while holding the lease. Honor the sameX-Lease-ID,X-Fencing-Token, and CAS headers (X-If-Version,X-If-State-ETag) as updates. - Release –
POST /v1/release→ release lease with the same fencing token; the sweeper handles timeouts for crashed workers.
AcquireForUpdate combines the normal acquire → get → update → release flow
into a single call that takes a user-supplied function. The handler receives an
AcquireForUpdateContext exposing the current StateSnapshot along with helper
methods (Update, UpdateBytes, Save, Remove, etc.). While the handler runs the
client keeps the lease alive in the background; once the handler returns the
helper always releases the lease.
Only the initial acquire/get handshake retries automatically (respecting
client.WithAcquireFailureRetries and related backoff settings). After the
handler begins executing, any error is surfaced immediately so callers can
decide whether to re-run the helper.
The legacy /v1/acquire-for-update streaming endpoint has been removed;
all clients must use the callback helper described above.
server.go– server wiring, storage retry wrapper, sweeper.internal/httpapi– HTTP handlers for the API surface.internal/storage– backend interface, retry wrapper, S3/Disk/Memory.client– public Go SDK.cmd/lockd– CLI entrypoint (Cobra/Viper).tlsutil– bundle loading/generation helpers.integration/– end-to-end tests (mem, disk, NFS, AWS, MinIO, Azure, OTLP, queues).
The HTTP handlers embed swaggo annotations so the OpenAPI description can be produced straight from the code. Run make swagger (or go generate ./swagger) with the swag binary on your PATH to refresh the spec. Generation writes three artifacts to swagger/docs/:
swagger.jsonandswagger.yamlfor downstream tooling.swagger.html, a self-contained Swagger UI page that inlines the JSON spec.
These files live alongside the CLI so the reusable server library stays swaggo-free. Serve the HTML with any static web host—or just open it locally—to explore the API interactively.
lockd picks the storage implementation from the --store flag (or LOCKD_STORE
environment variable) by inspecting the URL scheme:
| Scheme | Example | Backend | Notes |
|---|---|---|---|
mem:// or empty |
mem:// |
In-memory | Ephemeral; test only. |
aws:// |
aws://my-bucket/prefix |
AWS S3 | Provide AWS credentials via AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY (and optional AWS_SESSION_TOKEN). Set the region via --aws-region, LOCKD_AWS_REGION, AWS_REGION, or AWS_DEFAULT_REGION. |
s3:// |
s3://localhost:9000/lockd-data?insecure=1 |
S3-compatible (MinIO, Localstack, etc.) | TLS enabled by default. Append ?insecure=1 for HTTP. Supply credentials with LOCKD_S3_ACCESS_KEY_ID/LOCKD_S3_SECRET_ACCESS_KEY (falls back to LOCKD_S3_ROOT_USER/LOCKD_S3_ROOT_PASSWORD). |
disk:// |
disk:///var/lib/lockd-data |
SSD/NVMe-tailored disk backend | Stores state/meta beneath the provided root; optional retention window. |
azure:// |
azure://account/container/prefix |
Azure Blob Storage | Account name in host, container + optional prefix in path. Authentication via account key or SAS token. |
For AWS, the standard credential chain (AWS_ACCESS_KEY_ID /
AWS_SECRET_ACCESS_KEY, profiles, IAM roles, etc.) is used. For other
S3-compatible stores, lockd reads LOCKD_S3_ACCESS_KEY_ID and
LOCKD_S3_SECRET_ACCESS_KEY (falling back to
LOCKD_S3_ROOT_USER/LOCKD_S3_ROOT_PASSWORD). No secret keys are stored in the
lockd config file.
- Uses temp uploads +
CopyObjectwith conditional headers for CAS. - Supports SSE (
aws:kms/AES256) and custom endpoints (MinIO, Localstack). - Default retry budget: 12 attempts, 500 ms base delay, capped at 15 s.
Configuration (flags or env via LOCKD_ prefix):
| Flag / Env | Description |
|---|---|
--store / LOCKD_STORE |
aws://bucket/prefix or s3://host:port/bucket |
--aws-region |
AWS region for aws:// stores (required unless provided via env) |
--s3-sse |
AES256 or aws:kms |
--s3-kms-key-id |
KMS key for generic s3:// stores |
--aws-kms-key-id |
KMS key for aws:// stores |
--s3-max-part-size |
Multipart upload part size |
Shutdown tuning:
| Flag / Env | Description |
|---|---|
--drain-grace / LOCKD_DRAIN_GRACE |
How long to keep serving existing lease holders before HTTP shutdown begins (default 10s; set 0s to disable draining). |
--shutdown-timeout / LOCKD_SHUTDOWN_TIMEOUT |
Overall shutdown budget, split 80/20 between drain and HTTP server teardown (default 10s; set 0s to rely on the orchestrator’s deadline). |
- Streams JSON payloads directly to files beneath the store root, hashing on the fly to produce deterministic ETags.
- Keeps metadata in per-key protobuf documents; state lives under
<namespace>/state/<encoded-key>/data, keeping every payload inside its namespace directory (for exampledefault/state/orders/data). - Optional retention (
--disk-retention,LOCKD_DISK_RETENTION) prunes keys whose metadataupdated_at_unixis older than the configured duration. Set to0(default) to keep data indefinitely. - The janitor sweep interval defaults to half the retention window (clamped between 1 minute and 1 hour). Override via
--disk-janitor-interval. - Configure with
--store disk:///var/lib/lockd-data. All files live beneath the specified root; lockd createsmeta/,state/, andtmp/directories automatically, and every metadata/state object is stored with its namespace as part of the key (e.g.default/q/orders/msg/...).
- URL format:
azure://<account>/<container>/<optional-prefix>(the account comes from the host component). - Supply credentials via
--azure-key/LOCKD_AZURE_ACCOUNT_KEY(Shared Key) or--azure-sas-token/LOCKD_AZURE_SAS_TOKEN(SAS). Standard Azure environment variables such asAZURE_STORAGE_ACCOUNTare also honoured if the account is omitted from the URL. - Default endpoint is
https://<account>.blob.core.windows.net; override with--azure-endpointor?endpoint=on the store URL when using custom domains/emulators. - Authentication supports either account keys (Shared Key) or SAS tokens. Provide exactly one of the above secrets; the CLI no longer requires
--azure-accountbecause the account name is embedded in the store URL. - Example:
set -a && source .env.azure && set +a
lockd --store "azure://lockdintegration/container/pipelines" --listen :9341 --disable-mtlsIn-process backend utilized for unit tests (mem://); it can also serve for
experimentation or support a no-footprint ephemeral instance.
When the server receives a shutdown signal it immediately advertises a Shutdown-Imminent header, rejects new acquires with a shutdown_draining API error, and keeps existing lease holders alive until the drain grace expires. By default the 10 s --shutdown-timeout budget is split 80/20: approximately 8 s are dedicated to draining active leases and the remaining 2 s are reserved for http.Server.Shutdown to close idle connections cleanly. Setting --drain-grace 0s (or LOCKD_DRAIN_GRACE=0s) skips the drain window entirely when an orchestrator already enforces a strict deadline.
Clients are drain-aware too. The Go SDK and CLI listen for the Shutdown-Imminent header and, after in-flight work completes, automatically release the lease so another worker can be scheduled. Opt out via client.WithDrainAwareShutdown(false), --drain-aware-shutdown=false, or LOCKD_CLIENT_DRAIN_AWARE=false if your workflow prefers to hold the lease until the next session. KeepAlive and Release continue to succeed during the drain window, so operators can monitor progress (metrics/logs) while state is flushed to storage.
lockd exposes flags mirrored by LOCKD_* environment variables. Example:
# AWS S3 with region-based endpoint
export LOCKD_STORE="aws://my-bucket/prefix"
export LOCKD_AWS_REGION="us-west-2"
export AWS_ACCESS_KEY_ID="AKIA..."
export AWS_SECRET_ACCESS_KEY="..."
lockd \
--listen :9341 \
--store "$LOCKD_STORE" \
--default-namespace workflows \
--json-max 100MB \
--default-ttl 30s \
--max-ttl 30m \
--acquire-block 60s \
--sweeper-interval 5s \
--bundle $HOME/.lockd/server.pem
# MinIO running locally over HTTP
export LOCKD_STORE="s3://localhost:9000/lockd-data?insecure=1"
export LOCKD_S3_ACCESS_KEY_ID="lockddev"
export LOCKD_S3_SECRET_ACCESS_KEY="lockddevpass"
lockd --store "$LOCKD_STORE" --listen :9341 --bundle $HOME/.lockd/server.pem
# Azure Blob Storage (account key)
set -a && source .env.azure && set +a
# .env.azure should export LOCKD_STORE=azure://account/container/prefix and LOCKD_AZURE_ACCOUNT_KEY=...
lockd --store "$LOCKD_STORE" --listen :9341 --disable-mtls
# IPv4-only binding
lockd --listen-proto tcp4 --listen 0.0.0.0:9341 --store "$LOCKD_STORE"
# Prefer stdlib JSON compaction for tiny payloads
lockd --store mem:// --json-util stdlibThe default listen address is :9341, chosen from the unassigned IANA space to
avoid clashes with common cloud-native services.
Every key, queue, and workflow lease lives inside a namespace. When callers omit
the field, the server falls back to --default-namespace (defaults to
"default"). Use --default-namespace / LOCKD_DEFAULT_NAMESPACE to set the
cluster-wide default, or supply namespace in API requests to isolate
workloads.
The Go SDK mirrors this behaviour via client.WithDefaultNamespace. CLI
commands expose --namespace / -n and honor LOCKD_CLIENT_NAMESPACE for
lease/state operations plus LOCKD_QUEUE_NAMESPACE for queue helpers. Queue
dequeue commands export LOCKD_QUEUE_NAMESPACE (along with message metadata)
so follow-up ack/nack/extend calls can reuse the value without repeating
--namespace. Set these environment variables in your shell to avoid passing
the flag on every invocation.
Namespaces cascade to storage: metadata and payload objects are prefixed with
<namespace>/... regardless of backend. When adding new features or tests,
ensure we exercise at least one non-default namespace to avoid regressions in
prefix handling.
Metadata attributes & hidden keys
Each key’s metadata protobuf stores lease info plus user-controlled attributes.
The server exposes POST /v1/metadata so callers can mutate attributes without
rewriting the JSON state. Today the flag query_hidden (persisted as the
lockd.query.exclude attribute) hides a key from /v1/query results while
keeping it readable via the lease or public=1 GET helpers. Toggle it by
holding the lease and calling:
curl -sS -X POST "https://host:9341/v1/metadata?key=orders&namespace=default" \
-H "X-Lease-ID: $LEASE_ID" \
-H "X-Fencing-Token: $FENCING" \
-d '{"query_hidden":true}'The same attribute can ride along with state uploads by setting the
X-Lockd-Meta-Query-Hidden header; the Go SDK exposes
client.WithQueryHidden() / client.WithQueryVisible() helpers and
LeaseSession.UpdateMetadata / Client.UpdateMetadata for direct control.
Server-side diagnostics (e.g. lockd-diagnostics/*) are automatically hidden so
queries never leak internal housekeeping blobs. Future metadata attributes will
use the same endpoint and response envelope (api.MetadataUpdateResponse).
POST /v1/query still returns {namespace, keys, cursor} JSON by default, but
you can now request document streams by passing return=documents. In document
mode the server replies with Content-Type: application/x-ndjson, sets the
cursor/index metadata via headers, and streams rows shaped like:
{"ns":"default","key":"orders/123","ver":42,"doc":{"status":"ready"}}
Only published documents are streamed, so the semantics match public=1 GET.
The Go SDK exposes client.WithQueryReturnDocuments() plus a revamped
QueryResponse that provides Mode(), Keys(), and ForEach helpers. In
streaming mode each row exposes row.DocumentReader() (close it when you’re
done) or the convenience row.DocumentInto(...) / row.Document() helpers. If
you only need the identifiers, call Keys() to drain the stream without
materialising any documents. Response headers mirror the cursor and index
sequence (X-Lockd-Query-Cursor, X-Lockd-Query-Index-Seq), and the metadata
map is available both in headers and on the JSON response for the keys-only
mode.
The queue dispatcher multiplexes storage polling across all connected consumers.
Use these flags (or matching LOCKD_QUEUE_* environment variables) to adjust
its behaviour:
| Flag / Env | Description |
|---|---|
--queue-max-consumers (LOCKD_QUEUE_MAX_CONSUMERS) |
Caps the number of concurrent dequeue waiters handled by a single server (default 1000). |
--queue-poll-interval (LOCKD_QUEUE_POLL_INTERVAL) |
Baseline interval between storage scans when no change notifications arrive (default 3s). |
--queue-poll-jitter (LOCKD_QUEUE_POLL_JITTER) |
Adds randomised delay up to the specified duration to stagger concurrent servers (default 500ms; set 0 to disable). |
--disk-queue-watch (LOCKD_DISK_QUEUE_WATCH) |
Enables Linux/inotify queue watchers on the disk backend (default true; automatically ignored on unsupported filesystems such as NFS). |
--mem-queue-watch (LOCKD_MEM_QUEUE_WATCH) |
Enables in-process queue notifications for the in-memory backend (default true; disable to force pure polling). |
devenv/docker-compose.yaml brings up a complete local stack—Jaeger, the OTLP
collector, and a MinIO instance whose data lives under
devenv/volumes/miniostore/. Start the environment with:
cd devenv
docker compose up -dThe MinIO container listens on localhost:9000 (S3 API) / 9001 (console),
with credentials lockddev / lockddevpass. Point lockd at the local bucket via:
export LOCKD_STORE="s3://localhost:9000/lockd-data?insecure=1"
export LOCKD_S3_ACCESS_KEY_ID="lockddev"
export LOCKD_S3_SECRET_ACCESS_KEY="lockddevpass"Set --otlp-endpoint (or LOCKD_OTLP_ENDPOINT) to export traces and metrics to
the same compose collector. Omit the scheme to default to OTLP/gRPC on
port 4317; use grpc:///grpcs:// to force gRPC, or http:///https://
(default port 4318) for the HTTP transport. With the compose stack running:
LOCKD_STORE=mem:// \
LOCKD_OTLP_ENDPOINT=localhost \
lockdAll HTTP endpoints are wrapped with otelhttp, storage backends emit child
spans, and structured logs attach trace_id/span_id when a span is active.
Run go run ./devenv/assure to execute an end-to-end probe that verifies the
compose stack. The tool assumes the default compose settings (MinIO on
localhost:9000, OTLP collector on localhost:4317, Jaeger UI/API on
localhost:16686), connects to MinIO, starts an in-process lockd server against
the dev bucket, performs lease/state mutations, confirms the underlying objects
exist, and queries Jaeger’s HTTP API to ensure OTLP traces arrived via the
bundled collector. It’s a zero-config “go run” sanity check once docker compose up is running.
Every request processed by lockd carries a correlation identifier:
- Incoming clients may provide
X-Correlation-Id; the server accepts printable ASCII values up to 128 characters. Invalid identifiers are ignored and a fresh UUID is generated instead. - Successful acquire responses include
correlation_idin the JSON payload and echoX-Correlation-Idin the response headers. Follow-up operations must continue sending the same header. - Spans exported via OTLP include the
lockd.correlation_idattribute so traces and logs can be tied together. - Queue enqueue/dequeue responses and the ack/nack/extend APIs echo
correlation_id, preserving the identifier across retries, DLQ moves, and stateful leases. - The Go client automatically propagates correlation IDs. Seed a context with
client.WithCorrelationID(ctx, id)(useclient.GenerateCorrelationID()to mint one) and all lease/session operations will emit the header. For generic API clients, wrap anhttp.Clientwithclient.WithCorrelationHTTPClientor a custom transport viaclient.WithCorrelationTransportto overwriteX-Correlation-Idon every request.client.CorrelationIDFromResponseextracts the identifier from HTTP responses when needed.
Every successful acquire response includes a fencing_token and echoing
X-Fencing-Token on follow-up requests is mandatory. The Go SDK manages the
token automatically when you reuse the same client.Client. For CLI workflows
you can export the token so subsequent commands pick it up:
eval "$(lockd client acquire --server localhost:9341 --owner worker --key orders)"
lockd client keepalive --lease "$LOCKD_CLIENT_LEASE_ID" --key orders
lockd client update --lease "$LOCKD_CLIENT_LEASE_ID" --fencing-token "$LOCKD_CLIENT_FENCING_TOKEN" --key orders payload.jsonIf the server detects a stale token it returns 403 fencing_mismatch, ensuring
delayed or replayed requests cannot clobber state after a lease changes hands.
The token is a strictly monotonic counter that advances on every successful
acquire. Compared with relying purely on lease IDs and CAS/version checks, it
adds several key safeguards:
- Lease turnover without state writes – metadata
versiononly increments when the JSON blob changes. If lease A expires and lease B acquires but has not updated the state yet, the version remains unchanged. The fencing token has already increased, so any delayed keepalive/update from lease A is rejected immediately. - Ordering, not just identity – lease IDs are random, so downstream systems cannot tell which one is newer. By carrying the token, workers give databases and queues a simple “greater-than” check: accept writes with the highest token, reject anything older.
- Cache resilience inside the server – the handler caches lease metadata to avoid extra storage reads. A stale request might otherwise slip through by reading the cached entry; the fencing token still forces a mismatch and blocks the outdated lease holder.
- Protection for downstream systems – workers can forward the token to other services (databases, queues) and let them reject stale writers. CAS keeps lockd’s JSON state consistent, while fencing tokens extend that guarantee to anything else the worker touches.
- Operational guardrails – CLI/scripts that stash lease IDs in environment variables gain an extra safety net. If an operator forgets to refresh after a re-acquire, the stale token triggers a clear 403 instead of silently updating the wrong state.
Handshake retries for AcquireForUpdate are bounded by DefaultFailureRetries
(currently 5). Each time the initial acquire/get sequence encounters
lease_required, throttling, or a transport error, the helper burns one retry
according to the configured backoff (client.WithAcquireFailureRetries,
client.WithAcquireBackoff). Once the handler starts running, any subsequent
error is returned immediately so the caller can decide whether to invoke the
helper again.
While the handler executes, the helper issues keepalives at half the TTL. A
failed keepalive cancels the handler’s context, surfaces the error, and the
helper releases the lease (treating lease_required as success). Other client
calls (Get, Update, KeepAlive, Release) continue to surface
lease_required immediately so callers can choose their own retry strategy.
For multi-host deployments, build clients with
client.NewWithEndpoints([]string{...}). The SDK rotates through the supplied
base URLs when a request fails, carrying the same bounded retry budget across
endpoints, so failovers remain deterministic.
lockd can also read a YAML configuration file (loaded via Viper). At start-up
the server looks for $HOME/.lockd/config.yaml; use -c/--config (or
LOCKD_CONFIG) to point at an alternate file:
lockd -c /etc/lockd/config.yaml
# or
LOCKD_CONFIG=/etc/lockd/config.yaml lockdGenerate a template with sensible defaults using the helper command:
lockd config gen # writes $HOME/.lockd/config.yaml
lockd config gen --stdout # print the template instead of writing a file
lockd config gen --out /tmp/lockd.yaml --forceThe generated file contains the same keys as the CLI flags (for example
listen-proto, json-max, json-util, payload-spool-mem, disk-retention, disk-janitor-interval, s3-region, s3-disable-tls). When present, the configuration file
is read before environment variables so you can override individual settings via
LOCKD_* exports or command-line flags.
Use lockd --bootstrap /path/to/config-dir to idempotently generate a CA bundle
(ca.pem), server bundle with kryptograf metadata (server.pem +
server.denylist), a starter client certificate (client.pem), and a
config.yaml wired to that material. The flag is safe to run on every start; it
skips files that already exist and only fills in the missing pieces. Container
images built from this repo run with --bootstrap /config by default so
mounting an empty /config volume automatically provisions mTLS + storage
encryption.
Build a minimal image with the provided Containerfile:
nerdctl build -t lockd:latest .
# or with podman...
podman build -t lockd:latest .The final stage is FROM scratch with /lockd as entrypoint. It exposes two
volumes:
/config– persisted certificates, config, denylist, and kryptograf material/storage– defaultdisk:///storagebackend for state/queue data
Run the server:
nerdctl run -p 9341:9341 -v lockd-config:/config -v lockd-data:/storage localhost/lockd:latestBecause the entrypoint appends --bootstrap /config, mounting a fresh config
volume auto-creates CA/server/client bundles plus config.yaml. You can also
invoke admin commands directly:
nerdctl run -ti -v lockd-config:/config localhost/lockd:latest auth new client \
--cn worker-2 --out /config/client-worker-2.pemEnvironment variables still override everything (for example supply
LOCKD_STORE, LOCKD_S3_ENDPOINT, etc.) so the same image works for disk,
MinIO, Azure, or AWS deployments.
Lockd ships with three drop-in JSON compactors. Select one with --json-util or
the LOCKD_JSON_UTIL environment variable:
lockd(default) – streaming writer tuned for multi-megabyte payloads.jsonv2– tokenizer inspired by Go 1.25’s experimental JSONv2 runtime.stdlib– Go’s stockencoding/json.Compacthelper for minimal dependencies.
Benchmarks on a 13th Gen Intel Core i7-1355U (Go 1.25.2):
| Implementation | Small (~150B) ns/op (allocs) | Medium (~60KB) ns/op (allocs) | Large (~2MB) ns/op (allocs) |
|---|---|---|---|
encoding/json |
380 (1) | 69,818 (1) | 4,032,017 (1) |
lockd |
1,609 (5) | 98,083 (20) | 2,818,759 (26) |
jsonv2 |
1,572 (5) | 92,892 (16) | 3,046,118 (23) |
The default remains lockd, which provides the best throughput on large payloads
— the primary lockd use-case — while jsonv2 or stdlib can be selected for
small, latency-sensitive workloads. Re-run the suite locally with:
go test -bench=BenchmarkCompact -benchmem ./internal/jsonutilThe client SDK now exposes streaming helpers so large JSON blobs no longer need to be buffered in memory. On the same 13th Gen Intel Core i7-1355U host:
| Benchmark | ns/op | MB/s | B/op | allocs/op |
|---|---|---|---|---|
BenchmarkClientGetBytes |
707,158 | 370.72 | 1,218,417 | 131 |
BenchmarkClientGetStream |
219,847 | 1,192.45 | 8,100 | 97 |
BenchmarkClientUpdateBytes |
241,807 | 1,084.16 | 43,330 | 114 |
BenchmarkClientUpdateStream |
402,225 | 651.74 | 9,759 | 122 |
Streaming reads cut allocations by ~150× and uploads by ~4.4×; throughput is a touch lower for uploads because the payload is generated on the fly, but avoids materialising entire documents in memory.
On the server side, lockd compacts JSON through an in-memory spool that spills
to disk once a threshold is exceeded. By default up to 4 MiB of the request is
kept in RAM. You can tune this via --payload-spool-mem /
LOCKD_PAYLOAD_SPOOL_MEM / payload-spool-mem in the config file to trade
memory for CPU (or vice versa).
Running the MinIO-backed benchmarks with the default threshold:
| Benchmark | ns/op | MB/s | B/op | allocs/op |
|---|---|---|---|---|
BenchmarkLockdLargeJSON |
112,064,222 | 46.78 | 22,309,298 | 6,166 |
BenchmarkLockdLargeJSONStream |
59,486,906 | 88.14 | 22,279,726 | 6,315 |
BenchmarkLockdSmallJSON |
76,891,365 | 0.01 | 1,790,225 | 7,513 |
BenchmarkLockdSmallJSONStream |
18,178,942 | 0.03 | 439,805 | 2,261 |
Large uploads still allocate heavily because the spool buffers the first 4 MiB
before spilling to disk. Lowering the threshold (for example --payload-spool-mem=1MB)
pushes more work onto disk IO, which may improve tail latency on constrained
hosts. Small updates benefit significantly from streaming even with the default
threshold. Choose a value that matches your workload and disk characteristics;
the benchmarks above were gathered via:
set -a && source .env.local && set +a && go test -run=^$ -bench=BenchmarkClientGet -benchmem ./client
set -a && source .env.local && set +a && go test -run=^$ -bench=BenchmarkClientUpdate -benchmem ./client
set -a && source .env.local && set +a && go test -run='^$' -bench='BenchmarkLockd(LargeJSON|SmallJSON)' -benchmem ./integration/minio -tags "integration minio bench"In addition to coordinating workflow checkpoints, lockd’s lease + atomic JSON model unlocks several other patterns once performance and durability goals are met:
- Feature flag shards – hold per-segment experiment state and atomically roll back under contention without adding a new datastore.
- Session handoff / sticky routing – track live client sessions across stateless edge workers using short leases and protobuf metadata blobs.
- IoT rollout controller – drive firmware or configuration rollouts where each device claims work and reports progress exactly once.
- Distributed cron / windowing – serialize recurring jobs (per key) so retries don’t overlap, while keeping per-run state directly in lockd.
Acquire-for-update is particularly useful for these scenarios because the state
reader holds the lease while a worker inspects the JSON payload. Once it computes
the next cursor it can call Update followed by Release, avoiding any race
window between separate update and release calls.
./run-benchmark-suites.sh provides a single entry point for benchmark runs.
Invoke ./run-benchmark-suites.sh list to see the available suites (disk,
minio, mem/lq) and run ./run-benchmark-suites.sh minio (or all) to
execute them with the correct build tags and per-suite logs in
benchmark-logs/.
With MinIO running locally (for example on localhost:9000) you can compare raw
object-store performance against lockd by running the benchmark suite:
# Large (5 MiB) and small payload benchmarks + concurrency tests
go test -bench . -run '^$' -tags "integration minio bench" ./integration/minioThe harness measures both sequential and concurrent scenarios for large (~5 MiB) and small (~512 B) payloads:
- Raw MinIO
PutObjectthroughput (large/small). lockdacquire/update/release cycles (large/small).- Raw MinIO concurrent writers on distinct keys (large/small).
lockdconcurrent writers on distinct keys (large/small).
Benchmarks assume the same environment variables as the MinIO integration tests
(LOCKD_STORE, MINIO_ROOT_USER, MINIO_ROOT_PASSWORD, etc.). Use
LOCKD_STORE=s3://localhost:9000/lockd-integration?insecure=1 for a default
local setup, or point it at HTTPS by omitting the ?insecure=1 query string.
The disk backend benchmarks pit raw disk.Store throughput against the HTTP API
and include an optional NFS-targeted scenario. Run them with:
set -a && source .env.local && set +a && go test -run=^$ -bench='Benchmark(Disk|LockdDisk)' -benchmem ./integration/disk -tags "integration disk bench"
set -a && source .env.local && set +a && go test -run ^$ -bench BenchmarkLockdDiskLargeJSONNFS -benchmem ./integration/disk -tags "integration disk bench"Source .env.disk (or export the variables manually) before running; the suite
fails fast if the required paths are missing:
LOCKD_DISK_ROOT– absolute path on SSD/NVMe for local disk benchmarks.LOCKD_NFS_ROOT– absolute path to an NFS mount (optional but required for the NFS benchmark). If both/mnt/nfs4-lockdand/mnt/nfs-lockdare unset/unavailable the test fails.
The benchmark/mem/lq package measures dispatcher throughput using the
in-process mem:// backend. Each case launches real servers and clients so the
numbers reflect the full subscribe/ack handshake rather than synthetic mocks:
go test -bench . -tags "bench mem lq" ./benchmark/mem/lqScenarios currently included:
| Name | Description |
|---|---|
single_server_prefetch1_100p_100c |
100 producers / 100 consumers on one server with subscribe prefetch=1 (baseline). |
single_server_prefetch4_100p_100c |
Same workload with prefetch batches of four (default tuning). |
single_server_subscribe_100p_1c |
High fan-in into a single blocking subscriber (prefetch=16). |
single_server_dequeue_guard |
Legacy dequeue path kept as a regression guard. |
double_server_prefetch4_100p_100c |
Two servers sharing the same mem store to verify routing/failover performance. |
Only the total messages and enqueue/dequeue rates are printed by default so CI stays readable. Extra instrumentation can be toggled per run:
MEM_LQ_BENCH_EXTRA=1– export latency-derived metrics such asdequeue_ms/op,ack_ms/op,messages_per_batch, anddequeue_gap_max_ms.MEM_LQ_BENCH_DEBUG=1– emit verbose client logs (including stale/idempotent ACKs) when chasing data races.MEM_LQ_BENCH_TRACE=1– attach the trace logger to both servers; combine withLOCKD_READY_CACHE_TRACE=1to trace ready-cache refreshes.MEM_LQ_BENCH_TRACE_GAPS=1– print per-delivery gaps over 10 ms to stdout.MEM_LQ_BENCH_PRODUCERS,MEM_LQ_BENCH_CONSUMERS,MEM_LQ_BENCH_MESSAGES,MEM_LQ_BENCH_PREFETCH, andMEM_LQ_BENCH_WARMUP– override the baked-in workload sizes without editing the source file.MEM_LQ_BENCH_CPUPROFILE=/tmp/cpu.pprof– capture a CPU profile for the run.
These knobs let you keep the fast, quiet default for routine regression runs while enabling deep tracing when profiling throughput locally.
For tests or embedded use-cases you can run lockd entirely in-process. The
client/inprocess package starts a private Unix-socket server and returns a
regular client facade:
ctx := context.Background()
cfg := lockd.Config{Store: "mem://", DisableMTLS: true}
inproc, err := inprocess.New(ctx, cfg)
if err != nil { log.Fatal(err) }
defer inproc.Close(ctx)
lease, err := inproc.Acquire(ctx, api.AcquireRequest{Key: "jobs", Owner: "worker", TTLSeconds: 15})
if err != nil { log.Fatal(err) }
defer inproc.Release(ctx, api.ReleaseRequest{Key: "jobs", LeaseID: lease.LeaseID})Use client.BlockWaitForever (default) to wait indefinitely when acquiring, or client.BlockNoWait to fail immediately if the lease is already held.
Behind the scenes it relies on lockd.StartServer, which launches the server in
a goroutine and returns a shutdown function. You can use the helper directly
when wiring tests around Unix-domain sockets:
cfg := lockd.Config{Store: "mem://", ListenProto: "unix", Listen: "/tmp/lockd.sock", DisableMTLS: true}
srv, stop, err := lockd.StartServer(ctx, cfg)
if err != nil { log.Fatal(err) }
defer stop(context.Background())
cli, err := client.New("unix:///tmp/lockd.sock")
if err != nil { log.Fatal(err) }
sess, err := cli.Acquire(ctx, api.AcquireRequest{Key: "demo", Owner: "worker", TTLSeconds: 20})
if err != nil { log.Fatal(err) }
defer sess.Close()Health endpoints:
/healthz– liveness probe/readyz– readiness probe
The repository includes a dedicated testing harness (lockd.NewTestServer) that
starts a fully configured server, returns a shutdown function, and can emit logs
through testing.T. It also accepts a ChaosConfig that injects bounded
latency, drops, or a single forced disconnect via an in-process TCP proxy. This
is used heavily by the Go client’s integration tests to validate
AcquireForUpdate failover logic across multiple endpoints.
lockd verify store validates credentials (list/get/put/delete) and prints a
suggested IAM policy when access fails. Disk backends run a multi-replica
simulation (metadata CAS and payload writes) so locking bugs or stale CAS tokens
fail fast.
# Verify a disk mount before starting the server
LOCKD_STORE=disk:///var/lib/lockd lockd verify store
# Verify Azure Blob credentials (requires LOCKD_AZURE_ACCOUNT_KEY or LOCKD_AZURE_SAS_TOKEN)
LOCKD_STORE=azure://lockdaccount/lockd-container LOCKD_AZURE_ACCOUNT_KEY=... lockd verify storeWhen --store uses disk://, the same verification runs automatically during
server startup and the process exits if any check fails.
To keep observability output, docs, and dashboards aligned, every structured log emitted by lockd adheres to the following rules:
appis alwayslockd, whether the line originates from the server, CLI, or in-process harness.sysis mandatory and encodes the origin hierarchy assystem.subsystem.component[.subcomponent]. Use as many segments as needed to pinpoint the emitting code (for examplestorage.crypto.envelope,queue.dispatcher.ready_cache.prune, orserver.shutdown.controller.drain).- We no longer emit
svcorcomponentkeys—sysreplaces both. Any additional structured fields (latency, owner, key, cid, etc.) stay as-is.
OpenTelemetry exports follow the same convention: every server-managed span records lockd.sys=<value> alongside lockd.operation, so traces and metrics can be filtered with the same identifiers as logs.
mTLS is enabled by default. lockd looks for a bundle at
$HOME/.lockd/server.pem unless --bundle points elsewhere. Disable with
--disable-mtls / LOCKD_DISABLE_MTLS=1 (testing only).
Bundle format (PEM concatenated):
- CA certificate (trust anchor)
- Server certificate (leaf + chain)
- Server private key
- Optional denylist block (
LOCKD DENYLIST)
The CA private key lives in ca.pem and should be stored securely. Keep it
offline when possible; only the CA certificate is bundled with each server.
# Create a Certificate Authority bundle (ca.pem)
lockd auth new ca --cn lockd-root
# Issue a server certificate signed by the CA
lockd auth new server --ca-in $HOME/.lockd/ca.pem --hosts "lockd.example.com,127.0.0.1"
# Issue a new client certificate signed by the bundle CA
lockd auth new client --ca-in $HOME/.lockd/ca.pem --cn worker-1
# Revoke previously issued client certificates (by serial number)
lockd auth revoke client <hex-serial> [<hex-serial>...]
# Inspect bundle details (CA, server cert, denylist)
lockd auth inspect server # lists all server*.pem bundles
lockd auth inspect client --in $HOME/.lockd/client-*.pem
# Verify bundles (validity, EKUs, denylist enforcement)
lockd auth verify server # scans all server*.pem in the config dir
lockd auth verify client --server-in $HOME/.lockd/server.pemThe commands default to $HOME/.lockd/, creating the directory with 0700 and
files with 0600 permissions. Use --out/--ca-in/--force to override file
locations. ca.pem contains the trust anchor + private key and is intended to
be stored in a secure location separate from the server runtime bundle.
When lockd auth new server writes to the default location and server.pem
already exists, the CLI automatically picks the next available name
(server02.pem, server03.pem, …) so existing bundles are preserved without
requiring --force.
Client issuance follows the same pattern: the first default bundle is written
to client.pem, then client02.pem, client03.pem, and so on when rerun.
lockd auth verify ensures that the server bundle presents a CA + ServerAuth
certificate (with matching server private key) and that client bundles were
issued by the same CA and are not present on the denylist.
lockd auth revoke client updates the denylist for every server*.pem bundle
in the same directory as the referenced server bundle so multi-replica nodes
block revoked certificates consistently. Pass --propagate=false to limit the
update to just the specified bundle when needed (e.g. staging experiments).
All metadata, lock state JSON blobs, and queue payloads are encrypted at rest by
default using pkt.systems/kryptograf.
When you run lockd auth new ca the CLI generates a kryptograf root key and a
global metadata descriptor and embeds them alongside the CA certificate/key in
ca.pem. Subsequent lockd auth new server invocations propagate the same
material into each server.pem bundle so every replica can reconstruct the
metadata DEK on startup.
At runtime the server mints per-object DEKs (one per lock state blob, per queue
message metadata document, and per queue payload) derived from stable
contexts (e.g. state:<key>, queue-meta:q/<queue>/msg/<id>.pb,
queue-payload:q/<queue>/msg/<id>.bin). Descriptors for these DEKs are stored
alongside the objects themselves so reads remain stateless. Encrypted objects
use deterministic content-types:
- Metadata protobuf:
application/vnd.lockd+protobuf-encrypted - JSON state:
application/vnd.lockd+json-encrypted - Queue payloads / DLQ binaries:
application/vnd.lockd.octet-stream+encrypted
Disable encryption (testing only) with --disable-storage-encryption (or
LOCKD_DISABLE_STORAGE_ENCRYPTION=1). Optional Snappy compression is available via
--storage-encryption-snappy (only honoured when encryption remains enabled); when encryption is disabled, the original
content-types (application/x-protobuf, application/json,
application/octet-stream) are restored automatically.
lockd verify store now exercises the decrypt path by reading (or, when the
store is empty, synthesising and deleting) sample metadata/state and queue
objects. Failures surface immediately so misconfigured bundles or mismatched
descriptors are caught during deployment. Because storage encryption is tied to
the bundle, servers must load server.pem even when mTLS is disabled.
cli, err := client.New("https://lockd.example.com")
if err != nil { log.Fatal(err) }
sess, err := cli.Acquire(ctx, api.AcquireRequest{
Key: "orders",
Owner: "worker-1",
TTLSeconds: 30,
BlockSecs: client.BlockWaitForever,
})
if err != nil { log.Fatal(err) }
defer sess.Close()
var payload struct {
Data []byte
Counter int
}
if err := sess.Load(ctx, &payload); err != nil { log.Fatal(err) }
payload.Counter++
if err := sess.Save(ctx, payload); err != nil { log.Fatal(err) }
// Customise timeouts (HTTP requests + close/release window) when constructing the client:
cli, err := client.New(
"https://lockd.example.com",
client.WithHTTPTimeout(30*time.Second),
client.WithCloseTimeout(10*time.Second),
client.WithKeepAliveTimeout(8*time.Second),
)To connect over a Unix domain socket (useful when the server runs on the same
host), point the client at unix:///path/to/lockd.sock:
cli, err := client.New("unix:///var/run/lockd.sock")
if err != nil { log.Fatal(err) }
// run the server with --disable-mtls (or supply a client bundle)
sess, err := cli.Acquire(ctx, api.AcquireRequest{Key: "orders", Owner: "worker-uds", TTLSeconds: 30})
if err != nil { log.Fatal(err) }
defer sess.Close()Acquire automatically retries conflicts and transient 5xx/429 responses with
exponential backoff.
lockd client ships alongside the server binary for quick interactions with a
running cluster. Flags mirror the Go SDK defaults and honour LOCKD_CLIENT_*
environment variables.
# Acquire and release leases (exports LOCKD_CLIENT_* env vars)
eval "$(lockd client acquire --server 127.0.0.1:9341 --owner worker-1 --ttl 30s --key orders)"
lockd client keepalive --ttl 45s --key orders
lockd client release --key orders
# State operations / pipe through edit
lockd client get --key orders -o - \
| lockd client edit /status/counter++ \
| lockd client update --key orders
lockd client remove --key orders
# Atomic JSON mutations (mutate using an existing lease)
lockd client set --key orders /progress/step=fetch /progress/count++ time:/progress/updated=NOW
# Rich mutations with brace/quoted syntax (LQL)
lockd client set --key ledger \
'/data{/hello key="mars traveler",/count++}' \
/meta/previous=world \
time:/meta/processed=NOW
# Local JSON helper (no server interaction)
lockd client edit checkpoint.json /progress/step="done" /progress/count=+5
# Query keys or stream documents
lockd client query '/report/status="staged"'
lockd client query --output text --file keys.txt '/report/status="staged"'
lockd client query --documents --file staged.ndjson '/progress/count>=10'
lockd client query --documents --directory ./export '/report/region="emea"'
`lockd client query` parses the same LQL selector syntax as the server and
defaults to a compact JSON array of keys. Pass `--output text` for newline
lists that are easy to pipe into other shell tools. `--documents` switches the
request to `return=documents`, streaming each JSON state as NDJSON (to stdout by
default, or `--file`/`--directory` to store the feed). Directory mode writes one
`<key>.json` file per match, making it trivial to diff or archive results.
Selectors support shorthand comparison syntax, so `/progress/count>=10` is
automatically rewritten into the full `range{...}` clause, and `/status="ok"`
maps to an equality predicate without brace boilerplate. The selector argument
is required; to query “everything” explicitly pass an empty string (e.g.
`lockd client query ""`).
Pass --public to lockd client get when you only need the last published
state and don’t want to hold a lease (the CLI calls /v1/get?public=1 under the
hood). The default mode still enforces lease ownership and fencing tokens so
writers remain serialized.
Every lockd client subcommand accepts an optional --key (-k) flag. When
you omit --key, the command falls back to LOCKD_CLIENT_KEY (typically set
by the most recent acquire). Invoking acquire without --key requests a
server-generated identifier; the resulting value is exported via
LOCKD_CLIENT_KEY so follow-up calls can rely on the environment.
Queue commands ship alongside the standard lease helpers:
# Enqueue a JSON payload (stdin, --data, or --file)
lockd client queue enqueue \
--queue orders \
--content-type application/json \
--data '{"op":"ship","order_id":42}'
# Dequeue and export LOCKD_QUEUE_* environment variables
eval "$(lockd client queue dequeue --queue orders --owner worker-1)"
printf 'payload stored at %s\n' "$LOCKD_QUEUE_PAYLOAD_PATH"
# Use the exported metadata to ack/nack/extend
lockd client queue ack
lockd client queue nack --delay 15s --reason "upstream retry"
lockd client queue extend --extend 45squeue dequeue supports a --stateful flag which acquires both the message
and workflow state leases; the exported LOCKD_QUEUE_STATE_* variables align
with the fields consumed by queue ack/nack/extend.
Custom clients must send
/v1/queue/enqueueasmultipart/form-data(ormultipart/related) with two parts namedmetaandpayload. Themetapart contains a JSON-encodedapi.EnqueueRequest; thepayloadpart streams the message body and may use anyContent-Type(e.g.application/json). Earlier builds auto-detected JSON, but current releases require the explicit field names, matching the Go SDK and CLI.
Payloads are streamed directly to disk. When --payload-out is omitted the CLI
creates a temporary file and exports its location via
LOCKD_QUEUE_PAYLOAD_PATH, making it easy to hand off large bodies to other
tools without buffering in memory.
The CLI auto-discovers client*.pem bundles under $HOME/.lockd/ (or use
--bundle) and performs the same host-agnostic mTLS verification as the SDK.
set and edit consume the shared LQL mutation DSL. Paths now use JSON Pointer
syntax (RFC 6901)
(/progress/counter++), so literal dots, spaces, or Unicode in
keys are handled transparently (only / and ~ are escaped as ~1/~0). The
mutator also supports brace blocks that fan out to nested fields
(/data{/hello key="mars traveler",/count++}), increments (=+3/--),
removals (rm:/delete:), and time: prefixes for RFC3339 timestamps.
Commas and newlines can be mixed freely, e.g. lockd client set --key ledger 'meta{/owner="alice",/previous="world"}'.
Timeout knobs mirror the Go client: --timeout (HTTP dial+request),
--close-timeout (release window), and --keepalive-timeout
(LOCKD_CLIENT_TIMEOUT, LOCKD_CLIENT_CLOSE_TIMEOUT, and
LOCKD_CLIENT_KEEPALIVE_TIMEOUT respectively). Use
--drain-aware-shutdown/LOCKD_CLIENT_DRAIN_AWARE (enabled by default) to
control whether the CLI auto-releases leases when the server signals a
drain phase via the Shutdown-Imminent header.
Use - with --output to stream results to standard output or with file inputs
to read from standard input (e.g. -o -, lockd client update ... -). When the
acquire command runs in text mode it prints shell-compatible export LOCKD_CLIENT_*= assignments, making eval "$(lockd client acquire ...)" a
convenient way to populate environment variables for subsequent commands.
When mTLS is enabled (default) the CLI assumes HTTPS for bare host[:port]
values; when you pass --disable-mtls it assumes HTTP. Supplying an explicit
http://.../https://... URL is always honoured.
Integration suites are selected via build tags. Every run must include the
integration tag plus one (or more) backend tags; optional feature tags narrow
the scope further. The general pattern is:
go test -tags "integration <backend> [feature ...]" ./integration/...For everyday development, ./run-integration-suites.sh wraps these invocations,
sources the required .env.<backend> files, and stores logs under
integration-logs/. Use list to see available suites (mem, disk, nfs, aws,
azure, minio, plus /lq, /query, and /crypto variants), pass specific suites such as
disk disk/lq or nfs nfs/lq, or run the full matrix with all. The helper
honors LOCKD_GO_TEST_TIMEOUT, exports LOCKD_TEST_STORAGE_ENCRYPTION=1 so
disk/nfs/etc. suites run with envelope crypto by default, and exposes
--disable-crypto to flip the env var when targeting legacy buckets.
Current backends:
| Backend | Notes | Examples |
|---|---|---|
mem |
Uses the in-memory store; no environment needed. | go test -tags "integration mem" ./integration/... (all mem suites) / go test -tags "integration mem lq" ./integration/... (queue scenarios only) / go test -tags "integration mem query" ./integration/... (query-only scenarios). |
disk |
Local disk backend. Requires .env.disk (see integration/disk). |
set -a && source .env.disk && set +a && go test -tags "integration disk" ./integration/... / ... -tags "integration disk lq" ... for queue-only coverage. |
nfs |
Disk backend mounted on NFS. Source .env.nfs so LOCKD_NFS_ROOT is set. |
set -a && source .env.nfs && set +a && go test -tags "integration nfs lq" ./integration/.... |
aws |
Real S3 credentials via .env. |
set -a && source .env && set +a && go test -tags "integration aws" ./integration/.... |
minio, azure |
S3-compatible / Azure Blob suites. | e.g. go test -tags "integration minio" ./integration/... (requires appropriate env). |
The queue-specific feature tag is lq, and query-specific coverage uses the
query tag. A suite built with integration && mem && lq, for example, only
compiles the queue wrappers in integration/mem/lq, while integration && mem && query targets the integration/mem/query tests without running the rest of
the mem suite.
We’ll extend the same layout to the AWS, Azure, and MinIO queue suites next so
go test -tags "integration aws lq" ./integration/... (and similar) will target
their queue scenarios without running unrelated tests.
For the Go client’s AcquireForUpdate failover tests:
go test -tags integration -run 'TestAcquireForUpdateCallback(SingleServer|FailoverMultiServer)' ./clientThese harnesses ship with watchdog timers that panic after 5–10 s and print
full goroutine dumps via runtime.Stack, making hangs immediately actionable—
retain these guards whenever tests are updated.
- Javascript/Typescript client (bun/deno-compatible).
- Python client.
- C# client.
- Client helpers (auto keepalive, JSON patch utilities).
- Metrics/observability and additional diagnostics.
MIT – see LICENSE.
