Lokt ships with a built-in demo that shows what happens when concurrent processes share state without coordination. It builds a "hex wall" — a grid of hexadecimal characters — using parallel worker processes. With locking, the wall is clean. Without locking, it falls apart.
lokt demo # generate the script
./lokt-hexwall-demo.sh # run with locking (clean)
./lokt-hexwall-demo.sh --no-lock # run without locking (chaos)You'll see a wall of hex characters printed to your terminal in real time.
In lock mode every row is uniform (row 0 = all 0s, row 1 = all 1s,
etc.). In no-lock mode the rows are a jumble of mismatched characters.
The demo spawns multiple worker processes (8 by default) that compete to
fill a grid, one character at a time. Each character requires exclusive
access to a shared counter file and three shared "work files." In lock
mode, workers use lokt guard to serialize access. In no-lock mode they
all read and write the same files simultaneously, stepping on each other.
The grid is ROWS x COLS characters (default 16x32 = 512 cells). Every
cell requires one lock acquisition. With 8 workers contending for each
acquisition, there are hundreds of contested lock attempts per run.
A critical section is a block of code that must run to completion without interruption from other processes. The demo's critical section has five phases. Here is what happens in each, and why locking matters.
The worker reads the shared counter to learn which cell to fill. It
computes three intermediate values (a, b, c) and writes them to
three shared files (work_a, work_b, work_c).
The values are chosen so that (a + b + c) % 16 equals the row number.
Specifically: a = row % 16, b = col % 16, c = (16 - b) % 16.
Because b + c is always a multiple of 16, the sum reduces to just a,
which is the row number. This identity only holds when all three values
come from the same worker at the same grid position.
With lock: The work files contain our values and only our values.
Without lock: Another worker may overwrite work_b or work_c
between our writes, replacing them with values from a different grid
position.
The counter is temporarily set to a random grid position. This models "computation in progress" — the kind of intermediate state that real systems produce during multi-step updates.
With lock: No other worker can read the counter while it holds this garbage value. They are blocked, waiting for the lock.
Without lock: Other workers read the poisoned counter, believe they are at the wrong grid position, and begin computing intermediates for that position. This is where the real damage starts — a single poisoned read cascades into incorrect writes across multiple work files.
The worker reads the three work files back.
With lock: It reads its own values from phase 1. The cancellation
b + c ≡ 0 (mod 16) holds. The computed character equals the row number.
Without lock: Some work files have been overwritten by workers that were misdirected in phase 2. The values come from different grid positions, the cancellation breaks, and the character is wrong.
The computed character is appended to the row's buffer file. When the
buffer reaches COLS characters, the row is flushed to the output file.
With lock: Each row fills sequentially from left to right with the correct character.
Without lock: Workers write characters for wrong positions (due to counter poisoning), multiple workers may append to the same row concurrently, and the buffer fills with a mix of correct and incorrect characters.
The counter is set to the correct next value (i + 1).
With lock: The next worker picks up exactly where we left off.
Without lock: Multiple workers write competing "next" values. Some workers' phase-5 writes are overwritten by other workers' phase-2 poison writes. The counter jumps around unpredictably.
A critical section is a sequence of operations that must execute atomically — as if they were a single, indivisible step. If another process interleaves its own operations in the middle, the shared state becomes inconsistent.
Think of a bank teller serving one customer at a time. The teller reads the account balance, computes the new balance, and writes it back. If two tellers serve the same account simultaneously — both read $100, both subtract $50, both write $50 — the bank loses $50. The solution is a lock: only one teller handles the account at a time.
In the demo, the critical section is the five-phase computation. The "bank account" is the set of shared files (counter, work files, row buffers). The "tellers" are the worker processes.
Contention occurs when multiple processes want the same lock at the same time. Only one can proceed; the rest must wait.
Higher contention means more waiting, but it also means the lock is doing its job — preventing the corruption that would occur if everyone proceeded simultaneously.
In the demo:
- 8 workers, 512 cells = high contention. Workers frequently collide.
- 1 worker, 512 cells = zero contention. No races possible.
- 16 workers, 512 cells = even higher contention. More waiting, same correct result.
Real-world contention examples:
- Multiple CI/CD pipelines deploying the same service
- Parallel terraform applies against the same state
- Database transactions updating the same row
The demo proves a simple claim: if a computation has multiple steps over shared state, concurrent access without a lock produces incorrect results.
The lock provides mutual exclusion — the guarantee that only one process executes the critical section at a time. This is not about speed (locking is slower). It is about correctness. The locked demo produces the right answer every time. The unlocked demo produces garbage every time.
Phase 2 is the most destructive step in the no-lock scenario. When the counter is temporarily set to a random value:
- Worker A reads position 42, begins computing, poisons counter to 317.
- Worker B reads 317, thinks it should fill position 317, writes intermediates for that position into the shared work files.
- Worker A reads back work files — gets Worker B's values for position 317 instead of its own values for position 42.
- Worker A writes a wrong character to row 42's buffer.
- Worker B also writes a wrong character (to row 317 if it exists, or an overflowed position).
This cascade is self-amplifying. Every poisoned read creates more poisoned writes, which create more poisoned reads. Within a few iterations, the entire grid is corrupted.
The three intermediate values use a deterministic formula:
a = row % 16
b = col % 16
c = (16 - b) % 16
When all three come from the same grid position:
(a + b + c) % 16 = (row + col + 16 - col) % 16 = (row + 16) % 16 = row % 16
The col terms cancel. The result depends only on the row number. That
is why every character in a locked row is the same hex digit — it is the
row number mod 16.
When values come from different positions (the no-lock case), the col
terms do not cancel. The result is (row_x + col_y + 16 - col_z) % 16
where x, y, z are different grid positions. This produces essentially
random output.
When a row buffer reaches COLS characters, the worker needs to flush
it to the output file. But in a concurrent environment, multiple workers
might see the buffer reach that threshold simultaneously. Writing the
same row twice would produce duplicate output.
The demo solves this with mkdir:
if mkdir "$STATE_DIR/done_$row" 2>/dev/null; then
# flush the row — we won the race
fiOn POSIX systems, mkdir is atomic — exactly one caller succeeds when
multiple processes attempt to create the same directory simultaneously.
The others get an error. This provides a lock-free, race-free way to
claim a one-time action.
This pattern is useful beyond demos. Any time you need exactly-once
semantics in a shell script without external tooling, mkdir is a
reliable primitive.
| Flag | Default | Effect |
|---|---|---|
--workers N |
8 | More workers = more contention. Above ~16 you hit diminishing returns as workers mostly wait for each other. |
--rows N |
16 | More rows = taller wall. Each row is independently filled, so this scales linearly. |
--cols N |
32 | More columns = wider rows. Each cell is one lock acquisition, so wider rows increase total work. |
--no-lock |
off | Disables lokt guard. Workers run without coordination. |
--lock-name NAME |
demo.hexwall |
Changes the lock name used by lokt guard. |
For a quick sanity check:
./lokt-hexwall-demo.sh --rows 4 --cols 32For maximum visual impact:
./lokt-hexwall-demo.sh --rows 16 --cols 64 --workers 160 | 00000000000000000000000000000000
1 | 11111111111111111111111111111111
2 | 22222222222222222222222222222222
3 | 33333333333333333333333333333333
Every row is filled with a single repeating character — the row number
in hex. Row 0 is all 0s, row 10 is all as, row 15 is all fs. The
wall is uniform because every character was computed from consistent
intermediate values, protected by the lock.
0 | 0a3f10b280c000f0a92d051e400e0000
1 | 011113a10d931110c141010b16110010
2 | 22822af602b0d3b2822a2b0622202222
3 | 333004333035a33233f333339ab05093
Rows contain a mix of characters. Some are correct (by chance), others are wrong. The pattern varies between runs because the corruption depends on scheduling — which workers happen to interleave in which order.
The more workers you run, the more chaotic the output. With 1 worker
and --no-lock, the output is actually correct (no concurrency, no
races). Add a second worker and corruption appears immediately.
The demo models a pattern found across computing:
Databases. A transaction reads a balance, computes interest, writes the new balance. Without isolation (the database equivalent of locking), a concurrent transaction sees the intermediate state and produces a wrong result. This is called a dirty read.
Map-reduce. Multiple reducers accumulate partial results into a shared counter. Without a lock, concurrent increments lose updates. A counter incremented 1000 times by 10 workers might end up at 980 instead of 1000. This is called a lost update.
CI/CD pipelines. Two pipelines deploy the same service
simultaneously. One writes config, the other writes the binary. The
service starts with mismatched config and binary — the equivalent of
reading work_a from one worker and work_b from another.
Shared memory. Multiple threads update a struct with several fields. Without a lock, one thread reads a struct where some fields were written by thread A and others by thread B. The struct is internally inconsistent — a torn read.
In every case, the fix is the same: serialize access to the shared state. A lock, a mutex, a transaction, a queue — the mechanism varies, but the principle is identical. Only one writer at a time.