zukov

distributed task runner for bun with multi-node compute, inspired by erlang/otp.

usage

import { ZukovRuntime, RestartStrategy } from "zukov";

const runtime = new ZukovRuntime("node-1");
await runtime.start();

// spawn a process
const pid = await runtime.spawn({
  init: () => {
    console.log("process started");
  },
  handleMessage: (msg) => {
    if (msg === "ping") {
      console.log("received ping");
    } else if (typeof msg === "object" && msg !== null) {
      const m = msg as { type: string; data?: unknown };
      if (m.type === "task") {
        console.log("processing task:", m.data);
      }
    }
  },
});

// send messages
runtime.send(pid, "ping");
runtime.send(pid, { type: "task", data: "compute" });

// request-reply pattern
const reply = await runtime.sendAndWait(pid, { type: "request" }, 5000);
console.log("got reply:", reply);

// multi-node: listen for connections
await runtime.listen(8080);

// connect to another node
runtime.addNode({
  id: "node-2",
  address: "localhost",
  port: 8081,
});
await runtime.connectToNode("node-2");

// send to remote process (format: "node-id:local-pid")
const remotePid = "node-2:1";
runtime.send(remotePid, "hello from node-1");

console.log("process count:", runtime.getProcessCount());
console.log("node id:", runtime.getNodeId());

await runtime.stop();

concepts

processes are isolated units of computation. they communicate via message passing. each process has a unique PID and can send messages to any PID, including processes on remote nodes.

multi-node: start a runtime with listen(port) to accept connections. add nodes with addNode({ id, address, port }) then connect with connectToNode(nodeId). processes can send messages across nodes transparently using the node-id:local-pid format.

supervisors manage process lifecycles and restart strategies.

process linking & monitoring

processes can be linked so that when one exits, the other is notified. this is the foundation of fault tolerance.

// link two processes (bidirectional)
runtime.link(pid1, pid2);
runtime.unlink(pid1, pid2);

// monitor a process (unidirectional, returns a ref)
const ref = runtime.monitor(pid1, pid2);
runtime.demonitor(ref, pid1, pid2);

exit messages are delivered as { type: "exit", fromPid, toPid, reason } to linked processes. down messages are delivered as { type: "down", ref, fromPid, toPid, reason } to monitoring processes.

supervisors

supervisors manage child process lifecycles with three restart strategies:

OneForOne — only restart the failed child
OneForAll — restart all children when any one fails
RestForOne — restart the failed child and all children started after it

import { RestartStrategy } from "zukov";

const supervisor = runtime.createSupervisor(RestartStrategy.OneForOne);

await supervisor.startChildren([
  { id: "worker-1", spec: { handleMessage: () => {} }, restart: "permanent" },
  { id: "worker-2", spec: { handleMessage: () => {} }, restart: "transient" },
  { id: "temp-task", spec: { handleMessage: () => {} }, restart: "temporary" },
]);

// restart policies:
// "permanent" — always restart on exit
// "transient" — restart only on abnormal exit (reason !== "normal")
// "temporary" — never restart, remove from supervision

supervisors also enforce restart intensity limits (max 5 restarts per 60s by default).

distributed task scheduling

simple scheduler

// submit tasks
runtime.submitTask({ payload: { compute: "fibonacci", n: 42 } });

// start a worker that processes tasks
runtime.startTaskWorker(async (task) => {
  console.log("processing:", task.payload);
  return { result: 42 };
});

// listen for completions
runtime.onTaskComplete((task) => {
  console.log("task done:", task.result);
});

runtime.onTaskFailed((task) => {
  console.error("task failed:", task.error);
});

worker pool

const pool = runtime.createWorkerPool(
  async (task) => {
    // process task
    return { result: "done" };
  },
  4, // initial worker count
);

await pool.submitTask({ type: "compute", data: "heavy" });
await pool.scaleUp(2);  // add 2 more workers
await pool.scaleDown(1); // remove 1 idle worker

pool.onTaskComplete((result) => {
  console.log("completed:", result.taskId);
});

await pool.stop();

connection reconnection

configure automatic reconnection for dropped connections:

runtime.setReconnectConfig({
  maxRetries: 10,       // -1 for infinite retries
  maxDelay: 30000,      // max backoff delay in ms
  initialDelay: 1000,   // initial backoff delay in ms
});

connections use exponential backoff. when a connection drops, the reconnector schedules retries automatically.

authentication & encryption

shared secret handshake

runtime.setAuthConfig({
  secret: "my-shared-secret",
});

all connections require HMAC-SHA256 authentication before exchanging messages. unauthenticated connections are rejected.

TLS

runtime.setAuthConfig({
  secret: "my-shared-secret",
  tls: {
    cert: fs.readFileSync("cert.pem", "utf-8"),
    key: fs.readFileSync("key.pem", "utf-8"),
    rejectUnauthorized: true,
  },
});

TLS encrypts all traffic. set rejectUnauthorized: false for self-signed certs in development.

api reference

ZukovRuntime

method	description
`start()`	start the runtime
`stop()`	stop the runtime and terminate all processes
`spawn(spec)`	create a new process, returns PID
`send(pid, message)`	fire-and-forget message send
`sendAndWait(pid, message, timeout)`	request-reply pattern
`sendReply(to, correlationId, payload)`	respond to a request
`link(pid1, pid2)`	link two processes
`unlink(pid1, pid2)`	unlink two processes
`monitor(fromPid, toPid)`	monitor a process, returns ref
`demonitor(ref, fromPid, toPid)`	stop monitoring
`createSupervisor(strategy)`	create a supervisor
`listen(port)`	accept incoming connections
`addNode(config)`	register a known remote node
`connectToNode(nodeId)`	connect to a remote node
`disconnectFromNode(nodeId)`	disconnect from a remote node
`setReconnectConfig(config)`	configure reconnection behavior
`setAuthConfig(config)`	configure auth and TLS
`submitTask(spec)`	submit a task to the scheduler
`startTaskWorker(handler)`	start a task processing worker
`onTaskComplete(callback)`	register task completion handler
`onTaskFailed(callback)`	register task failure handler
`createWorkerPool(handler, size)`	create a managed worker pool
`getNodeId()`	get this node's ID
`getProcessCount()`	get number of running processes

see examples/ for more.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
examples		examples
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

zukov

usage

concepts

process linking & monitoring

supervisors

distributed task scheduling

simple scheduler

worker pool

connection reconnection

authentication & encryption

shared secret handshake

TLS

api reference

ZukovRuntime

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

zukov

usage

concepts

process linking & monitoring

supervisors

distributed task scheduling

simple scheduler

worker pool

connection reconnection

authentication & encryption

shared secret handshake

TLS

api reference

ZukovRuntime

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages