feat(v2): init code + basic shred receiver by Sobeston · Pull Request #1204 · Syndica/sig

Sobeston · 2026-02-03T09:29:42Z

Working

Child process initialisation
Child process exit handling
- fmt printing error return traces
- fmt printing panics traces
memory sharing
high level of security
- regions are shared only as-needed, with write perms as-needed
- mseal to stop our shared regions from being later modified by anyone
- closing out all FDs, except for an optional stderr
- seccomp to ban almost all syscalls
  - write syscalls only allowed on single provided stderr FD
  - intending that these will be per-service once we have some
segfault/signal handling
a basic net service
a basic shred receiver service

Example output:

$ zig build run -- config/testnet.zig.zon 
config: .{ .cluster = .testnet, .leader_schedule_file = { 115, 99, 104, 101, 100, 117, 108, 101, 46, 116, 120, 116 }, .gossip = .{ .port = 8001 }, .shred_network = .{ .recv_port = 8002 } }
Initialising: .net_pair
Initialised: Region `net_pair` shared with [ shred_receiver_0 (rw), net_0 (rw), ]
Initialising: .leader_schedule
Initialised: Region `leader_schedule` shared with [ shred_receiver_0 (ro), ]
Starting Service `shred_receiver_0`, pid: 975397
Starting Service `net_0`, pid: 975398
(net)binding 0.0.0.0:8002
Waiting for shreds on port 8002
slot: 442890464
erasure_set_index: 160
index: 162
shred_type: .code

slot: 442890464
erasure_set_index: 160
index: 169
shred_type: .data

slot: 442890464
erasure_set_index: 160
index: 173
shred_type: .data

What a minimal service looks like:

const std = @import("std");
const start = @import("start");

comptime {
    _ = start;
}

pub const name = "prng";
pub const panic = start.panic;

pub const ReadWrite = struct {
    prng_state: *std.Random.Xoroshiro128,
};

pub fn main(writer: *std.io.Writer, rw: ReadWrite) !noreturn {
    _ = writer;

    rw.prng_state.seed(123);
    while (true) rw.prng_state.seed(rw.prng_state.next());
}

codecov · 2026-02-03T09:51:52Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
see 17 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

v2/src/services/logger.zig

v2/src/services/net.zig

v2/src/common/net.zig

v2/.gitignore

v2/build.zig.zon

Sobeston · 2026-02-18T17:06:39Z

Couple things

this is failing CI because I added a README.md, causing doc step to fail. Not sure what's up with that
v2-check step is failing as we're using zig 0.14.1 in CI, we can either update that in this PR or wait for chore: upgrade zig to 0.15 #1225

v2/src/common/net.zig

v2/src/common/shred/shred_type.zig

v2/src/common/ring.zig

v2/src/common/shred.zig

v2/src/services/net.zig

v2/src/services/shred_receiver.zig

v2/src/common/crypto/ed25519/pippenger.zig

v2/src/common/crypto/ed25519/straus.zig

v2/src/common/crypto/ed25519.zig

v2/src/common/shred.zig

v2/src/common.zig

v2/src/main.zig

v2/src/common/shred.zig

v2/src/common/shred/shred_type.zig

style

get seccomp violation traces

refactor remove unused style fix typo main: use gpa

kprotty · 2026-02-26T22:35:07Z

v2/build.zig

+    inline for (@import("src/services.zon")) |service_name| {
+        const service_mod = b.createModule(.{
+            .target = target,
+            .optimize = optimize,
+            .root_source_file = b.path("src/services").path(b, service_name ++ ".zig"),
+            .single_threaded = true,
+            .omit_frame_pointer = false,
+        });
+        service_mod.addImport("common", common);
+        service_mod.addImport("start", start_service);
+        service_mod.addImport("tracy", tracy);
+
+        const lib_svc = b.addLibrary(.{
+            .name = service_name,
+            .root_module = service_mod,
+            .use_llvm = true,
+        });
+        sig_init.linkLibrary(lib_svc);
+
+        const service_tests = b.addTest(.{ .root_module = service_mod, .name = service_name });
+        const service_tests_run = b.addRunArtifact(service_tests);
+        test_step.dependOn(&service_tests_run.step);
+    }
+
+    const validate_services_list_exe = b.addExecutable(.{
+        .name = "validate_services_list",
+        .root_module = b.createModule(.{
+            .root_source_file = b.path("scripts/validate_services_list.zig"),
+            .target = b.graph.host,
+            .optimize = .Debug,
+            .imports = &.{
+                .{
+                    .name = "services",
+                    .module = b.createModule(.{ .root_source_file = b.path("src/services.zon") }),
+                },
+            },
+        }),
+    });
+    const validate_services_list_run = b.addRunArtifact(validate_services_list_exe);
+    validate_services_list_run.addDirectoryArg(b.path("src/services"));
+    b.getInstallStep().dependOn(&validate_services_list_run.step);
+}


Could inline the service list here instead of the zon file. Would remove the need for the zon file + the validate_service_list script.

dnut · 2026-02-26T23:16:31Z

v2/src/common/shred.zig

+const SIZE_OF_MERKLE_ROOT: usize = Hash.SIZE;
+
+/// Analogous to [Shred](https://github.com/anza-xyz/agave/blob/8c5a33a81a0504fd25d0465bed35d153ff84819f/ledger/src/shred.rs#L245)
+pub const Shred = union(ShredType) {


Does this need to be in common? There's a good chance this won't be used outside the shred service. I'd rather keep things scoped in services and only move them into common when it's clearly needed by multiple services.

Keeping services lean and focused on the data structures and algorithms which facilitate their higher level logic is preferable. A specific data structure being used by only a single service is not a good justification for its implementation being defined within that service. For example we would not implement the zksdk inside the zk_elgamal program simply because that is the only place it is used at the moment. Instead the zk_elgamal program defines only its instruction type and execution function which together define its 'higher level logic' (i.e. what is it doing).

Let's consider a hypothetical data structure that defines a domain-specific concept which is only meaningful in the one very narrow context that is fully implemented by a single service. The concept has no relevance in any other context outside that service, and it never well. It's not general purpose code that will likely to be useful in any other context. In that case, it makes no sense for this data structure to exist in a common library. Can we agree on this much? My claim is that Shred follows this pattern.

dnut · 2026-02-26T23:20:57Z

v2/src/common/solana.zig

+test {
+    _ = std.testing.refAllDecls(@This());
+}


Is this going to be needed for every namespace that contains tests? I get that refAllDeclsRecursive isn't perfect, but I also don't like needing to add this to every file.

Also, I don't think you need to put this in a test {} block. This makes the test counts confusing because it adds another unit test for every instance of test {}. It's not a big problem but you could just put it in a comptime block instead to achieve the same thing. refAllDecls is a noop if it's not a test build.

Suggested change

test {

_ = std.testing.refAllDecls(@This());

}

comptime {

std.testing.refAllDecls(@This());

}

dnut · 2026-02-26T23:33:49Z

v2/src/common/solana/leader_schedule.zig

+const common = @import("../../common.zig");
+
+const std = @import("std");
+
+const Slot = common.solana.Slot;
+const Pubkey = common.solana.Pubkey;


Suggested change

const common = @import("../../common.zig");

const std = @import("std");

const Slot = common.solana.Slot;

const Pubkey = common.solana.Pubkey;

const std = @import("std");

const solana = @import("../solana.zig");

const Slot = solana.Slot;

const Pubkey = solana.Pubkey;

The code is more modular if it only reaches out into the nearest parent scope that's necessary to get its dependencies. We shouldn't make every file depend on the overall structure of common if it's not necessary. It'll be easier to refactor the code if things are more tightly scoped, and I don't see a downside to it.

dnut · 2026-02-26T23:37:17Z

v2/src/main.zig

+        defer diag.deinit(allocator);
+
+        break :cfg std.zon.parse.fromSlice(Config, allocator, cfg_str, &diag, .{}) catch |err| {
+            std.debug.print("{f}\n", .{diag});


should this be std.log?

dnut · 2026-02-26T23:55:20Z

v2/src/main.zig

+    const shared_regions: []const services.SharedRegion = &.{
+        .{
+            .region = .{ .net_pair = .{ .port = config.shred_network.recv_port } },
+            .shares = &.{
+                .{ .instance = .{ .service = .shred_receiver }, .rw = true },
+                .{ .instance = .{ .service = .net }, .rw = true },
+            },
+        },
+        .{
+            .region = .{ .leader_schedule = .{ .schedule_string = &reader.interface } },
+            .shares = &.{
+                .{ .instance = .{ .service = .shred_receiver } },
+            },
+        },
+    };


It's not in scope right now but I'm curious on your thoughts. It seems like the types and numbers of expected shares per service could actually be partly inferred from each serviceMain's parameters and validated at comptime against these shares.

dnut · 2026-02-27T00:36:31Z

v2/src/services.zig

+    var status: u32 = 0;
+    const exited_pid: i32 = pid: {
+        const ret: usize = linux.waitpid(-1, &status, 0);
+        std.debug.assert(ret != -1);


this is a tautology for usize.

is this more meaningful?

Suggested change

std.debug.assert(ret != -1);

std.debug.assert(ret != std.math.maxInt(usize));

or should we be checking e(ret) != .SUCCESS?

dnut · 2026-02-27T00:42:27Z

v2/src/start_service.zig

+    pub var stderr: std.os.linux.fd_t = undefined;
+    pub var exit: *common.Exit = undefined;


is it safe to set these to undefined? what if something tries to use one of these before they are set?

dnut · 2026-02-27T00:46:44Z

v2/src/common/shred.zig

+    }
+
+    pub fn get(self: *const MerkleProofEntryList, index: usize) ?MerkleProofEntry {
+        if (index > self.len) return null;


pre-existing bug

Suggested change

if (index > self.len) return null;

if (index >= self.len) return null;

dnut · 2026-02-27T01:06:20Z

v2/src/common/linux/bpf.zig

+}
+
+const SECCOMP = std.os.linux.SECCOMP;
+const syscalls = std.os.linux.syscalls.X64;


Suggested change

const syscalls = std.os.linux.syscalls.X64;

const syscalls = std.os.linux.SYS;

dnut · 2026-02-27T01:10:26Z

v2/src/services.zig

+        if (std.os.linux.syscall3(.close_range, 0, @intCast(stderr - 1), 0) != 0)
+            std.debug.panic("close_range failed\n", .{});
+        if (std.os.linux.syscall3(.close_range, @intCast(stderr + 1), max_fd, 0) != 0)
+            std.debug.panic("close_range failed\n", .{});


not that i would expect stderr to actually reach numbers this high or low, but saturating is equally correct and theoretically safer.

Suggested change

if (std.os.linux.syscall3(.close_range, 0, @intCast(stderr - 1), 0) != 0)

std.debug.panic("close_range failed\n", .{});

if (std.os.linux.syscall3(.close_range, @intCast(stderr + 1), max_fd, 0) != 0)

std.debug.panic("close_range failed\n", .{});

if (std.os.linux.syscall3(.close_range, 0, @intCast(stderr -| 1), 0) != 0)

std.debug.panic("close_range failed\n", .{});

if (std.os.linux.syscall3(.close_range, @intCast(stderr +| 1), max_fd, 0) != 0)

std.debug.panic("close_range failed\n", .{});

github-project-automation bot moved this to 🏗 In progress in Sig Feb 3, 2026

github-project-automation bot added this to Sig Feb 3, 2026

Sobeston self-assigned this Feb 3, 2026

Sobeston requested review from InKryption, Rexicon226 and kprotty February 3, 2026 09:30

Sobeston marked this pull request as ready for review February 5, 2026 06:45

Sobeston requested review from dnut, ultd and yewman as code owners February 5, 2026 06:45

dnut requested changes Feb 6, 2026

View reviewed changes

v2/src/services/logger.zig Outdated Show resolved Hide resolved

v2/src/services/net.zig Outdated Show resolved Hide resolved

v2/src/common/net.zig Show resolved Hide resolved

v2/.gitignore Outdated Show resolved Hide resolved

v2/build.zig.zon Show resolved Hide resolved

github-project-automation bot moved this from 🏗 In progress to 👀 In review in Sig Feb 6, 2026

Sobeston force-pushed the sobe/v2 branch from 14abdcf to da933aa Compare February 18, 2026 01:17

Sobeston changed the title ~~feat(v2): init code~~ feat(v2): init code + basic shred receiver Feb 18, 2026

Sobeston force-pushed the sobe/v2 branch 3 times, most recently from f2abeda to eba600b Compare February 18, 2026 04:58

Sobeston requested a review from dnut February 18, 2026 17:04

Sobeston requested a review from jbuckmccready February 18, 2026 17:10

kprotty reviewed Feb 19, 2026

View reviewed changes

InKryption requested changes Feb 19, 2026

View reviewed changes

Sobeston requested review from InKryption and kprotty February 20, 2026 19:33

yewman reviewed Feb 23, 2026

View reviewed changes

v2/src/common/shred.zig Show resolved Hide resolved

v2/src/common/shred/shred_type.zig Outdated Show resolved Hide resolved

Sobeston and others added 4 commits February 27, 2026 01:07

init v2

9cab52c

style

integrate networking tile / ping demo

beac9dc

add segfault + seccomp violation handling

a7041dc

get seccomp violation traces

make service startup generic + clean up main

1099218

refactor remove unused style fix typo main: use gpa

Sobeston added 16 commits February 27, 2026 01:07

remove allocators from shred types

68019e2

tracy: expose no_exit and on_demand

190f68c

add some zones / remove redundant compute

f9891b8

services: use std.logger

c31d314

GPA -> DebugAllocator

7f555a8

shred: cleanup

3c2b8de

fix outdated comment

3ed742e

fix: tracy set threadname before seccomp enabled

f9b5f20

remove snake_case errors

91961e4

fix incorrect Hash comment

797ab9f

fix Pubkey.initRandom

f4ac30e

ring: clean up magic

404a77c

address misc comments + banish boundedarray

275ec94

LeaderSchedule.fromCommand cleanup

9c8bd89

update fmt api + update style

ceecc3c

further specify net socket open

5e150ac

Sobeston force-pushed the sobe/v2 branch from 8f39369 to 5e150ac Compare February 26, 2026 17:08

Sobeston and others added 6 commits February 27, 2026 01:24

copy v2 readme to docusaurus

4321fe0

add style check for v2 / fix checks

28d8af6

remove function that copies out of ring

7479d3b

remove extra shred files

14ce530

v2 CI: use zig build ci step

17c3e40

Move dir iter out of build.zig into a subprocess

9a5c471

Sobeston requested a review from yewman February 26, 2026 21:52

kprotty reviewed Feb 26, 2026

View reviewed changes

dnut requested changes Feb 27, 2026

View reviewed changes

This was linked to issues Feb 27, 2026

feat: v2 sig init #1195

Open

feat: networking service #1184

Open

feat(shred): v2 service #1187

Open

feat: v2 IPC #1185

Open

	std.debug.assert(ret != -1);
	std.debug.assert(ret != std.math.maxInt(usize));

		pub var stderr: std.os.linux.fd_t = undefined;
		pub var exit: *common.Exit = undefined;

	if (index > self.len) return null;
	if (index >= self.len) return null;

	const syscalls = std.os.linux.syscalls.X64;
	const syscalls = std.os.linux.SYS;

Conversation

Sobeston commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Sobeston commented Feb 18, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Sobeston commented Feb 3, 2026 •

edited

Loading

codecov bot commented Feb 3, 2026 •

edited

Loading