feat+chore: refactor Azure UserRoleAssignments post-processing logic#2396
feat+chore: refactor Azure UserRoleAssignments post-processing logic#2396
Conversation
* More aggressive caching of bitmaps following pprof top10 guidance for hotspots * Break out components of old analysis tools into bespoke `post` package * Author better async tooling for post-processed relationships in `sink.go` * Author graph entity tracking via xxhash tooling to allow `sink.go` to filter only changes to the database * Remove UserRoleAssignment post-processed edges from legacy `DeleteTransitEdges` function * Update interfaces, add documentation * Added usage of ImmutableDuplex[T uint32 | uint64] to prevent mutation of cached bitmaps * Removed old code no longer used * Added metrics tooling to make global registration of new metrics easier * Added more trace logging and metric features
|
Howdy! Thank you for opening this pull request 🙇 It looks like your pull request title needs some adjustment. Details: |
📝 WalkthroughWalkthroughThis PR refactors the post-processing infrastructure by consolidating stats types and relationship jobs into a new Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Analysis Client
participant Sink as FilteredRelationshipSink
participant Filter as Delta Filter Worker
participant Insert as Insert Worker
participant DB as Graph Database
participant Delta as Delta Tracker
Client->>Sink: Submit(EnsureRelationshipJob)
activate Sink
Sink->>Sink: jobC <- job
Sink->>Filter: (concurrent) read from jobC
activate Filter
Filter->>Delta: HasEdge(start, end, kind)
Delta-->>Filter: bool (seen/unseen)
alt Edge not previously tracked
Filter->>Insert: send to insertC
else Edge already tracked
Filter->>Sink: metrics.filtered++
end
deactivate Filter
activate Insert
Insert->>Insert: batch accumulate relationships
Insert->>DB: batch.CreateRelationshipByIDs(...)
DB-->>Insert: success/error
Insert->>Sink: stats.RelationshipsCreated[kind]++
deactivate Insert
Client->>Sink: Done()
Sink->>Sink: close jobC, wait workers
Sink->>Delta: Deleted() → missing edge IDs
Sink->>DB: batch delete missing edges
DB-->>Sink: deleted count
Sink->>Sink: stats.RelationshipsDeleted[kind]++
Sink-->>Client: Stats()
deactivate Sink
Estimated code review effort🎯 4 (Complex) | ⏱️ ~90 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/go/analysis/post.go (1)
81-85:⚠️ Potential issue | 🟡 MinorPre-existing bug:
LogStatsearly-return logic is inverted.The comment says "Only output stats during debug runs," but the code returns early when debug is accepted — meaning stats are never logged during debug runs, only when debug is not accepted. This is the opposite of the stated intent.
func (s PostProcessingStats) LogStats() { // Only output stats during debug runs - if level.GlobalAccepts(slog.LevelDebug) { + if !level.GlobalAccepts(slog.LevelDebug) { return }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/analysis/post.go` around lines 81 - 85, The early-return in PostProcessingStats.LogStats is inverted: it currently returns when debug logging is enabled; change the condition so the method returns unless debug is enabled (e.g., use the negated check on level.GlobalAccepts or swap branches) so that stats are only skipped when debug is not accepted and logged when debug is enabled; update the if in LogStats accordingly.
🧹 Nitpick comments (12)
packages/go/trace/trace.go (2)
34-47:case anyis equivalent todefault— preferdefaultfor clarity.In a Go type switch,
case anymatches every type since all types satisfy theany(interface{}) constraint. Usingdefaultmakes the intent explicit.♻️ Suggested simplification
func combineArgs(args ...any) []any { var all []any for _, arg := range args { switch typedArg := arg.(type) { case []any: all = append(all, typedArg...) - case any: + default: all = append(all, typedArg) } } return all }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/trace/trace.go` around lines 34 - 47, The type switch in combineArgs uses "case any" which is equivalent to default; replace the "case any" arm with a "default" arm to make intent explicit and clearer, keeping the behavior of appending the single typedArg to all unchanged; update the switch in the combineArgs function accordingly.
115-119: Using a PrometheusCounterto accumulate elapsed seconds — consider aHistograminstead.
metrics.Counter(...).Add(elapsed.Seconds())accumulates total seconds, which works but loses individual call distribution data. AHistogramis the idiomatic Prometheus instrument for latency tracking, giving you percentiles, count, and sum in one metric. If simplicity is the goal here, the current approach is valid—just flagging for awareness.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/trace/trace.go` around lines 115 - 119, The current code uses metrics.Counter(...).Add(elapsed.Seconds()) inside the hasTraceCtx branch which accumulates total seconds but loses latency distribution; replace this with a Histogram (or Summary) metric to record per-call latency for percentiles and better observability: locate the block guarded by hasTraceCtx and the call to metrics.Counter("function_trace", traceCtx.Namespace, map[string]string{"fn": function}).Add(elapsed.Seconds()) and change it to use the library's Histogram recorder (e.g., metrics.Histogram("function_trace", traceCtx.Namespace, ...) or equivalent) and observe/record elapsed.Seconds() on that histogram, preserving the same labels (fn and traceCtx.Namespace).packages/go/analysis/ad/post.go (1)
51-51: Returning uninitializedAtomicPostProcessingStatson error paths has nil maps and nil mutex.On error returns (e.g., line 51:
return &post.AtomicPostProcessingStats{}, err), the struct hasnilmaps and anilmutex. If a caller inadvertently callsMergeorAddRelationshipsCreatedon this value without checking the error first, it will panic. Consider usingpost.NewAtomicPostProcessingStats()for defensive initialization, or document that the returned stats must not be used when err is non-nil.This is a pre-existing pattern carried forward, so not urgent.
Also applies to: 95-95, 137-137, 183-183
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/analysis/ad/post.go` at line 51, Replace all error-path returns that create a bare &post.AtomicPostProcessingStats{} with a defensive initializer post.NewAtomicPostProcessingStats() so the returned struct has non-nil maps and a initialized mutex; specifically update the returns at the locations producing "&post.AtomicPostProcessingStats{}, err" (also seen at the other occurrences) to "post.NewAtomicPostProcessingStats(), err". This ensures callers can safely call methods like Merge or AddRelationshipsCreated on the returned stats even if err is non-nil.packages/go/analysis/post/stats.go (1)
79-98:Mergealiases counter pointers fromother— mutations to either side will affect both.On line 85 (and 93), when a key is absent in
s, the raw*int32pointer fromotheris stored directly. After the merge,sandothershare the same counter, so subsequentAddRelationshipsCreated/Deletedcalls on either side will silently mutate the other. Ifotheris always discarded after merge this is safe, but a defensive copy avoids subtle bugs.♻️ Copy the value instead of sharing the pointer
for key, value := range other.RelationshipsCreated { if val, ok := s.RelationshipsCreated[key]; !ok { - s.RelationshipsCreated[key] = value + copied := *value + s.RelationshipsCreated[key] = &copied } else { atomic.AddInt32(val, *value) } } for key, value := range other.RelationshipsDeleted { if val, ok := s.RelationshipsDeleted[key]; !ok { - s.RelationshipsDeleted[key] = value + copied := *value + s.RelationshipsDeleted[key] = &copied } else { atomic.AddInt32(val, *value) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/analysis/post/stats.go` around lines 79 - 98, The Merge method currently stores raw *int32 pointers from other into s (in AtomicPostProcessingStats.RelationshipsCreated and RelationshipsDeleted), causing shared counters; instead, when a key is absent, allocate a new int32, initialize it with the current value read via atomic.LoadInt32(otherValue) and store that new pointer into s so counters are independent; keep using atomic.AddInt32 for the existing-key path. Target the Merge function and the maps RelationshipsCreated / RelationshipsDeleted and use atomic.LoadInt32 to copy values rather than assigning other’s pointer.packages/go/bhlog/attr/attr.go (1)
51-73: Add doc comments to new exported functions for consistency.The existing functions in this file (
Error,Namespace,Scope,Function) all have doc comments describing their purpose and the log field they populate. The six new exported functions (Operation,Enter,Exit,Elapsed,ElapsedSince,Measurement) lack doc comments, breaking the established pattern.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/bhlog/attr/attr.go` around lines 51 - 73, Add missing doc comments for the six exported functions so they match the existing pattern (describe purpose and the log field they populate); add a short comment above Operation explaining it returns a slog.Attr for the "operation" field, above Enter and Exit explaining they return slog.Attr for the "state" field with values "enter" and "exit", above Elapsed explaining it returns a slog.Attr for the "elapsed" duration field, above ElapsedSince explaining it computes elapsed since the given time and returns the same "elapsed" field, and above Measurement explaining it returns a slog.Attr for the "measurement" uint64 id field; ensure comments are formatted as Go doc comments immediately preceding the functions Operation, Enter, Exit, Elapsed, ElapsedSince, and Measurement.packages/go/metrics/registry.go (2)
163-164:Unregistersilently discards theboolreturn value.
prometheus.Registry.Unregisterreturnsboolindicating whether the collector was found. The wrapper drops this, which may confuse callers who need to know if un-registration succeeded.Proposed fix
-func Unregister(collector prometheus.Collector) { - Registerer().Unregister(collector) +func Unregister(collector prometheus.Collector) bool { + return Registerer().Unregister(collector) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/metrics/registry.go` around lines 163 - 164, The Unregister wrapper currently discards the bool returned by Registerer().Unregister(collector); change the Unregister function signature to return bool and propagate the underlying return value so callers can observe whether the collector was actually removed. Update the function named Unregister to return the bool from Registerer().Unregister(collector) (references: Unregister, Registerer(), prometheus.Collector) and adjust any call sites/tests that expect the old void signature to handle the returned bool.
151-157:NewCounter/NewGaugebypass the dedup cache — risk of duplicate-registration panic.These helpers register directly via
promauto.With()without consulting thecounters/gaugesmaps. If the same metric opts are ever passed to both the cached (Counter()/Gauge()) and uncached (NewCounter()/NewGauge()) paths,promautowill panic on the duplicate. Consider either routing these through the cache or documenting the mutual-exclusivity contract.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/metrics/registry.go` around lines 151 - 157, NewCounter and NewGauge currently register metrics directly via promauto.With(Registerer()) and bypass the deduplication maps (counters/gauges), risking prometheus duplicate-registration panics if the same opts are used elsewhere; change these helpers to route through the existing cached factory functions (call Counter(opts) and Gauge(opts) respectively) so metric registration is unified via the counters/gauges maps and Registerer(), or if truly separate semantics are required, implement the same lookup/insert logic against the counters/gauges maps used by Counter()/Gauge() and/or document the mutual-exclusivity contract clearly.packages/go/analysis/post_operation.go (1)
42-69: Refactored writer with conditional property cloning looks correct.The base
relPropis created once and reused for jobs without custom properties, withClone()used whenRelPropertiesis non-empty. This avoids unnecessary allocations. The trace instrumentation is properly deferred.One minor readability nit:
relProp(singular, the template) vsrelProps(plural, the per-job instance) differ by a singlesand are easy to confuse. Consider e.g.baseRelProps/relPropsorrelPropTemplate/relProps.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/analysis/post_operation.go` around lines 42 - 69, The variable names relProp and relProps in NewPostRelationshipOperation are easy to confuse; rename the template variable (currently relProp returned from NewPropertiesWithLastSeen) to something clearer like baseRelProps or relPropTemplate and keep the per-job instance as relProps so intent is obvious; update all references including the Clone() call, Set() calls when copying nextJob.RelProperties, and the CreateRelationshipByIDs call to use the new template name to improve readability without changing logic.packages/go/analysis/post/sink.go (1)
182-190:workerIDis captured but never used.The
workerIDparameter in the goroutine closure (Line 186) is passed through butdeltaFilterWorkerdoesn't accept or use it. This is dead code from the FIXME heuristic. Consider removing it or adding it to trace/log output for debugging concurrent filter workers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/analysis/post/sink.go` around lines 182 - 190, The loop captures and passes workerID into the goroutine closure but deltaFilterWorker doesn't use it, so remove the unused workerID capture: change the goroutine to use go func() { defer filterWG.Done(); s.deltaFilterWorker(ctx, filterC, insertC) }() and eliminate the workerID parameter in the closure (and any dead assignment); alternatively, if you want an identifier for tracing, add an int parameter to deltaFilterWorker (e.g., workerID) and propagate it from the loop, update its signature and any call sites, and emit a trace/log using that ID inside deltaFilterWorker; reference symbols: runtime.NumCPU, workerID, filterWG, filterC, insertC, s.deltaFilterWorker.packages/go/analysis/delta/tracker.go (1)
98-115: Consider always returning the encoder to the pool.If
sync.Poolreturns nil or a wrong type, the newly created encoder isn’t put back, which defeats pooling for that call.♻️ Suggested pooling fix
func (s *KeyEncoderPool) EdgeKey(start, end uint64, kind graph.Kind) uint64 { - var ( - raw = s.encoders.Get() - encoder, typeOK = raw.(*KeyEncoder) - ) - - if !typeOK { - encoder = NewKeyEncoder() - } - - key := encoder.EdgeKey(start, end, kind) - - if typeOK { - s.encoders.Put(raw) - } - - return key + raw := s.encoders.Get() + encoder, ok := raw.(*KeyEncoder) + if !ok || encoder == nil { + encoder = NewKeyEncoder() + } + + key := encoder.EdgeKey(start, end, kind) + s.encoders.Put(encoder) + return key }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/analysis/delta/tracker.go` around lines 98 - 115, The EdgeKey method in KeyEncoderPool may fail to return a freshly created encoder to s.encoders when sync.Pool.Get() yields nil or a wrong type; update KeyEncoderPool.EdgeKey so that regardless of whether raw was a *KeyEncoder or NewKeyEncoder() was used, the encoder instance is Put back into s.encoders after computing key (use the same encoder variable), and ensure you Put(raw) only when raw was the original pooled object (typeOK) or otherwise Put the newly created encoder to preserve pooling behavior; reference KeyEncoderPool.EdgeKey, s.encoders, raw, encoder, typeOK, KeyEncoder, and NewKeyEncoder.packages/go/analysis/azure/post.go (2)
880-924: DuplicatedisRoleAssignableproperty lookup and error handling.The
isRoleAssignableproperty is fetched from the sametenantGroupnode twice (lines 880 and 903) with identical error handling. The only difference between the two blocks is callingUsersWithRolevsServicePrincipalsWithRole. This can be consolidated into a single property read.Proposed refactor
- if isRoleAssignable, err := tenantGroup.Properties.Get(azure.IsAssignableToRole.String()).Bool(); err != nil { - if graph.IsErrPropertyNotFound(err) { - slog.WarnContext( - ctx, - "Node is missing property", - slog.Uint64("node_id", tenantGroup.ID.Uint64()), - slog.String("property", azure.IsAssignableToRole.String()), - ) - } else { - return err - } - } else if !isRoleAssignable { - roleAssignments.UsersWithRole(AddMemberGroupNotRoleAssignableTargetRoles()...).Each(func(nextID uint64) bool { - nextJob := post.EnsureRelationshipJob{ - FromID: graph.ID(nextID), - ToID: tenantGroupID, - Kind: azure.AddMembers, - } - - return sink.Submit(ctx, nextJob) - }) - } - - if isRoleAssignable, err := tenantGroup.Properties.Get(azure.IsAssignableToRole.String()).Bool(); err != nil { - if graph.IsErrPropertyNotFound(err) { - slog.WarnContext( - ctx, - "Node is missing property", - slog.Uint64("node_id", tenantGroup.ID.Uint64()), - slog.String("property", azure.IsAssignableToRole.String()), - ) - } else { - return err - } - } else if !isRoleAssignable { - roleAssignments.ServicePrincipalsWithRole(AddMemberGroupNotRoleAssignableTargetRoles()...).Each(func(nextID uint64) bool { - nextJob := post.EnsureRelationshipJob{ - FromID: graph.ID(nextID), - ToID: tenantGroupID, - Kind: azure.AddMembers, - } - - return sink.Submit(ctx, nextJob) - }) - } + if isRoleAssignable, err := tenantGroup.Properties.Get(azure.IsAssignableToRole.String()).Bool(); err != nil { + if graph.IsErrPropertyNotFound(err) { + slog.WarnContext( + ctx, + "Node is missing property", + slog.Uint64("node_id", tenantGroup.ID.Uint64()), + slog.String("property", azure.IsAssignableToRole.String()), + ) + } else { + return err + } + } else if !isRoleAssignable { + submitAddMember := func(nextID uint64) bool { + nextJob := post.EnsureRelationshipJob{ + FromID: graph.ID(nextID), + ToID: tenantGroupID, + Kind: azure.AddMembers, + } + return sink.Submit(ctx, nextJob) + } + + roleAssignments.UsersWithRole(AddMemberGroupNotRoleAssignableTargetRoles()...).Each(submitAddMember) + roleAssignments.ServicePrincipalsWithRole(AddMemberGroupNotRoleAssignableTargetRoles()...).Each(submitAddMember) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/analysis/azure/post.go` around lines 880 - 924, Read azure.IsAssignableToRole once and handle its error in a single block: call tenantGroup.Properties.Get(azure.IsAssignableToRole.String()).Bool() into isRoleAssignable, check graph.IsErrPropertyNotFound and return on other errors, then if !isRoleAssignable run both roleAssignments.UsersWithRole(...).Each(...) and roleAssignments.ServicePrincipalsWithRole(...).Each(...) to submit post.EnsureRelationshipJob via sink.Submit; remove the duplicated property-read/error-handling block so only one check for isRoleAssignable exists.
938-969: Ignored error return frompostAzureGlobalAdmins.Line 957 discards the
errorreturn frompostAzureGlobalAdmins. Although the current implementation always returnsnil, silently ignoring a returned error is fragile — if the function body is later updated to surface a real error, this call site will swallow it.Consider handling it consistently with the other calls in this block (e.g.,
postAzureResetPasswordon line 954 andpostAzureAddMemberson line 961).Proposed fix
- postAzureGlobalAdmins(ctx, sink, roleAssignments, tenant) + if err := postAzureGlobalAdmins(ctx, sink, roleAssignments, tenant); err != nil { + return &post.AtomicPostProcessingStats{}, err + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/go/analysis/azure/post.go` around lines 938 - 969, The call to postAzureGlobalAdmins is currently ignoring its error return; update the call in UserRoleAssignments to capture and handle the error consistently with the surrounding calls (e.g., check the returned error from postAzureGlobalAdmins and either return it like postAzureResetPassword does or log it like postAzureAddMembers does), so replace the bare call to postAzureGlobalAdmins(ctx, sink, roleAssignments, tenant) with error handling that uses the returned error value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/api/src/analysis/ad/post.go`:
- Around line 16-30: Remove the corrupted duplicated license block that was
introduced by the bad find-and-replace (the block containing the garbled tokens
like "post.CreatePostRelationshipJob", e.g., "Copyright
2post.CreatePostRelationshipJob23" and
"Apache-2.post.CreatePostRelationshipJob"); keep only the original correct
license header at the top of cmd/api/src/analysis/ad/post.go and delete the
entire garbled lines (the duplicate block containing
"post.CreatePostRelationshipJob") so the file contains a single valid Apache-2.0
license header.
In `@packages/go/analysis/azure/role.go`:
- Around line 103-108: The code uses the nonexistent type
cardinality.ImmutableDuplex[uint64]; update the RoleAssignments struct fields
(TenantPrincipals? keep as-is, and replace users, usersWithAnyRole,
usersWithoutRoles, servicePrincipals) to use cardinality.Duplex[uint64] instead
of ImmutableDuplex[uint64], and update any methods that currently return
cardinality.ImmutableDuplex[uint64] (the RoleAssignments getters) to return
cardinality.Duplex[uint64]; if you need immutability semantics, call Clone() on
the Duplex before returning to emulate an immutable view rather than relying on
a non-existent type.
In `@packages/go/analysis/hybrid/hybrid.go`:
- Line 52: The error return constructs an empty AtomicPostProcessingStats with
nil maps/mutex which can panic later; replace the literal
&post.AtomicPostProcessingStats{} with the proper constructor (e.g.,
post.NewAtomicPostProcessingStats()) wherever you return stats on error in
hybrid.go so the RelationshipsCreated, RelationshipsDeleted and mutex are
initialized before returning the error.
In `@packages/go/analysis/post/sink.go`:
- Around line 93-102: The stats and metric increments are executed even when
batch.CreateRelationshipByIDs fails; change the flow so
s.stats.AddRelationshipsCreated(nextJob.Kind, 1) and
postOperationsVec.With(...).Add(1) only run when err == nil. Specifically, in
the block around batch.CreateRelationshipByIDs(nextJob.FromID, nextJob.ToID,
nextJob.Kind, relProps), check the returned err and, on error, log via
slog.Error as you already do and return/continue, otherwise call
s.stats.AddRelationshipsCreated(...) and postOperationsVec.With(...).Add(1) so
they are only counted on successful creation.
- Around line 216-219: The goroutine started in FilteredRelationshipSink.start
currently calls s.worker(ctx) and discards any returned error (errors from
deleteMissingEdges inside worker are lost); update the implementation so the
worker's error is captured (e.g., assign to a new field like s.workerErr or send
it on an internal err channel), ensure s.worker still decrements the wait group,
and then surface that error from Done() (or at minimum log it from the
goroutine) so callers can observe failures from deleteMissingEdges; update the
Done method to return the stored error (or read from the err channel) and
document the behavior in comments next to FilteredRelationshipSink.worker and
FilteredRelationshipSink.Done.
- Around line 137-160: deleteMissingEdges currently deletes by ID without
updating per-kind deletion stats; fix by retrieving kind info for each deleted
relationship before or during deletion and updating s.stats accordingly. Change
the deletion flow in deleteMissingEdges to first resolve the deleted IDs to
their Relationship.Kind (e.g., via a lookup like s.db.GetRelationshipsByIDs or
adding kind to edgeTracker.Deleted() return), aggregate counts per kind, call
s.stats.AddRelationshipsDeleted(kind, count) for each kind, then proceed with
s.db.BatchOperation/batch.DeleteRelationship as before; reference
functions/fields: deleteMissingEdges, s.edgeTracker.Deleted,
s.db.BatchOperation, batch.DeleteRelationship, and
s.stats.AddRelationshipsDeleted.
In `@packages/go/analysis/post/stats.go`:
- Around line 100-104: The early-return condition in
AtomicPostProcessingStats.LogStats is inverted causing no logs to be emitted in
debug; change the guard to return only when debug is NOT enabled by negating the
call (use if !level.GlobalAccepts(slog.LevelDebug) { return }) so LogStats
proceeds when debug is active.
In `@packages/go/metrics/registry.go`:
- Around line 71-89: CounterVec currently builds the cache key with
metricKey(name, namespace, constLabels) which ignores variableLabelNames so
different variableLabelNames collide; update registry.CounterVec to incorporate
variableLabelNames into the cache key (e.g., serialize/join variableLabelNames
into a deterministic string and append to metricKey or add a new
metricKeyWithVars helper) before looking up/setting s.counterVecs[key] so each
unique (name, namespace, constLabels, variableLabelNames) maps to its own
*prometheus.CounterVec; keep using s.lock and s.counterVecs and ensure the
serialized form is deterministic (preserve order) to avoid accidental
collisions.
- Around line 35-47: The metricKey function is non-deterministic and can produce
collisions because it iterates map[string]string without ordering and omits
separators; update metricKey to build a deterministic key by collecting and
sorting the label keys (use the sort package), then append namespace and name
with explicit separators (e.g., ":" or "|") and append each label as key + "=" +
value with separators between pairs; ensure you handle nil/empty labels
consistently so the same label set always yields the same string.
In `@packages/go/trace/trace.go`:
- Around line 84-128: In Function (trace.go) the startArgs are being merged into
commonArgs and then passed again to slog.Log, causing duplicate fields; remove
startArgs from the initial combineArgs call that builds commonArgs (leave only
attr.Scope, attr.Function and the traceCtx additions), and keep startArgs only
in the slog.Log calls (Enter and Exit) where you currently pass
startArgs/exitArgs; update the combineArgs invocations around slog.Log to
include commonArgs plus startArgs (for Enter) and commonArgs plus
attr.Elapsed(...) plus exitArgs (for Exit) so startArgs are emitted exactly
once.
---
Outside diff comments:
In `@packages/go/analysis/post.go`:
- Around line 81-85: The early-return in PostProcessingStats.LogStats is
inverted: it currently returns when debug logging is enabled; change the
condition so the method returns unless debug is enabled (e.g., use the negated
check on level.GlobalAccepts or swap branches) so that stats are only skipped
when debug is not accepted and logged when debug is enabled; update the if in
LogStats accordingly.
---
Nitpick comments:
In `@packages/go/analysis/ad/post.go`:
- Line 51: Replace all error-path returns that create a bare
&post.AtomicPostProcessingStats{} with a defensive initializer
post.NewAtomicPostProcessingStats() so the returned struct has non-nil maps and
a initialized mutex; specifically update the returns at the locations producing
"&post.AtomicPostProcessingStats{}, err" (also seen at the other occurrences) to
"post.NewAtomicPostProcessingStats(), err". This ensures callers can safely call
methods like Merge or AddRelationshipsCreated on the returned stats even if err
is non-nil.
In `@packages/go/analysis/azure/post.go`:
- Around line 880-924: Read azure.IsAssignableToRole once and handle its error
in a single block: call
tenantGroup.Properties.Get(azure.IsAssignableToRole.String()).Bool() into
isRoleAssignable, check graph.IsErrPropertyNotFound and return on other errors,
then if !isRoleAssignable run both roleAssignments.UsersWithRole(...).Each(...)
and roleAssignments.ServicePrincipalsWithRole(...).Each(...) to submit
post.EnsureRelationshipJob via sink.Submit; remove the duplicated
property-read/error-handling block so only one check for isRoleAssignable
exists.
- Around line 938-969: The call to postAzureGlobalAdmins is currently ignoring
its error return; update the call in UserRoleAssignments to capture and handle
the error consistently with the surrounding calls (e.g., check the returned
error from postAzureGlobalAdmins and either return it like
postAzureResetPassword does or log it like postAzureAddMembers does), so replace
the bare call to postAzureGlobalAdmins(ctx, sink, roleAssignments, tenant) with
error handling that uses the returned error value.
In `@packages/go/analysis/delta/tracker.go`:
- Around line 98-115: The EdgeKey method in KeyEncoderPool may fail to return a
freshly created encoder to s.encoders when sync.Pool.Get() yields nil or a wrong
type; update KeyEncoderPool.EdgeKey so that regardless of whether raw was a
*KeyEncoder or NewKeyEncoder() was used, the encoder instance is Put back into
s.encoders after computing key (use the same encoder variable), and ensure you
Put(raw) only when raw was the original pooled object (typeOK) or otherwise Put
the newly created encoder to preserve pooling behavior; reference
KeyEncoderPool.EdgeKey, s.encoders, raw, encoder, typeOK, KeyEncoder, and
NewKeyEncoder.
In `@packages/go/analysis/post_operation.go`:
- Around line 42-69: The variable names relProp and relProps in
NewPostRelationshipOperation are easy to confuse; rename the template variable
(currently relProp returned from NewPropertiesWithLastSeen) to something clearer
like baseRelProps or relPropTemplate and keep the per-job instance as relProps
so intent is obvious; update all references including the Clone() call, Set()
calls when copying nextJob.RelProperties, and the CreateRelationshipByIDs call
to use the new template name to improve readability without changing logic.
In `@packages/go/analysis/post/sink.go`:
- Around line 182-190: The loop captures and passes workerID into the goroutine
closure but deltaFilterWorker doesn't use it, so remove the unused workerID
capture: change the goroutine to use go func() { defer filterWG.Done();
s.deltaFilterWorker(ctx, filterC, insertC) }() and eliminate the workerID
parameter in the closure (and any dead assignment); alternatively, if you want
an identifier for tracing, add an int parameter to deltaFilterWorker (e.g.,
workerID) and propagate it from the loop, update its signature and any call
sites, and emit a trace/log using that ID inside deltaFilterWorker; reference
symbols: runtime.NumCPU, workerID, filterWG, filterC, insertC,
s.deltaFilterWorker.
In `@packages/go/analysis/post/stats.go`:
- Around line 79-98: The Merge method currently stores raw *int32 pointers from
other into s (in AtomicPostProcessingStats.RelationshipsCreated and
RelationshipsDeleted), causing shared counters; instead, when a key is absent,
allocate a new int32, initialize it with the current value read via
atomic.LoadInt32(otherValue) and store that new pointer into s so counters are
independent; keep using atomic.AddInt32 for the existing-key path. Target the
Merge function and the maps RelationshipsCreated / RelationshipsDeleted and use
atomic.LoadInt32 to copy values rather than assigning other’s pointer.
In `@packages/go/bhlog/attr/attr.go`:
- Around line 51-73: Add missing doc comments for the six exported functions so
they match the existing pattern (describe purpose and the log field they
populate); add a short comment above Operation explaining it returns a slog.Attr
for the "operation" field, above Enter and Exit explaining they return slog.Attr
for the "state" field with values "enter" and "exit", above Elapsed explaining
it returns a slog.Attr for the "elapsed" duration field, above ElapsedSince
explaining it computes elapsed since the given time and returns the same
"elapsed" field, and above Measurement explaining it returns a slog.Attr for the
"measurement" uint64 id field; ensure comments are formatted as Go doc comments
immediately preceding the functions Operation, Enter, Exit, Elapsed,
ElapsedSince, and Measurement.
In `@packages/go/metrics/registry.go`:
- Around line 163-164: The Unregister wrapper currently discards the bool
returned by Registerer().Unregister(collector); change the Unregister function
signature to return bool and propagate the underlying return value so callers
can observe whether the collector was actually removed. Update the function
named Unregister to return the bool from Registerer().Unregister(collector)
(references: Unregister, Registerer(), prometheus.Collector) and adjust any call
sites/tests that expect the old void signature to handle the returned bool.
- Around line 151-157: NewCounter and NewGauge currently register metrics
directly via promauto.With(Registerer()) and bypass the deduplication maps
(counters/gauges), risking prometheus duplicate-registration panics if the same
opts are used elsewhere; change these helpers to route through the existing
cached factory functions (call Counter(opts) and Gauge(opts) respectively) so
metric registration is unified via the counters/gauges maps and Registerer(), or
if truly separate semantics are required, implement the same lookup/insert logic
against the counters/gauges maps used by Counter()/Gauge() and/or document the
mutual-exclusivity contract clearly.
In `@packages/go/trace/trace.go`:
- Around line 34-47: The type switch in combineArgs uses "case any" which is
equivalent to default; replace the "case any" arm with a "default" arm to make
intent explicit and clearer, keeping the behavior of appending the single
typedArg to all unchanged; update the switch in the combineArgs function
accordingly.
- Around line 115-119: The current code uses
metrics.Counter(...).Add(elapsed.Seconds()) inside the hasTraceCtx branch which
accumulates total seconds but loses latency distribution; replace this with a
Histogram (or Summary) metric to record per-call latency for percentiles and
better observability: locate the block guarded by hasTraceCtx and the call to
metrics.Counter("function_trace", traceCtx.Namespace, map[string]string{"fn":
function}).Add(elapsed.Seconds()) and change it to use the library's Histogram
recorder (e.g., metrics.Histogram("function_trace", traceCtx.Namespace, ...) or
equivalent) and observe/record elapsed.Seconds() on that histogram, preserving
the same labels (fn and traceCtx.Namespace).
| // Copyright 2post.CreatePostRelationshipJob23 Specter Ops, Inc. | ||
| // | ||
| // Licensed under the Apache License, Version 2.post.CreatePostRelationshipJob | ||
| // you may not use this file except in compliance with the License. | ||
| // You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.post.CreatePostRelationshipJob | ||
| // | ||
| // Unless required by applicable law or agreed to in writing, software | ||
| // distributed under the License is distributed on an "AS IS" BASIS, | ||
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
| // | ||
| // SPDX-License-Identifier: Apache-2.post.CreatePostRelationshipJob |
There was a problem hiding this comment.
Corrupted duplicate license header — bad search-and-replace artifact.
Lines 16–30 contain a second, garbled license block where 0 was replaced with post.CreatePostRelationshipJob (e.g., "Copyright 2post.CreatePostRelationshipJob23", "Apache-2.post.CreatePostRelationshipJob"). This is clearly a find-and-replace accident during the refactor. Remove the entire duplicate block.
Proposed fix
// SPDX-License-Identifier: Apache-2.0
-// Copyright 2post.CreatePostRelationshipJob23 Specter Ops, Inc.
-//
-// Licensed under the Apache License, Version 2.post.CreatePostRelationshipJob
-// you may not use this file except in compliance with the License.
-// You may obtain a copy of the License at
-//
-// http://www.apache.org/licenses/LICENSE-2.post.CreatePostRelationshipJob
-//
-// Unless required by applicable law or agreed to in writing, software
-// distributed under the License is distributed on an "AS IS" BASIS,
-// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-// See the License for the specific language governing permissions and
-// limitations under the License.
-//
-// SPDX-License-Identifier: Apache-2.post.CreatePostRelationshipJob
package ad📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Copyright 2post.CreatePostRelationshipJob23 Specter Ops, Inc. | |
| // | |
| // Licensed under the Apache License, Version 2.post.CreatePostRelationshipJob | |
| // you may not use this file except in compliance with the License. | |
| // You may obtain a copy of the License at | |
| // | |
| // http://www.apache.org/licenses/LICENSE-2.post.CreatePostRelationshipJob | |
| // | |
| // Unless required by applicable law or agreed to in writing, software | |
| // distributed under the License is distributed on an "AS IS" BASIS, | |
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
| // See the License for the specific language governing permissions and | |
| // limitations under the License. | |
| // | |
| // SPDX-License-Identifier: Apache-2.post.CreatePostRelationshipJob | |
| package ad |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cmd/api/src/analysis/ad/post.go` around lines 16 - 30, Remove the corrupted
duplicated license block that was introduced by the bad find-and-replace (the
block containing the garbled tokens like "post.CreatePostRelationshipJob", e.g.,
"Copyright 2post.CreatePostRelationshipJob23" and
"Apache-2.post.CreatePostRelationshipJob"); keep only the original correct
license header at the top of cmd/api/src/analysis/ad/post.go and delete the
entire garbled lines (the duplicate block containing
"post.CreatePostRelationshipJob") so the file contains a single valid Apache-2.0
license header.
| type RoleAssignments struct { | ||
| Principals graph.NodeKindSet | ||
| TenantPrincipals graph.NodeKindSet | ||
| users cardinality.ImmutableDuplex[uint64] | ||
| usersWithAnyRole cardinality.ImmutableDuplex[uint64] | ||
| usersWithoutRoles cardinality.ImmutableDuplex[uint64] | ||
| servicePrincipals cardinality.ImmutableDuplex[uint64] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for ImmutableDuplex usage and definition
echo "=== Searching for ImmutableDuplex usage ==="
rg -n "ImmutableDuplex" -g '*.go'
echo ""
echo "=== Checking go.mod for dawgs version ==="
rg -n "specterops/dawgs" go.mod
echo ""
echo "=== Checking the actual file structure ==="
head -150 packages/go/analysis/azure/role.go | tail -60Repository: SpecterOps/BloodHound
Length of output: 2932
🏁 Script executed:
#!/bin/bash
# Check imports in the role.go file
echo "=== Imports in role.go ==="
head -30 packages/go/analysis/azure/role.go | grep -A 20 "^import"
echo ""
echo "=== Search for ImmutableDuplex definition ==="
rg -n "type ImmutableDuplex" -g '*.go'
echo ""
echo "=== Search for cardinality types available ==="
rg -n "type.*Duplex" -g '*.go' | head -20
echo ""
echo "=== Check if there's a cardinality package in repo ==="
find . -type d -name cardinality 2>/dev/null | head -10Repository: SpecterOps/BloodHound
Length of output: 589
🌐 Web query:
specterops/dawgs v0.4.8 ImmutableDuplex cardinality
💡 Result:
In SpecterOps/DAWGS v0.4.8 (tagged Feb 10, 2026), an ImmutableDuplex is still a cardinality.Duplex[...] provider, so its cardinality is simply the number of unique values contained in the duplex (e.g., for the bitmap-backed implementations used throughout DAWGS, it’s the bitmap’s set-bit count). [1]
Practically, you obtain it by calling the duplex’s cardinality/count accessor (the same “how many IDs are in this set” notion used by helpers like ThreadSafeKindBitmap.Cardinality(...) and other Duplex-returning APIs). [2]
Sources:
[1] DAWGS v0.4.8 tag list (shows v0.4.8 and date) https://github.com/SpecterOps/DAWGS/tags
[2] DAWGS graph package docs showing cardinality.Duplex[...] usage + ThreadSafeKindBitmap.Cardinality(...) https://pkg.go.dev/github.com/specterops/dawgs/graph
🌐 Web query:
site:pkg.go.dev github.com/specterops/dawgs cardinality ImmutableDuplex
💡 Result:
In github.com/specterops/dawgs, the cardinality package does not define a type named ImmutableDuplex (at least up through v0.4.9, published Feb 18, 2026). Instead, it exposes:
type Provider[T] interface { Add(...); Or(...); Clear(); Cardinality() uint64 }type Duplex[T] interface { Provider[T]; Xor/And/AndNot/Remove/Slice/Contains/Each/CheckedAdd/Clone }
So if you’re looking for “immutable duplex” semantics, the intended mechanism is typically to Clone() a Duplex (or CloneProvider for a Provider) before mutating, rather than using a distinct ImmutableDuplex type. [1][2]
Sources:
[1] cardinality package docs (interfaces, constructors, CloneProvider, Duplex.Clone) — https://pkg.go.dev/github.com/specterops/dawgs/cardinality
[2] cardinality version list (v0.1.0 → v0.4.9; no ImmutableDuplex shown) — https://pkg.go.dev/github.com/specterops/dawgs/cardinality?tab=versions
Build failure: cardinality.ImmutableDuplex is undefined and does not exist in specterops/dawgs.
The cardinality package in specterops/dawgs v0.4.8 (and through v0.4.9) does not export an ImmutableDuplex type. Available types are Duplex[T] and Provider[T] interfaces. Replace cardinality.ImmutableDuplex[uint64] with cardinality.Duplex[uint64] in the struct fields (lines 105–108) and method return types (lines 130–142), or adopt the immutability pattern using Clone() if needed.
🧰 Tools
🪛 GitHub Actions: Run Go Integration Tests
[error] 105-105: undefined: cardinality.ImmutableDuplex
🪛 GitHub Actions: Run Go Unit Tests
[error] 105-105: undefined: cardinality.ImmutableDuplex
🪛 golangci-lint (2.5.0)
[major] 105-105: : # github.com/specterops/bloodhound/packages/go/analysis/azure
packages/go/analysis/azure/role.go:105:44: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:106:44: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:107:44: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:108:44: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:130:58: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:134:46: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:138:57: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:142:58: undefined: cardinality.ImmutableDuplex
(typecheck)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/go/analysis/azure/role.go` around lines 103 - 108, The code uses the
nonexistent type cardinality.ImmutableDuplex[uint64]; update the RoleAssignments
struct fields (TenantPrincipals? keep as-is, and replace users,
usersWithAnyRole, usersWithoutRoles, servicePrincipals) to use
cardinality.Duplex[uint64] instead of ImmutableDuplex[uint64], and update any
methods that currently return cardinality.ImmutableDuplex[uint64] (the
RoleAssignments getters) to return cardinality.Duplex[uint64]; if you need
immutability semantics, call Clone() on the Duplex before returning to emulate
an immutable view rather than relying on a non-existent type.
| tenants, err := azure.FetchTenants(ctx, db) | ||
| if err != nil { | ||
| return &analysis.AtomicPostProcessingStats{}, fmt.Errorf("fetching Entra tenants: %w", err) | ||
| return &post.AtomicPostProcessingStats{}, fmt.Errorf("fetching Entra tenants: %w", err) |
There was a problem hiding this comment.
Returning uninitialized AtomicPostProcessingStats — nil maps and nil mutex.
&post.AtomicPostProcessingStats{} leaves RelationshipsCreated, RelationshipsDeleted, and mutex as nil. If a caller ever accesses the returned stats on an error path (e.g., to log or merge), this will panic. Use the constructor to be consistent with every other error return in this PR (e.g., cmd/api/src/analysis/ad/post.go line 61).
Proposed fix
- return &post.AtomicPostProcessingStats{}, fmt.Errorf("fetching Entra tenants: %w", err)
+ stats := post.NewAtomicPostProcessingStats()
+ return &stats, fmt.Errorf("fetching Entra tenants: %w", err)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return &post.AtomicPostProcessingStats{}, fmt.Errorf("fetching Entra tenants: %w", err) | |
| stats := post.NewAtomicPostProcessingStats() | |
| return &stats, fmt.Errorf("fetching Entra tenants: %w", err) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/go/analysis/hybrid/hybrid.go` at line 52, The error return
constructs an empty AtomicPostProcessingStats with nil maps/mutex which can
panic later; replace the literal &post.AtomicPostProcessingStats{} with the
proper constructor (e.g., post.NewAtomicPostProcessingStats()) wherever you
return stats on error in hybrid.go so the RelationshipsCreated,
RelationshipsDeleted and mutex are initialized before returning the error.
| if err := batch.CreateRelationshipByIDs(nextJob.FromID, nextJob.ToID, nextJob.Kind, relProps); err != nil { | ||
| slog.Error("Create Relationship Error", slog.String("err", err.Error())) | ||
| } | ||
|
|
||
| s.stats.AddRelationshipsCreated(nextJob.Kind, 1) | ||
|
|
||
| postOperationsVec.With(map[string]string{ | ||
| "kind": nextJob.Kind.String(), | ||
| "operation": "edge_insert", | ||
| }).Add(1) |
There was a problem hiding this comment.
Relationships counted as created even on insertion failure.
Line 97 (s.stats.AddRelationshipsCreated) and Line 99 (metric increment) execute regardless of whether batch.CreateRelationshipByIDs on Line 93 succeeded. On error, the stat and metric still record a successful insert.
Proposed fix: skip stats/metrics on error
if err := batch.CreateRelationshipByIDs(nextJob.FromID, nextJob.ToID, nextJob.Kind, relProps); err != nil {
slog.Error("Create Relationship Error", slog.String("err", err.Error()))
- }
-
- s.stats.AddRelationshipsCreated(nextJob.Kind, 1)
-
- postOperationsVec.With(map[string]string{
- "kind": nextJob.Kind.String(),
- "operation": "edge_insert",
- }).Add(1)
+ } else {
+ s.stats.AddRelationshipsCreated(nextJob.Kind, 1)
+
+ postOperationsVec.With(map[string]string{
+ "kind": nextJob.Kind.String(),
+ "operation": "edge_insert",
+ }).Add(1)
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/go/analysis/post/sink.go` around lines 93 - 102, The stats and
metric increments are executed even when batch.CreateRelationshipByIDs fails;
change the flow so s.stats.AddRelationshipsCreated(nextJob.Kind, 1) and
postOperationsVec.With(...).Add(1) only run when err == nil. Specifically, in
the block around batch.CreateRelationshipByIDs(nextJob.FromID, nextJob.ToID,
nextJob.Kind, relProps), check the returned err and, on error, log via
slog.Error as you already do and return/continue, otherwise call
s.stats.AddRelationshipsCreated(...) and postOperationsVec.With(...).Add(1) so
they are only counted on successful creation.
| func (s *FilteredRelationshipSink) deleteMissingEdges(ctx context.Context) error { | ||
| deletedEdges := s.edgeTracker.Deleted() | ||
|
|
||
| defer trace.Method(ctx, "FilteredRelationshipSink", "deleteMissingEdges", slog.Int("num_edges", len(deletedEdges)))() | ||
|
|
||
| if err := s.db.BatchOperation(ctx, func(batch graph.Batch) error { | ||
| for _, deletedEdge := range deletedEdges { | ||
| if err := batch.DeleteRelationship(graph.ID(deletedEdge)); err != nil { | ||
| return err | ||
| } | ||
| } | ||
|
|
||
| return nil | ||
| }); err != nil { | ||
| return err | ||
| } | ||
|
|
||
| postOperationsVec.With(map[string]string{ | ||
| "kind": "all", | ||
| "operation": "edge_delete", | ||
| }).Add(float64(len(deletedEdges))) | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n 'RelationshipsDeleted' --type=go -C3Repository: SpecterOps/BloodHound
Length of output: 5523
🏁 Script executed:
fd 'stats.go' --type fRepository: SpecterOps/BloodHound
Length of output: 144
🏁 Script executed:
cat -n packages/go/analysis/post/stats.go | head -100Repository: SpecterOps/BloodHound
Length of output: 3383
🏁 Script executed:
cat -n packages/go/analysis/post/sink.go | head -180Repository: SpecterOps/BloodHound
Length of output: 7353
🏁 Script executed:
rg -n 'AddRelationshipsDeleted' --type=go packages/go/analysis/post/Repository: SpecterOps/BloodHound
Length of output: 199
🏁 Script executed:
rg -n 'func (s \*FilteredRelationshipSink)' --type=go packages/go/analysis/post/sink.go | head -20Repository: SpecterOps/BloodHound
Length of output: 47
🏁 Script executed:
rg -n 'func.*Deleted\(' --type=go packages/go/analysis/delta/Repository: SpecterOps/BloodHound
Length of output: 145
🏁 Script executed:
fd 'tracker.go' --path='*/delta/*' --type fRepository: SpecterOps/BloodHound
Length of output: 280
🏁 Script executed:
rg -n 'type Tracker struct' --type=go -A 10 packages/go/analysis/delta/Repository: SpecterOps/BloodHound
Length of output: 858
🏁 Script executed:
cat -n packages/go/analysis/delta/tracker.go | sed -n '130,160p'Repository: SpecterOps/BloodHound
Length of output: 1177
🏁 Script executed:
rg -n 'func (s \*Tracker) HasEdge\|func (s \*Tracker) Track' --type=go packages/go/analysis/delta/tracker.go -A 5Repository: SpecterOps/BloodHound
Length of output: 47
🏁 Script executed:
rg -n 'type trackedEntity' --type=go -A 3 packages/go/analysis/delta/Repository: SpecterOps/BloodHound
Length of output: 286
🏁 Script executed:
cat -n packages/go/analysis/delta/tracker.go | sed -n '1,130p' | tail -40Repository: SpecterOps/BloodHound
Length of output: 1253
🏁 Script executed:
rg -n 'func (s \*Tracker) HasEdge' --type=go -A 15 packages/go/analysis/delta/tracker.goRepository: SpecterOps/BloodHound
Length of output: 47
🏁 Script executed:
cat -n packages/go/analysis/delta/tracker.go | sed -n '157,180p'Repository: SpecterOps/BloodHound
Length of output: 777
🏁 Script executed:
rg -n 'AddEdge' --type=go -A 10 packages/go/analysis/delta/tracker.go | head -30Repository: SpecterOps/BloodHound
Length of output: 47
deleteMissingEdges — missing per-kind deletion stats updates.
Currently, deleted edges are counted as a single Prometheus metric with "kind": "all" (line 154-157), while AtomicPostProcessingStats.AddRelationshipsDeleted() method exists for per-kind tracking. However, s.stats.AddRelationshipsDeleted() is never called here, unlike insertions which call s.stats.AddRelationshipsCreated() per-kind (line 97). This creates an inconsistency: deletions bypass the per-kind stats map entirely, so Stats().RelationshipsDeleted will remain empty while Stats().RelationshipsCreated is populated.
The challenge is that s.edgeTracker.Deleted() returns only relationship IDs without Kind information. Consider whether the tracker should preserve Kind data for deleted edges, or adjust the stats API to handle bulk deletions without per-kind granularity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/go/analysis/post/sink.go` around lines 137 - 160, deleteMissingEdges
currently deletes by ID without updating per-kind deletion stats; fix by
retrieving kind info for each deleted relationship before or during deletion and
updating s.stats accordingly. Change the deletion flow in deleteMissingEdges to
first resolve the deleted IDs to their Relationship.Kind (e.g., via a lookup
like s.db.GetRelationshipsByIDs or adding kind to edgeTracker.Deleted() return),
aggregate counts per kind, call s.stats.AddRelationshipsDeleted(kind, count) for
each kind, then proceed with s.db.BatchOperation/batch.DeleteRelationship as
before; reference functions/fields: deleteMissingEdges, s.edgeTracker.Deleted,
s.db.BatchOperation, batch.DeleteRelationship, and
s.stats.AddRelationshipsDeleted.
| func (s *FilteredRelationshipSink) start(ctx context.Context) { | ||
| s.wg.Add(1) | ||
| go s.worker(ctx) | ||
| } |
There was a problem hiding this comment.
Error from worker is silently discarded.
s.worker(ctx) returns an error (from deleteMissingEdges), but the goroutine on Line 218 ignores it. If edge deletion fails, the caller of Done() has no way to know. Consider storing the error on the struct and returning it from Done(), or at minimum logging it.
Sketch: capture and surface the worker error
type FilteredRelationshipSink struct {
operationName string
db graph.Database
edgeTracker *delta.Tracker
jobC chan EnsureRelationshipJob
stats AtomicPostProcessingStats
wg sync.WaitGroup
+ workerErr error
}
// start begins execution of the sink's main worker loop.
func (s *FilteredRelationshipSink) start(ctx context.Context) {
s.wg.Add(1)
- go s.worker(ctx)
+ go func() {
+ if err := s.worker(ctx); err != nil {
+ slog.Error("FilteredRelationshipSink worker error", attr.Error(err))
+ s.workerErr = err
+ }
+ }()
}
// Done signals the end of processing and waits for all workers to complete.
-func (s *FilteredRelationshipSink) Done() {
+func (s *FilteredRelationshipSink) Done() error {
close(s.jobC)
s.wg.Wait()
+ return s.workerErr
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/go/analysis/post/sink.go` around lines 216 - 219, The goroutine
started in FilteredRelationshipSink.start currently calls s.worker(ctx) and
discards any returned error (errors from deleteMissingEdges inside worker are
lost); update the implementation so the worker's error is captured (e.g., assign
to a new field like s.workerErr or send it on an internal err channel), ensure
s.worker still decrements the wait group, and then surface that error from
Done() (or at minimum log it from the goroutine) so callers can observe failures
from deleteMissingEdges; update the Done method to return the stored error (or
read from the err channel) and document the behavior in comments next to
FilteredRelationshipSink.worker and FilteredRelationshipSink.Done.
| func (s *AtomicPostProcessingStats) LogStats() { | ||
| // Only output stats during debug runs | ||
| if level.GlobalAccepts(slog.LevelDebug) { | ||
| return | ||
| } |
There was a problem hiding this comment.
Bug: inverted condition — LogStats never logs when debug is enabled.
The comment says "Only output stats during debug runs," but the guard returns when GlobalAccepts(slog.LevelDebug) is true, which is exactly when you want to log. The condition is inverted.
🐛 Fix: negate the condition
func (s *AtomicPostProcessingStats) LogStats() {
// Only output stats during debug runs
- if level.GlobalAccepts(slog.LevelDebug) {
+ if !level.GlobalAccepts(slog.LevelDebug) {
return
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (s *AtomicPostProcessingStats) LogStats() { | |
| // Only output stats during debug runs | |
| if level.GlobalAccepts(slog.LevelDebug) { | |
| return | |
| } | |
| func (s *AtomicPostProcessingStats) LogStats() { | |
| // Only output stats during debug runs | |
| if !level.GlobalAccepts(slog.LevelDebug) { | |
| return | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/go/analysis/post/stats.go` around lines 100 - 104, The early-return
condition in AtomicPostProcessingStats.LogStats is inverted causing no logs to
be emitted in debug; change the guard to return only when debug is NOT enabled
by negating the call (use if !level.GlobalAccepts(slog.LevelDebug) { return })
so LogStats proceeds when debug is active.
| func metricKey(name, namespace string, labels map[string]string) string { | ||
| builder := strings.Builder{} | ||
|
|
||
| builder.WriteString(namespace) | ||
| builder.WriteString(name) | ||
|
|
||
| for key, value := range labels { | ||
| builder.WriteString(key) | ||
| builder.WriteString(value) | ||
| } | ||
|
|
||
| return builder.String() | ||
| } |
There was a problem hiding this comment.
metricKey is non-deterministic — Go map iteration order causes cache misses and panics.
When labels has more than one entry, range labels iterates in a random order. Two calls with the same label set can produce different keys, causing a cache miss. The second call then attempts to re-register the same Prometheus metric, and promauto panics on duplicate registration.
Additionally, there are no separators between namespace, name, and label key-value pairs, so distinct inputs can collide (e.g., namespace "ab" + name "c" = namespace "a" + name "bc").
Proposed fix — sort label keys and add separators
func metricKey(name, namespace string, labels map[string]string) string {
builder := strings.Builder{}
builder.WriteString(namespace)
+ builder.WriteByte('/')
builder.WriteString(name)
- for key, value := range labels {
+ // Sort keys for deterministic iteration
+ keys := make([]string, 0, len(labels))
+ for key := range labels {
+ keys = append(keys, key)
+ }
+ sort.Strings(keys)
+
+ for _, key := range keys {
+ builder.WriteByte('/')
builder.WriteString(key)
- builder.WriteString(value)
+ builder.WriteByte('=')
+ builder.WriteString(labels[key])
}
return builder.String()
}This requires adding "sort" to the imports.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/go/metrics/registry.go` around lines 35 - 47, The metricKey function
is non-deterministic and can produce collisions because it iterates
map[string]string without ordering and omits separators; update metricKey to
build a deterministic key by collecting and sorting the label keys (use the sort
package), then append namespace and name with explicit separators (e.g., ":" or
"|") and append each label as key + "=" + value with separators between pairs;
ensure you handle nil/empty labels consistently so the same label set always
yields the same string.
| func (s *registry) CounterVec(name, namespace string, constLabels map[string]string, variableLabelNames []string) *prometheus.CounterVec { | ||
| s.lock.Lock() | ||
| defer s.lock.Unlock() | ||
|
|
||
| key := metricKey(name, namespace, constLabels) | ||
|
|
||
| if counterVec, hasCounter := s.counterVecs[key]; hasCounter { | ||
| return counterVec | ||
| } else { | ||
| newCounterVec := promauto.With(s.prometheusRegistry).NewCounterVec(prometheus.CounterOpts{ | ||
| Name: name, | ||
| Namespace: namespace, | ||
| ConstLabels: constLabels, | ||
| }, variableLabelNames) | ||
|
|
||
| s.counterVecs[key] = newCounterVec | ||
| return newCounterVec | ||
| } | ||
| } |
There was a problem hiding this comment.
CounterVec cache key ignores variableLabelNames.
Two CounterVec calls with identical name/namespace/constLabels but different variableLabelNames will share a cache key. The first registration wins, and the second silently returns the wrong *prometheus.CounterVec. Include variableLabelNames in the key.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/go/metrics/registry.go` around lines 71 - 89, CounterVec currently
builds the cache key with metricKey(name, namespace, constLabels) which ignores
variableLabelNames so different variableLabelNames collide; update
registry.CounterVec to incorporate variableLabelNames into the cache key (e.g.,
serialize/join variableLabelNames into a deterministic string and append to
metricKey or add a new metricKeyWithVars helper) before looking up/setting
s.counterVecs[key] so each unique (name, namespace, constLabels,
variableLabelNames) maps to its own *prometheus.CounterVec; keep using s.lock
and s.counterVecs and ensure the serialized form is deterministic (preserve
order) to avoid accidental collisions.
| func Function(ctx context.Context, function string, startArgs ...any) func(args ...any) { | ||
| var ( | ||
| level = slog.LevelInfo | ||
| then = time.Now() | ||
| traceCtx, hasTraceCtx = fromContext(ctx) | ||
| commonArgs []any | ||
| ) | ||
|
|
||
| commonArgs = combineArgs([]any{ | ||
| attr.Scope("process"), | ||
| attr.Function(function), | ||
| }, startArgs) | ||
|
|
||
| if hasTraceCtx { | ||
| level = traceCtx.Level | ||
|
|
||
| commonArgs = combineArgs(commonArgs, []any{ | ||
| attr.Namespace(traceCtx.Namespace), | ||
| attr.Measurement(traceCtx.ID), | ||
| }) | ||
| } | ||
|
|
||
| slog.Log(ctx, level, "Function Trace", combineArgs( | ||
| commonArgs, | ||
| attr.Enter(), | ||
| startArgs, | ||
| )...) | ||
|
|
||
| return func(exitArgs ...any) { | ||
| elapsed := time.Since(then) | ||
|
|
||
| if hasTraceCtx { | ||
| metrics.Counter("function_trace", traceCtx.Namespace, map[string]string{ | ||
| "fn": function, | ||
| }).Add(elapsed.Seconds()) | ||
| } | ||
|
|
||
| slog.Log(ctx, level, "Function Trace", combineArgs( | ||
| commonArgs, | ||
| attr.Exit(), | ||
| startArgs, | ||
| attr.Elapsed(elapsed), | ||
| exitArgs, | ||
| )...) | ||
| } |
There was a problem hiding this comment.
startArgs is included twice in log output — appears both in commonArgs and again directly.
At line 92–95, commonArgs is built by combining scope, function, and startArgs. Then at lines 106–110 (Enter log) and 121–127 (Exit log), startArgs is passed again alongside commonArgs, causing all start arguments to be emitted twice per log entry.
🐛 Proposed fix: remove the duplicate `startArgs`
slog.Log(ctx, level, "Function Trace", combineArgs(
commonArgs,
attr.Enter(),
- startArgs,
)...)
return func(exitArgs ...any) {
elapsed := time.Since(then)
if hasTraceCtx {
metrics.Counter("function_trace", traceCtx.Namespace, map[string]string{
"fn": function,
}).Add(elapsed.Seconds())
}
slog.Log(ctx, level, "Function Trace", combineArgs(
commonArgs,
attr.Exit(),
- startArgs,
attr.Elapsed(elapsed),
exitArgs,
)...)
}🧰 Tools
🪛 golangci-lint (2.5.0)
[major] 105-105: : # github.com/specterops/bloodhound/packages/go/analysis/azure
packages/go/analysis/azure/role.go:105:44: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:106:44: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:107:44: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:108:44: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:130:58: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:134:46: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:138:57: undefined: cardinality.ImmutableDuplex
packages/go/analysis/azure/role.go:142:58: undefined: cardinality.ImmutableDuplex
(typecheck)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/go/trace/trace.go` around lines 84 - 128, In Function (trace.go) the
startArgs are being merged into commonArgs and then passed again to slog.Log,
causing duplicate fields; remove startArgs from the initial combineArgs call
that builds commonArgs (leave only attr.Scope, attr.Function and the traceCtx
additions), and keep startArgs only in the slog.Log calls (Enter and Exit) where
you currently pass startArgs/exitArgs; update the combineArgs invocations around
slog.Log to include commonArgs plus startArgs (for Enter) and commonArgs plus
attr.Elapsed(...) plus exitArgs (for Exit) so startArgs are emitted exactly
once.
postpackagesink.gosink.goto filter only changes to the databaseDeleteTransitEdgesfunctionDescription
Describe your changes in detail
Motivation and Context
Resolves <TICKET_OR_ISSUE_NUMBER>
Why is this change required? What problem does it solve?
How Has This Been Tested?
Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc.
Screenshots (optional):
Types of changes
Checklist:
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Refactor