Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughCentralizes environment and schema-relationship retrieval behind new filtered DB methods, adds ID-based Kind/SourceKind getters, changes SourceKind.Name to string with ToKind(), updates mocks and migrations to upsert missing kinds, and normalizes environment_id strings to uppercase in node conversion. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cmd/api/src/database/graphschema.go`:
- Around line 748-762: GetSchemaRelationshipFindingsByEnvironmentId may return a
nil slice when no rows are found; modify it so it always returns a non-nil empty
slice instead of nil. After the query (before returning), ensure findings is
initialized to an empty slice when length is zero (e.g., set findings =
[]model.SchemaRelationshipFinding{}), so callers serializing the result get []
not null; keep references to the existing variables and use the same Raw(...)
call and CheckError handling as currently implemented.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/api/src/database/graphschema.go (1)
598-641:⚠️ Potential issue | 🟠 MajorUpdate
buildSQLFilterto properly handle qualified column names likese.id.The current implementation passes the entire qualified name (e.g.,
"se.id") topgsql.Identifier(), which treats it as a single identifier rather than splitting it into table alias and column. The pattern exists incmd/api/src/model/filter.gowhereBuildSQLFilteraccepts atableAliasparameter and usespgsql.AsCompoundIdentifier(tableAlias, columnName)when needed.Since
GetEnvironmentByKindsandGetEnvironmentByIdboth pass qualified filters like"se.environment_kind_id"and"se.id"tobuildSQLFilter, the function should either:
- Accept a
tableAliasparameter and usepgsql.AsCompoundIdentifier- Parse and split qualified names on the dot separator before passing to
pgsql.IdentifierWithout proper handling, the generated SQL may be semantically incorrect (e.g.,
"se.id"instead of"se"."id").
wes-mil
left a comment
There was a problem hiding this comment.
The schema changes look good to me
mistahj67
left a comment
There was a problem hiding this comment.
Not blocking, but left a few comments to ponder 🤔
There was a problem hiding this comment.
Only comment is that you could return the ID from genscript_upsert_source_kind and then you could skip having to SELECT but tomato tomato
EDIT: Approving specifically the sql generator changes, please get other approvals for the rest of the changes 😄
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/api/src/database/upsert_schema_environment.go (1)
117-133:⚠️ Potential issue | 🟠 MajorFix environment lookup:
GetEnvironmentByIdexpects schema environment ID, notenvironment_kind_id.Line 122 passes
EnvironmentKindIdtoGetEnvironmentById, which filters onse.id. That will usually miss the existing environment and, after the new unique constraint, will cause duplicate insert errors. Query byenvironment_kind_idinstead (and update the comment to match the new constraint).🐛 Suggested fix
func (s *BloodhoundDB) replaceSchemaEnvironment(ctx context.Context, graphSchema model.SchemaEnvironment) (int32, error) { - if existing, err := s.GetEnvironmentById(ctx, graphSchema.EnvironmentKindId); err != nil && !errors.Is(err, ErrNotFound) { - return 0, fmt.Errorf("error retrieving schema environment: %w", err) - } else if !errors.Is(err, ErrNotFound) { - // Environment exists - delete it first - if err := s.DeleteEnvironment(ctx, existing.ID); err != nil { - return 0, fmt.Errorf("error deleting schema environment %d: %w", existing.ID, err) - } - } + filters := model.Filters{ + "se.environment_kind_id": []model.Filter{{Operator: model.Equals, Value: fmt.Sprintf("%d", graphSchema.EnvironmentKindId)}}, + } + if existing, err := s.GetEnvironmentsFiltered(ctx, filters); err != nil { + return 0, fmt.Errorf("error retrieving schema environment: %w", err) + } else if len(existing) > 0 { + // Environment exists - delete it first + if err := s.DeleteEnvironment(ctx, existing[0].ID); err != nil { + return 0, fmt.Errorf("error deleting schema environment %d: %w", existing[0].ID, err) + } + }
🤖 Fix all issues with AI agents
In `@cmd/api/src/database/migration/migrations/v8.7.0.sql`:
- Around line 18-30: The migration adds a UNIQUE constraint on
schema_environments.environment_kind_id which will fail if duplicate
environment_kind_id values already exist; modify the migration (around the ALTER
TABLE ... ADD CONSTRAINT schema_environments_environment_kind_id_key) to first
check for duplicates by querying schema_environments with GROUP BY
environment_kind_id HAVING COUNT(*) > 1, and if any rows are returned either (a)
perform a cleanup strategy (delete or consolidate duplicates) or (b) RAISE
EXCEPTION with a clear message listing the offending environment_kind_id values
so the upgrade fails fast and informs operators; ensure this pre-check runs
before attempting to add the constraint and reference schema_environments and
schema_environments_environment_kind_id_key in the added check/exception.
In `@cmd/api/src/database/upsert_schema_finding.go`:
- Around line 39-47: The code incorrectly calls GetEnvironmentById(ctx,
environmentKindId) with an environment *kind* ID; change this to look up by
environment kind instead (either call an existing
GetEnvironmentByKindID/GetEnvironmentByEnvironmentKindID function or add one
that queries se.environment_kind_id = environmentKindId) so the lookup uses
environment_kind_id not se.id, and update the surrounding comment to state the
uniqueness is on environment_kind_id (not environment id); keep references to
validateAndTranslateSourceKind, GetEnvironmentById, and environmentKindId to
locate and replace the call.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/api/src/database/upsert_schema_environment.go (1)
29-62:⚠️ Potential issue | 🟠 MajorMulti-step upsert is not wrapped in a transaction — partial failures leave inconsistent state.
UpsertSchemaEnvironmentWithPrincipalKindsperforms multiple dependent write operations (delete old environment → create new environment → delete old principal kinds → create new principal kinds) without a transaction. If any intermediate step fails, the database is left in a half-updated state (e.g., old environment deleted but new one not created, or environment created but principal kinds missing).Consider wrapping the write operations in
s.Transaction(ctx, ...)to ensure atomicity.
🤖 Fix all issues with AI agents
In `@cmd/api/src/database/graphschema_integration_test.go`:
- Around line 2491-2492: The test calls GetEnvironmentByEnvironmentKindId with
environment.ID (the row PK) but the function expects the environment_kind_id;
change the call to pass environment.EnvironmentKindId (the field on the
environment struct) instead of environment.ID so the query filters by the
intended foreign-key value when invoking GetEnvironmentByEnvironmentKindId in
graphschema_integration_test.go.
🧹 Nitpick comments (3)
cmd/api/src/database/graphschema_integration_test.go (1)
1392-1470: Test naming and arg field are now misleading after the method rename.
TestGetSchemaEnvironmentByIdand theargs.environmentIdfield no longer describe what's being tested — the method now looks up byenvironment_kind_id, not the row ID. Consider renaming the test toTestGetSchemaEnvironmentByEnvironmentKindIdand the arg toenvironmentKindIdto keep the test self-documenting.cmd/api/src/database/upsert_schema_finding.go (1)
39-42: Source kind validated but result discarded — consider documenting the intent.
validateAndTranslateSourceKindis called for its side effect (registering the source kind if absent), but the returned ID is discarded. A brief inline comment explaining why this call exists would help future readers understand the intent, since the source kind ID isn't used in the finding creation.cmd/api/src/database/graphschema.go (1)
987-1012: Remove commented-out dead code.This block of commented-out code (a
Kindstruct andGetKindByNamemethod) with a TODO note should be cleaned up. It adds noise and the TODO suggests it's already been implemented elsewhere.
| BEGIN | ||
| SELECT id INTO retreived_environment_kind_id FROM kind WHERE name = v_environment_kind_name; | ||
| IF retreived_environment_kind_id IS NULL THEN | ||
| RAISE EXCEPTION 'couldn''t find matching kind_id'; |
There was a problem hiding this comment.
I think if the exception is removed, the environment kinds must also be declared as node kinds before being created as an environment.
Description
This PR contains odds and ends bits to support the wire up of datapipe to the opengraph schema definitions that both teams have been chuggin on.
Tenant, whenAZTenantis needed. @wes-mil Please look over my change toaz_graph_schema.sqlGetEnvironments*methods onOpenGraphSchemainterface to use the same shared func with configurable sql filters.GetSchemaRelationshipFinding*methods onOpenGraphSchemainterface to use a shared func with configurable sql filters.convertors.goi capitalize theenvironment_idproperty on any generic-ingested node. this will keep environment_id's consistent with the uppercase convention we currently maintain for DomainSIDs and TenantIDsMotivation and Context
BED-6903
OpenGraph findings
How Has This Been Tested?
Screenshots (optional):
Types of changes
Checklist:
Summary by CodeRabbit
New Features
Improvements
Tests