Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
289 changes: 289 additions & 0 deletions rfc/bh-rfc-4.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,289 @@
---
bh-rfc: 4
title: OpenGraph Extension Fundamental Requirements
authors: |
[Alyx Holms](aholms@specterops.io)
status: DRAFT
created: 2025-12-15
audiences: |
- BloodHound OpenGraph Extension Authors
- BHE Team
---

# OpenGraph Extension Fundamental Requirements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I read through and was contemplating the various sections and how they were organized, I was realizing: I think it might be helpful to separate what are actual requirements (namespacing, validation, modularity, etc), from some of our design considerations (kinds vs schemas, table relations, sample JSON, etc).


## 1. Overview

This RFC introduces **OpenGraph Extensions**, a framework for modularly extending BloodHound’s data models. It mandates that all extension-defined attributes (e.g. node kinds, edge kinds, properties) use a unique namespace prefix to prevent conflicts and ensure traceability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is really short, and is very focused on the namespace requirement. I think it would be good to expand on this section. Happy to help articulate more here.


## 2. Motivation & Goals

### 2.1 Extensibility
OpenGraph Extensions enable BloodHound to support modular, community-driven data model extensions. Without a standardized approach to namespacing, multiple extensions could define conflicting attribute names, leading to data integrity issues and ambiguous ownership of schema elements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this is more focused on the namespacing, and I think there's a lot more to highlight about the modular and extensible nature.


### 2.2 Hybrid Paths
Extensions must be able to define hybrid paths, despite their namespacing requirements. This requirement should be maintainable via referencing other extension's kinds when defining hybrid paths.

### 2.3 Intentional Interactions vs Unintentional Collisions
BloodHound needs to support intentional interactions (such as hybrid paths) implicitly through references, and must avoid unintentional collisions through aggressive namespacing for any extension defined identifiers (node kinds, edge kinds, properties, etc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix grammar: compound adjective and abbreviation period.

Line 28 should read extension-defined (with hyphen) and end properties, etc.) (with period before closing paren).

-BloodHound needs to support intentional interactions (such as hybrid paths) implicitly through references, and must avoid unintentional collisions through aggressive namespacing for any extension defined identifiers (node kinds, edge kinds, properties, etc)
+BloodHound needs to support intentional interactions (such as hybrid paths) implicitly through references, and must avoid unintentional collisions through aggressive namespacing for any extension-defined identifiers (node kinds, edge kinds, properties, etc.)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
BloodHound needs to support intentional interactions (such as hybrid paths) implicitly through references, and must avoid unintentional collisions through aggressive namespacing for any extension defined identifiers (node kinds, edge kinds, properties, etc)
BloodHound needs to support intentional interactions (such as hybrid paths) implicitly through references, and must avoid unintentional collisions through aggressive namespacing for any extension-defined identifiers (node kinds, edge kinds, properties, etc.)
🧰 Tools
🪛 LanguageTool

[grammar] ~28-~28: Use a hyphen to join words.
Context: ...aggressive namespacing for any extension defined identifiers (node kinds, edge ki...

(QB_NEW_EN_HYPHEN)


[style] ~28-~28: In American English, abbreviations like “etc.” require a period.
Context: ...rs (node kinds, edge kinds, properties, etc) - Avoid Attribute Collisions - Pr...

(ETC_PERIOD)

🤖 Prompt for AI Agents
In rfc/bh-rfc-4.md around line 28, fix the grammar by changing "extension
defined identifiers" to the hyphenated compound adjective "extension-defined
identifiers" and ensure the parenthetical ends with a period before the closing
parenthesis so the phrase reads "properties, etc.)." Replace the text on that
line accordingly.


- **Avoid Attribute Collisions** - Prevent multiple extensions from defining the same attribute.
- **Clarity of Ownership** - Trace attributes back to their defining extension via prefixes.

## 3. Considerations

### 3.1 Impact on Existing Systems

SharpHound and AzureHound will initially bypass namespace validation to avoid breaking changes. A future migration tool will:

- Add namespaces retroactively (e.g. `AD_` for SharpHound, `EAD_` for Azure/EntraHound).
- No longer bundle these extensions but allow for easy installation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is accurate. While AD and AZ will be migrated to the OG/extension platform, they will come pre-installed on any build.


### 3.2 Security & Compliance

- **No Direct Risks** - Namespacing is a structural constraint, not a security feature.
- **Data Integrity** - Prefixes ensure attribute uniqueness, ensuring schemas have only one source of truth.

### 3.3 Drawbacks & Alternatives

#### 3.3.1 Increased Verbosity

- **Drawback** - Increased verbosity in attribute names (e.g. `GH_User` vs. `User`).
- **Alternative** - Global registry for attribute names (rejected due to centralization).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More specifically, I think we rejected due to the complexity involved with orchestrating and maintaining such a centrally organized system.


#### 3.3.2 Multiple Extensions Covering the Same Technology

- **Drawback** - Multiple extensions cannot simply cover the same technology using the same types (e.g. GitHub).
- **Mitigation**:
- Extensions should be focused on their own domain.
- Namespacing prevents conflicts. For hybrid paths between technologies, references will be allowed to types in other namespaces.
Comment on lines +58 to +59
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Adjust list indentation to match Markdown convention.

Lines 58–59 use 4 spaces but should use 2 spaces for nested bullets under the mitigation list.

 - **Mitigation**:
-    - Extensions should be focused on their own domain.
-    - Namespacing prevents conflicts. For hybrid paths between technologies, references will be allowed to types in other namespaces.
+  - Extensions should be focused on their own domain.
+  - Namespacing prevents conflicts. For hybrid paths between technologies, references will be allowed to types in other namespaces.
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

58-58: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


59-59: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

🤖 Prompt for AI Agents
In rfc/bh-rfc-4.md around lines 58 to 59, the nested list items under the
mitigation list use 4-space indentation instead of the 2-space indentation
preferred for Markdown lists; update those two lines to use 2 spaces for the
nested bullets so they render consistently as sub-items under the parent bullet.

Comment on lines +56 to +59
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that I don't think we've captured as a drawback is that we cannot support cross-extension augmentation, which may make it either:

  1. Difficult to get additional info added to an existing extension, or
  2. Authors will create additional nodes and edges, potentially adding bloat and duplicative objects to the graph


## 4. Namespace Declaration

Extensions must declare a namespace in their manifest:

```json
{
"schema": {
"name": "github_hound",
"version": "1.0.0",
"namespace": "GH"
},
"node_kinds": [
{
"symbol": "GH_User",
"representation": "Github User",
"icon": "user",
"color": "#00FF00"
}
],
"environments": [
{
"environmentKind": "GH_Organization",
"sourceKind": "GHBase",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n rfc/bh-rfc-4.md | sed -n '75,100p'

Repository: SpecterOps/BloodHound

Length of output: 876


🏁 Script executed:

cat -n rfc/bh-rfc-4.md | sed -n '180,210p'

Repository: SpecterOps/BloodHound

Length of output: 843


🏁 Script executed:

cat -n rfc/bh-rfc-4.md | sed -n '1,50p'

Repository: SpecterOps/BloodHound

Length of output: 2686


🏁 Script executed:

cat -n rfc/bh-rfc-4.md | sed -n '88,120p'

Repository: SpecterOps/BloodHound

Length of output: 1469


Clarify whether sourceKind is restricted to core kinds or can be extension-defined.

The RFC has unresolved ambiguity around the treatment of sourceKind:

  • Section 5 (line 94) lists GHBase as exempt from prefixing, citing it as "a source kind" not defined by the extension. However, this doesn't explicitly state whether sourceKind itself is restricted to core kinds.

  • Section 10 (line 200) states that sourceKind should be created if it doesn't exist, implying extensions can define their own source kinds—yet no guidance is given on whether extension-defined source kinds must follow the naming convention (e.g., GH_Base vs. GHBase).

  • The manifest examples (lines 83, 190) show sourceKind as GHBase (unprefixed), while environmentKind and principalKinds are consistently prefixed (GH_Organization, GH_User).

Recommendation: Explicitly document in Section 5 whether sourceKind is exclusively core-owned (and thus exempt from prefixing) or whether extensions may define source kinds (in which case the naming rule should apply). Align the rule with the examples and the validation logic in Section 10.

"principalKinds": ["GH_User"]
Comment on lines +61 to +84
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either consider reusing a snipped from the example JSON schema OR at bare minimum ensure that the keys are consistent within this document. Looks like we're using different casing and names.

}
]
}
```

## 5. Attribute Prefixing Rules

- **Format** - `namespace` + `_` + `<type>` (e.g., `GH_User`)
- **Required** - All extension-defined attributes (e.g., `GH_User`, `GH_MemberOf`).
- **Exempt** - References to attributes not defined by the extension (e.g., `"GHBase"` is a source kind).
Comment on lines +90 to +94
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this just be a subsection of #4. Namespacing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would should be clear on what exactly doesn't require it. So source_kind, but does environment_kind require it?


## 6. Validation

Reject extensions if:

1. Namespace conflicts with an existing extension.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be clear that we're specifically referring to the namespace key in particular, which is what we rely on to establish/align to the namespace.

2. Any attribute definition (not reference) lacks the required prefix.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realistically, if an attribute is defined with a namespace key, then the reference will also necessarily require it for the reference to be valid. Maybe there's a better way to clarify this?


**Example of Invalid Schema**:

```json
{
"namespace": "GH",
...
"node_kinds": [
{ "symbol": "User" }
]
}
```

The above schema is invalid because `User` should be prefixed with the extension's declared namespace (e.g. `GH_User`).

## 7. Handling Customization
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this section is vague. what kind of customization are we talking about? node icons and colors?

Copy link
Contributor Author

@superlinkx superlinkx Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's somewhat intentionally vague (this RFC is meant to cover the underlying philosophy of extensions, including future additions we haven't explored yet). But I can add the icon/color examples in, that will help give better guidance than the way this ended up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also clarify that we are trying to facilitate and promote the users ability to choose and manage within their own environment. It's not just about including customization options within the schema declaration, but also once that schema is imported.

I think it would be good to call out that extensions define defaults to be used as a baseline, but users are allowed to further customize within their own instance, and user customization should not be overridden.


Extensions may include default customization for their attributes in the manifest. These will be stored with the extension but will not overwrite existing user-customized definitions. Examples of customization include node icons and colors, which are currently defined in the `custom_node_kinds` table. Extensions can declare icons and colors, but they will only be written to the actual `custom_node_kinds` table if they do not already exist.

## 8. Kinds Table Handling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would benefit from clarifying the distinction between the kinds system and the schema system we're implementing, as well as specifying how they should relate and interact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, as I read further, I think it would be wise to combine #8. Kinds Table Handling and #9. Environments and Principal Kinds all under a broader Kinds and Schemas section.


Extensions should use junction tables when creating relationships with tables owned by DAWGS (e.g., kinds table). Modifying these tables directly is discouraged for performance and reliability reasons.

```mermaid
erDiagram
schema_extensions ||--o{ schema_node_kinds : "extension"
schema_extensions ||--o{ schema_relationship_kinds : "extension"
schema_extensions ||--o{ schema_environments : "extension"
Comment on lines +127 to +129
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be shown as "extension_id FK" like we show with the other relationships? Are they not FK'd?


schema_node_kinds ||--|| kinds : "kind_id FK"
schema_relationship_kinds ||--|| kinds : "kind_id FK"
schema_environments ||--|| kinds : "environment_kind_id FK"
schema_environments ||--|| source_kinds : "source_kind_id FK"
schema_environments ||--o{ schema_environments_principal_kinds : "environment_id FK"
schema_environments_principal_kinds ||--|| kinds : "principal_kind FK"

schema_extensions {
int id
text name
text display_name
text version
text namespace
}

schema_node_kinds {
int id
int extension_id
int kind_id
text display_name
text description
bool is_display_kind
text icon
text color
}

schema_relationship_kinds {
int id
int extension_id
int kind_id
text description
bool is_traversable
}

schema_environments {
int id
int extension_id
int environment_kind_id
int source_kind_id
}

schema_environments_principal_kinds {
int id
int environment_id
int principal_kind
}

source_kinds {
int id
text name
}
```

## 9. Environments and Principal Kinds

Environments define the security boundary of a user's model (e.g., Domain in Active Directory, Tenant in Azure). Principal kinds are nodes that count toward exposure/impact scores (e.g., User, Computer).

**Example Environment Schema**:

```json
{
"environments": [
{
"environmentKind": "GH_Organization",
"sourceKind": "GHBase",
"principalKinds": ["GH_User"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are principalKinds (or principal_kinds) optional? Can that field be omitted if there are none? Or is the field required with an empty array?

}
]
}
```

## 10. Validation Rules for Environments

1. Ensure the specified `environmentKind` exists.
2. Ensure the specified `sourceKind` exists (create if it doesn't, reactivate if it does).
3. Ensure all `principalKinds` exist.
Comment on lines +204 to +206
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reduce repetitive sentence structure in validation rules.

Lines 199–201 have three consecutive sentences starting with "Ensure". Vary the structure for clarity.

 1. Ensure the specified `environmentKind` exists.
-2. Ensure the specified `sourceKind` exists (create if it doesn't, reactivate if it does).
-3. Ensure all `principalKinds` exist.
+2. The specified `sourceKind` must exist (or be created and reactivated as needed).
+3. All `principalKinds` must be defined.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
1. Ensure the specified `environmentKind` exists.
2. Ensure the specified `sourceKind` exists (create if it doesn't, reactivate if it does).
3. Ensure all `principalKinds` exist.
1. Ensure the specified `environmentKind` exists.
2. The specified `sourceKind` must exist (or be created and reactivated as needed).
3. All `principalKinds` must be defined.
🧰 Tools
🪛 LanguageTool

[style] ~201-~201: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... it doesn't, reactivate if it does). 3. Ensure all principalKinds exist.

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🤖 Prompt for AI Agents
rfc/bh-rfc-4.md around lines 199 to 201: the three validation rules each begin
with the word "Ensure", producing repetitive sentence structure; rewrite these
three lines to vary sentence starters and improve clarity while preserving
meaning (for example: "Verify the specified `environmentKind` exists.", "If the
specified `sourceKind` does not exist, create it; if it exists but is inactive,
reactivate it.", "Confirm that all `principalKinds` are present."), keeping the
same order and intent.

Comment on lines +204 to +206
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need consistency in these keys' cases (and check for spelling, too).


## 11. Example Upload Artifacts
### Extension JSON example

```json
{
"schema": {
"name": "AzureHound",
"version": "1.0.0",
"namespace": "AZ"
},
"node_kinds": [
{
"name": "AZ_Tenant",
"display_name": "Azure Tenant",
Comment on lines +220 to +221
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I saw that we settled on "name" (instead of "symbol") and "display_name" (instead of "name" or "representation"). Is that correct?

"description": "An Azure tenant environment",
"is_display_kind": "true",
"icon": "cloud",
"color": "0xFF00FF",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Research folks noticed that in some places we specify "#FF0000" and others we use "0xFF0000". I'm pretty sure that we use "#FF0000" everywhere, correct? If so, we should make it consistent in this example.

},{
"name": "AZ_Device",
"display_name": "Azure Device",
"description": "An Azure device",
"is_display_kind": "true",
"icon": "desktop",
"color": "0x0000FF",
},
{
"name": "AZ_User",
"display_name": "Azure User",
"description": "An Azure user account",
"is_display_kind": "true",
"icon": "user",
"color": "0x00FF00",
},
{
"name": "AZ_Group",
"display_name": "Azure Group",
"description": "An Azure security or distribution group",
"is_display_kind": "true",
"icon": "group",
"color": "0xFF0000",
}
],
"relationship_kinds": [
{
"name": "AZ_MemberOf",
"description": "User or computer is a member of a group",
"is_traversable": false,
},
{
"name": "AZ_HasSession",
"description": "User has an active session on a device",
"is_traversable": true,
}
],
"environments": [
{
"environment_kind": "AZ_Tenant",
"source_kind": "AZBase",
"principal_kinds": [
"AZ_User",
"AZ_Group",
"AZ_Device"
Comment on lines +267 to +270
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's something we haven't thought about. Do we need to explicitly declare the environment as a "principal" as well, or will analysis already consider it as such? I.e. does it need to be self-referred in this list?

]
}
],
"relationship_findings": [
{
"name": "AZ_GenericAll",
"display_name": "Generic All Access (Azure)",
"environment_kind": "AZ_Tenant",
"relationship_kind": "AZ_HasSession",
"remediation": {
"short_description": "Principal has excessive permissions on target",
"long_description": "This finding indicates that a principal has been granted GenericAll permissions on a target object, providing complete control including read, write, delete, and permission modification rights.",
"short_remediation": "Remove the GenericAll permission from the principal's access control list",
"long_remediation": "# Remove Generic All Permissions\n\n## Steps to Remediate:\n\n1. **Identify the affected object**\n2. **Remove excessive permissions**\n3. **Grant minimal required permissions**"
}
}
]
}
```