Skip to content

Refactor the HeaderChain into BlockTree#324

Merged
rustaceanrob merged 2 commits intomasterfrom
chain-4-1
Apr 11, 2025
Merged

Refactor the HeaderChain into BlockTree#324
rustaceanrob merged 2 commits intomasterfrom
chain-4-1

Conversation

@rustaceanrob
Copy link
Copy Markdown
Collaborator

@rustaceanrob rustaceanrob commented Apr 4, 2025

Partially works towards #323

Indexing the chain of headers by height and evaluating new headers in a batch is convenient at first, but that approach has lead to poorly implemented audits for difficulty adjustments, finding headers by their hash, and poor interoperability with storage APIs. Of course, basic sanity checks can be ran on a headers message to assert all the headers are connected, pass their own proof of work, etc. This PR does a couple of things, but one of them is to evaluate each header individually for it's relationship to other headers in the chain and if it follows the difficulty adjustment. This approach is extended to the storage backend, where the trait now either 1. accepts a new header 2. handles a reorganization by taking the newly accepted blocks and reorged blocks.

These changes are made far easier with a more robust representation of the header chain. Right now only the chain of most work is stored in memory, and reorganizations only occur if a node provides a message with more work. This representation does not take advantage of the inherent properties of the header chain. If the data structure holds a directed graph of headers, it is trivial to navigate down the graph from the tip of the chain. By storing block data with its associated hash, the directed nature of the blockchain is preserved and accesses to data by hash are O(1). The BlockTree is the implementation of this philosophy, storing both the active chain tip as well as candidate forks, and a hashmap of the block hash to the blocks header and other metadata. Accessing data by height is still valuable, but a chain may have multiple headers at a single height. In the BlockTree only the heights that are considered to be in the chain of most work are stored. When a fork accumulates more work than the current chain, the heights are updated accordingly. This is the most complicated function of the BlockTree, switch_to_fork. All other methods should be straightforward.

This approach differs from the previous, where header chains of equal work are rejected. Due to that compromise, a few tests are removed temporarily. No integration tests are effected, however, which is a decently strong indication the implementation works as expected for most conditions

There will be a lot of follow up to this in terms of the Chain structure. Because each block hash now contains extra metadata, all of the logic within both the CFHeaderChain and FilterChain can be consolidated into the BlockTree

@rustaceanrob rustaceanrob force-pushed the chain-4-1 branch 6 times, most recently from 725b066 to e895611 Compare April 7, 2025 09:37
@rustaceanrob rustaceanrob marked this pull request as ready for review April 7, 2025 09:47
@rustaceanrob
Copy link
Copy Markdown
Collaborator Author

rustaceanrob commented Apr 7, 2025

More details in case future documentation covers why things are done this way:

#324 introduces a "tree" structure. The model itself is a directed acyclic graph, but the resulting graph would look like a tree with potentially multiple leaves but only a single terminating root. Thinking in terms of a graph, each header and associated metadata are a "node", and block hashes are edges. To traverse the graph, you may only follow the previous hash of whatever node you are at. In practice this can be stored as a HashMap, where each "edge" is a key, the block hash, and each "node" is the header, height, accumulated PoW, and other metadata. To determine the next edge to visit, one would just check the previous hash of the header. There is no natural order in a hashmap, so a "tip" structure is used to determine what edge to begin with for a particular chain. The chain with the most work is stored as the active tip. The tip also stores the expected work for the next header in the chain. Properties:

  • Blocks are accessed in O(1) by hash
  • As long as a header connects to some header in the HashMap, a new fork can be created and evaluated trivially
  • The height and hash of the active tip are O(1)
  • If a header connects to the tip, the update is an insert and assigning a new tip, which is computationally cheap as well

It is still nice to have to be able to index by height for: difficulty adjustments, GetCFHeaders, GetCFilters.

  • Because one height can map to potentially many hashes, it is simpler to store only the hash which is considered to be in the most-work chain. When extending the active chain, this is trivial
  • When switching to a new fork, the hashes at each height must be updated as follows, assuming difficult adjustments are audited:
    • If there was no previous hash at the height, the hash from the new chain may be added.
    • If there was a previous hash at this height, the hash in the new chain is compared with the old:
      • If they differ, the hash from the new chain is replaced, and the old hash is reported as reorged
      • If they are the same, stop. This is where the two chains agree
  • Iterators over the graph are trivially implemented, as long as they go from the tip downwards
  • Because each "node" in the graph stores additional data besides the block header, future fields for the filter hash/header, and if the filter was indeed checked may be added to each node in the graph.
  • Now that each header is evaluated individually as they are added to the chain, the database can make use of these changes, and implement caching soon.

Copy link
Copy Markdown
Collaborator

@nyonson nyonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 6ce550f

self.0 as u64 % params.as_ref().difficulty_adjustment_interval() == 0
}

pub(crate) fn to_u32(self) -> u32 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I fully grasp the subtle API tradeoffs yet, but any reason to not use a standard From impl here like:

impl From<Height> for u32 {
    fn from(height: Height) -> Self {
        height.0
    }
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken this wouldn't enable the impl Into<Height> message signature because it is not clear how to go from a u32 to a Height as far as the compiler is concerned

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't impl Into<Height> handled by the existing From<u32> impl?

impl From<u32> for Height {
    fn from(value: u32) -> Self {
        Height(value)
    }
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't read what snippet this was above. I like forcing the caller to know what they are converting to here. "into" could apply to both 32 and 64 bits at the callsite

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so just so I understand, you want to avoid a scenario where the caller by accident converts the Height into a u64? I am not sure I follow though, does impl From<Height> for u32 also work with u64? I assumed rust would still require that to be explicit?

Copy link
Copy Markdown
Collaborator Author

@rustaceanrob rustaceanrob Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extrapolated the suggestion into also implementing for From<Height> for u64, as there is a method to do that with to_u64. I would think it strange to have a to_u64 but no to_u32 because you are supposed to use into() to get your u32. I think you should be able to convert to either one, but the call should be explicit as to which one

I think after this PR there can be some discussion of either:

  1. promoting this type across the crate and making it public
  2. substituting this type with absolute::Height from bitcoin and using an extension trait to do useful things to Height.

If 1. is the conclusion, then we can consider what exactly the API should look like to public users

Copy link
Copy Markdown
Collaborator

@nyonson nyonson Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh whoops, missed the other u64 method. I guess having both implemented would force the caller to be explicit, like with turbofish: let value = height.into::<u32>();, which kinda accomplishes the same goal for being explicit with the functions? Not a big deal, was just curios on the api choice.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I want to do a bit of research on absolute::Height or see if we should iron this one out for public use. For this PR it does the job

Height(height)
}

fn from_u64_checked(height: u64) -> Option<Self> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my other API question, can this use a TryFrom<u64> for Height?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe a bit contrived, but since it would be mapped with try_into().ok() and it resembles the checked API I went with an optional.

I discussed with valuedmammal from the BDK team, and it might make sense to actually use the absolute::Height from bitcoin, but we would like have to add an extension trait to do the nice things this Height does. I am punting on it for a future PR where we've thought through what exactly makes sense here.

canonical_hashes: BTreeMap::new(),
headers: HashMap::with_capacity(20_000),
active_tip: tip,
candidate_forks: Vec::with_capacity(2),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 2 just kinda a feel good number here? Just wondering if it was influenced by something.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, somewhat of a guess because I think the default might be something like 8 or 16 which is more than I would expect to see. Maybe a bit pedantic of me here

} else {
self.active_tip.next_work_required
};
if let Some(work) = next_work {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on being defensive and validating the actual pow here? I see it happening in Chain's sanity check which also makes sense, just feels a little naked not seeing it here.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there would be no harm in it but it would introduce a new variant on the rejected reasoning. I think it is okay for the HeaderBatch to still have some basic role in sanity checking just to get the simple checks out of the way. The function body is already pretty large.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is kind of inline with my pub(crate) concerns. pub(crate) pokes a hole in the visibility red tape so that the item can be called from anywhere, but in this case, it probably should only be called by Chain since it is relying on the validation at that level.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you mean to restrict this to "mod" visibility? I think that makes sense if that's what you mean. Otherwise not sure if I follow

Copy link
Copy Markdown
Collaborator

@nyonson nyonson Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, the two-plane visibility rules of rust (and how pub(crate) overrides both) is confusing. I think this is a little extra confusing because we have the chain module with a child module also called chain.

But at a very high level, this is how I view the hierarchy:

         +--> chain ---> graph (holding the BlockTree type)
         |
         |
 node ---+
         |
         |
         +--> network/peers

In my mind, the graph module should just default to private and have normal public items which only the parent chain module would have access too. And the chain module can be the interface for node.

In order to avoid some big visibility refactor though, pub(chain) (I think technically the parent chain module or the sibling would work...but haven't tested that) is perhaps kind of equivalent in the current layout?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like there is a greater discussion to be had about visibility. For now I'm not sure its really a bleeding problem, but if we reach some sort of protocol on what to scope visibility at given a type's role, then we can apply that across the crate. As an aside I plan on changing the name of chain.rs to something more like manager.rs, which is ultimately what that file will amount to

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably makes it easier during refactoring, but the heavy use of pub(crate) makes it difficult to parse what exactly is the interface of the module. Given the module itself has pub(crate) visibility, might be nice to take a pass at this later and clean up what needs to be exposed to a minimum. Since this is wrapped by the chain module, feels like that is where more thought could be put into the api.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lean to the explicit pub(crate) because it makes it very clear this is not something one would have to worry about breaking. Just another crate preference that is up to interpretation I think

@ValuedMammal
Copy link
Copy Markdown

Concept ACK

Copy link
Copy Markdown
Collaborator

@nyonson nyonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 7f26f3e

@rustaceanrob
Copy link
Copy Markdown
Collaborator Author

rustaceanrob commented Apr 11, 2025

Found a bug. In chain.rs at line 331 there is handling for the rejection case in which we don't know the previous hash to the new header. It may not be stored in memory within the BlockTree yet, but we should also check the database for the header at that hash. If it exists, we can start from that header instead.

This patch was doing that, but it was only re-initializing the BlockTree at that particular prev hash. Instead, we also need to load all the headers we have stored after that hash as well. Otherwise, we can easily follow a new fork, but the fork won't be reported to the user because the old chain isn't fully in the BlockTree.

The fix is simple, we just load in all the headers we can after that prev hash we found. Failures to do this load are reported but don't immediately kill the node. I think that should be up to the user, especially since the database just successfully fetched the header at the height we were looking for.

--- a/src/chain/chain.rs
+++ b/src/chain/chain.rs
@@ -335,7 +335,15 @@ impl<H: HeaderStore> Chain<H> {
-                                self.header_chain.accept_header(header);
+                                drop(db);
+                                if let Err(e) = self.load_headers().await {
+                                    crate::log!(self.dialog,
+                                        "Failure when attempting to fetch previous headers while syncing"
+                                    );
+                                    self.dialog.send_warning(Warning::FailedPersistence {
+                                        warning: format!("Persistence failure: {e}"),
+                                    });
+                                }

Note that this branch of logic was almost tested by stop_reorg_start_on_orphan, but the event BlocksDisconnected never actually got emitted. With this diff applied, the test now asserts that the orphan block is reported as disconncted

@rustaceanrob
Copy link
Copy Markdown
Collaborator Author

Going to self-ACK because I have reviewed the rest thoroughly and there is a lot to be done before next release

@rustaceanrob rustaceanrob merged commit ae3afde into master Apr 11, 2025
12 checks passed
@rustaceanrob rustaceanrob deleted the chain-4-1 branch April 11, 2025 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants