Skip to content

Commit 005ed44

Browse files
feat: use positions based on a forest with 63 rows
When we compute the position of a node, for any node not in the 0th row, their position depends on how many leaves there are. This happens because the 0th row's size is allocated to the nearest power of two that can fit that many leaves. Therefore, in a forest with 6 leaves, the bottom row goes from zero through 7, the row 1 from 8 through 11 (the size of each row halves as you move up). If you add three extra UTXOs, growing the forest to nine leaves, adding the 9th will require allocating 16 0-row leaves, row 1 therefore goes from 16 to 23 and so on. If leaves always stay at the bottom, that fine. Nothing at the bottom ever needs to care about this, because there's no row before it to grow and shift their positions. However, leaves **do** move up during deletions. For that reason, whenever the forest grow, all targets that aren't at the bottom needs to be updated. Now imagine that we want to keep a leaf map that maps leaf_hash -> position within the forest: this works fine, we know where a node must go when deleting, by calling [`parent`] with their current position and `num_leaves`. But now imagine the forest has to grow: we need to go through the map and update all non-row 0 leaves. This could potentially involve going through millions of UTXOs and update one-by-one. Note that we can find the next position, it's not super efficient but works (see [`crate::proof::Proof::maybe_remap`] for more details), but doing this for every UTXO that isn't at the bottom is too expensive, even though it happens exponentially less frequently, when it happens, it's going to take an absurd amount of time and potentially stall the Utreexo network for hours. For that reason, we communicate positions as if the forest is always filled with the maximum amount of leaves we can possibly have, which is 63. Therefore, those positions never need to be remapped. Internally, we still use the dynamic size, and use this function to translate between the two.
1 parent fbc46c2 commit 005ed44

File tree

5 files changed

+112
-7
lines changed

5 files changed

+112
-7
lines changed

src/lib.rs

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,32 @@
2828

2929
extern crate alloc;
3030

31+
/// This is the maximum size the forest is ever allowed to have, this caps how big `num_leaves` can
32+
/// be (we use a [`u64`]) and is also used by the [`util::translate`] logic.
33+
///
34+
/// # Calculations
35+
///
36+
/// If you think: "but... is 63 enough space"? Well... assuming there's around 999,000 WUs
37+
/// available on each block (let's account for header and coinbase), a non-segwit transaction's
38+
/// size is:
39+
/// `4 (version) + 1 (vin count) + 41 (input) + 5 (vout for many outputs) + 10N + 4 (locktime)`
40+
///
41+
/// `N` is how many outputs we have (we are considering outputs with amount and a zero-sized
42+
/// script), for 999,000 WUs we can fit:
43+
/// - `55 + 10N <= 999,000`
44+
/// - `N ~= 90k` outputs (a little over)
45+
///
46+
/// Since `2^63 = 9,223,372,036,854,775,808`, if you divide this by 90,000 we get
47+
/// 102,481,911,520,608 blocks. It would take us 3,249,680 years to mine that many blocks.
48+
///
49+
/// For the poor soul in 3,249,682 A.D., who needs to fix this hard-fork, here's what you gotta do:
50+
/// - Change the `leaf_data` type to u128, or q128 if Quantum Bits are the fashionable standard.
51+
/// - Change `MAX_FOREST_ROWS` to 128 or higher in `lib.rs`
52+
/// - Modify [`util::start_position_at_row`] to avoid overflows.
53+
///
54+
/// That should save you the trouble.
55+
pub(crate) const MAX_FOREST_ROWS: u8 = 63;
56+
3157
#[cfg(not(feature = "std"))]
3258
/// Re-exports `alloc` basics plus HashMap/HashSet and IO traits.
3359
pub mod prelude {

src/mem_forest/mod.rs

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,8 @@ use super::util::right_child;
4949
use super::util::root_position;
5050
use super::util::tree_rows;
5151
use crate::prelude::*;
52+
use crate::util::translate;
53+
use crate::MAX_FOREST_ROWS;
5254

5355
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
5456
enum NodeType {
@@ -329,7 +331,13 @@ impl<Hash: AccumulatorHash> MemForest<Hash> {
329331
.map(|pos| self.get_hash(*pos).unwrap())
330332
.collect::<Vec<_>>();
331333

332-
Ok(Proof::new_with_hash(positions, proof))
334+
let tree_rows = tree_rows(self.leaves);
335+
let translated_targets = positions
336+
.into_iter()
337+
.map(|pos| translate(pos, tree_rows, MAX_FOREST_ROWS))
338+
.collect();
339+
340+
Ok(Proof::new_with_hash(translated_targets, proof))
333341
}
334342

335343
/// Returns a reference to the roots in this MemForest.

src/pollard/mod.rs

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,8 @@ use super::util::right_child;
6363
use super::util::root_position;
6464
use super::util::tree_rows;
6565
use crate::prelude::*;
66+
use crate::util::translate;
67+
use crate::MAX_FOREST_ROWS;
6668

6769
#[derive(Default, Clone)]
6870
/// A node in the Pollard tree
@@ -703,9 +705,15 @@ impl<Hash: AccumulatorHash> Pollard<Hash> {
703705
proof_hashes.push(hash);
704706
}
705707

708+
let tree_rows = tree_rows(self.leaves);
709+
let translated_targets = target_positions
710+
.into_iter()
711+
.map(|pos| translate(pos, tree_rows, MAX_FOREST_ROWS))
712+
.collect();
713+
706714
Ok(Proof::<Hash> {
707715
hashes: proof_hashes,
708-
targets: target_positions,
716+
targets: translated_targets,
709717
})
710718
}
711719

src/proof/mod.rs

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,8 @@ use super::util::get_proof_positions;
7474
use super::util::read_u64;
7575
use super::util::tree_rows;
7676
use crate::prelude::*;
77+
use crate::util::translate;
78+
use crate::MAX_FOREST_ROWS;
7779

7880
#[derive(Clone, Debug, Eq, PartialEq)]
7981
#[cfg_attr(feature = "with-serde", derive(Serialize, Deserialize))]
@@ -446,11 +448,16 @@ impl<Hash: AccumulatorHash> Proof<Hash> {
446448
Vec::<(Hash, Hash)>::with_capacity(util::num_roots(num_leaves));
447449

448450
// the positions that should be passed as a proof
449-
let proof_positions = get_proof_positions(&self.targets, num_leaves, total_rows);
451+
let translated: Vec<_> = self
452+
.targets
453+
.iter()
454+
.copied()
455+
.map(|pos| translate(pos, MAX_FOREST_ROWS, total_rows))
456+
.collect();
457+
let proof_positions = get_proof_positions(&translated, num_leaves, total_rows);
450458

451459
// As we calculate nodes upwards, it accumulates here
452-
let mut nodes: Vec<_> = self
453-
.targets
460+
let mut nodes: Vec<_> = translated
454461
.iter()
455462
.copied()
456463
.zip(del_hashes.to_owned())
@@ -527,7 +534,13 @@ impl<Hash: AccumulatorHash> Proof<Hash> {
527534
let mut calculated_root_hashes = Vec::<Hash>::with_capacity(util::num_roots(num_leaves));
528535

529536
// the positions that should be passed as a proof
530-
let proof_positions = get_proof_positions(&self.targets, num_leaves, total_rows);
537+
let translated: Vec<_> = self
538+
.targets
539+
.iter()
540+
.copied()
541+
.map(|pos| translate(pos, MAX_FOREST_ROWS, total_rows))
542+
.collect();
543+
let proof_positions = get_proof_positions(&translated, num_leaves, total_rows);
531544

532545
// As we calculate nodes upwards, it accumulates here
533546
let mut nodes: Vec<_> = self

src/util/mod.rs

Lines changed: 51 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,50 @@ pub fn remove_bit(val: u64, bit: u64) -> u64 {
3030

3131
(upper >> 1) | lower
3232
}
33+
34+
/// Translates targets from a forest with `from_rows` to a forest with `to_rows`.
35+
///
36+
/// When we compute the position of a node, any node not in row 0 has a position that depends
37+
/// on how many leaves there are. This happens because row 0 is allocated to the nearest power of
38+
/// two that can fit that many leaves. Therefore, in a forest with 6 leaves, the bottom row goes
39+
/// from 0 through 7, and row 1 goes from 8 through 11 (the size of each row halves as you move
40+
/// up). If you add three extra UTXOs, growing the forest to 9 leaves, adding the 9th will require
41+
/// allocating 16 row-0 leaves; row 1 therefore goes from 16 through 23, and so on.
42+
///
43+
/// If leaves always stayed at the bottom, that's fine. Nothing at the bottom ever needs to care
44+
/// about this, because there is no row below it whose growth would shift its positions. However,
45+
/// leaves **do** move up during deletions. For that reason, whenever the forest grows, all targets
46+
/// that are not at the bottom need to be updated.
47+
///
48+
/// Now imagine that we want to keep a leaf map from `leaf_hash` to position within the forest:
49+
/// this works fine, and we know where a node must go when deleting by calling [`parent`] with its
50+
/// current position and `num_leaves`. But now imagine the forest has to grow: we need to go
51+
/// through the map and update all non-row-0 leaves. This could potentially involve going through
52+
/// millions of UTXOs and updating them one by one. Note that we can find the next position; it is
53+
/// not super efficient, but it works (see [`crate::proof::Proof::maybe_remap`] for more details).
54+
/// But doing this for every UTXO that is not at the bottom is too expensive. Even though it
55+
/// happens exponentially less frequently, when it does happen, it is going to take an absurd
56+
/// amount of time and could potentially stall the Utreexo network for hours.
57+
///
58+
/// For that reason, we communicate positions as if the forest was always filled with the maximum
59+
/// number of leaves we can possibly have, which is 63. Therefore, those positions never need to be
60+
/// remapped. Internally, we still use the dynamic size, and use this function to translate between
61+
/// the two.
62+
///
63+
/// # Implementation
64+
///
65+
/// This function simply computes how far away from the start of the row this leaf is, then uses
66+
/// that to offset the same amount in the new structure.
67+
pub fn translate(pos: u64, from_rows: u8, to_rows: u8) -> u64 {
68+
let row = detect_row(pos, from_rows);
69+
if row == 0 {
70+
return pos;
71+
}
72+
73+
let offset = pos - start_position_at_row(row, from_rows);
74+
offset + start_position_at_row(row, to_rows)
75+
}
76+
3377
pub fn calc_next_pos(position: u64, del_pos: u64, forest_rows: u8) -> Result<u64, String> {
3478
let del_row = detect_row(del_pos, forest_rows);
3579
let pos_row = detect_row(position, forest_rows);
@@ -93,7 +137,7 @@ pub fn start_position_at_row(row: u8, forest_rows: u8) -> u64 {
93137
// 2 << forest_rows is 2 more than the max position
94138
// to get the correct offset for a given row,
95139
// subtract (2 << `row complement of forest_rows`) from (2 << forest_rows)
96-
(2 << forest_rows) - (2 << (forest_rows - row)) as u64
140+
((2_u128 << forest_rows) - (2_u128 << (forest_rows - row))) as u64
97141
}
98142

99143
pub fn is_left_niece(position: u64) -> bool {
@@ -359,6 +403,7 @@ mod tests {
359403
use super::roots_to_destroy;
360404
use crate::node_hash::BitcoinNodeHash;
361405
use crate::util::children;
406+
use crate::util::start_position_at_row;
362407
use crate::util::tree_rows;
363408

364409
#[test]
@@ -501,4 +546,9 @@ mod tests {
501546
let res = super::calc_next_pos(1, 9, 3);
502547
assert_eq!(Ok(9), res);
503548
}
549+
550+
#[test]
551+
fn test_start_position_at_row() {
552+
assert_eq!(start_position_at_row(1, 12), 4096);
553+
}
504554
}

0 commit comments

Comments
 (0)