Skip to content

Commit 95384ad

Browse files
feat: use positions based on a forest with 63 rows
When we compute the position of a node, for any node not in the 0th row, their position depends on how many leaves there are. This happens because the 0th row's size is allocated to the nearest power of two that can fit that many leaves. Therefore, in a forest with 6 leaves, the bottom row goes from zero through 7, the row 1 from 8 through 11 (the size of each row halves as you move up). If you add three extra UTXOs, growing the forest to nine leaves, adding the 9th will require allocating 16 0-row leaves, row 1 therefore goes from 16 to 23 and so on. If leaves always stay at the bottom, that fine. Nothing at the bottom ever needs to care about this, because there's no row before it to grow and shift their positions. However, leaves **do** move up during deletions. For that reason, whenever the forest grow, all targets that aren't at the bottom needs to be updated. Now imagine that we want to keep a leaf map that maps leaf_hash -> position within the forest: this works fine, we know where a node must go when deleting, by calling [`parent`] with their current position and `num_leaves`. But now imagine the forest has to grow: we need to go through the map and update all non-row 0 leaves. This could potentially involve going through millions of UTXOs and update one-by-one. Note that we can find the next position, it's not super efficient but works (see [`crate::proof::Proof::maybe_remap`] for more details), but doing this for every UTXO that isn't at the bottom is too expensive, even though it happens exponentially less frequently, when it happens, it's going to take an absurd amount of time and potentially stall the Utreexo network for hours. For that reason, we communicate positions as if the forest is always filled with the maximum amount of leaves we can possibly have, which is 63. Therefore, those positions never need to be remapped. Internally, we still use the dynamic size, and use this function to translate between the two.
1 parent 6b14fd0 commit 95384ad

File tree

5 files changed

+113
-7
lines changed

5 files changed

+113
-7
lines changed

src/lib.rs

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,34 @@
2626

2727
extern crate alloc;
2828

29+
/// This is the maximum size the forest is ever allowed to have, this caps how big `num_leaves` can
30+
/// be (we use a [`u64`]) and is also used by the [`util::translate`] logic.
31+
///
32+
/// # Calculations
33+
///
34+
/// If you think: "but... is 63 enough space"? Well... assuming there's around 999,000 WUs
35+
/// available on each block (let's account for header and coinbase), a non-segwit transaction's
36+
/// size is:
37+
/// 4 (version) + 1 (vin count) + 41 (input) + 5 (vout for a large number of outputs) + 10N + 4
38+
/// (locktime)
39+
///
40+
/// N is how many outputs we have (we are considering outputs with amount and a zero-sized
41+
/// script), for 999,000 WU we can fit
42+
/// 55 + 10N <= 999,000
43+
/// N ~= 90k outputs (a little over)
44+
///
45+
/// 2^63 = 9,223,372,036,854,775,808
46+
/// dividing this by 90,000 we get 102,481,911,520,608 blocks
47+
/// it would take 3,249,680 years to mine that many blocks...
48+
///
49+
/// For the poor soul in 3,249,682 A.D., who need to fix this hard-fork, here's what you gotta do:
50+
/// - Change the leaf_data type to a u128 or something q128 if Quantum Bits are the fashionable standard
51+
/// - Change `MAX_FOREST_ROWS` to 128 or higher in `lib.rs`
52+
/// - Modify [`start_position_at_row`] to avoid overflows.
53+
///
54+
/// That should save you the trouble.
55+
pub(crate) const MAX_FOREST_ROWS: u8 = 63;
56+
2957
#[cfg(not(feature = "std"))]
3058
/// Re-exports `alloc` basics plus HashMap/HashSet and IO traits.
3159
pub mod prelude {

src/mem_forest/mod.rs

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@ use super::util::right_child;
4747
use super::util::root_position;
4848
use super::util::tree_rows;
4949
use crate::prelude::*;
50+
use crate::util::translate;
51+
use crate::MAX_FOREST_ROWS;
5052

5153
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
5254
enum NodeType {
@@ -327,7 +329,13 @@ impl<Hash: AccumulatorHash> MemForest<Hash> {
327329
.map(|pos| self.get_hash(*pos).unwrap())
328330
.collect::<Vec<_>>();
329331

330-
Ok(Proof::new_with_hash(positions, proof))
332+
let tree_rows = tree_rows(self.leaves);
333+
let translated_targets = positions
334+
.into_iter()
335+
.map(|pos| translate(pos, tree_rows, MAX_FOREST_ROWS))
336+
.collect();
337+
338+
Ok(Proof::new_with_hash(translated_targets, proof))
331339
}
332340

333341
/// Returns a reference to the roots in this MemForest.

src/pollard/mod.rs

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ use super::util::right_child;
6161
use super::util::root_position;
6262
use super::util::tree_rows;
6363
use crate::prelude::*;
64+
use crate::util::translate;
65+
use crate::MAX_FOREST_ROWS;
6466

6567
#[derive(Default, Clone)]
6668
/// A node in the Pollard tree
@@ -701,9 +703,15 @@ impl<Hash: AccumulatorHash> Pollard<Hash> {
701703
proof_hashes.push(hash);
702704
}
703705

706+
let tree_rows = tree_rows(self.leaves);
707+
let translated_targets = target_positions
708+
.into_iter()
709+
.map(|pos| translate(pos, tree_rows, MAX_FOREST_ROWS))
710+
.collect();
711+
704712
Ok(Proof::<Hash> {
705713
hashes: proof_hashes,
706-
targets: target_positions,
714+
targets: translated_targets,
707715
})
708716
}
709717

src/proof/mod.rs

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ use super::util::get_proof_positions;
7272
use super::util::read_u64;
7373
use super::util::tree_rows;
7474
use crate::prelude::*;
75+
use crate::util::translate;
7576

7677
#[derive(Clone, Debug, Eq, PartialEq)]
7778
#[cfg_attr(feature = "with-serde", derive(Serialize, Deserialize))]
@@ -444,11 +445,16 @@ impl<Hash: AccumulatorHash> Proof<Hash> {
444445
Vec::<(Hash, Hash)>::with_capacity(util::num_roots(num_leaves));
445446

446447
// the positions that should be passed as a proof
447-
let proof_positions = get_proof_positions(&self.targets, num_leaves, total_rows);
448+
let translated: Vec<_> = self
449+
.targets
450+
.iter()
451+
.copied()
452+
.map(|pos| translate(pos, 63, total_rows))
453+
.collect();
454+
let proof_positions = get_proof_positions(&translated, num_leaves, total_rows);
448455

449456
// As we calculate nodes upwards, it accumulates here
450-
let mut nodes: Vec<_> = self
451-
.targets
457+
let mut nodes: Vec<_> = translated
452458
.iter()
453459
.copied()
454460
.zip(del_hashes.to_owned())
@@ -525,7 +531,13 @@ impl<Hash: AccumulatorHash> Proof<Hash> {
525531
let mut calculated_root_hashes = Vec::<Hash>::with_capacity(util::num_roots(num_leaves));
526532

527533
// the positions that should be passed as a proof
528-
let proof_positions = get_proof_positions(&self.targets, num_leaves, total_rows);
534+
let translated: Vec<_> = self
535+
.targets
536+
.iter()
537+
.copied()
538+
.map(|pos| translate(pos, 63, total_rows))
539+
.collect();
540+
let proof_positions = get_proof_positions(&translated, num_leaves, total_rows);
529541

530542
// As we calculate nodes upwards, it accumulates here
531543
let mut nodes: Vec<_> = self

src/util/mod.rs

Lines changed: 51 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,50 @@ pub fn remove_bit(val: u64, bit: u64) -> u64 {
2828

2929
(upper >> 1) | lower
3030
}
31+
32+
/// Translates targets from a forest with `from_rows` to a forest with `to_rows`.
33+
///
34+
/// When we compute the position of a node, for any node not in row 0 has a position that depends
35+
/// on how many leaves there are. This happens because row 0 is allocated to the nearest power of
36+
/// two that can fit that many leaves. Therefore, in a forest with 6 leaves, the bottom row goes
37+
/// from 0 through 7, and row 1 goes from 8 through 11 (the size of each row halves as you
38+
/// move up). If you add three extra UTXOs, growing the forest to 9 leaves, adding the 9th
39+
/// will require allocating 16 row-0 leaves; row 1 therefore goes from 16 though 23, and so on.
40+
///
41+
/// If leaves always stayed at the bottom, that fine. Nothing at the bottom ever needs to care about
42+
/// this, because there is no row below it whose growth would shift its positions. However, leaves
43+
/// **do** move up during deletions. For that reason, whenever the forest grows, all targets that
44+
/// are not at the bottom needs to be updated.
45+
///
46+
/// Now, imagine that we want to keep a leaf map from `leaf_hash` to position within the forest:
47+
/// this works fine, and we know where a node must go when deleting, by calling [`parent`] with their
48+
/// current position and `num_leaves`. But now imagine the forest has to grow: we need to go through
49+
/// the map and update all non-row-0 leaves. This could potentially involve going through millions
50+
/// of UTXOs and update them one by one. Note that we can find the next position, it is not super
51+
/// efficient, but it works (see [`crate::proof::Proof::maybe_remap`] for more details). But doing this
52+
/// for every UTXO that are not at the bottom is too expensive. Even though it happens exponentially
53+
/// less frequently, when it happens, it is going to take an absurd amount of time and potentially
54+
/// stall the Utreexo network for hours.
55+
///
56+
/// For that reason, we communicate positions as if the forest was always filled with the maximum
57+
/// number of leaves we can possibly have, which is 63. Therefore, those positions never need to be
58+
/// remapped. Internally, we still use the dynamic size, and use this function to translate between
59+
/// the two.
60+
///
61+
/// # Implementation
62+
///
63+
/// This function simply computes how far away from the start of the row this leaf is, then uses
64+
/// that to offset the same amount in the new structure.
65+
pub fn translate(pos: u64, from_rows: u8, to_rows: u8) -> u64 {
66+
let row = detect_row(pos, from_rows);
67+
if row == 0 {
68+
return pos;
69+
}
70+
71+
let offset = pos - start_position_at_row(row, from_rows);
72+
offset + start_position_at_row(row, to_rows)
73+
}
74+
3175
pub fn calc_next_pos(position: u64, del_pos: u64, forest_rows: u8) -> Result<u64, String> {
3276
let del_row = detect_row(del_pos, forest_rows);
3377
let pos_row = detect_row(position, forest_rows);
@@ -91,7 +135,7 @@ pub fn start_position_at_row(row: u8, forest_rows: u8) -> u64 {
91135
// 2 << forest_rows is 2 more than the max position
92136
// to get the correct offset for a given row,
93137
// subtract (2 << `row complement of forest_rows`) from (2 << forest_rows)
94-
(2 << forest_rows) - (2 << (forest_rows - row)) as u64
138+
((2_u128 << forest_rows) - (2_u128 << (forest_rows - row))) as u64
95139
}
96140

97141
pub fn is_left_niece(position: u64) -> bool {
@@ -357,6 +401,7 @@ mod tests {
357401
use super::roots_to_destroy;
358402
use crate::node_hash::BitcoinNodeHash;
359403
use crate::util::children;
404+
use crate::util::start_position_at_row;
360405
use crate::util::tree_rows;
361406

362407
#[test]
@@ -499,4 +544,9 @@ mod tests {
499544
let res = super::calc_next_pos(1, 9, 3);
500545
assert_eq!(Ok(9), res);
501546
}
547+
548+
#[test]
549+
fn test_start_position_at_row() {
550+
assert_eq!(start_position_at_row(1, 12), 4096);
551+
}
502552
}

0 commit comments

Comments
 (0)