|
| 1 | +# Archival node |
| 2 | + |
| 3 | +An archival node keeps the full blockchain history and serves historical queries via liteserver. Setting one up involves two steps: importing existing archives into epoch-based storage, then starting the node in archival mode. |
| 4 | + |
| 5 | +## Table of contents |
| 6 | + |
| 7 | +- [System requirements](#system-requirements) |
| 8 | +- [Source data](#source-data) |
| 9 | +- [Archive import](#archive-import) |
| 10 | + - [Parameters](#parameters) |
| 11 | + - [What the import does](#what-the-import-does) |
| 12 | + - [Output structure](#output-structure) |
| 13 | +- [Archival mode config](#archival-mode-config) |
| 14 | + - [Simple setup](#simple-setup) |
| 15 | + - [Distributed setup](#distributed-setup) |
| 16 | +- [Cloning an existing archival node](#cloning-an-existing-archival-node) |
| 17 | +- [Helm integration](#helm-integration) |
| 18 | + |
| 19 | +## System requirements |
| 20 | + |
| 21 | +| Resource | Minimum | |
| 22 | +|----------|---------| |
| 23 | +| Disk | 20 TB (block archives + cells database) | |
| 24 | +| RAM | 32 GB | |
| 25 | +| Import time | Several days for full blockchain history | |
| 26 | + |
| 27 | +## Source data |
| 28 | + |
| 29 | +TON block archives are stored as pairs of `.pack` files per archive group (masterchain + shard blocks): |
| 30 | + |
| 31 | +``` |
| 32 | +archive.00000.pack # masterchain blocks 0-99 |
| 33 | +archive.00000.0:8000000000000000.pack # workchain 0 shard blocks for the same range |
| 34 | +archive.00100.pack # masterchain blocks 100-199 |
| 35 | +archive.00100.0:8000000000000000.pack # workchain 0 shard blocks |
| 36 | +... |
| 37 | +``` |
| 38 | + |
| 39 | +You also need: |
| 40 | + |
| 41 | +- Masterchain zerostate `.boc` file |
| 42 | +- Workchain zerostate(s) `.boc` file(s) (one per workchain) |
| 43 | +- Global config JSON (contains zerostate hashes and hard fork list) |
| 44 | + |
| 45 | +## Archive import |
| 46 | + |
| 47 | +The `archive_import` tool converts raw `.pack` files into epoch-based storage used by the archival node. |
| 48 | + |
| 49 | +```bash |
| 50 | +RUST_LOG=info archive_import \ |
| 51 | + --archives-path /path/to/archives \ |
| 52 | + --epochs-path /data/epochs \ |
| 53 | + --node-db-path /data/node_db \ |
| 54 | + --mc-zerostate /path/to/mc_zerostate.boc \ |
| 55 | + --wc-zerostate /path/to/wc0_zerostate.boc \ |
| 56 | + --global-config /path/to/global-config.json |
| 57 | +``` |
| 58 | + |
| 59 | +### Parameters |
| 60 | + |
| 61 | +| Parameter | Required | Description | |
| 62 | +|-----------|----------|-------------| |
| 63 | +| `--archives-path` | yes | Directory containing source `.pack` files | |
| 64 | +| `--epochs-path` | yes | Directory where epoch subdirectories will be created | |
| 65 | +| `--node-db-path` | yes | Path to node database | |
| 66 | +| `--mc-zerostate` | yes | Path to masterchain zerostate `.boc` file | |
| 67 | +| `--wc-zerostate` | yes | Path to workchain zerostate `.boc` file (repeat for each workchain) | |
| 68 | +| `--global-config` | yes | Path to global config JSON | |
| 69 | +| `--epoch-size` | no | MC blocks per epoch, default `10000000` (must be a multiple of 20000) | |
| 70 | +| `--copy` | no | Copy `.pack` files instead of moving them | |
| 71 | + |
| 72 | +### What the import does |
| 73 | + |
| 74 | +1. Validates zerostate hashes against the global config |
| 75 | +2. Scans the archives directory and groups `.pack` files by archive ID |
| 76 | +3. For each group: deserializes blocks, validates proofs, imports packages into epoch storage |
| 77 | +4. Populates the node database: block handles, prev/next block links, block index, shard states |
| 78 | + |
| 79 | +The import supports resume — if interrupted, re-run with the same parameters to continue from the last imported group. |
| 80 | + |
| 81 | +### Output structure |
| 82 | + |
| 83 | +``` |
| 84 | +/data/node_db/ |
| 85 | + db/ # main RocksDB (block handles, indexes, state keys) |
| 86 | + archive_states/ # shard state cell storage |
| 87 | +
|
| 88 | +/data/epochs/ |
| 89 | + epoch_0/ # archive packages for MC blocks 0..epoch_size-1 |
| 90 | + archive_db/ # RocksDB with package metadata |
| 91 | + archive/packages/ # .pack files |
| 92 | + epoch_1/ |
| 93 | + ... |
| 94 | +``` |
| 95 | + |
| 96 | +--- |
| 97 | + |
| 98 | +## Archival mode config |
| 99 | + |
| 100 | +Add the `archival_mode` section to the node config to enable epoch-based archival storage. Set `internal_db_path` to point to the database created by the import. |
| 101 | + |
| 102 | +### Simple setup |
| 103 | + |
| 104 | +All epochs in one directory. The node auto-discovers existing epochs on startup and creates new ones in the same location. |
| 105 | + |
| 106 | +```json |
| 107 | +{ |
| 108 | + "internal_db_path": "/data/node_db", |
| 109 | + "archival_mode": { |
| 110 | + "epoch_size": 10000000, |
| 111 | + "new_epochs_path": "/data/epochs", |
| 112 | + "existing_epochs": [] |
| 113 | + } |
| 114 | +} |
| 115 | +``` |
| 116 | + |
| 117 | +The `epoch_size` must match the value used during import. |
| 118 | + |
| 119 | +### Distributed setup |
| 120 | + |
| 121 | +Imported epochs on separate (slower) storage, new epochs on fast storage. List imported epoch directories explicitly in `existing_epochs`. |
| 122 | + |
| 123 | +```json |
| 124 | +{ |
| 125 | + "internal_db_path": "/data/node_db", |
| 126 | + "archival_mode": { |
| 127 | + "epoch_size": 10000000, |
| 128 | + "new_epochs_path": "/fast_ssd/new_epochs", |
| 129 | + "existing_epochs": [ |
| 130 | + { "path": "/nfs/imported/epoch_0" }, |
| 131 | + { "path": "/nfs/imported/epoch_1" }, |
| 132 | + { "path": "/fast_ssd/imported/epoch_2" } |
| 133 | + ] |
| 134 | + } |
| 135 | +} |
| 136 | +``` |
| 137 | + |
| 138 | +> **Note:** The last imported epoch is likely incomplete — its range covers blocks still being created. It will continue to receive new blocks during sync, so place it on fast storage alongside `new_epochs_path`. |
| 139 | +
|
| 140 | +### Behavior |
| 141 | + |
| 142 | +When `archival_mode` is set: |
| 143 | + |
| 144 | +- Archive GC is disabled — all historical data is preserved |
| 145 | +- Shard states are stored in a separate cell database (`archive_states/`) |
| 146 | +- New blocks arriving via sync are appended to the latest epoch |
| 147 | + |
| 148 | +> **See also:** For a simpler setup that keeps full history without epoch-based storage, see the [archival node section in node-config.md](node-config.md#archival-node). That approach disables GC and works without the import step, but requires the node to sync all history from scratch. |
| 149 | +
|
| 150 | +--- |
| 151 | + |
| 152 | +## Cloning an existing archival node |
| 153 | + |
| 154 | +Instead of importing from scratch, you can copy data from a running archival node: |
| 155 | + |
| 156 | +1. Stop the source node |
| 157 | +2. Copy epoch directories and the node database (`rsync` or similar) |
| 158 | +3. On the new machine, generate a fresh node config with new ADNL keys |
| 159 | +4. Set `internal_db_path` and `archival_mode` pointing to the copied data |
| 160 | +5. Start the new node |
| 161 | + |
| 162 | +This works because the database contains only blockchain data (blocks, states, indexes). Node identity (ADNL keys, validator keys) is stored in the config file and secrets vault, not in the database. |
| 163 | + |
| 164 | +> **Important:** Do not copy the node config file — it contains the ADNL private keys of the source node. Always generate fresh keys for the new node. |
| 165 | +
|
| 166 | +--- |
| 167 | + |
| 168 | +## Helm integration |
| 169 | + |
| 170 | +The archive import runs outside of Kubernetes as a one-time migration step. After import, configure the Helm chart to start the node in archival mode: |
| 171 | + |
| 172 | +1. Mount the epoch storage and node database into the pod using `extraVolumes` and `extraVolumeMounts` |
| 173 | +2. Set `archival_mode` in the node config (`nodeConfigs`) with paths matching the mount points |
| 174 | +3. Size `storage.db.size` for the node database (the epoch data lives on external volumes) |
| 175 | + |
| 176 | +> **See also:** [node-config.md](node-config.md#archival-node) covers the GC-based approach to keeping full history, which does not require the import step but uses more disk on the primary volume. |
0 commit comments