-
Notifications
You must be signed in to change notification settings - Fork 65
HDDS-14773. [Docs] Add userguide for capacity distribution #365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
priyeshkaratha
wants to merge
6
commits into
apache:master
Choose a base branch
from
priyeshkaratha:HDDS-14773
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
6846cec
HDDS-14773. [Docs] Add userguide for capacity distribution
priyeshkaratha 5148e90
fixing ci errors
priyeshkaratha e7092e7
addressing comments
priyeshkaratha 7a071be
addressing review comments
priyeshkaratha 8b4b0e5
changing refered image
priyeshkaratha 1368f1a
addressing review comments
priyeshkaratha File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
135 changes: 135 additions & 0 deletions
135
...guide/03-operations/09-observability/02-recon/03-recon-capacity-distribution.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,135 @@ | ||
| --- | ||
| sidebar_label: Cluster Capacity User Guide | ||
| --- | ||
| # Cluster Capacity User Guide | ||
|
|
||
| This page is the central place for understanding storage distribution across the Ozone cluster. | ||
| It moves from a high-level physical view to logical service usage, and down to individual node diagnostics. | ||
| Use this guide to understand exactly where your storage capacity is going. | ||
|
|
||
| ## Dashboard Layout Overview | ||
|
|
||
| The Cluster Capacity page is organized logically from top to bottom, increasing in granularity: | ||
|
|
||
| 1. Cluster Summary: The total physical disk view. | ||
| 2. Service Summary: The logical state of Ozone data (Open, Committed, Pending Deletion). | ||
| 3. Pending Deletion & Datanode Insights: Deep dives into data deletion life cycles and individual node performance. | ||
|
|
||
| --- | ||
|
|
||
| ## Cluster (Physical Capacity) | ||
|
|
||
| The **Cluster** widget provides a high-level summary of the total physical storage managed by Ozone Datanodes. It helps you distinguish between space used by Ozone and space taken by other processes on the underlying hardware. | ||
|
|
||
|  | ||
|
|
||
| ### Metric Definitions | ||
|
|
||
| > Note: All metric values presented here are obtained from a representative sample cluster and are for reference purposes only. | ||
|
|
||
| - **Total Capacity (2.2 TB)** | ||
| The combined capacity of all configured storage directories across all live Datanodes in the cluster. | ||
priyeshkaratha marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| It shows how much usable space is available after subtracting the reserved space from the file system capacity. | ||
|
|
||
| - **Ozone Used Space (437.3 GB)** | ||
| Physical space currently occupied by replicated Ozone blocks. | ||
| > Note: This accounts for the replication factor (e.g., a 100 GB key with 3x replication uses 300 GB of physical space). | ||
|
|
||
| - **Other Used Space (482.5 GB)** | ||
| This is the space occupied by other Ozone related files but not actual data stored by Ozone. This may include things like logs, configuration files, Rocks DB files etc. | ||
|
|
||
| - **Container Pre-allocated (0 B)** | ||
| Space reserved for open containers that have been allocated to clients but not yet written to, ensuring capacity is available when needed. This reserved space decreases as data is written to the containers, and any remaining unused portion is released when the containers are closed. | ||
|
|
||
| - **Remaining Space (1.3 TB)** | ||
| The actual amount of unused physical disk space available for new Ozone data or other files. | ||
|
|
||
| --- | ||
|
|
||
| ## Service (Logical Capacity) | ||
|
|
||
| The **Service** widget transitions from the physical view to the logical view. It breaks down the **Ozone Used Space** based on the state of the keys within the Ozone architecture. | ||
|
|
||
|  | ||
|
|
||
| ### Ozone Used Space Breakdown | ||
|
|
||
| - **Total (437.3 GB)** | ||
| The sum of all Ozone data currently tracked in the system across all states. This matches the physical Ozone Used Space. | ||
|
|
||
| - **Open Keys (2.9 GB)** | ||
| Data in keys that are currently being written to by clients or have not yet been committed to the system. This data is temporary. | ||
|
|
||
| - **Committed Keys (429.5 GB)** | ||
| Finalized and immutable data that is successfully stored and accessible by users. | ||
|
|
||
| - **Pending Deletion (3.8 GB)** | ||
| Data from keys that have been logically deleted by a user but have not yet been physically scrubbed from the Datanodes. This is the combined total size of data pending deletion across OM, SCM, and Datanodes. This space will eventually be reclaimed. | ||
|
|
||
| > 💡 **Administrator Tip:** | ||
| > A high and persistent **Pending Deletion** value might indicate that the automated deletion process is lagging. This guide explains how to investigate that lifecycle in the next section. | ||
|
|
||
| --- | ||
|
|
||
| ## Pending Deletion Lifecycle | ||
|
|
||
| This widget provides transparency into the multi-stage process of data deletion in Ozone. It tracks how deletion requests flow from the Ozone Manager through the Storage Container Manager to final removal of block on Datanodes. | ||
|
|
||
|  | ||
|
|
||
| ### Tracking the Stages | ||
|
|
||
| - **Ozone Manager (OM) (0 B)** | ||
| Keys or directories deleted by the client but whose underlying blocks have not yet been processed by SCM. | ||
|
|
||
| - **Storage Container Manager (SCM) (3.8 GB)** | ||
| Blocks that SCM has identified as ready for deletion and is actively trying to command Datanodes to remove. | ||
|
|
||
| - **Datanodes (0 B)** | ||
| Blocks that are queued on the individual Datanodes waiting for physical disk deletion. | ||
|
|
||
| > 💡 **Diagnostic Tip:** | ||
| > If SCM shows 1 TB pending deletion but the Datanodes stage shows 0 B, SCM may be having trouble communicating deletion commands to the nodes. | ||
|
|
||
| --- | ||
|
|
||
| ## Datanode Insights | ||
|
|
||
| The **Datanodes** section moves from the cluster level to individual node performance. This is crucial for identifying imbalances, failing disks, or nodes that are filling up faster than others. | ||
|
|
||
|  | ||
|
|
||
| ### Using the Datanode Inspector | ||
|
|
||
| - **Download Insights** | ||
| Download a snapshot report of all Datanode storage distribution in CSV format. | ||
priyeshkaratha marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| - **Node Selector** | ||
| Use the searchable dropdown to pick a specific Datanode. It displays both the hostname and the unique UUID for precise identification (e.g., `ozone-datanode-1.ozone_defa`). | ||
|
|
||
| Once a node is selected, the specific storage charts appear. | ||
|
|
||
| ### Individual Node Metric Definitions | ||
|
|
||
| - **Used Space (Node Level) (12 MB)** | ||
| Storage distribution on the selected node. | ||
|
|
||
| - **Pending Deletion (0 B)** | ||
| Space occupied by blocks on this node that are queued for immediate physical deletion. | ||
|
|
||
| - **Ozone Used (12 MB)** | ||
| Physical space used by active, replicated Ozone blocks on this specific node. | ||
|
|
||
| - **Free Space (Node Level) (202.2 GB)** | ||
| Remaining capacity on the selected node. | ||
|
|
||
| - **Unused (202.2 GB)** | ||
| Total unused physical disk space available on this specific node. | ||
|
|
||
| - **Ozone Pre-allocated (0 B)** | ||
| Space reserved specifically for Ozone open containers on this node. | ||
|
|
||
| > 💡 **Diagnostic Tip:** | ||
| > Compare multiple nodes using the Node Selector. Large discrepancies in **Ozone Used** space can indicate data balancing issues in your cluster. | ||
|
|
||
| --- | ||
Binary file added
BIN
+50.3 KB
...inistrator-guide/03-operations/09-observability/02-recon/data_node_insights.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+38.2 KB
...-administrator-guide/03-operations/09-observability/02-recon/ozone_capacity.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+25.5 KB
...dministrator-guide/03-operations/09-observability/02-recon/pending_deletion.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+33.2 KB
...dministrator-guide/03-operations/09-observability/02-recon/service_capacity.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.