HDDS-14730. Update Recon container sync to use container IDs#9842
HDDS-14730. Update Recon container sync to use container IDs#9842devmadhuu merged 13 commits intoapache:masterfrom
Conversation
devmadhuu
left a comment
There was a problem hiding this comment.
Thanks @jasonosullivan34 for the patch. Few comments in line with code. Pls check.
Also please add in PR description , how the patch was tested. Also better write following tests:
-
A unit test for ContainerStateMap.getContainerIDs(state, start, count) verifying pagination and state filtering
-
A unit or integration test for syncWithSCMContainerInfo() covering the "container missing from Recon, add it" path, and the "container already present, skip it" path
...e/recon/src/main/java/org/apache/hadoop/ozone/recon/spi/StorageContainerServiceProvider.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/apache/hadoop/ozone/recon/spi/impl/StorageContainerServiceProviderImpl.java
Outdated
Show resolved
Hide resolved
...pache/hadoop/hdds/scm/protocolPB/StorageContainerLocationProtocolClientSideTranslatorPB.java
Show resolved
Hide resolved
...work/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocol.java
Outdated
Show resolved
Hide resolved
.../apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocolServerSideTranslatorPB.java
Outdated
Show resolved
Hide resolved
...econ/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerManagerFacade.java
Outdated
Show resolved
Hide resolved
...econ/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerManagerFacade.java
Outdated
Show resolved
Hide resolved
.../apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocolServerSideTranslatorPB.java
Outdated
Show resolved
Hide resolved
...hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java
Show resolved
Hide resolved
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerManager.java
Outdated
Show resolved
Hide resolved
devmadhuu
left a comment
There was a problem hiding this comment.
Thanks @jasonosullivan34 for improving the patch, however, still few points need to handle. Pls check.
...econ/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerManagerFacade.java
Show resolved
Hide resolved
| return true; | ||
| } | ||
|
|
||
| private long getContainerCountPerCall(long totalContainerCount) { |
There was a problem hiding this comment.
Earlier, CONTAINER_METADATA_SIZE was defined as 1MB to estimate a ContainerInfo object. A ContainerID proto is ~8–12 bytes. With an IPC max of 128MB, the old code limited batches to floor(128MB / 1MB) = 128 containers per call. The new code does the same — 128 IDs per call — when it could safely fetch floor(128MB / 12 bytes) ≈ 5.5 million IDs per call. This means the change may actually increase the number of RPCs instead of reducing them. So I think, we should test with large set of container Ids specially when a cluster will have 4-5 millions CLOSED containers and record the SCM latency and impact. Because that was the whole objective behind why this JIRA was raised. And based on impact, we should think of innovative way to handle impact on SCM as well as rpc message length also should not exceed default 128 MB.
There was a problem hiding this comment.
I propose limiting the number of container IDs we fetch back in a single message to 500K. This would equate to 16MB which is well within the RPC message size limit and should be safe enough for SCM to handle.
I also want to introduce a config where we can reduce the number of container ids fetched if we need to fine tune due to memory constraints
There was a problem hiding this comment.
The "~16MB heap" is computed as 500K × ~32 bytes per JVM ContainerID object. But the wire size is 500K × 12 bytes = 6MB. Check my comment in other place.
...econ/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerManagerFacade.java
Outdated
Show resolved
Hide resolved
...e/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerSyncHelper.java
Outdated
Show resolved
Hide resolved
...con/src/test/java/org/apache/hadoop/ozone/recon/scm/TestReconStorageContainerSyncHelper.java
Show resolved
Hide resolved
devmadhuu
left a comment
There was a problem hiding this comment.
@jasonosullivan34 thanks for improving the patch. Just few nits. Now it LGTM.
...e/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerSyncHelper.java
Outdated
Show resolved
Hide resolved
...econ/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerManagerFacade.java
Show resolved
Hide resolved
| * Maximum number of ContainerIDs to fetch from SCM per RPC call | ||
| * during container sync. Each ContainerID is approximately 12 bytes | ||
| * on the wire. Reduce this value on memory-constrained Recon nodes. | ||
| * Default: 500,000 (~16MB heap per batch, 8 calls for a 4M container cluster) |
There was a problem hiding this comment.
The "~16MB heap" is computed as 500K × ~32 bytes per JVM ContainerID object. But the wire size is 500K × 12 bytes = 6MB. These are different things and the comment doesn't distinguish them. Operators tuning this config for RPC limits care about wire size; operators tuning for memory care about heap size. Clarify both: e.g., "~6MB on the wire, ~16MB JVM heap per batch". So based on wire size, number of containers in single rpc call also can be more in default ?
| return true; | ||
| } | ||
|
|
||
| private long getContainerCountPerCall(long totalContainerCount) { |
There was a problem hiding this comment.
The "~16MB heap" is computed as 500K × ~32 bytes per JVM ContainerID object. But the wire size is 500K × 12 bytes = 6MB. Check my comment in other place.
… to use ContainerManager.getContainerStateCount(state)
devmadhuu
left a comment
There was a problem hiding this comment.
Thanks @jasonosullivan34 for improving the patch. LGTM +1
What changes were proposed in this pull request?
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14730
How was this patch tested?
Unit tests
Manual testing
@devmadhuu