-
Notifications
You must be signed in to change notification settings - Fork 582
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Description
Velox backend provides 2-level file cache (AsyncDataCache and SsdCache) and we have enabled it in PR, using a dedicated MMapAllocator initialized with configured capacity. This part of memory is not counted by execution memory or storage memory, and not managed by Spark UnifiedMemoryManager. In this ticket, we would like to fill this gap by following designs:
- Add
NativeStorageMemorysegment in vanillaStorageMemory. We will have a configurationspark.memory.native.storageFractionto define its size. Then we use this sizeoffheap.memory*spark.memory.storageFraction*spark.memory.native.storageFractionto initializeAsyncDataCache. - Add configuration
spark.memory.storage.preferSpillNativeto determine preference of spilling RDD cache or FileCache(Native) when storage memory should be shrinked. For example, when queries are mostly executed on same data sources, we prefer to keep native file cache. - Introduce
NativeMemoryStoreto provide similar interfaces as vanillaMemoryStoreand callAsyncDataCache::shrinkwhen eviction needed. - Introduce
NativeStorageMemoryAllocatorwhich is a memory allocator used for creatingAsyncDataCache. It's wrapped with aReservationListenerto track the memory usage in native cache. VeloxBackendinitialization will be done w/o cache created. We will doVeloxBackend::setAsyncDatacachewhen memory pools initializing.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request
