-
Notifications
You must be signed in to change notification settings - Fork 582
Description
Backend
VL (Velox)
Bug description
There seems to be a memory leak from Table Scan
Got this response from Velox team : Velox Issue
Sample query
WITH
cdl_summary AS (
SELECT
user_id,
CASE
WHEN <TRACKING_LABEL_COL> = 'store_front' THEN 'storefront'
ELSE <TRACKING_LABEL_COL>
END AS source,
SUM(CASE WHEN name IN (<IMPRESSION_EVENTS>)
AND datestr BETWEEN :START_7D AND :END_DATE THEN 1 ELSE 0 END) AS impression_count_7d,
SUM(CASE WHEN name IN (<IMPRESSION_EVENTS>)
AND datestr BETWEEN :START_14D AND :END_DATE THEN 1 ELSE 0 END) AS impression_count_14d,
SUM(CASE WHEN name IN (<IMPRESSION_EVENTS>)
AND datestr BETWEEN :START_28D AND :END_DATE THEN 1 ELSE 0 END) AS impression_count_28d,
SUM(CASE WHEN name IN (<IMPRESSION_EVENTS>)
AND datestr BETWEEN :START_56D AND :END_DATE THEN 1 ELSE 0 END) AS impression_count_56d,
SUM(CASE WHEN name IN (<IMPRESSION_EVENTS>)
AND datestr BETWEEN :START_112D AND :END_DATE THEN 1 ELSE 0 END) AS impression_count_112d,
SUM(CASE WHEN name IN (<CLICK_EVENTS>)
AND datestr BETWEEN :START_7D AND :END_DATE THEN 1 ELSE 0 END) AS click_count_7d,
SUM(CASE WHEN name IN (<CLICK_EVENTS>)
AND datestr BETWEEN :START_14D AND :END_DATE THEN 1 ELSE 0 END) AS click_count_14d,
SUM(CASE WHEN name IN (<CLICK_EVENTS>)
AND datestr BETWEEN :START_28D AND :END_DATE THEN 1 ELSE 0 END) AS click_count_28d,
SUM(CASE WHEN name IN (<CLICK_EVENTS>)
AND datestr BETWEEN :START_56D AND :END_DATE THEN 1 ELSE 0 END) AS click_count_56d,
SUM(CASE WHEN name IN (<CLICK_EVENTS>)
AND datestr BETWEEN :START_112D AND :END_DATE THEN 1 ELSE 0 END) AS click_count_112d,
SUM(CASE WHEN name IN (<ORDER_EVENTS>)
AND datestr BETWEEN :START_7D AND :END_DATE THEN 1 ELSE 0 END) AS order_count_7d,
SUM(CASE WHEN name IN (<ORDER_EVENTS>)
AND datestr BETWEEN :START_14D AND :END_DATE THEN 1 ELSE 0 END) AS order_count_14d,
SUM(CASE WHEN name IN (<ORDER_EVENTS>)
AND datestr BETWEEN :START_28D AND :END_DATE THEN 1 ELSE 0 END) AS order_count_28d,
SUM(CASE WHEN name IN (<ORDER_EVENTS>)
AND datestr BETWEEN :START_56D AND :END_DATE THEN 1 ELSE 0 END) AS order_count_56d,
SUM(CASE WHEN name IN (<ORDER_EVENTS>)
AND datestr BETWEEN :START_112D AND :END_DATE THEN 1 ELSE 0 END) AS order_count_112d,
AVG(CASE
WHEN name = 'marketplace_scrolled'
AND datestr BETWEEN :START_7D AND :END_DATE THEN 0
WHEN name IN ('feed_item_card_scrolled','feed_item_dish_card_scrolled')
AND datestr BETWEEN :START_7D AND :END_DATE
THEN feed.display_item_position
ELSE NULL
END) AS avg_scroll_depth_7d,
MAX(CASE WHEN name IN (<INTERACTION_EVENTS>)
AND datestr BETWEEN :START_112D AND :END_DATE
THEN epoch_ms ELSE NULL END) AS latest_interaction_time
FROM <SCHEMA_CDL>.<TABLE_CDL> cdl
WHERE TRUE
AND datestr BETWEEN :START_112D AND :END_DATE
AND name IN (<IMPRESSION_EVENTS>, <CLICK_EVENTS>, <ORDER_EVENTS>)
AND COALESCE(<SESSION_ID_COL>, user_id) IS NOT NULL
AND is_first_event = TRUE
AND user_id IS NOT NULL AND user_id <> ''
AND <FEED_CONTEXT_COL> IN ('home','vertical','allstores','all_stores')
GROUP BY 1,2
),
xlb_summary AS (
SELECT
user_id,
CASE
WHEN <TRACKING_LABEL_COL> = 'store_front' THEN 'storefront'
ELSE <TRACKING_LABEL_COL>
END AS source,
MAX(CASE WHEN name IN (<INTERACTION_EVENTS>)
AND datestr BETWEEN :START_112D AND :END_DATE
THEN epoch_ms ELSE NULL END) AS latest_interaction_time
FROM <SCHEMA_XLB>.<TABLE_XLB> xlb
WHERE TRUE
GROUP BY 1,2
),
data_summary AS (
SELECT
COALESCE(cdl.user_id, xlb.user_id) AS user_id,
COALESCE(cdl.source, xlb.source) AS source,
COALESCE(cdl.impression_count_7d, 0)
+ COALESCE(xlb.impression_count_7d, 0) AS impression_count_7d,
GREATEST(cdl.latest_interaction_time, xlb.latest_interaction_time)
AS latest_interaction_time
FROM cdl_summary cdl
FULL OUTER JOIN xlb_summary xlb
ON cdl.user_id = xlb.user_id
AND cdl.source = xlb.source
),
summary_agg AS (
SELECT
user_id,
SUM(impression_count_7d) AS total_impression_7d,
SUM(order_count_112d) AS total_order_112d
FROM data_summary
GROUP BY 1
)
INSERT OVERWRITE TABLE <SCHEMA_TARGET>.<TABLE_TARGET>
PARTITION (datestr)
SELECT
CONCAT(d.user_id, '|', d.source) AS uuid,
d.user_id,
d.source,
d.impression_count_7d,
agg.total_impression_7d,
d.latest_interaction_time,
:END_DATE AS datestr
FROM data_summary d
JOIN summary_agg agg
ON d.user_id = agg.user_id
WHERE d.source IS NOT NULL;
Gluten version
No response
Spark version
None
Spark configurations
NA
System information
NA
Relevant logs
E20250427 14:32:18.128706 176 VeloxMemoryManager.cc:401] Failed to release Velox memory manager after 43350ms as there are still outstanding memory resources.
E20250427 14:32:18.128832 176 MemoryPool.cpp:435] [MEM] Memory leak (Used memory): Memory Pool[op.0.0.0.TableScan LEAF root[root] parent[node.0] MALLOC track-usage thread-safe]<unlimited max capacity unlimited capacity used 276.00KB available 748.00KB reservation [used 276.00KB, reserved 1.00MB, min 0B] counters [allocs 114189, frees 114184, reserves 0, releases 1, collisions 0])>
E20250427 14:32:18.128959 176 Exceptions.h:66] Line: cpp/velox/memory/VeloxMemoryManager.cc:102, Function:removePool, Expression: pool->reservedBytes() == 0 (1048576 vs. 0), Source: RUNTIME, ErrorCode: INVALID_STATE
terminate called after throwing an instance of 'facebook::velox::VeloxRuntimeError'
what(): Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: (1048576 vs. 0)
Retriable: False
Expression: pool->reservedBytes() == 0
Function: removePool
File: cpp/velox/memory/VeloxMemoryManager.cc
Line: 102
Stack trace:
# 0 _ZN8facebook5velox7process10StackTraceC1Ei
# 1 _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_
# 2 _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRKNS1_18VeloxCheckFailArgsET0_
# 3 _ZN6gluten20ListenableArbitrator10removePoolEPN8facebook5velox6memory10MemoryPoolE
# 4 _ZN8facebook5velox6memory13MemoryManager8dropPoolEPNS1_10MemoryPoolE
# 5 _ZN8facebook5velox6memory14MemoryPoolImplD2Ev
# 6 _ZNSt16_Sp_counted_baseILN9__gnu_cxx12_Lock_policyE2EE24_M_release_last_use_coldEv
# 7 _ZN6gluten18VeloxMemoryManagerD1Ev
# 8 _ZN6gluten18VeloxMemoryManagerD0Ev
# 9 _ZN6gluten13MemoryManager7releaseEPS0_
# 10 Java_org_apache_gluten_memory_NativeMemoryManagerJniWrapper_release
# 11 0x00007fc63ca15074