Skip to content

cherry-pick zstd fix + add guc for off changes#158

Merged
visill merged 14 commits intoMDB_6_25_STABLE_YEZZEYfrom
zstd_guc
Apr 30, 2025
Merged

cherry-pick zstd fix + add guc for off changes#158
visill merged 14 commits intoMDB_6_25_STABLE_YEZZEYfrom
zstd_guc

Conversation

@visill
Copy link
Contributor

@visill visill commented Mar 13, 2025

Here are some reminders before you submit the pull request

  • Add tests for the change
  • Document changes
  • Communicate in the mailing list if needed
  • Pass make installcheck
  • Review a PR in return to support the community

Dennis Kovalenko and others added 7 commits March 13, 2025 11:23
gpdb uses zstd to compress workfiles, but it can't control memory allocation within zstd.
Meanwhile, zstd allocates as much memory as it needs which results in ambiguous error
message in a case of memory exhaustion and can make OOM killer stop gpdb processes with
force. This patch forces zstd to use our custom allocator which can keep track of
allocated memory. Since this zstd feature is experimental and its api can change, we
also had to link zstd statically.
@visill visill requested a review from Smyatkin-Maxim March 25, 2025 14:22
Copy link
Contributor

@Smyatkin-Maxim Smyatkin-Maxim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a leak that needs to be fixed.
IMO The perfect solution would be to make MemoryContext a member of BufFile. Then create context when we create BufFile and drop the context when we close BufFile.

The problem is that we need to pass this context somehow to custommem, BUT I see that there is a member named opaque in ZSTD_customMem. I didn't find any documentation on how to use it, but I believe you can do something like customMem.opaque = (void *)file_buf->memory_context; and use this context from customAlloc and customFree.

This way you get rid of global context (which can't be even made static), and get rid of the leak.

Copy link
Contributor

@Smyatkin-Maxim Smyatkin-Maxim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two critical issue here with memory allocations, details are in comments.

Copy link
Contributor

@Smyatkin-Maxim Smyatkin-Maxim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, feel free to squash and merge

@visill visill merged commit af6ef71 into MDB_6_25_STABLE_YEZZEY Apr 30, 2025
9 of 15 checks passed
visill added a commit that referenced this pull request Jul 2, 2025
* ADBDEV-3774 Use custom allocators for zstd (#566)

gpdb uses zstd to compress workfiles, but it can't control memory allocation within zstd.
Meanwhile, zstd allocates as much memory as it needs which results in ambiguous error
message in a case of memory exhaustion and can make OOM killer stop gpdb processes with
force. This patch forces zstd to use our custom allocator which can keep track of
allocated memory. Since this zstd feature is experimental and its api can change, we
also had to link zstd statically.

* Add GUC

---------

Co-authored-by: Dennis Kovalenko <kds@arenadata.io>
visill added a commit that referenced this pull request Jul 2, 2025
* ADBDEV-3774 Use custom allocators for zstd (#566)

gpdb uses zstd to compress workfiles, but it can't control memory allocation within zstd.
Meanwhile, zstd allocates as much memory as it needs which results in ambiguous error
message in a case of memory exhaustion and can make OOM killer stop gpdb processes with
force. This patch forces zstd to use our custom allocator which can keep track of
allocated memory. Since this zstd feature is experimental and its api can change, we
also had to link zstd statically.

* Add GUC

---------

Co-authored-by: Dennis Kovalenko <kds@arenadata.io>
reshke pushed a commit that referenced this pull request Jul 2, 2025
* ADBDEV-3774 Use custom allocators for zstd (#566)

gpdb uses zstd to compress workfiles, but it can't control memory allocation within zstd.
Meanwhile, zstd allocates as much memory as it needs which results in ambiguous error
message in a case of memory exhaustion and can make OOM killer stop gpdb processes with
force. This patch forces zstd to use our custom allocator which can keep track of
allocated memory. Since this zstd feature is experimental and its api can change, we
also had to link zstd statically.

* Add GUC

---------

Co-authored-by: Dennis Kovalenko <kds@arenadata.io>
reshke pushed a commit that referenced this pull request Jul 2, 2025
* ADBDEV-3774 Use custom allocators for zstd (#566)

gpdb uses zstd to compress workfiles, but it can't control memory allocation within zstd.
Meanwhile, zstd allocates as much memory as it needs which results in ambiguous error
message in a case of memory exhaustion and can make OOM killer stop gpdb processes with
force. This patch forces zstd to use our custom allocator which can keep track of
allocated memory. Since this zstd feature is experimental and its api can change, we
also had to link zstd statically.

* Add GUC

---------

Co-authored-by: Dennis Kovalenko <kds@arenadata.io>
@reshke reshke deleted the zstd_guc branch November 5, 2025 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants