Fix in-place tablespace crashes: session lock leak and segment SIGSEGV#1628
Fix in-place tablespace crashes: session lock leak and segment SIGSEGV#1628yjhjstz merged 3 commits intoapache:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes Greenplum in-place tablespace edge cases that could crash backends/segments or emit misleading logs, and addresses a utility-mode session lock leak that caused TAP recovery test failures.
Changes:
- Release movedb session-level lock at commit time for utility-mode backends to prevent lock leakage/assertion failure at process exit.
- Prevent QE SIGSEGV by treating a NULL
CreateTableSpaceStmt->locationas empty string for in-place tablespaces. - Suppress expected
readlink()EINVALnoise when dropping in-place tablespaces (wherepg_tblspc/<oid>is a directory, not a symlink).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/backend/commands/tablespace.c | Guard against NULL stmt->location for in-place tablespaces; avoid logging expected readlink(EINVAL) for in-place drop paths. |
| src/backend/access/transam/xact.c | Release movedb session lock on commit in GP_ROLE_UTILITY to prevent session lock leaks in standalone/utility backends. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
0039c9b to
65670b6
Compare
reshke
left a comment
There was a problem hiding this comment.
Ohhh.. yes, this is clear oversight introduced in recent ported changes.
Somehow this worked in my by-hand hand (I also thought CI was reliable here)
|
Linking for cross-reference #1585 |
|
Lets rebase & merge |
65670b6 to
929f77d
Compare
|
Another rebase after 11c5515 |
ccd5d42 to
65670b6
Compare
When CREATE TABLESPACE ... LOCATION '' is dispatched from QD to QE, the serialization converts the empty string to NULL. On the segment, pstrdup(stmt->location) crashes with SIGSEGV because stmt->location is NULL. Add a NULL guard to treat NULL the same as empty string, preserving in-place tablespace semantics. Fixes apache#1627
movedb() acquires a session-level AccessExclusiveLock on the database
via MoveDbSessionLockAcquire(). The release in CommitTransaction()
only checked for GP_ROLE_DISPATCH and IS_SINGLENODE(), missing
GP_ROLE_UTILITY. This caused the lock to leak in standalone backends
(e.g. TAP tests), triggering a proc.c assertion failure at exit:
FailedAssertion("SHMQueueEmpty(&(MyProc->myProcLocks[i]))")
Add GP_ROLE_UTILITY to the release condition.
Also fix a spurious "could not read symbolic link" log message when
dropping in-place tablespaces: readlink() on a directory returns
EINVAL, which is expected and can be safely skipped.
Fixes apache#1626
Two fixes:
- Use TestLib::slurp_file instead of PostgreSQL::Test::Utils::slurp_file
in adjust_conf(). PostgresNode.pm only imports TestLib, not
PostgreSQL::Test::Utils, so the latter is undefined and causes
t/101_restore_point_and_startup_pause to fail.
- Replace cp with install -m 644 in enable_archiving() archive_command.
coreutils 8.32 (Rocky 8, Ubuntu 22.04) uses copy_file_range() in cp
which crashes on Docker overlayfs. install does not use
copy_file_range(), avoiding the crash.
Also add ic-recovery test suite to rocky8 and deb CI pipelines.
65670b6 to
c09c1a8
Compare
Summary
ALTER DATABASE SET TABLESPACEfor utility-mode backends, which caused assertion failure in TAP test033_replay_tsp_drops.plCREATE TABLESPACE ... LOCATION ''Root Cause
Session lock leak (Issue #1626):
movedb()acquires a session-levelAccessExclusiveLockviaMoveDbSessionLockAcquire(). The release inCommitTransaction()only checkedGp_role == GP_ROLE_DISPATCH || IS_SINGLENODE(), missingGP_ROLE_UTILITY. Standalone backends (TAP tests, pg_basebackup recovery) leaked the lock, triggeringproc.c:1067assertion at exit.Segment SIGSEGV (Issue #1627): When
CREATE TABLESPACE ... LOCATION ''is dispatched to segments, serialization converts the empty string to NULL.pstrdup(NULL)crashes the segment process.Fix
Gp_role == GP_ROLE_UTILITYto the session lock release condition inCommitTransaction()stmt->locationinCreateTableSpace()readlink()EINVAL error for in-place tablespace directories (they are directories, not symlinks)Test Plan
src/test/recovery/t/033_replay_tsp_drops.plpasses (previously crashed with assertion failure)CREATE TABLESPACE ... LOCATION ''on MPP cluster does not crash segmentsFixes #1626
Fixes #1627