fix(migrations): Skip pg_dumpall restrict/unrestrict tokens in drift check#112777
Conversation
…check PostgreSQL 16+ emits \restrict and \unrestrict directives with random per-session tokens in pg_dumpall output. These tokens differ between any two dumps and are unrelated to the database schema, causing the migration drift detection to always report false positives. Filter out these lines in the norm() function alongside the existing filters for comments and django_migrations tables. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
| continue | ||
| if last == "\n" and line == "\n": | ||
| continue | ||
| else: |
There was a problem hiding this comment.
Bug: The script will crash when parsing a \connect statement that includes a role, due to an incorrect assumption about the number of arguments.
Severity: HIGH
Suggested Fix
Modify the \connect parsing logic to handle a variable number of arguments. For example, use parts = line.split() and then check len(parts) to correctly extract the database name, which is always the second element, while ignoring any subsequent parts like the role.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: tools/migrations/compare.py#L48
Potential issue: The parsing logic for `\connect` statements in
`tools/migrations/compare.py` incorrectly assumes that `line.split()` will always yield
two values. However, `pg_dumpall` can produce `\connect` statements with a role, such as
`\connect mydatabase myrole`, which results in three values. This will cause the script
to raise a `ValueError: too many values to unpack` when it encounters such a line,
crashing the migration drift CI job. This is a common output format in environments
where databases have specific owners.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
not a part of this PR nor this change
|
Maybe just for posterity - the restrict command in Postgres seems to have been added to mitigate a security issue - so for the sake of checking migrations this is irrelevant. Just did a little bit of research because I wanted to make sure we're not missing info if we remove this. |
|
Can we revert this? I think my fix here is the proper fix. #112731 |
Of course! That's way better fix than mine, I'll revert my change |
|
PR reverted: f6ed917 |
…check (#112777) PostgreSQL 16+ added `\restrict` / `\unrestrict` directives to `pg_dumpall` output. These carry random per-session tokens for password scrambling and are not part of the database schema. The migration drift CI job runs `pg_dumpall` twice (once after sequential migrations, once after squashed migrations) and compares the output. Since the tokens are regenerated on every dump, the comparison always reports a false-positive diff — causing the `migrations-drift` workflow to fail on every PR that touches migrations, in both sentry and getsentry. This filters out `\restrict` and `\unrestrict` lines in the `norm()` function, alongside the existing filters for comments and `django_migrations` tables. The `quick_drift_compare.py` script delegates to the same compare module, so both code paths are fixed. Example of the spurious diff this eliminates: ``` -\restrict chSRsrfaJ6ZjA7ck8dSaajbhnPiwOqTaC6ldxf7bN2kbn4VJZk4ODoz8l0XL2pk +\restrict dXbwY4tgrWq0glKas9UkhhnydrGduwSqaYnQMvQj53fxjnB36DA9XvXg3TMMG9U ``` Co-authored-by: Claude Opus 4 <noreply@anthropic.com>
PostgreSQL 16+ added
\restrict/\unrestrictdirectives topg_dumpalloutput. These carry random per-session tokens for password scrambling and are not part of the database schema.The migration drift CI job runs
pg_dumpalltwice (once after sequential migrations, once after squashed migrations) and compares the output. Since the tokens are regenerated on every dump, the comparison always reports a false-positive diff — causing themigrations-driftworkflow to fail on every PR that touches migrations, in both sentry and getsentry.This filters out
\restrictand\unrestrictlines in thenorm()function, alongside the existing filters for comments anddjango_migrationstables. Thequick_drift_compare.pyscript delegates to the same compare module, so both code paths are fixed.Example of the spurious diff this eliminates: