Skip to content

Improve SQLite3 WAL#9

Merged
Schamper merged 28 commits intofox-it:mainfrom
PimSanders:improve-wal
Dec 9, 2025
Merged

Improve SQLite3 WAL#9
Schamper merged 28 commits intofox-it:mainfrom
PimSanders:improve-wal

Conversation

@PimSanders
Copy link
Contributor

This PR:

A WAL file and/or WAL checkpoint can be opened on initialization:

sqlite3.SQLite3(sqlite_db, sqlite_wal, wal_checkpoint=2)

Previous WAL checkpoints might contain interesting data that has been modified or removed, for example:

Checkpoints found: 4
Salt1: 1945264100
[<Row table=test id=1 name='testing' value=1337>,
 <Row table=test id=2 name='omg' value=7331>,
 <Row table=test id=3 name='AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' value=4100>,
 <Row table=test id=4 name='BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB' value=4100>,
 <Row table=test id=5 name='negative' value=-11644473429>,
 <Row table=test id=6 name='after checkpoint' value=42>,
 <Row table=test id=8 name='after checkpoint' value=44>,
 <Row table=test id=9 name='wow' value=1234>,
 <Row table=test id=10 name='second checkpoint' value=100>,
 <Row table=test id=11 name='second checkpoint' value=101>]
Salt1: 1945264099
[<Row table=test id=1 name='testing' value=1337>,
 <Row table=test id=2 name='omg' value=7331>,
 <Row table=test id=3 name='AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' value=4100>,
 <Row table=test id=4 name='BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB' value=4100>,
 <Row table=test id=5 name='negative' value=-11644473429>,
 <Row table=test id=6 name='after checkpoint' value=42>,
 <Row table=test id=7 name='after checkpoint' value=43>,
 <Row table=test id=8 name='after checkpoint' value=44>,
 <Row table=test id=9 name='after checkpoint' value=45>,
 <Row table=test id=10 name='second checkpoint' value=100>,
 <Row table=test id=11 name='second checkpoint' value=101>]
Salt1: 1945264098
[<Row table=test id=1 name='testing' value=1337>,
 <Row table=test id=2 name='omg' value=7331>,
 <Row table=test id=3 name='AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' value=4100>,
 <Row table=test id=4 name='BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB' value=4100>,
 <Row table=test id=5 name='negative' value=-11644473429>,
 <Row table=test id=6 name='after checkpoint' value=42>,
 <Row table=test id=7 name='after checkpoint' value=43>,
 <Row table=test id=8 name='after checkpoint' value=44>,
 <Row table=test id=9 name='after checkpoint' value=45>]
Salt1: 1945264097
[<Row table=test id=1 name='testing' value=1337>,
 <Row table=test id=2 name='omg' value=7331>,
 <Row table=test id=3 name='AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' value=4100>,
 <Row table=test id=4 name='BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB' value=4100>,
 <Row table=test id=5 name='negative' value=-11644473429>]

If no checkpoint is specified the reader will use the latest valid data from the WAL file, if that's not available it will use the database file. If no WAL file is provided it will check if DB.sqlite-wal or DB.db-wal exists.

There are three things I'm not sure about and I would like your opinion on:

  1. Currently checkpoints are ordered in reverse, this means that checkpoint 0 is the latest and checkpoint 3 is the oldest. Is this the right call or should they be inverted and follow the checkpoint_sequence counter in the header?
  2. The Python script to generate is currently stored in tests/_data which means it will be saved in LFS, I don't think it should be stored in LFS. Should the script stay in _data and have an exception from LFS or should it be stored somewhere else?
  3. The test could be shorter, but I think it is best to test for everything every time, right?

This commit:
 - Adds a feature to open databases from various WAL checkpoints.
 - Allows a database to be initialized with a Path OR BinaryIO.
 - Improves the reader by also parsing the WAL file
 - Adds a Python script to generate test data.
@codecov
Copy link

codecov bot commented Nov 19, 2025

Codecov Report

❌ Patch coverage is 86.30137% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.91%. Comparing base (14741f8) to head (e12b54b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
dissect/database/sqlite3/wal.py 86.60% 15 Missing ⚠️
dissect/database/sqlite3/sqlite3.py 85.29% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main       #9      +/-   ##
==========================================
+ Coverage   79.71%   81.91%   +2.20%     
==========================================
  Files          30       31       +1     
  Lines        2282     2334      +52     
==========================================
+ Hits         1819     1912      +93     
+ Misses        463      422      -41     
Flag Coverage Δ
unittests 81.91% <86.30%> (+2.20%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@Schamper Schamper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move all WAL related classes to a file wal.py and lose the WAL prefix in the class names?

PimSanders and others added 2 commits November 26, 2025 07:54
Co-authored-by: Erik Schamper <1254028+Schamper@users.noreply.github.com>
PimSanders and others added 2 commits November 26, 2025 12:53
Co-authored-by: Erik Schamper <1254028+Schamper@users.noreply.github.com>
PimSanders and others added 5 commits November 26, 2025 14:40
Co-authored-by: Erik Schamper <1254028+Schamper@users.noreply.github.com>
Co-authored-by: Erik Schamper <1254028+Schamper@users.noreply.github.com>
@JSCU-CNI
Copy link
Contributor

Really appreciate the effort you put into this feature @PimSanders.

PimSanders and others added 6 commits December 1, 2025 19:21
PimSanders and others added 3 commits December 4, 2025 11:03
Co-authored-by: Erik Schamper <1254028+Schamper@users.noreply.github.com>
Copy link
Member

@Schamper Schamper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything you still wanted to look at yourself? Otherwise it looks good to merge after these last comments to me.

@Schamper
Copy link
Member

Schamper commented Dec 9, 2025

Also please fix the linter.

@PimSanders
Copy link
Contributor Author

I'm all good, thanks for the reviews!

@Schamper Schamper merged commit 33bc02a into fox-it:main Dec 9, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement opening DB's from various WAL checkpoints

3 participants