Feature/ext2 critical fixes #167

Galfurian · 2026-01-22T14:14:45Z

No description provided.

Test Files Added: - t_ext2_audit_write_boundary.c: Issue #1 buffer overflow tests * Test unaligned writes spanning multiple blocks * Test exact block boundary writes * Test multiple partial writes (stress) - t_ext2_audit_overflow.c: Issue #2 integer overflow tests * Test large offset handling * Test uint32_t boundary conditions * Test mixed boundary conditions - t_ext2_audit_read_failure.c: Issue #3 silent failure tests * Test read after write * Test read across block boundaries * Test partial reads consistency * Test EOF behavior - t_ext2_audit_mount_cache.c: Issue #4 NULL check tests * Test filesystem operational after mount * Test cache under load * Test cache on reads * Test cache lifecycle All tests compile successfully but should fail with current code.

Changed right = fs->block_size to right = fs->block_size - 1 Prevents overflow when writing to block boundaries. Test: t_ext2_audit_write_boundary

Recalculate allocated block count per iteration in ext2_write_inode_data() since ext2_write_inode_block() dynamically allocates blocks. Add conditional read logic: only read for partial writes on existing blocks, zero new blocks, skip reads for full overwrites. Convert test files to syslog.

- Add ext2_write_bgdt_for_group() to write only the BGDT block containing the modified group descriptor instead of writing all BGDT blocks - Update ext2_allocate_inode(), ext2_allocate_block(), ext2_free_block(), and ext2_free_inode() to use the optimized function - This significantly reduces I/O operations during inode/block allocation and deallocation, especially for operations like FHS directory creation that allocate many inodes in sequence

…alls - Add ext2_write_bgdt_for_group() to write only affected BGDT block per operation instead of all BGDT blocks. Reduces I/O overhead significantly. - Keep ext2_write_superblock() calls in alloc/free functions for data consistency - Create fs/sync.c with sys_sync(), sys_syncfs(), and sys_sync_file_range() syscalls - Add syscall declarations to mentos/inc/system/syscall.h - Remove unnecessary sys_sync() calls from FHS initialization The BGDT optimization remains the primary improvement: - Each allocation/deallocation now writes 1 BGDT block instead of N blocks - For systems with multiple block groups, this dramatically reduces I/O - Superblock writes ensure consistent metadata tracking Syscalls are functional stubs. Full implementation would iterate through all mounted filesystems and call their respective sync methods.

The pr_warning was firing 17 times during FHS initialization when scheduler_get_current_process() returned NULL. This is expected behavior during kernel boot before any user tasks are spawned. Commenting out the warning eliminates excessive logging that was creating I/O overhead. The code correctly handles the NULL task case by assuming uid/gid of 0.

- Reduce compiler matrix to GCC 12–14 and Clang 17–19 - Upgrade checkout to v4 and set Release builds - Use 16 parallelism for faster CI - Increase test timeout and align build flags - Install gcc-9 in test job to ensure tool availability

procfs_unlink() returned -EEXIST when the file was found, then proceeded to use the pointer only when NULL, passing an invalid address to list_head_empty(). This caused -Warray-bounds via list_head_validate() in CI. Now returns -ENOENT when the file is missing and safely proceeds when present.

- Create scripts/run-qemu-test wrapper with live output monitoring - Show test.log content in real-time during CI runs - Add 10-minute timeout to prevent indefinite hangs - Display serial.log and test results even on failure (if: always()) - Separate cdrom_test.iso build from test execution for clarity This resolves the issue where CI appeared frozen because QEMU output was redirected to files with no progress indication.

- Remove quotes from file paths (match CMake syntax exactly) - Clear old logs before starting to ensure fresh run - Monitor both serial.log (boot) and test.log (tests) - Show serial.log snippets to diagnose boot issues - Display both log sizes in heartbeat messages This helps identify whether QEMU is booting, hanging, or if the test output is going to the wrong place.

Add timeouts and pause instructions to rtc_update_datetime() busy-wait loops. In Release builds with optimizations, the compiler may optimize away repeated hardware reads in tight loops, causing infinite hangs. The pause instruction hints the CPU this is a spin-wait loop and the timeout prevents indefinite blocking if RTC hardware misbehaves. This fixes CI hangs after 'Install RTC' when building with -O2/-O3.

Replace complex polling logic with simple tail -f that streams both serial.log and test.log in real-time with prefixes. This is cleaner and shows everything as it happens without selective filtering or heartbeat messages.

- Use tail -F (capital F) to retry if files don't exist yet - Use sed -u for unbuffered output to prevent CI buffering - Remove explicit tail PID tracking (subshells handle cleanup) This ensures output appears immediately in CI rather than being buffered until QEMU exits.

Add 100k iteration timeouts to ps2_write_command() and ps2_read_data() busy-wait loops. In Release builds with optimizations, these infinite loops can hang if the PS/2 controller doesn't respond as expected. The timeout prevents indefinite blocking and logs a warning if the controller is unresponsive (e.g., in headless QEMU without PS/2). This fixes CI hangs after 'Setting up PS/2 driver...'.

PS/2 init can timeout in headless QEMU, leading to panic. Now we: - Warn and continue when PS/2 is unavailable - Skip keyboard initialization when PS/2 is missing This prevents cascading panics (e.g., COW faults) when PS/2 hardware is absent in CI.

Add filesystem to cdrom_test.iso DEPENDS so rootfs.img is always built before the test ISO. This covers cases where qemu-test is run without an explicit filesystem build (e.g., CI).

…te and inode reading - Discovered root cause: Second read of inode table block (sector 40) returns all zeros - Root inode (ino 2) shows mode=0 instead of valid directory mode - Bus master status register shows 0xFF after init process creation attempt - DMA buffer shows no data transfer occurring on second read of same block - Added detailed logging to trace ATA DMA controller state - Added DMA reset (command and status register clear) at start of each read - Issue persists - device entering error state or wait loop exiting prematurely This is a critical blocker for filesystem access after boot. Requires further investigation of either: 1. QEMU ATA DMA emulation accuracy 2. DMA wait loop condition logic 3. Device error handling and recovery

- DMA IRQ emulation in QEMU doesn't work reliably; timeouts on every read - PIO fallback works perfectly and boots filesystem correctly - Disable DMA entirely to avoid pointless timeouts - System now boots cleanly with EXT2 mounting via PIO reads

Galfurian added 26 commits January 21, 2026 17:01

fix(ext2): buffer overflow on write boundary (line 1873)

a8434f2

Changed right = fs->block_size to right = fs->block_size - 1 Prevents overflow when writing to block boundaries. Test: t_ext2_audit_write_boundary

fix(ext2): bounds check for integer overflow (line 1876)

208436f

fix(ext2): propagate read errors instead of silent continue (line 1809)

160cd7f

fix(ext2): add NULL check for kmem_cache_create (line 3790)

748b929

refactor(tests): convert ext2 audit tests to use syslog

83b2a4e

fix(fs): add /proc to FHS standard directories

00afd67

refactor(ci): simplify test script to stream all output

738f988

Replace complex polling logic with simple tail -f that streams both serial.log and test.log in real-time with prefixes. This is cleaner and shows everything as it happens without selective filtering or heartbeat messages.

ci(test): build filesystem whenever cdrom_test.iso is made

ee49944

Add filesystem to cdrom_test.iso DEPENDS so rootfs.img is always built before the test ISO. This covers cases where qemu-test is run without an explicit filesystem build (e.g., CI).

refactor(workflow): dump image status after build

a2cfee0

fix(ps2): harden controller init timeouts

20383a7

refactor(log): set log levels to notice

38e55c0

Galfurian merged commit 305bf01 into develop Jan 23, 2026
8 checks passed

Galfurian deleted the feature/ext2-critical-fixes branch January 23, 2026 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/ext2 critical fixes #167

Feature/ext2 critical fixes #167

Uh oh!

Galfurian commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feature/ext2 critical fixes #167

Feature/ext2 critical fixes #167

Uh oh!

Conversation

Galfurian commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants