Improve support for safe symlink extraction#763
Improve support for safe symlink extraction#763AndrewFasano wants to merge 7 commits intoonekey-sec:mainfrom
Conversation
We can rewrite symlinks to ensure they are always relative and remain within the extraction directory.
Explicitly use extract_root in output path instead of ./ to avoid issues with symlinks within directories.
0e6e026 to
1b98609
Compare
|
@AndrewFasano from within the unblob directory, you can do pre-commit can be installed with It will modify the code and you can create fixups for existing commits depending on the part it touches. |
|
Ruff seems to be quite unhappy with my use of |
|
I think there is some confusion on how
Maybe there was some confusion in some handlers on the argument order - or names, but these tests show the intention, and they should keep working after modifications. |
FileSystem.create_symlink(src, dst)is supposed to be following the unix command line order and naming Except these are named differently in the So |
|
Thanks for the info @e3krisztian. These names are definitely confusing and now that I cleaned up my changes, I can see that you're right - the order was fine before. Sorry about that! There might still be value in revising the change from bbe18c6 to add the I think the other changes are still relevant though! |
|
@AndrewFasano this is a valued input and contribution for You might also be correct with some hidden argument swapping somewhere still. The below symlinks are intentionally marked as problematic in the current code ( # 3) Symlink with extra parent directories that would still be valid
ln -s ../../../bin/busybox "$WORKDIR/sbin/symlink_extra_up_to_busybox"
# 7) Circular symlink (A -> B, B -> A)
ln -s symlink_circular_b "$WORKDIR/bin/symlink_circular_a"
ln -s symlink_circular_a "$WORKDIR/bin/symlink_circular_b" |
Cherry-picking these commit over I have tried to reproduce the problems locally based on the description given with the tests below applied to diff --git a/tests/test_file_utils.py b/tests/test_file_utils.py
index 9a20da74..1fc94138 100644
--- a/tests/test_file_utils.py
+++ b/tests/test_file_utils.py
@@ -25,7 +25,7 @@ from unblob.file_utils import (
round_down,
round_up,
)
-from unblob.report import PathTraversalProblem
+from unblob.report import LinkExtractionProblem, PathTraversalProblem
@pytest.mark.parametrize(
@@ -503,6 +503,30 @@ class TestFileSystem:
assert os.readlink(output_path) == "target file"
assert sandbox.problems == []
+ def test_create_symlink_target_inside_sandbox(self, sandbox: FileSystem):
+ # ./sbin/shell -> ../bin/sh
+ sandbox.mkdir(Path("bin"))
+ sandbox.write_bytes(Path("bin/sh"), b"posix shell")
+ sandbox.mkdir(Path("sbin"))
+ sandbox.create_symlink(Path("../bin/sh"), Path("sbin/shell"))
+
+ output_path = sandbox.root / "sbin/shell"
+ assert output_path.read_bytes() == b"posix shell"
+ assert output_path.exists()
+ assert os.readlink(output_path) == "../bin/sh"
+ assert sandbox.problems == []
+
+ def test_create_symlink_target_outside_sandbox(self, sandbox: FileSystem):
+ # /shell -> ../bin/sh
+ sandbox.mkdir(Path("bin"))
+ sandbox.write_bytes(Path("bin/sh"), b"posix shell")
+ sandbox.create_symlink(Path("../bin/sh"), Path("/shell"))
+
+ assert any(p for p in sandbox.problems if isinstance(p, LinkExtractionProblem))
+ output_path = sandbox.root / "shell"
+ assert not output_path.exists()
+ assert not output_path.is_symlink()
+
def test_create_symlink_absolute_paths(self, sandbox: FileSystem):
sandbox.write_bytes(Path("target file"), b"test content")
sandbox.create_symlink(Path("/target file"), Path("/symlink")) |
|
@e3krisztian I can try updating those commits to work on main and try digging up a filesystem where I was seeing the issue that was supposed to address. At a minimum bbe18c6 would need to be updated to swap src/dst since this PR has them swapped. |
This PR aims to fix a few issues around symlink extraction discussed in #761. I don't think this PR is perfect, but I hope it's an improvement over the current state of things.
Now in #768
954c1cd rewrites the logic to sanitize symlinks to be relative and kept within the extraction directory. This is done using theosmodule instead ofPathlibas Pathlib.resolve would fail if a symlink target was missing (which doesn't prevent us from safely converting it to a relative link). With this change I no longer see false positives around MaliciousSymlinks, instead symlinks are created safely within the extraction directory. If a relative symlink originally tried accessing a directory above its own root (i.e.,./bin/sh -> ../../../../../bin/bash), we update the link so it remains within the extraction directory.This may have just been an artifact of swapping src/dst?
bbe18c6 and 76c29fe change how the.dstfield of a symlink is calculated in file_utils and in _safe_tarfile - previously it was made by combining the extraction root with the symlink destination. This would lose critical information about the path of the symlink source. For example a symlink at./sbin/shell -> ../bin/shis safely within the extraction directory while a symlink at/shell -> ../bin/shis trying to go up too high.Now in #770
fc60755 fixes a bug where tarfile absolute symlinks would be improperly dropped which I observed with a system that had/var/log -> /tmp. 0e6e026 fixes a bug with relative symlinks that I observed on a system that had/var/tmp -> ../../tmp56617a3 and 2ce66d6 are trying to fix a mix up between symlink source and destination when callingcreate_symlink. I fixed the CPIO extractor but it looks like things may be backwards in other extractors as well.To test these changes, I created a CPIO archive with the following script:
If I extract this with the head of unblob (d0f3086) and run
find ../test/test_archive.cpio_extract/ -type f,l -exec ls -al {} \;I get:After applying the changes in this PR I get the following result with two additional (and expected) files extracted:
symlink_circular_bandsymlink_extra_up_to_busybox. All the files are still contained within the extraction directory.