Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report extraction problems from handlers/extractors #633

Merged
merged 9 commits into from
Aug 11, 2023
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ build/
*.so
.idea
.coverage*
/.venv/
18 changes: 16 additions & 2 deletions docs/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -515,7 +515,7 @@ class Extractor(abc.ABC):
return []

@abc.abstractmethod
def extract(self, inpath: Path, outdir: Path):
def extract(self, inpath: Path, outdir: Path) -> Optional[ExtractResult]:
"""Extract the carved out chunk. Raises ExtractError on failure."""
```

Expand All @@ -526,6 +526,15 @@ Two methods are exposed by this class:
- `extract()`: you must override this function. This is where you'll perform the
extraction of `inpath` content into `outdir` extraction directory

!!! Recommendation

Although it is possible to implement `extract()` with path manipulations,
checks for path traversals, and performing io by using Python libraries
(`os`, `pathlib.Path`), but it turns out somewhat tedious.
Instead we recommend to remove boilerplate and use a helper class `FileSystem` from
[unblob/file_utils.py](https://github.com/onekey-sec/unblob/blob/main/unblob/file_utils.py)
which ensures that all file objects are created under its root.

### DirectoryExtractor class

The `DirectoryExtractor` interface is defined in
Expand All @@ -538,7 +547,7 @@ class DirectoryExtractor(abc.ABC):
return []

@abc.abstractmethod
def extract(self, paths: List[Path], outdir: Path):
def extract(self, paths: List[Path], outdir: Path) -> Optional[ExtractResult]:
"""Extract from a multi file path list.

Raises ExtractError on failure.
Expand All @@ -552,6 +561,11 @@ Two methods are exposed by this class:
- `extract()`: you must override this function. This is where you'll perform the
extraction of `paths` files into `outdir` extraction directory

!!! Recommendation

Similarly to `Extractor`, it is recommended to use the `FileSystem` helper class to
implement `extract`.

### Example Extractor

Extractors are quite complex beasts, so rather than trying to come up with a
Expand Down
24 changes: 1 addition & 23 deletions tests/handlers/filesystem/test_romfs.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
from pathlib import Path

import pytest

from unblob.file_utils import File
from unblob.handlers.filesystem.romfs import get_string, is_safe_path, valid_checksum
from unblob.handlers.filesystem.romfs import get_string, valid_checksum


@pytest.mark.parametrize(
Expand Down Expand Up @@ -44,23 +42,3 @@ def test_get_string(content, expected):
)
def test_valid_checksum(content, valid):
assert valid_checksum(content) == valid


@pytest.mark.parametrize(
"basedir, path, expected",
[
("/lib/out", "/lib/out/file", True),
("/lib/out", "file", True),
("/lib/out", "dir/file", True),
("/lib/out", "some/dir/file", True),
("/lib/out", "some/dir/../file", True),
("/lib/out", "some/dir/../../file", True),
("/lib/out", "some/dir/../../../file", False),
("/lib/out", "some/dir/../../../", False),
("/lib/out", "some/dir/../../..", False),
("/lib/out", "../file", False),
("/lib/out", "/lib/out/../file", False),
],
)
def test_is_safe_path(basedir, path, expected):
assert is_safe_path(Path(basedir), Path(path)) is expected
Loading