Skip to content

Prevent path traversal in Directory.read()#362

Open
assinscreedFC wants to merge 1 commit into
aerkalov:masterfrom
assinscreedFC:fix/359-path-traversal
Open

Prevent path traversal in Directory.read()#362
assinscreedFC wants to merge 1 commit into
aerkalov:masterfrom
assinscreedFC:fix/359-path-traversal

Conversation

@assinscreedFC

Copy link
Copy Markdown

Summary

Fixes #359 (path traversal / CWE-22).

Directory.read() joined the source directory with a subname taken from EPUB OPF manifest href attributes (attacker-controlled, and URL-decoded via unquote) without any sanitisation:

open(os.path.join(self.directory_path, subname), "rb")

A crafted href such as ../../etc/passwd (or its percent-encoded form %2e%2e%2f...) could escape the intended directory and read arbitrary files from disk.

Fix

Resolve both the base directory and the requested target with os.path.realpath (which also collapses .. and follows symlinks), then reject any target that does not stay within the base directory before opening it.

def read(self, subname):
    directory_path = os.path.realpath(self.directory_path)
    path = os.path.realpath(os.path.join(directory_path, subname))
    if path != directory_path and not path.startswith(directory_path + os.sep):
        raise OSError(...)
    with open(path, "rb") as fp:
        return fp.read()

Legitimate relative hrefs (e.g. Text/chapter1.xhtml) are unaffected.

Tests

Added TestDirectory in tests/test_utils.py:

  • test_read_within_directory — a normal in-directory read still works.
  • test_read_rejects_path_traversal — a ../../ href raises.

Verification

  • pytest tests/test_utils.py — passed; full suite: 26 passed.
  • ruff check and ruff format --check — clean.

Directory.read() joined the source directory with a subname taken from
EPUB OPF manifest hrefs (attacker-controlled and URL-decoded) without
any sanitisation, so a crafted value such as '../../etc/passwd' (or its
percent-encoded form) could escape the directory and read arbitrary
files (CWE-22).

Resolve both the base directory and the target with os.path.realpath and
reject any target that falls outside the base directory before opening
it. Add tests for a normal read and a rejected traversal.

Fixes aerkalov#359
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security: Path traversal in Directory.read() via crafted epub manifest (CWE-22)

1 participant