Skip to content

TST: Surface worker thread exceptions in download_test_pdfs#3766

Open
jonasboos wants to merge 1 commit into
py-pdf:mainfrom
jonasboos:fix-3728-download-test-pdfs-exceptions
Open

TST: Surface worker thread exceptions in download_test_pdfs#3766
jonasboos wants to merge 1 commit into
py-pdf:mainfrom
jonasboos:fix-3728-download-test-pdfs-exceptions

Conversation

@jonasboos
Copy link
Copy Markdown
Contributor

Summary

Previously, download_test_pdfs() called concurrent.futures.wait(futures) but never iterated the returned futures or called .result() on them. Any exception raised in a worker thread was silently captured, causing confusing downstream FileNotFoundErrors when tests tried to open missing files.

Replace wait() with as_completed() + .result() so that download failures are surfaced immediately at the source.

Fix

  • Replace concurrent.futures.wait(futures) with concurrent.futures.as_completed(futures) and call .result() on each future to re-raise any worker thread exceptions.

Testing

  • The change is in test infrastructure code, so the existing test suite exercises it.
  • To manually verify: temporarily break a URL in example_files.yaml and confirm that download_test_pdfs() now raises instead of silently swallowing the error.

Closes #3728

Copilot AI review requested due to automatic review settings May 10, 2026 19:31
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the test infrastructure helper download_test_pdfs() to ensure exceptions raised in worker threads during concurrent PDF downloads are not silently swallowed, making download failures visible at their source instead of later as misleading FileNotFoundErrors.

Changes:

  • Replace concurrent.futures.wait(futures) with iteration over concurrent.futures.as_completed(futures).
  • Call .result() on each completed future to re-raise worker-thread exceptions in the main thread.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/__init__.py
Comment on lines 134 to +138
executor.submit(get_data_from_url, url=pdf["url"], name=pdf["local_filename"])
for pdf in pdfs
]
concurrent.futures.wait(futures)
for future in concurrent.futures.as_completed(futures):
future.result() # re-raises any exception from the worker thread
Previously, `download_test_pdfs()` called `concurrent.futures.wait(futures)`
but never iterated the returned futures or called `.result()` on them. Any
exception raised in a worker thread was silently captured, causing confusing
downstream FileNotFoundError when tests tried to open missing files.

Replace `wait()` with `as_completed()` + `.result()` so that download failures
are surfaced immediately at the source.

Also fix a race condition this exposed: multiple threads could race on
`cache_dir.mkdir()`, raising `FileExistsError` when the directory was created
by another thread between the `exists()` check and the `mkdir()` call. Use
`mkdir(exist_ok=True)` instead of the check-then-create pattern.

Closes py-pdf#3728
@jonasboos jonasboos force-pushed the fix-3728-download-test-pdfs-exceptions branch from 8fca0d2 to 94e0853 Compare May 10, 2026 19:35
@codecov
Copy link
Copy Markdown

codecov Bot commented May 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.62%. Comparing base (4c64630) to head (94e0853).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3766   +/-   ##
=======================================
  Coverage   97.62%   97.62%           
=======================================
  Files          55       55           
  Lines       10221    10221           
  Branches     1876     1876           
=======================================
  Hits         9978     9978           
  Misses        138      138           
  Partials      105      105           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@stefan6419846 stefan6419846 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I have left some comment for a change which seems to be unrelated and not required here before we can continue with this.

Comment thread tests/__init__.py
cache_dir = Path(__file__).parent / "pdf_cache"
if not cache_dir.exists():
cache_dir.mkdir()
cache_dir.mkdir(exist_ok=True)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which benefits does this have?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

download_test_pdfs silently swallows thread exceptions, masking flaky CI failures

3 participants