[FIX] Unblock urllib3 2.x via google-auth bump (clear urllib3 Dependabot alerts)#2042
[FIX] Unblock urllib3 2.x via google-auth bump (clear urllib3 Dependabot alerts)#2042jaseemjaskp wants to merge 1 commit into
Conversation
…ndabot alerts) urllib3 was held at 1.26.20 across root/backend/workers/connectors because google-auth==2.20.0 (pinned in unstract-connectors) requires urllib3<2.0. google-auth>=2.22.0 drops that cap, letting urllib3 resolve to 2.7.0 and clearing the urllib3 high-severity alerts (header leakage on redirects, decompression-bomb bypass, unbounded decompression chain). - unstract/connectors/pyproject.toml: google-auth==2.20.0 -> >=2.22.0 - root/backend/workers/connectors uv.lock: urllib3 1.26.20 -> 2.7.0, google-auth 2.20.0 -> 2.53.0 (cachetools dropped — no longer a hard dep of google-auth 2.53) Verified: all 4 locks pass 'uv lock --check'; connectors env smoke test imports urllib3 2.7.0, google.auth, google.cloud.bigquery, pymysql cleanly. Deferred (NOT in this PR): protobuf 5.x is still capped <5 by additional google/grpc libraries (beyond google-cloud-bigquery) and is a major-version jump; protobuf stays on the 4.25.x LTS line for now.
Summary by CodeRabbit
WalkthroughThe PR updates the ChangesDependency Version Constraint
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Unstract test resultsPer-group results
Critical paths
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@unstract/connectors/pyproject.toml`:
- Line 25: Update the inline comment on the dependency entry
"google-auth>=2.22.0" to remove the incorrect claim that it “drops the urllib3<2
cap” (google-auth==2.22.0 still pins urllib3<2.0); instead note that upgrading
to a google-auth version that truly allows urllib3>=2 is required to lift that
cap. Also audit any direct use of urllib3.util.Retry in your codebase: replace
deprecated parameters such as method_whitelist with allowed_methods and update
any references to renamed DEFAULT_* constants to their urllib3 2.0 equivalents
so Retry construction is compatible with urllib3 2.x.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: af6aa938-a3c3-4f3f-9c8b-3f917ffdf9f4
⛔ Files ignored due to path filters (4)
backend/uv.lockis excluded by!**/*.lockunstract/connectors/uv.lockis excluded by!**/*.lockuv.lockis excluded by!**/*.lockworkers/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (1)
unstract/connectors/pyproject.toml
| "singleton-decorator~=1.0.0", | ||
| # For GCSHelper | ||
| "google-auth==2.20.0", | ||
| "google-auth>=2.22.0", # >=2.22 drops the urllib3<2 cap |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
google-auth 2.22.0 urllib3 constraint removal changelog
💡 Result:
In google-auth version 2.22.0, the urllib3 dependency was explicitly constrained to < 2.0 [1]. This constraint was added in earlier versions to avoid breaking changes introduced by the urllib3 2.0 release [2]. The removal of this constraint did not occur in version 2.22.0. Instead, it was addressed in subsequent releases. A later attempt to update the dependency to require urllib3 >= 2.0.5 in pull request #1389 caused dependency conflicts for other packages [3]. Consequently, support for urllib3 < 2.0 versions was restored in google-auth version 2.23.2 [4].
Citations:
- 1: https://pypi.org/project/google-auth/2.22.0/
- 2: chore: pin urllib3 < 2.0 to avoid breaking changes googleapis/google-auth-library-python#1283
- 3: fix: update urllib3 to >= 2.0.5 googleapis/google-auth-library-python#1389
- 4: chore(main): release 2.23.2 googleapis/google-auth-library-python#1391
🌐 Web query:
google-auth breaking changes 2.20 to 2.53
💡 Result:
From google-auth library version 2.20.0 (2023-06-12) up through 2.53.0 (2026-05-15), the authoritative changelog shows mostly bugfixes/features; I did not find any documented “breaking changes” sections specifically spanning that entire range. What I can state with evidence: - The google-auth 2.20.0 changelog entry exists (2023-06-12) and lists changes under bug fixes/features, including adding public API load_credentials_from_dict (#1326) and expiry handling for compute_engine.IDTokenCredentials and impersonated_credentials.IDTokenCredentials (#1327/#1330) [1]. - A later release in this range, google-auth v2.53.0, lists bug fixes/features (e.g., allowlisting agents-nonprod trust domains for agent identity and fail-fast on invalid/non-workload certificate configs) and does not mark itself as a breaking-change release in the release notes [2][3]. If you meant a specific type of “breaking change” (API signature change, removed symbol, dependency constraint change, auth behavior change, etc.), tell me which surface you care about and (ideally) your current version and target version (e.g., “2.20 → 2.53”). I can then enumerate only the relevant deltas from the changelog/compare views for those exact releases.
Citations:
- 1: https://raw.githubusercontent.com/googleapis/google-auth-library-python/master/CHANGELOG.md
- 2: https://github.com/googleapis/google-cloud-python/releases/tag/google-auth-v2.53.0
- 3: https://pypi.org/project/google-auth/
🌐 Web query:
urllib3 2.0 breaking changes Retry API compatibility
💡 Result:
In urllib3 2.0, the Retry API had a few compatibility-breaking changes, mostly around renamed/removed Retry configuration fields and default constants. 1) Removed/renamed Retry option names (most common break) - Deprecated options in 1.x were removed in 2.0: - Retry.method_whitelist (and Retry.DEFAULT_METHOD_WHITELIST) -> replaced by Retry.allowed_methods / Retry.DEFAULT_ALLOWED_METHODS - Retry.DEFAULT_REDIRECT_HEADERS_BLACKLIST -> replaced by Retry.DEFAULT_REMOVE_HEADERS_ON_REDIRECT - Retry(method_whitelist=...) -> replaced by Retry(allowed_methods=...) This is explicitly called out as deprecated options being removed “Starting in urllib3 v2.0”. [1][2] 2) Changed/removed class constants related to backoff - Retry.BACK0FF_MAX was renamed to Retry.DEFAULT_BACKOFF_MAX. [1][3] - Retry.MAX_BACKOFF was deprecated and removed in v2.0 in favor of Retry.DEFAULT_MAX_BACKOFF. [1][4] 3) Retry constructor gained new backoff-related parameters - urllib3 2.0 added: - a configurable Retry(backoff_max=...) - optional Retry(backoff_jitter=...) If you were subclassing or introspecting constructor parameters, these signature changes can be relevant. [1][3] What “compatibility” usually looks like for most users - The urllib3 2.0 migration guide states that urllib3 “maintain[s] functional API compatibility for most users” and that many changes are to defaults/internal implementation details; you generally only see trouble when you use deprecated/removed Retry option names or constants. [5] If you want, paste your code that constructs Retry (or your dependency tree / error message) and I can map it to the correct urllib3 2.x Retry arguments.
Citations:
- 1: https://urllib3.readthedocs.io/en/2.0.0/changelog.html
- 2: https://urllib3.readthedocs.io/en/2.0.2/changelog.html
- 3: https://github.com/urllib3/urllib3/releases/tag/2.0.0
- 4: https://urllib3.readthedocs.io/en/2.0.4/changelog.html
- 5: https://urllib3.readthedocs.io/en/2.0.2/v2-migration-guide.html
🏁 Script executed:
#!/bin/bash
# Description: Verify google-auth and urllib3 API usage across the codebase
echo "=== Checking google-auth API usage ==="
rg -n --type=py "from google\.oauth2 import|from google\.cloud import" -A 3
echo -e "\n=== Checking urllib3 direct imports ==="
rg -n --type=py "from urllib3|import urllib3" -A 2
echo -e "\n=== Checking if any code pins urllib3 or google-auth versions elsewhere ==="
fd -e toml -e txt -e lock -e cfg | xargs rg -l "urllib3|google-auth"Repository: Zipstack/unstract
Length of output: 2855
Fix the google-auth → urllib3 “cap removal” rationale in pyproject
- The inline comment on
google-auth>=2.22.0(“drops theurllib3<2cap”) is incorrect forgoogle-auth==2.22.0; upstream packaging for 2.22.0 still pinsurllib3to<2.0, so>=2.22.0alone doesn’t guarantee urllib3 2.x. - Because the project uses
urllib3.util.Retrydirectly, re-check the retry construction for urllib3 2.0 removed/renamed options/constants (e.g.,method_whitelist→allowed_methods, relatedDEFAULT_*renames).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@unstract/connectors/pyproject.toml` at line 25, Update the inline comment on
the dependency entry "google-auth>=2.22.0" to remove the incorrect claim that it
“drops the urllib3<2 cap” (google-auth==2.22.0 still pins urllib3<2.0); instead
note that upgrading to a google-auth version that truly allows urllib3>=2 is
required to lift that cap. Also audit any direct use of urllib3.util.Retry in
your codebase: replace deprecated parameters such as method_whitelist with
allowed_methods and update any references to renamed DEFAULT_* constants to
their urllib3 2.0 equivalents so Retry construction is compatible with urllib3
2.x.
|
| Filename | Overview |
|---|---|
| unstract/connectors/pyproject.toml | Single constraint bump: google-auth==2.20.0 → >=2.22.0 with an inline comment; follows existing repo >= style and is the root cause fix. |
| unstract/connectors/uv.lock | urllib3 1.26.20→2.7.0, google-auth 2.20.0→2.53.0, cachetools entry removed; cryptography was already present transitively so no new native dep is introduced. |
| uv.lock | Root workspace lock updated with the same three-package delta as the connectors lock; changes are consistent. |
| backend/uv.lock | Backend workspace lock updated with the same three-package delta; consistent with root and connectors locks. |
| workers/uv.lock | Workers workspace lock updated with the same three-package delta; consistent across all four workspaces. |
Reviews (1): Last reviewed commit: "[FIX] Unblock urllib3 2.x by bumping goo..." | Re-trigger Greptile
jaseemjaskp
left a comment
There was a problem hiding this comment.
Automated review (PR Review Toolkit). This is a clean, internally-consistent dependency bump; the 4 lock files match each other and the pyproject change, and platform-service/prompt-service/sdk1 were correctly left untouched (they already resolve urllib3 2.7.0). One line carries two minor, non-blocking points below — distinct from the floor-version question @coderabbitai already raised.
| "singleton-decorator~=1.0.0", | ||
| # For GCSHelper | ||
| "google-auth==2.20.0", | ||
| "google-auth>=2.22.0", # >=2.22 drops the urllib3<2 cap |
There was a problem hiding this comment.
Dependency hygiene — two minor points (separate from the floor-version question already raised):
1. Unbounded upper bound breaks this block's pinning convention. Every other dependency here is exact-pinned (google-cloud-secret-manager==2.16.1, gcsfs==2024.10.0) or capped (singleton-decorator~=1.0.0). google-auth>=2.22.0 is the only unbounded-upper specifier. The lock pins 2.53.0 so today's builds are reproducible, but a future uv lock regen will silently pull whatever google-auth is newest, with no guard against a breaking major.
Suggest: match the siblings with "google-auth==2.53.0", or at minimum cap it: "google-auth>=2.22.0,<3".
2. Comment is slightly misleading + names an unverifiable version. "drops the urllib3<2 cap" reads as if google-auth still depends on urllib3 with a relaxed range. Per the lock diff, google-auth 2.53.0 removes urllib3 from its dependencies entirely (2.20.0 listed urllib3/six/rsa/cachetools; 2.53.0 lists only cryptography + pyasn1-modules). Also 2.22 is neither the old (2.20.0) nor new (2.53.0) endpoint and isn't substantiated here, so it's prime comment-rot.
Suggest:
# google-auth >= 2.53 no longer depends on urllib3 at all (2.20 did) — that's what unblocks urllib3 2.x
"google-auth>=2.22.0",


What
Unblocks urllib3 1.26.20 → 2.7.0 across root / backend / workers / connectors by lifting the constraint that pinned it to the 1.x line.
unstract/connectors/pyproject.toml:google-auth==2.20.0→google-auth>=2.22.0uv.lock,backend/uv.lock,workers/uv.lock,unstract/connectors/uv.lock: urllib3 → 2.7.0, google-auth → 2.53.0 (cachetools dropped — no longer a hard dep of google-auth 2.53)Why
urllib3 was held at 1.26.20 everywhere, and its CVEs (sensitive-header leakage across redirects, decompression-bomb bypass, unbounded decompression chain) have no 1.26.x patch — the fixes exist only in 2.x. The resolver was capped by a single pin:
google-auth==2.20.0, which declaresurllib3<2.0.google-auth>=2.22.0removed that cap, so urllib3 resolves to 2.7.0 and the alerts clear. It was the onlyurllib3<2constraint in the graph (verified by forcingurllib3>=2.7.0and reading the resolver conflict).How
urllib3>=2.7.0into the resolver until it namedgoogle-auth==2.20.0.google-cloud-bigquerypinned, since the remaining protobuf cap is separate — see below).>=matches existing repo style (boto3~=,httpx>=,croniter>=, …).Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
This is the highest-risk PR in the Dependabot series and should get real integration testing before merge. It changes the HTTP layer (urllib3 1.x → 2.x) and the GCP auth library (google-auth 2.20 → 2.53), both used across the cloud connectors (GCS, BigQuery, GCP-auth) and anything doing HTTP via requests/botocore/etc.
Mitigating evidence:
uv lock --check.urllib32.7.0,google.auth,google.cloud.bigquery,pymysqlall import cleanly.👉 Please exercise the cloud connectors (GCS / BigQuery / GCP auth) and any outbound-HTTP paths in CI/staging. A full local run isn't possible (backend env blocked by the unrelated django-celery-beat 2.5.0 wheel quirk); connectors-level imports are verified but live auth/transfer flows are not.
Database Migrations
None.
Env Config
None.
Relevant Docs
urllib3 2.0 migration guide · google-auth changelog
Related Issues or PRs
Final PR of the Dependabot remediation series (#2038 frontend, #2039 python-transitive, #2040 Django, #2041 Authlib/PyMySQL).
Deferred — protobuf 5.x
protobuf is still capped
<5by additional google/grpc libraries (beyondgoogle-cloud-bigquery, which this PR leaves pinned), and 4 → 5 is a major-version jump. It stays on the 4.25.x LTS line (latest patch), which carries the relevant backported fixes. Moving to 5.x warrants its own PR with a coordinated google-cloud / grpcio upgrade + testing.Dependencies Versions
urllib3 1.26.20 → 2.7.0 · google-auth 2.20.0 → 2.53.0
Notes on Testing
uv lock --checkon all 4 workspaces ✓Screenshots
N/A — dependency change.
Checklist
I have read and understood the Contribution Guidelines.