Skip to content

fix: smoother progress for xet downloads#4059

Open
tobocop2 wants to merge 1 commit intohuggingface:mainfrom
tobocop2:fix/xet-tqdm-granularity
Open

fix: smoother progress for xet downloads#4059
tobocop2 wants to merge 1 commit intohuggingface:mainfrom
tobocop2:fix/xet-tqdm-granularity

Conversation

@tobocop2
Copy link
Copy Markdown

@tobocop2 tobocop2 commented Apr 6, 2026

Fixes #4058.

Problem

Xet downloads barely report progress. For a 2.5GB file the bar updates about 9 times total -- 0% → 1.7% → 5.6% → 11% → 25% → 84% → 100%. That 25% to 84% gap is ~1.5GB of nothing. Looks completely stuck.

The cause: xet_get() and download_bucket_files() use a 1-arg callback. xet-core supports a 2-arg signature that gives fine-grained network-level updates instead.

Fix

make_xet_progress_callback(progress_bar, file_size) returns a 2-arg callback so xet-core gives us useful progress. Transfer bytes are scaled to file size so the bar tracks 0-100% correctly (xet dedup can make transfer size differ from file size).

Wired into both xet_get() and download_bucket_files(). No public API changes.

Demo

Qwen3 4B (2.5 GB, xet-stored) with a custom tqdm_class callback. Builds on #4056 which fixes custom tqdm_class in non-TTY environments.

Before (bar appears stuck)

Old coarse progress

After (smooth progress)

New fine-grained progress

Note

Both downloads finish in roughly the same time (~80s). The difference is purely in how progress is reported.

Multi-file: snapshot_download Qwen3-8B (5 xet shards, 16.4 GB)

Before (bar barely moves)

Old snapshot progress

After (smooth per-file progress)

New snapshot progress

Related


Note

Low Risk
Low risk: changes are limited to progress reporting callbacks for Xet downloads, with new tests covering scaling/capping and edge cases. Main risk is minor regressions in progress bar updates or callback compatibility with hf_xet.download_files.

Overview
Improves progress reporting for Xet-powered downloads.

Adds make_xet_progress_callback to provide the 2-argument progress callback expected by xet-core for fine-grained network updates, while scaling transfer bytes to the expected file size (and capping at 100%) so tqdm bars advance smoothly and finish correctly.

Wires the new callback into both xet_get() and HfApi.download_bucket_files(), and adds focused tests validating the callback signature, scaling behavior (including dedup/transfer-total mismatch), and edge cases like zero increments and unknown file sizes.

Reviewed by Cursor Bugbot for commit 6a8f8a3. Bugbot is set up for automated code reviews on this repo. Configure here.

@tobocop2 tobocop2 force-pushed the fix/xet-tqdm-granularity branch from e124bb4 to dcc4347 Compare April 6, 2026 17:06
@tobocop2 tobocop2 marked this pull request as draft April 6, 2026 17:10
@tobocop2 tobocop2 force-pushed the fix/xet-tqdm-granularity branch 2 times, most recently from 84bde27 to 205c369 Compare April 6, 2026 17:15
@tobocop2 tobocop2 marked this pull request as ready for review April 6, 2026 17:17
@tobocop2 tobocop2 force-pushed the fix/xet-tqdm-granularity branch from 205c369 to a58216f Compare April 6, 2026 17:20
@tobocop2 tobocop2 changed the title fix: use fine-grained xet-core callback for smoother tqdm progress fix: smoother progress bars for xet downloads Apr 6, 2026
@tobocop2 tobocop2 changed the title fix: smoother progress bars for xet downloads fix: smoother progress for xet downloads Apr 6, 2026
@tobocop2 tobocop2 force-pushed the fix/xet-tqdm-granularity branch 4 times, most recently from 26d9fca to 9a69c7a Compare April 7, 2026 02:01
Switch xet_get() from a 1-arg to 2-arg callback so xet-core reports
progress frequently instead of barely at all.

Fixes huggingface#4058
@tobocop2 tobocop2 force-pushed the fix/xet-tqdm-granularity branch from 9a69c7a to 6a8f8a3 Compare April 7, 2026 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: xet downloads barely report progress (bar appears stuck on large files)

1 participant