Add scripts+workflow to build and upload tarballs from artifacts by ScottTodd · Pull Request #4448 · ROCm/TheRock

ScottTodd · 2026-04-09T22:48:44Z

Motivation

We'd like to produce tarballs as part of multi-arch release pipelines. For context, see:

This will also enable building JAX packages as part of CI pipelines, see:

[CI] Build and test JAX Python packages as part of ci.yml #3878

Technical Details

This downloads artifacts from a workflow run (current workflow run when included as part of CI/CD workflows, or a prior workflow for testing or repackaging) and then uploads them to an artifacts bucket (e.g. therock-dev-artifacts). Release workflows (to be added) can then choose to copy these tarballs to a tarballs bucket (e.g. therock-dev-tarball).

Important

The workflow is not yet integrated into any workflows via workflow_call. It is only run manually via workflow_dispatch.

Tarball files use substantial storage (2GB+ per tarball), so I'd like to only include this for release builds and opt-in for PRs that want to build JAX -- at least until KPACK_SPLIT_ARTIFACTS is flipped and we can produce a single "multiarch" tarball instead of separate tarballs per family.

Behavior with and without `KPACK_SPLIT_ARTIFACTS`

In this initial implementation,

Condition	Behavior
`KPACK_SPLIT_ARTIFACTS` disabled	Creates a single tarball per GPU family
`KPACK_SPLIT_ARTIFACTS` enabled	Creates a single tarball per GPU target and a "multiarch" tarball with all GPU targets

We may later want to also produce tarballs without including test artifacts, produce larger groups independent of the current families like "all Radeon GPU targets", etc. All of that is just changes to the filtering and repackaging.

Downloading and extracting

This implementation runs a loop around:

python build_tools/artifact_manager.py fetch \
    --stage=all \  # artifacts from all stages (foundation,math-libs,etc.), all components (lib,doc,test,etc.)
    --amdgpu-families=${families_str} \  # filter to a single family
    --output-dir=${output_dir} \
    --flatten \  # extract and flatten into "dist" directory in one command
    --download-cache-dir=${download_cache_dir}  # reuse generic artifacts downloaded by prior calls

This has the advantage of being easy to reproduce outside of the script and reusing cached downloaded artifacts for local debugging and CI efficiency. We also considered fetching and not flattening, then using artifacts.py::ArtifactCatalog to repackage as build_python_packages.py does (using py_packaging.py), but this is simpler.

Compression

This implementation produces .tar.gz, matching existing tarball releases. Compression would be faster and more efficient using .tar.zst. I ran some benchmarks on my Windows dev machine:

Expand for benchmark results

Method	Time (s)	Size (MB)	Ratio
tar-cfz	21.0	419.4	29.5% <- current default
gz-1	12.2	449.8	31.6%
gz-3	15.2	440.5	31.0%
gz-6	26.4	420.9	29.6%
gz-9	67.9	420.2	29.5%
zst-1	3.3	420.2	29.5% <- matches gz-6 ratio, 6x faster
zst-3	4.4	360.5	25.3% <- sweet spot
zst-6	8.0	343.9	24.2%
zst-9	10.0	317.9	22.3%
zst-19	197.9	199.4	14.0%

I did wrap compression in a ProcessPoolExecutor since parallel compression does make efficient use of CPU cores, sample benchmarks showing speedup (so not oversubscribed):

Expand for benchmark results

Workers	Wall (s)	Avg/job	Speedup	Efficiency
1	244.2	24.4	1.0x	103%
2	128.3	25.6	2.0x	98%
4	79.4	26.6	3.2x	79%
6	54.4	27.2	4.6x	77%
8	54.0	27.6	4.7x	58%
10	28.8	28.6	8.8x	88%

Test Plan

New unit tests for some logic
Tested locally with artifacts from prior workflow runs with and without KPACK_SPLIT_ARTIFACTS, artifacts were downloaded, packaged into the expected tarballs, and "uploaded" to a staging directory
Trigger the new workflow on my fork, check that the workflow succeeds (except for upload, missing credentials)

Test Result

Without KPACK_SPLIT_ARTIFACTS: https://github.com/ScottTodd/TheRock/actions/runs/24205988455/job/70661826987

Building tarballs for 2 families: gfx1151, gfx110X-all
  Platform: linux
  Version: 7.13.0.dev0+83ae8235312791cd7302e3f50c9935887d62b5a3
  Output: /home/runner/work/TheRock/TheRock/tarballs
...
Done. Tarballs in /home/runner/work/TheRock/TheRock/tarballs:
  therock-dist-linux-gfx110X-all-7.13.0.dev0+83ae8235312791cd7302e3f50c9935887d62b5a3.tar.gz (2711.5 MB)
  therock-dist-linux-gfx1151-7.13.0.dev0+83ae8235312791cd7302e3f50c9935887d62b5a3.tar.gz (2820.1 MB)
...
[INFO] Uploading to s3://therock-ci-artifacts-external/ScottTodd-TheRock/24205988455-linux/tarballs

With KPACK_SPLIT_ARTIFACTS: https://github.com/ScottTodd/TheRock/actions/runs/24217435275/job/70701188683

Building tarballs for 2 families: gfx1151, gfx1100
  Platform: linux
  Version: 7.13.0.dev0+83ae8235312791cd7302e3f50c9935887d62b5a3
  Output: /home/runner/work/TheRock/TheRock/tarballs
...
Done. Tarballs in /home/runner/work/TheRock/TheRock/tarballs:
  therock-dist-linux-gfx1100-7.13.0.dev0+83ae8235312791cd7302e3f50c9935887d62b5a3.tar.gz (2891.5 MB)
  therock-dist-linux-gfx1151-7.13.0.dev0+83ae8235312791cd7302e3f50c9935887d62b5a3.tar.gz (2907.4 MB)
  therock-dist-linux-multiarch-7.13.0.dev0+83ae8235312791cd7302e3f50c9935887d62b5a3.tar.gz (3085.2 MB)
...
[INFO] Uploading to s3://therock-ci-artifacts-external/ScottTodd-TheRock/24217435275-linux/tarballs

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

ScottTodd · 2026-04-09T22:53:26Z

Here's my worklog with more notes, prototypes, etc., fyi: https://github.com/ScottTodd/claude-rocm-workspace/blob/main/tasks/active/multi-arch-releases.md#workstream-2a-build-multi-arch-tarballs

erman-gurses

Looks good overall, added one concern - will do one more pass tomorrow.

build_tools/github_actions/upload_tarballs.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The upload path includes the platform ({run_id}-{platform}/tarballs/), so the script needs to know the target platform rather than auto-detecting from the current system. This matters when building Windows tarballs on a Linux runner. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ScottTodd · 2026-04-10T17:45:21Z

build_tools/build_tarballs.py

+        f"--amdgpu-families={families_str}",
+        "--expand-family-to-targets",


@marbre this from #4449 is working as expected now, it expands gfx110X-all to gfx1100, gfx1101, gfx1102, gfx1103:

https://github.com/ScottTodd/TheRock/actions/runs/24255576558/job/70826158778

python build_tools/build_tarballs.py \ --run-id="24187929660" \ --run-github-repo="ROCm/TheRock" \ --dist-amdgpu-families="gfx110X-all;gfx1151" \ ++ Downloading prim_test_gfx1100.tar.zst ++ Downloading prim_test_gfx1101.tar.zst ++ Downloading prim_test_gfx1102.tar.zst ++ Downloading prim_test_gfx1103.tar.zst

erman-gurses

LGTM!

ScottTodd requested a review from marbre April 9, 2026 22:48

github-project-automation bot added this to TheRock Triage Apr 9, 2026

github-project-automation bot moved this to TODO in TheRock Triage Apr 9, 2026

ScottTodd requested a review from erman-gurses April 10, 2026 01:26

erman-gurses reviewed Apr 10, 2026

View reviewed changes

build_tools/github_actions/upload_tarballs.py Show resolved Hide resolved

ScottTodd and others added 4 commits April 10, 2026 10:19

Add scripts+workflow to build and upload tarballs from artifacts

031fe60

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move run-github-repo next to run-id

efe368e

Add --expand-family-to-targets arg to artifact_manager.py call

a763196

ScottTodd force-pushed the multi-arch-build-tarballs branch from 54e29a9 to a763196 Compare April 10, 2026 17:28

ScottTodd marked this pull request as ready for review April 10, 2026 17:43

ScottTodd requested a review from erman-gurses April 10, 2026 17:43

ScottTodd commented Apr 10, 2026

View reviewed changes

erman-gurses approved these changes Apr 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add scripts+workflow to build and upload tarballs from artifacts#4448

Add scripts+workflow to build and upload tarballs from artifacts#4448
ScottTodd wants to merge 4 commits intoROCm:mainfrom
ScottTodd:multi-arch-build-tarballs

ScottTodd commented Apr 9, 2026 •

edited

Loading

Uh oh!

ScottTodd commented Apr 9, 2026

Uh oh!

erman-gurses left a comment

Uh oh!

Uh oh!

ScottTodd Apr 10, 2026

Uh oh!

erman-gurses left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		f"--amdgpu-families={families_str}",
		"--expand-family-to-targets",

Conversation

ScottTodd commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Technical Details

Behavior with and without KPACK_SPLIT_ARTIFACTS

Downloading and extracting

Compression

Test Plan

Test Result

Submission Checklist

Uh oh!

ScottTodd commented Apr 9, 2026

Uh oh!

erman-gurses left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ScottTodd Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

erman-gurses left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ScottTodd commented Apr 9, 2026 •

edited

Loading

Behavior with and without `KPACK_SPLIT_ARTIFACTS`