Skip to content

feat(cudf): Add CudfGroupId operator for GPU-accelerated GROUPING SETS#17238

Open
shrshi wants to merge 5 commits intofacebookincubator:mainfrom
shrshi:cudf-groupid-operator
Open

feat(cudf): Add CudfGroupId operator for GPU-accelerated GROUPING SETS#17238
shrshi wants to merge 5 commits intofacebookincubator:mainfrom
shrshi:cudf-groupid-operator

Conversation

@shrshi
Copy link
Copy Markdown
Collaborator

@shrshi shrshi commented Apr 18, 2026

Summary

  • Implement CudfGroupId operator to replace the CPU GroupId operator on GPU for SQL GROUPING SETS, CUBE, and ROLLUP operations

Implementation Details

The CudfGroupId operator:

  • Takes a single input batch and produces N output batches (one per grouping set), matching CPU behavior
  • Creates all-null columns for grouping keys not in the current grouping set.
  • Creates constant group_id column (BIGINT) for each grouping set

Testing

Added GroupIdTest.cpp with tests matching core Velox patterns in AggregationTest.cpp:

Implement CudfGroupId operator to replace the CPU GroupId operator on GPU
for SQL GROUPING SETS, CUBE, and ROLLUP operations.

- Add CudfGroupId class inheriting from CudfOperatorBase
- Cycle through grouping sets one at a time (matching CPU behavior)
- Create all-null columns for keys not in current grouping set
- Create constant group_id column for each grouping set
- Optimize column ownership with usage counting (move vs copy)
- Register GroupIdAdapter in OperatorAdapters
- Add comprehensive tests matching core Velox test patterns
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 18, 2026

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 0c4e87c
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/69fa4fb92f807c00082a86b1

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 18, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 18, 2026

Build Impact Analysis

Full build recommended. Files outside the dependency graph changed:

  • velox/experimental/cudf/exec/CMakeLists.txt
  • velox/experimental/cudf/exec/CudfGroupId.cpp
  • velox/experimental/cudf/exec/CudfGroupId.h
  • velox/experimental/cudf/exec/OperatorAdapters.cpp
  • velox/experimental/cudf/tests/CMakeLists.txt
  • velox/experimental/cudf/tests/GroupIdTest.cpp

These directories are not fully covered by the dependency graph. A full build is the safest option.

cmake --build _build/release

Slow path • Graph generated from PR branch

Comment thread velox/experimental/cudf/exec/CudfGroupId.cpp
Comment thread velox/experimental/cudf/exec/CudfGroupId.cpp Outdated
Comment thread velox/experimental/cudf/tests/GroupIdTest.cpp
shrshi added 2 commits April 21, 2026 02:35
Add test that exercises the column copy vs move logic when the same
input column appears in multiple grouping sets.
Only move columns on the last grouping set since inputColumns_ is reused
across multiple getOutput() calls. This is simpler and avoids potential
issues when the same input column maps to multiple output positions.

Add test for same input column mapped to multiple outputs in a grouping set.
@shrshi shrshi requested a review from jinchengchenghh April 21, 2026 17:41
shrshi added 2 commits May 5, 2026 19:51
Fix a bug where the same input column mapped to multiple output positions
(e.g., aliased keys) would be moved twice on the last grouping set,
leaving a null unique_ptr and crashing on table construction. Track
per-call reference counts so columns are moved only on their final use.

Also precompute cudf data types for grouping key columns in the
constructor instead of recomputing them on every doGetOutput() call.
Copy link
Copy Markdown
Collaborator

@simoneves simoneves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not familiar with this code, so style review only (sorry!)

size_t groupingSetIndex_{0};

/// Total number of grouping sets.
size_t numGroupingSets_;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: init to zero (at least for consistency with the others, even though it's set in the constructor)

size_t numGroupingSets_;

/// Number of grouping key columns in output.
size_t numGroupingKeys_;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(ditto)

auto tempMr = get_temp_mr();
auto numRows = inputSize_;

const auto& mapping = groupingKeyMappings_[groupingSetIndex_];
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: CHECK on index before use.

auto inputIdx = aggregationInputs_[i];
auto outputIdx = numGroupingKeys_ + i;
if (isLastGroupingSet && --remainingUses[inputIdx] == 0) {
outputColumns[outputIdx] = std::move(inputColumns_[inputIdx]);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: CHECK on indices before use (4x)

0,
"GroupIdNode didn't map grouping key {} to input channel",
groupingKey);
auto inputChannel = outputToInputGroupingKeyMapping.at(outputChannel);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: CHECK result of at()

"GroupIdNode didn't map grouping key {} to input channel",
groupingKey);
auto inputChannel = outputToInputGroupingKeyMapping.at(outputChannel);
mappings[outputChannel] = inputChannel;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: CHECK index before use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants