Skip to content

perf(encoding): Use processFixedWidthRun in FBW bulkScan for filter/hook support#638

Closed
HuamengJiang wants to merge 1 commit intofacebookincubator:mainfrom
HuamengJiang:export-D99218756
Closed

perf(encoding): Use processFixedWidthRun in FBW bulkScan for filter/hook support#638
HuamengJiang wants to merge 1 commit intofacebookincubator:mainfrom
HuamengJiang:export-D99218756

Conversation

@HuamengJiang
Copy link
Copy Markdown

@HuamengJiang HuamengJiang commented Apr 4, 2026

Summary:
Replaces the simple addNumValues call in bulkScan's post-decode phase with processFixedWidthRun, which handles scatter (null gaps), filter evaluation, and hook forwarding. This allows the FBW fast path to be used with filters and hooks by moving the filter/hook gate from compile-time to runtime (useFastPath).

bulkScan now correctly:

  • Scatters values to their output positions when nulls are present
  • Evaluates filters and records passing row numbers in filterHits
  • Forwards values to hooks instead of the reader buffer

Also syncs the legacy FixedBitWidthEncoding with the non-legacy version: adds bulkScan and the no-filter/no-hook fast path (Component 1 gate). The legacy encoding uses the stricter compile-time gate (!kHasFilter && !kHasHook) since the filter+nulls fast path hasn't been validated for legacy contexts.

Reviewed By: xiaoxmeng

Differential Revision: D99218756

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 4, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented Apr 4, 2026

@HuamengJiang has exported this pull request. If you are a Meta employee, you can view the originating Diff in D99218756.

@meta-codesync meta-codesync Bot changed the title perf(encoding): Enable FBW fast path for unsigned integral types perf(encoding): Enable FBW fast path for unsigned integral types (#638) Apr 11, 2026
HuamengJiang pushed a commit to HuamengJiang/nimble-1 that referenced this pull request Apr 11, 2026
…ebookincubator#638)

Summary:

Add kSameSizeIntegral check to FixedBitWidthEncoding::readWithVisitor
so the fast bulk-scan path is used when physicalType and OutputType have
the same size (e.g., uint32_t → int32_t for dictionary indices).

Also fixed a bug with scattering and added unit tests.

Differential Revision: D99218756
HuamengJiang pushed a commit to HuamengJiang/nimble-1 that referenced this pull request Apr 11, 2026
…ebookincubator#638)

Summary:
Pull Request resolved: facebookincubator#638

Add kSameSizeIntegral check to FixedBitWidthEncoding::readWithVisitor
so the fast bulk-scan path is used when physicalType and OutputType have
the same size (e.g., uint32_t → int32_t for dictionary indices).

Also fixed a bug with scattering and added unit tests.

Differential Revision: D99218756
@meta-codesync meta-codesync Bot changed the title perf(encoding): Enable FBW fast path for unsigned integral types (#638) perf(encoding): Enable FBW fast path for more scenarios (#638) Apr 22, 2026
HuamengJiang pushed a commit to HuamengJiang/nimble-1 that referenced this pull request Apr 22, 2026
…bator#638)

Summary:

Additional scenarios enabled for FBW fast path
1) unsigned type
Add kSameSizeIntegral check to FixedBitWidthEncoding::readWithVisitor
so the fast bulk-scan path is used when physicalType and OutputType have
the same size (e.g., uint32_t → int32_t for dictionary indices).
2) allow entering fast path for filtering and value hook, since the gain is 3x-5x, outweighing most tradeoffs. (Memory is a follow up to take). This is done through calling processFixedWidthRun.

Also fixed a bug with scattering: in bulkScan's post-decode path. When kScatter=true (dense reads with nulls), the decoded values need to be "scattered" from their packed non-null positions
to their correct output positions, leaving gaps where nulls appear. For example, if rows [0,1,2,3,4] have nulls at positions 1 and 3, the 3 decoded values must be placed at output positions [0,2,4], not [0,1,2].

Differential Revision: D99218756
HuamengJiang pushed a commit to HuamengJiang/nimble-1 that referenced this pull request Apr 23, 2026
…bator#638)

Summary:

Additional scenarios enabled for FBW fast path
1) unsigned type
Add kSameSizeIntegral check to FixedBitWidthEncoding::readWithVisitor
so the fast bulk-scan path is used when physicalType and OutputType have
the same size (e.g., uint32_t → int32_t for dictionary indices).
2) allow entering fast path for filtering and value hook, since the gain is 3x-5x, outweighing most tradeoffs. (Memory is a follow up to take). This is done through calling processFixedWidthRun.

Also fixed a bug with scattering: in bulkScan's post-decode path. When kScatter=true (dense reads with nulls), the decoded values need to be "scattered" from their packed non-null positions
to their correct output positions, leaving gaps where nulls appear. For example, if rows [0,1,2,3,4] have nulls at positions 1 and 3, the 3 decoded values must be placed at output positions [0,2,4], not [0,1,2].

Differential Revision: D99218756
HuamengJiang pushed a commit to HuamengJiang/nimble-1 that referenced this pull request Apr 23, 2026
…bator#638)

Summary:
Pull Request resolved: facebookincubator#638

Additional scenarios enabled for FBW fast path
1) unsigned type
Add kSameSizeIntegral check to FixedBitWidthEncoding::readWithVisitor
so the fast bulk-scan path is used when physicalType and OutputType have
the same size (e.g., uint32_t → int32_t for dictionary indices).
2) allow entering fast path for filtering and value hook, since the gain is 3x-5x, outweighing most tradeoffs. (Memory is a follow up to take). This is done through calling processFixedWidthRun.

Also fixed a bug with scattering: in bulkScan's post-decode path. When kScatter=true (dense reads with nulls), the decoded values need to be "scattered" from their packed non-null positions
to their correct output positions, leaving gaps where nulls appear. For example, if rows [0,1,2,3,4] have nulls at positions 1 and 3, the 3 decoded values must be placed at output positions [0,2,4], not [0,1,2].

Differential Revision: D99218756
…ook support

Summary:
Replaces the simple addNumValues call in bulkScan's post-decode phase with processFixedWidthRun, which handles scatter (null gaps), filter evaluation, and hook forwarding. This allows the FBW fast path to be used with filters and hooks by moving the filter/hook gate from compile-time to runtime (useFastPath).

bulkScan now correctly:
- Scatters values to their output positions when nulls are present
- Evaluates filters and records passing row numbers in filterHits
- Forwards values to hooks instead of the reader buffer

Also syncs the legacy FixedBitWidthEncoding with the non-legacy version: adds bulkScan and the no-filter/no-hook fast path (Component 1 gate). The legacy encoding uses the stricter compile-time gate (!kHasFilter && !kHasHook) since the filter+nulls fast path hasn't been validated for legacy contexts.

Reviewed By: xiaoxmeng

Differential Revision: D99218756
@meta-codesync meta-codesync Bot changed the title perf(encoding): Enable FBW fast path for more scenarios (#638) perf(encoding): Use processFixedWidthRun in FBW bulkScan for filter/hook support Apr 28, 2026
@meta-codesync meta-codesync Bot closed this in 3871dec Apr 28, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented Apr 28, 2026

This pull request has been merged in 3871dec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant