perf(encoding): Use processFixedWidthRun in FBW bulkScan for filter/hook support#638
Closed
HuamengJiang wants to merge 1 commit intofacebookincubator:mainfrom
Closed
perf(encoding): Use processFixedWidthRun in FBW bulkScan for filter/hook support#638HuamengJiang wants to merge 1 commit intofacebookincubator:mainfrom
HuamengJiang wants to merge 1 commit intofacebookincubator:mainfrom
Conversation
|
@HuamengJiang has exported this pull request. If you are a Meta employee, you can view the originating Diff in D99218756. |
HuamengJiang
pushed a commit
to HuamengJiang/nimble-1
that referenced
this pull request
Apr 11, 2026
…ebookincubator#638) Summary: Add kSameSizeIntegral check to FixedBitWidthEncoding::readWithVisitor so the fast bulk-scan path is used when physicalType and OutputType have the same size (e.g., uint32_t → int32_t for dictionary indices). Also fixed a bug with scattering and added unit tests. Differential Revision: D99218756
2a5dbe5 to
8a03cc2
Compare
HuamengJiang
pushed a commit
to HuamengJiang/nimble-1
that referenced
this pull request
Apr 11, 2026
…ebookincubator#638) Summary: Pull Request resolved: facebookincubator#638 Add kSameSizeIntegral check to FixedBitWidthEncoding::readWithVisitor so the fast bulk-scan path is used when physicalType and OutputType have the same size (e.g., uint32_t → int32_t for dictionary indices). Also fixed a bug with scattering and added unit tests. Differential Revision: D99218756
8a03cc2 to
74d53c1
Compare
HuamengJiang
pushed a commit
to HuamengJiang/nimble-1
that referenced
this pull request
Apr 22, 2026
…bator#638) Summary: Additional scenarios enabled for FBW fast path 1) unsigned type Add kSameSizeIntegral check to FixedBitWidthEncoding::readWithVisitor so the fast bulk-scan path is used when physicalType and OutputType have the same size (e.g., uint32_t → int32_t for dictionary indices). 2) allow entering fast path for filtering and value hook, since the gain is 3x-5x, outweighing most tradeoffs. (Memory is a follow up to take). This is done through calling processFixedWidthRun. Also fixed a bug with scattering: in bulkScan's post-decode path. When kScatter=true (dense reads with nulls), the decoded values need to be "scattered" from their packed non-null positions to their correct output positions, leaving gaps where nulls appear. For example, if rows [0,1,2,3,4] have nulls at positions 1 and 3, the 3 decoded values must be placed at output positions [0,2,4], not [0,1,2]. Differential Revision: D99218756
74d53c1 to
450dce7
Compare
HuamengJiang
pushed a commit
to HuamengJiang/nimble-1
that referenced
this pull request
Apr 23, 2026
…bator#638) Summary: Additional scenarios enabled for FBW fast path 1) unsigned type Add kSameSizeIntegral check to FixedBitWidthEncoding::readWithVisitor so the fast bulk-scan path is used when physicalType and OutputType have the same size (e.g., uint32_t → int32_t for dictionary indices). 2) allow entering fast path for filtering and value hook, since the gain is 3x-5x, outweighing most tradeoffs. (Memory is a follow up to take). This is done through calling processFixedWidthRun. Also fixed a bug with scattering: in bulkScan's post-decode path. When kScatter=true (dense reads with nulls), the decoded values need to be "scattered" from their packed non-null positions to their correct output positions, leaving gaps where nulls appear. For example, if rows [0,1,2,3,4] have nulls at positions 1 and 3, the 3 decoded values must be placed at output positions [0,2,4], not [0,1,2]. Differential Revision: D99218756
450dce7 to
8341fc7
Compare
HuamengJiang
pushed a commit
to HuamengJiang/nimble-1
that referenced
this pull request
Apr 23, 2026
…bator#638) Summary: Pull Request resolved: facebookincubator#638 Additional scenarios enabled for FBW fast path 1) unsigned type Add kSameSizeIntegral check to FixedBitWidthEncoding::readWithVisitor so the fast bulk-scan path is used when physicalType and OutputType have the same size (e.g., uint32_t → int32_t for dictionary indices). 2) allow entering fast path for filtering and value hook, since the gain is 3x-5x, outweighing most tradeoffs. (Memory is a follow up to take). This is done through calling processFixedWidthRun. Also fixed a bug with scattering: in bulkScan's post-decode path. When kScatter=true (dense reads with nulls), the decoded values need to be "scattered" from their packed non-null positions to their correct output positions, leaving gaps where nulls appear. For example, if rows [0,1,2,3,4] have nulls at positions 1 and 3, the 3 decoded values must be placed at output positions [0,2,4], not [0,1,2]. Differential Revision: D99218756
8341fc7 to
a3cdb0b
Compare
…ook support Summary: Replaces the simple addNumValues call in bulkScan's post-decode phase with processFixedWidthRun, which handles scatter (null gaps), filter evaluation, and hook forwarding. This allows the FBW fast path to be used with filters and hooks by moving the filter/hook gate from compile-time to runtime (useFastPath). bulkScan now correctly: - Scatters values to their output positions when nulls are present - Evaluates filters and records passing row numbers in filterHits - Forwards values to hooks instead of the reader buffer Also syncs the legacy FixedBitWidthEncoding with the non-legacy version: adds bulkScan and the no-filter/no-hook fast path (Component 1 gate). The legacy encoding uses the stricter compile-time gate (!kHasFilter && !kHasHook) since the filter+nulls fast path hasn't been validated for legacy contexts. Reviewed By: xiaoxmeng Differential Revision: D99218756
a3cdb0b to
83180ea
Compare
|
This pull request has been merged in 3871dec. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
Replaces the simple addNumValues call in bulkScan's post-decode phase with processFixedWidthRun, which handles scatter (null gaps), filter evaluation, and hook forwarding. This allows the FBW fast path to be used with filters and hooks by moving the filter/hook gate from compile-time to runtime (useFastPath).
bulkScan now correctly:
Also syncs the legacy FixedBitWidthEncoding with the non-legacy version: adds bulkScan and the no-filter/no-hook fast path (Component 1 gate). The legacy encoding uses the stricter compile-time gate (!kHasFilter && !kHasHook) since the filter+nulls fast path hasn't been validated for legacy contexts.
Reviewed By: xiaoxmeng
Differential Revision: D99218756