Skip to content

test(encoding): add further tests for varint encoding to the varint encoding to make it robust (#580)#580

Open
srsuryadev wants to merge 5 commits intomainfrom
export-D96665765
Open

test(encoding): add further tests for varint encoding to the varint encoding to make it robust (#580)#580
srsuryadev wants to merge 5 commits intomainfrom
export-D96665765

Conversation

@srsuryadev
Copy link
Copy Markdown
Contributor

@srsuryadev srsuryadev commented Mar 18, 2026

Summary:

Add further tests to the varint encoding to make it robust

Reviewed By: xiaoxmeng

Differential Revision: D96665765

…Width, MainlyConstant for faster iteration for SST workload

Summary: Add v2 encoding scaffoldings for the Varint, RLE, FixedBitWidth, and MainlyConstant for faster iteration or perf tuning

Differential Revision: D96684714
Summary:
Add `decodeSingleByteRun` fast path to `bulkVarintDecode32` and
`bulkVarintDecode64` that processes leading runs of single-byte varints
(values 0-127) using 8-byte word reads before falling through to the
BMI2 switch-based decoder. For each 8-byte word where no continuation
bits are set (`word & 0x8080808080808080 == 0`), all 8 varints are
decoded with simple shifts, avoiding the `_pext_u64` and 64-case switch
overhead.

This is placed in the caller functions rather than inside
`bulkVarintDecodeBmi2` to preserve the BMI2 function's code layout and
icache behavior for mixed-width data.

Benchmark results (1M elements, mode/opt):
| Scenario              | Before    | After     | Speedup   |
|-----------------------|-----------|-----------|-----------|
| 1-byte (32-bit)       | 465us     | 260us     | 1.79x     |
| 5-byte (32-bit)       | slower    | 1.22ms    | fixed     |
| 3-byte (32-bit)       | 1.04ms    | 864us     | 1.20x     |
| 4-byte (32-bit)       | 1.50ms    | 1.04ms    | 1.44x     |
| 64-bit 1-byte         | 294us     | 232us     | 1.27x     |
| batch1024             | 1.96us    | 1.20us    | 1.63x     |
| Uniform/2-byte/8-byte | unchanged | unchanged | no regress|

Also enhances the varint benchmark with fixed byte-width benchmarks
(1-5 byte for 32-bit, 1/4/8 byte for 64-bit), skip benchmarks, and
batch size benchmarks.

Differential Revision: D96617939
… single-byte varints

Summary:
Manually loop-unroll `decodeSingleByteRun` with a 3-tier approach:
1. 32-element (4-word) unrolled loop with combined high-bit check
   `(w0 | w1 | w2 | w3) & kHighBits` to minimize branch overhead
2. 8-element (1-word) loop for smaller runs
3. Single-element trailing loop to pick up individual single-byte
   varints before multi-byte values

Also extracts the byte-expansion logic into a reusable `expandWord()`
helper for clarity.

Differential Revision: D96619597
…deSingleByteRun

Summary:
Replace scalar byte expansion and reinterpret_cast-based uint64_t loads in
decodeSingleByteRun with xsimd-based SIMD operations:

- Use xsimd::batch<uint8_t>::load_unaligned for a single wide load (32 bytes
  on AVX2) + vptest to check all high bits at once, replacing 4 separate
  uint64_t loads + OR chain.
- Use xsimd::batch<T> construction and store_unaligned for byte-to-element
  widening (compiles to vpmovzxbd on AVX2, vmovl on NEON).
- Replace reinterpret_cast<const uint64_t*> with std::memcpy in the 8-byte
  loop to avoid strict-aliasing/alignment issues.

Differential Revision: D96628007
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 18, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented Mar 18, 2026

@srsuryadev has exported this pull request. If you are a Meta employee, you can view the originating Diff in D96665765.

srsuryadev added a commit that referenced this pull request Mar 19, 2026
…ncoding to make it robust (#580)

Summary:
Pull Request resolved: #580

Add further tests to the varint encoding to make it robust

Reviewed By: xiaoxmeng

Differential Revision: D96665765
@meta-codesync meta-codesync Bot changed the title test(encoding): add further tests for varint encoding to the varint encoding to make it robust test(encoding): add further tests for varint encoding to the varint encoding to make it robust (#580) Mar 19, 2026
srsuryadev added a commit that referenced this pull request Mar 19, 2026
…ncoding to make it robust (#580)

Summary:
Pull Request resolved: #580

Add further tests to the varint encoding to make it robust

Reviewed By: xiaoxmeng

Differential Revision: D96665765
srsuryadev added a commit that referenced this pull request Mar 19, 2026
…ncoding to make it robust (#580)

Summary:
Pull Request resolved: #580

Add further tests to the varint encoding to make it robust

Reviewed By: xiaoxmeng

Differential Revision: D96665765
srsuryadev added a commit that referenced this pull request Mar 20, 2026
…ncoding to make it robust (#580)

Summary:
Pull Request resolved: #580

Add further tests to the varint encoding to make it robust

Reviewed By: xiaoxmeng

Differential Revision: D96665765
…ncoding to make it robust (#580)

Summary:
Pull Request resolved: #580

Add further tests to the varint encoding to make it robust

Reviewed By: xiaoxmeng

Differential Revision: D96665765
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant