Skip to content

Add buffer.readvector2/3/4 and buffer.writevector2/3/4 methods#198

Open
rbxphogen wants to merge 4 commits into
luau-lang:masterfrom
rbxphogen:rfc-buffer-readvector234-writevector234
Open

Add buffer.readvector2/3/4 and buffer.writevector2/3/4 methods#198
rbxphogen wants to merge 4 commits into
luau-lang:masterfrom
rbxphogen:rfc-buffer-readvector234-writevector234

Conversation

@rbxphogen
Copy link
Copy Markdown

@rbxphogen rbxphogen commented Apr 30, 2026

@rbxphogen rbxphogen marked this pull request as ready for review April 30, 2026 21:17
@ishtar112
Copy link
Copy Markdown

ishtar112 commented May 1, 2026

Would it be problematic for the proposed alternatives of buffer.readvector and buffer.writevector to take in a width parameter [instead of inferring intent based on LUA_VECTOR_SIZE]? Presumably, if width is 2, vec.x and vec.y are captured/returned; if width is 3, vec.z is captured/returned as well; if width is 4, vec.w is captured/returned as well, and should probably throw an error if LUA_VECTOR_SIZE is not 4.

As you mentioned, the alternative would help reduce API surface area and consume less fastcall slots (assuming these would be fastcalls, which they should be) in exchange for some performance (but certainly not so much that the feature becomes useless). This change in particular -- to take in a width parameter -- preserves user intent without sacrificing memory usage precision.

@rbxphogen
Copy link
Copy Markdown
Author

Would it be problematic for the proposed alternatives of buffer.readvector and buffer.writevector to take in a width parameter [instead of inferring intent based on LUA_VECTOR_SIZE]? Presumably, if width is 2, vec.x and vec.y are captured/returned; if width is 3, vec.z is captured/returned as well; if width is 4, vec.w is captured/returned as well, and should probably throw an error if LUA_VECTOR_SIZE is not 4.

As you mentioned, the alternative would help reduce API surface area and consume less fastcall slots (assuming these would be fastcalls, which they should be) in exchange for some performance (but certainly not so much that the feature becomes useless). This change in particular -- to take in a width parameter -- preserves user intent without sacrificing memory usage precision.

@ishtar112 the only reason I hesitate to do that is width would only have 2-3 valid values (well, 1 might also be valid if you fall back to f32 in that case), meanwhile presenting the whole valid range of a number – the methods would have to document (and/or warn) when widths that are too big are used, and specify what the fallback behavior (if any) is.

That said, I'm not strongly opposed, since it could make even-wider-vector-types easier to implement in the future.

@gaymeowing
Copy link
Copy Markdown
Contributor

I feel like if this is going to be an RFC, it might as well be widened to include a method for reading/writing booleans to buffers using a u8. As that's in the same vein as this, although there's no major simd improvements to be made there.
But both will give benefits in the interpreter, and also reduce boilerplate for those who want to use these primitives with buffers.

@rbxphogen
Copy link
Copy Markdown
Author

I feel like if this is going to be an RFC, it might as well be widened to include a method for reading/writing booleans to buffers using a u8. As that's in the same vein as this, although there's no major simd improvements to be made there. But both will give benefits in the interpreter, and also reduce boilerplate for those who want to use these primitives with buffers.

I think that makes more sense as a separate RFC – since 7 of the 8 bits would be unused when writing a boolean as a u8, packing might become a point of discussion.

This proposal uses all available bits and provides additional simd opportunities; in my opinion, that warrants discussing its merits individually

@gaymeowing
Copy link
Copy Markdown
Contributor

gaymeowing commented May 7, 2026

It would be nice if a number writing buffer library method like buffer.writeu8 could be passed to buffer.readvector2/3/4andbuffer.writevector2/3/4. As vectors are nice to use for anything that needs multiple numbers because they can fit into a TValue; like I use vectors for ranges, vector2s, colors, etc. Which most of the time could be serialized better than using f32s (colors could use u8 for example).

@rbxphogen
Copy link
Copy Markdown
Author

rbxphogen commented May 8, 2026

It would be nice if a number writing buffer library method like buffer.writeu8 could be passed to buffer.readvector2/3/4andbuffer.writevector2/3/4. As vectors are nice to use for anything that needs multiple numbers because they can fit into a TValue; like I use vectors for ranges, vector2s, colors, etc. Which most of the time could be serialized better than using f32s (colors could use u8 for example).

If your color values range [0, 255] you could do something like

local packed = vec.x + bit32.lshift(vec.y, 8) + bit32.lshift(vec.z, 16) + bit32.lshift(vec.w, 24)
buffer.writeu32(buf, 0, packed)

Adding a buffer.writevector4u8 library method as shorthand would allow you to skip 3 bit32.lshift calls, so it does seem useful, but overflow/underflow behavior would need to be specified, and the performance win is probably more meager by comparison – I would consider this out-of-scope.

If the maintainers agree to adding buffer.writevector2/3/4 we can propose more vector-like operations as a followup

@YarikSuperpro
Copy link
Copy Markdown

Which most of the time could be serialized better than using f32s (colors could use u8 for example).
I feel like if this is going to be an RFC, it might as well be widened to include a method for reading/writing booleans to For

U32 is generally the better fit since bit32 operates on 32-bit integer ranges directly.

local packed = bit32.bor(0b01,0b10) -- 0b11

and then:

buffer.writeu32(buff, offset, packed)

So boolean packing already composes cleanly with existing primitives using:

  • bit32.band
  • bit32.bor
  • bit32.lshift
  • bit32.btest

Although I do think adding U24 could make sense for cases like RGB colors since it maps exactly to 3 bytes.

@rbxphogen
Copy link
Copy Markdown
Author

rbxphogen commented May 9, 2026

Although I do think adding U24 could make sense for cases like RGB colors since it maps exactly to 3 bytes.

I believe this can be done via writebits

buffer.writebits(buf, offset * 8, 24, packed)

- If `LUA_VECTOR_SIZE` is 4, the `w` component of the resulting vector is also zero.
- equivalent to `vector.create(buffer.readf32(buf, offset), buffer.readf32(buf, offset + 4), buffer.readf32(buf, offset + 8))`

When `LUA_VECTOR_SIZE` is defined to be `4`, two additional methods are defined:
Copy link
Copy Markdown

@dyslexicsteak dyslexicsteak May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best not to have an API that changes that drastically based on VM build options. Maybe the vector4 methods could be replaced with "native" width methods, as in readvector and writevector that deal in vectors of "native" width by reading or writing however many values that requires, i.e. 3 or 4.

This way, if the width is 4, you still have the option to access 3 or 2 values, and similarly, if the width is 3, you have the option to access just 2 values. Otherwise, if you don't care, you can just access the amount required to fill the vector, making all functions available regardless of configuration.

The only issue I foresee with this approach is that the Luau vector library doesn't expose a way to get the environment's vector width. This is an issue for cursors, which won't know how far to advance if you just write a native width vector.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a comparison point, the vector.create method only exposes a 4-argument version if 4-element vectors are enabled – so the proposed text is consistent with that precedent.

I considered a pared-down readvector/writevector method set, but what ultimately convinced me against it was portability & predictability:

buffer.writevector3(buf, 0, vec) -- always writes 12 bytes
-- buffer.writevector4(buf, 0, vec) -- if your VM doesn't support 4-element vectors, this is an error

vs.

buffer.writevector(buf, 0, vec) -- writes either 12 or 16 bytes

You pointed out

the Luau vector library doesn't expose a way to get the environment's vector width. This is an issue for cursors, which won't know how far to advance if you just write a native width vector.

But if vectors are serialized and reloaded across environments, the predictability problem becomes even stickier – i.e. it wouldn't be enough to query the vector-size of the recipient-VM, you'd need to query the vector-size of the sender-VM.

Copy link
Copy Markdown

@dyslexicsteak dyslexicsteak May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the vector.create precedent but the thing is that it is not an error to call a function with more arguments than its arity, while it is an error to call nil (in normal configurations).

It's pretty "dirty" in my opinion to use the presence of the readvector4 and writevector4 functions as a test to determine environment vector width, and I think it is in the scope of this RFC to add a vector.width value or function to query the environment, returning the information as a width number or even as a boolean isvector4 or such. This should also solve any serdes problems you foresee.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to adding a vector.width, I can include it in this proposal – I am not a fan of making readvector/writevector behave differently depending on that value, though, because it's a hidden decision point.

Querying vector.width helps you if the same environment is writing/reading to/from buffers, but if the writing environment differs from the reading environment, writers would need to additionally write the vector-width in order for readers to know whether they can use readvector as-is or handle it as a special case

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see it as a hidden decision point, as the entire vector library already behaves in that way. I think forcing branches in all reader and writer code for something known at VM compile time and making the API less streamlined and consistent is a net negative. The ability to read and write without meta checks is pretty important, as this is a throughput-oriented API.

For readers with untrusted data, branching is unavoidable because you need to know what you're looking at and what tool you have, but I think if we can make it simpler and faster to read and write in the general case where you're dealing with trusted data it would be a lot better.

@Cooldude2606
Copy link
Copy Markdown

I like the suggestion provided by @ishtar112 of providing a width argument. This can then be made more ergonomic by having it default to vector.width which should address the concerns of @dyslexicsteak

buffer.writevector(buffer, offset, vector, width=vector.width)
buffer.readvector(buffer, offset, width=vector.width): vector

This allows the handling of unknown vector sizes from external sources, while providing a suitable default when working within the same VM.

Copy link
Copy Markdown
Collaborator

@vegorov-rbx vegorov-rbx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will not be accepting a proposal exposing vector.width from the define configuration variable.
It exposes an implementation detail that is made for some embedders, but should not be a part of the library API.

@dyslexicsteak
Copy link
Copy Markdown

We will not be accepting a proposal exposing vector.width from the define configuration variable.
It exposes an implementation detail that is made for some embedders, but should not be a part of the library API.

What about it leaking in other ways, such as via the presence of a function or by a catchable error from a function with a width argument?

@vegorov-rbx
Copy link
Copy Markdown
Collaborator

The whole mistake of introducing LUA_VECTOR_SIZE will have bad consequences for years into the future, so we can only minimize the effects (in some way, we are years in the future from that mistake and now dealing with it here).

Users are free to find roundabout ways of detecting it (you can check a read of w today), but the main configuration of Luau is LUA_VECTOR_SIZE 3 and people setting it to 4 are the ones to take on supporting the strangeness around it (they already have to overwrite the type definitions).

@dyslexicsteak
Copy link
Copy Markdown

the main configuration of Luau is LUA_VECTOR_SIZE 3 and people setting it to 4 are the ones to take on supporting the strangeness around it (they already have to overwrite the type definitions).

I guess that settles it then, I think an API that doesn't accommodate a vector width of 4 is fine.

@rbxphogen
Copy link
Copy Markdown
Author

We will not be accepting a proposal exposing vector.width from the define configuration variable. It exposes an implementation detail that is made for some embedders, but should not be a part of the library API.

@vegorov-rbx if I remove vector.width does the rest of the proposal look reasonable?

This reverts commit aa1fa7d.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants