-
Notifications
You must be signed in to change notification settings - Fork 93
Add buffer.readvector2/3/4 and buffer.writevector2/3/4 methods
#198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
rbxphogen
wants to merge
4
commits into
luau-lang:master
from
rbxphogen:rfc-buffer-readvector234-writevector234
+87
−0
Closed
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,87 @@ | ||
| # `buffer.writevector*` and `buffer.readvector*` | ||
|
|
||
| **Status**: Open | ||
|
|
||
| ## Summary | ||
|
|
||
| This proposal suggests adding new methods to write & read `vector`s to/from `buffer`s. | ||
|
|
||
| ## Motivation | ||
|
|
||
| The native `vector` type can leverage simd to speed up element-wise operations, making it popular for math. | ||
|
|
||
| The `buffer` library provides methods to read and write numeric data types, but in order to store a `vector` to a buffer, code must unpack each element, writing them individually: | ||
| ```luau | ||
| buffer.writef32(theBuffer, offset + 0, theVector.x) | ||
| buffer.writef32(theBuffer, offset + 4, theVector.y) | ||
| buffer.writef32(theBuffer, offset + 8, theVector.z) | ||
| ``` | ||
| Similarly, in order to retrieve vectors from the buffer, code must read each individual float, and then construct a vector from them: | ||
| ```luau | ||
| local theVector = vector.create( | ||
| buffer.readf32(theBuffer, offset + 0), | ||
| buffer.readf32(theBuffer, offset + 4), | ||
| buffer.readf32(theBuffer, offset + 8) | ||
| ) | ||
| ``` | ||
| When `LUA_VECTOR_SIZE` is 4, these patterns extend to a fourth component (`w`). | ||
|
|
||
| Each `writef32` or `readf32` performs an individual `memcpy`, and temporarily converts the 32-bit float to a `number` (64-bit double) – one bulk `memcpy` would be more efficient and amenable to simd. | ||
|
|
||
| ## Design | ||
|
|
||
| Adding the following four new methods would fill this performance gap. | ||
|
|
||
| ```luau | ||
| buffer.writevector2(buf : buffer, offset : number, vec : vector) : () | ||
| buffer.readvector2(buf : buffer, offset : number) : vector | ||
|
|
||
| buffer.writevector3(buf : buffer, offset : number, vec : vector) : () | ||
| buffer.readvector3(buf : buffer, offset : number) : vector | ||
| ``` | ||
|
|
||
| Like all buffer read/write operations, byte order is little-endian. An error is thrown if the read or write would exceed the buffer's bounds. | ||
|
|
||
| `buffer.writevector2(buf : buffer, offset : number, vec : vector) : ()` | ||
| - Writes `vec.x` and `vec.y` as two contiguous 32-bit floats into `buf`, starting at `offset`. | ||
| - equivalent to `buffer.writef32(buf, offset, vec.x); buffer.writef32(buf, offset + 4, vec.y)` | ||
|
|
||
| `buffer.readvector2(buf : buffer, offset : number) : vector` | ||
| - Constructs a new `vector`, whose `x` and `y` components are determined by reading two contiguous 32-bit floats from `buf` starting at `offset`. | ||
| - The resulting vector's `z` component is zero. | ||
| - If `LUA_VECTOR_SIZE` is 4, the `w` component of the resulting vector is also zero. | ||
| - equivalent to `vector.create(buffer.readf32(buf, offset), buffer.readf32(buf, offset + 4))` | ||
|
|
||
| `buffer.writevector3(buf : buffer, offset : number, vec : vector) : ()` | ||
| - Writes `vec.x`, `vec.y`, and `vec.z` as three contiguous 32-bit floats into `buf`, starting at `offset`. | ||
| - equivalent to `buffer.writef32(buf, offset, vec.x); buffer.writef32(buf, offset + 4, vec.y); buffer.writef32(buf, offset + 8, vec.z)` | ||
|
|
||
| `buffer.readvector3(buf : buffer, offset : number) : vector` | ||
| - Constructs a new `vector`, whose `x`, `y`, and `z` components are determined by reading three contiguous 32-bit floats from `buf` starting at `offset`. | ||
| - If `LUA_VECTOR_SIZE` is 4, the `w` component of the resulting vector is also zero. | ||
| - equivalent to `vector.create(buffer.readf32(buf, offset), buffer.readf32(buf, offset + 4), buffer.readf32(buf, offset + 8))` | ||
|
|
||
| When `LUA_VECTOR_SIZE` is defined to be `4`, two additional methods are defined: | ||
|
|
||
| ```luau | ||
| buffer.writevector4(buf : buffer, offset : number, vec : vector) : () | ||
| buffer.readvector4(buf : buffer, offset : number) : vector | ||
| ``` | ||
|
|
||
| `buffer.writevector4(buf : buffer, offset : number, vec : vector) : ()` | ||
| - Writes `vec.x`, `vec.y`, `vec.z`, and `vec.w` as four contiguous 32-bit floats into `buf`, starting at `offset`. | ||
| - equivalent to `buffer.writef32(buf, offset, vec.x); buffer.writef32(buf, offset + 4, vec.y); buffer.writef32(buf, offset + 8, vec.z); buffer.writef32(buf, offset + 12, vec.w)` | ||
|
|
||
| `buffer.readvector4(buf : buffer, offset : number) : vector` | ||
| - Constructs a new `vector`, whose `x`, `y`, `z`, and `w` components are determined by reading four contiguous 32-bit floats from `buf` starting at `offset`. | ||
| - equivalent to `vector.create(buffer.readf32(buf, offset), buffer.readf32(buf, offset + 4), buffer.readf32(buf, offset + 8), buffer.readf32(buf, offset + 12))` | ||
|
|
||
| ## Drawbacks | ||
|
|
||
| This proposal does not add any brand-new functionality. It increases the API surface of `buffer` by 4 or 6 methods, which could create a maintenance burden. Perhaps improvements to code generation would obviate the need for dedicated vector methods. | ||
|
|
||
| ## Alternatives | ||
|
|
||
| Given there is only one `vector` type, we considered proposing just two methods: `readvector`/`writevector`, that read/write 3 or 4 elements depending on `LUA_VECTOR_SIZE`. But given the existence of 2-element constructors, partial-construction might be popular. | ||
|
|
||
| Another alternative is to expose simd operations on `buffer` itself – this might still be a useful extension for non-floating-point operations, but it would result in many more methods. | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's best not to have an API that changes that drastically based on VM build options. Maybe the
vector4methods could be replaced with "native" width methods, as inreadvectorandwritevectorthat deal invectors of "native" width by reading or writing however many values that requires, i.e. 3 or 4.This way, if the width is 4, you still have the option to access 3 or 2 values, and similarly, if the width is 3, you have the option to access just 2 values. Otherwise, if you don't care, you can just access the amount required to fill the vector, making all functions available regardless of configuration.
The only issue I foresee with this approach is that the Luau
vectorlibrary doesn't expose a way to get the environment's vector width. This is an issue for cursors, which won't know how far to advance if you just write a native width vector.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a comparison point, the
vector.createmethod only exposes a 4-argument version if 4-element vectors are enabled – so the proposed text is consistent with that precedent.I considered a pared-down
readvector/writevectormethod set, but what ultimately convinced me against it was portability & predictability:vs.
You pointed out
But if vectors are serialized and reloaded across environments, the predictability problem becomes even stickier – i.e. it wouldn't be enough to query the vector-size of the recipient-VM, you'd need to query the vector-size of the sender-VM.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the
vector.createprecedent but the thing is that it is not an error to call a function with more arguments than its arity, while it is an error to callnil(in normal configurations).It's pretty "dirty" in my opinion to use the presence of the
readvector4andwritevector4functions as a test to determine environment vector width, and I think it is in the scope of this RFC to add avector.widthvalue or function to query the environment, returning the information as a width number or even as a booleanisvector4or such. This should also solve any serdes problems you foresee.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opposed to adding a
vector.width, I can include it in this proposal – I am not a fan of makingreadvector/writevectorbehave differently depending on that value, though, because it's a hidden decision point.Querying
vector.widthhelps you if the same environment is writing/reading to/from buffers, but if the writing environment differs from the reading environment, writers would need to additionally write the vector-width in order for readers to know whether they can usereadvectoras-is or handle it as a special caseThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see it as a hidden decision point, as the entire vector library already behaves in that way. I think forcing branches in all reader and writer code for something known at VM compile time and making the API less streamlined and consistent is a net negative. The ability to read and write without meta checks is pretty important, as this is a throughput-oriented API.
For readers with untrusted data, branching is unavoidable because you need to know what you're looking at and what tool you have, but I think if we can make it simpler and faster to read and write in the general case where you're dealing with trusted data it would be a lot better.