escape data strings according to WebAssembly spec , Fixes #37#947
Open
RayanVSS wants to merge 2 commits intoOCamlPro:mainfrom
Open
escape data strings according to WebAssembly spec , Fixes #37#947RayanVSS wants to merge 2 commits intoOCamlPro:mainfrom
RayanVSS wants to merge 2 commits intoOCamlPro:mainfrom
Conversation
redianthus
reviewed
Mar 22, 2026
| @@ -0,0 +1,8 @@ | |||
| test data special chars round-trip: | |||
| $ owi fmt data_special_chars.wat > /tmp/owi_test_output.wat | |||
Member
There was a problem hiding this comment.
Putting the file in /tmp is likely going to cause various issues. Doing ... > ./owi_test_output.wat is fine (dune takes care of having the files in the right place).
redianthus
reviewed
Mar 22, 2026
| (memory 1) | ||
| (data (memory 0) (offset i32.const 0) "hello\n\t\u{0d}\"\'\\world") | ||
| ) | ||
| $ rm /tmp/owi_test_output.wat |
Member
There was a problem hiding this comment.
no need to do this if you put it in the current directory
redianthus
reviewed
Mar 22, 2026
| | '\\' -> string fmt "\\\\" | ||
| | c -> | ||
| let ci = Char.code c in | ||
| if 0x20 <= ci && ci < 0x7f then char fmt c else pp_hex_char fmt c |
Member
There was a problem hiding this comment.
Can be rewritten as:
| '\x20' .. '\x7e' as c -> char fmt c
| c -> pp_hex_char fmt c
redianthus
reviewed
Mar 22, 2026
| in | ||
| let pp_unicode_char fmt = function | ||
| | (0x09 | 0x0a) as c -> pp_char fmt (Char.chr c) | ||
| | uc when 0x20 <= uc && uc < 0x7f -> pp_char fmt (Char.chr uc) |
Member
There was a problem hiding this comment.
Same here, you could use a character interval for the pattern (and it's likely not necessary to use integers for this function, using char should be enough)
Author
|
Thanks for the review, I've made the suggested changes. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix for Issue #37: WebAssembly Data Pretty Printing
The Initial Problem
Issue #37 (#37) reported a problem in the pretty-printing of the
datainstruction in WebAssembly Text format.The original code used OCaml's
%Stag to display the initialization string:pf fmt {|(data%a %a %S)|} pp_id_opt d.id Mode.pp d.mode d.initWhy Was This a Problem?
OCaml's
%Stag escapes characters according to OCaml conventions, not according to WebAssembly Text Format conventions. This caused issuesConcrete Example
With the old code, a string containing special characters like
\n,\r,\t, etc. was not escaped correctly, making it impossible to re-read the file.Modifications in
src/ir/text.mlI added two new pretty-printing functions that comply with the WebAssembly Text Format specification:
New Test Files in
test/fmt/To validate the fix, I added three new test files:
data_special_chars.wat: Test with special characters (\n,\t,\r,",',\)data_bytes.wat: Test with raw bytes and non-printable charactersdata_roundtrip.t: Round-trip test to verify format stabilityprint.t: Added test cases for the two files aboveThese tests are automatically executed with
dune runtest test/fmt.Added Functions
1.
pp_name_inner- Correct EscapingThis function implements WebAssembly escaping rules:
\n,\r,\t,\',\",\\are explicitly escaped\xx\u{xx}notation for tab (0x09) and newline (0x0a)2.
pp_name- Wrapper with QuotesSimple wrapper that surrounds the result of
pp_name_innerwith double quotes.3. Modification to
Data.ppChange:
%S→pp_nameto use our custom escaping function.Specification
The solution follows the WebAssembly Text Format specification:
The implemented escaping rules exactly match the spec.
Inspiration
PR #391 (which attempted to solve this problem) referenced a PR in the official WebAssembly repo:
this implementation aligns with these changes.
Non-regression Tests
All existing tests pass:
Stability Property (Idempotence)
The round-trip test verifies the idempotence property:
Case Analysis
I manually tested these cases:
\n\n\r\u{0d}\t\t"\"'\'\\\\xxReferences
Modified and Created Files
Modified Files
src/ir/text.ml: Addedpp_name_innerandpp_namefunctions, modifiedData.pptest/fmt/dune: Added new test files to dependenciestest/fmt/print.t: Added tests for special characters and raw bytesNew Test Files
test/fmt/data_special_chars.wat: Test with special characterstest/fmt/data_bytes.wat: Test with raw bytestest/fmt/data_roundtrip.t: Round-trip test