fix: [SNOW-3066557] read SQL files as UTF-8 regardless of system locale by sfc-gh-olorek · Pull Request #2982 · snowflakedb/snowflake-cli

sfc-gh-olorek · 2026-05-07T16:00:42Z

Pre-review checklist

I've confirmed that instructions included in README.md are still correct after my changes in the codebase.
I've added or updated automated unit tests to verify correctness of my new code.
I've added or updated integration tests to verify correctness of my new code.
I've confirmed that my changes are working by executing CLI's commands manually on MacOS.
I've confirmed that my changes are working by executing CLI's commands manually on Windows.
I've confirmed that my changes are up-to-date with the target branch.
I've described my changes in the release notes.
I've described my changes in the section below.
I've described my changes in the documentation.

Changes description

Fixes #2759 / SNOW-3066557.

snow sql -f <file> (and the !source <file> include directive) opened SQL files without specifying an encoding, so Python fell back to the platform default text encoding. On Japanese Windows that resolves to cp932: any UTF-8 SQL file containing a non-ASCII character — even a single -- コメント comment — crashes the command with UnicodeDecodeError before the first statement ever reaches Snowflake.

This change passes encoding=""utf-8"" explicitly at both call sites in src/snowflake/cli/_plugins/sql/statement_reader.py:

files_reader, the top-level entry point for snow sql -f
ParsedStatement.from_file, the loader used for !source <file> includes

The CLI already writes files as UTF-8 elsewhere (SecurePath.write_text), so this aligns the read path with the write path and removes the dependency on the platform locale.

Tests

Two regression tests were added to tests/sql/test_statement_reader.py:

test_read_utf8_file_on_non_utf8_locale
test_source_utf8_file_on_non_utf8_locale

Both write Japanese UTF-8 bytes to a .sql file, monkeypatch pathlib.Path.open to default to cp932 (faithfully simulating the customer environment — just monkeypatching locale.getpreferredencoding is not enough on modern Python), and assert the reader succeeds. Verified that both tests fail on main and pass with this fix. Full tests/sql/test_statement_reader.py suite: 52/52 passing.

`snow sql -f <file>` and the `!source <file>` directive relied on Python's default text encoding when opening SQL files, which on Japanese Windows resolves to cp932. Any UTF-8 file containing a non-ASCII character (including comments like `-- コメント`) would crash with UnicodeDecodeError before a single statement was sent to Snowflake. Explicitly pass `encoding="utf-8"` at both call sites (`files_reader` and `ParsedStatement.from_file`) so the reader always decodes UTF-8, matching the encoding the CLI already uses when writing files. Adds two regression tests that simulate a non-UTF-8 default encoding by monkeypatching `pathlib.Path.open` and verify both the `files_reader` (top-level `-f` path) and the `!source` include path read the file successfully. Fixes #2759

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: [SNOW-3066557] read SQL files as UTF-8 regardless of system locale#2982

fix: [SNOW-3066557] read SQL files as UTF-8 regardless of system locale#2982
sfc-gh-olorek wants to merge 1 commit into
mainfrom
proactive/SNOW-3066557-utf8-sql-read

sfc-gh-olorek commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sfc-gh-olorek commented May 7, 2026

Pre-review checklist

Changes description

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant