Skip to content

Force MySQL to use correct name and collation#6694

Open
BlackDex wants to merge 1 commit intodani-garcia:mainfrom
BlackDex:force-mysql-names-collate
Open

Force MySQL to use correct name and collation#6694
BlackDex wants to merge 1 commit intodani-garcia:mainfrom
BlackDex:force-mysql-names-collate

Conversation

@BlackDex
Copy link
Copy Markdown
Collaborator

@BlackDex BlackDex commented Jan 9, 2026

This should force all MySQL/MariaDB connections to use the correct names and collation to be utf8mb4 and utf8mb4_unicode_ci. It will not fix current tables which have the wrong charset or collation, but it should prevent new databases and future additions to use the wrong set.

Also made sure that the init statements are also applied during migrations, which currently did not happen.

Should prevent / resolve #6611 like issues in the future.

This should force all MySQL/MariaDB connections to use the correct names and collation to be `utf8mb4` and `utf8mb4_unicode_ci`.
It will not fix current tables which have the wrong charset or collation, but it should prevent new databases and future additions to use the wrong set.

Also made sure that the init statements are also applied during migrations, which currently did not happen.

Should prevent / resolve dani-garcia#6611 like issues in the future.

Signed-off-by: BlackDex <black.dex@gmail.com>
@BlackDex BlackDex requested a review from dani-garcia January 9, 2026 19:22
@stefan0xC
Copy link
Copy Markdown
Contributor

stefan0xC commented Jan 9, 2026

Would this not introduce the issue to everyone who does not use utf8mb4_unicode_ci as their default collation? 🤔

As far as I've looked into it MariaDB by default uses utf8mb4_uca1400_ai_ci unless you are using Debian because they patched it to utf8mb4_general_ci. And MySQL uses utf8mb4_0900_ai_ci as default collation for utf8mb4. (So we can't use either of those two as default if we want to stay compatible with both, even if we did migrate existing tables as well. )

I mean, I am not using MariaDB for Vaultwarden so I'm not sure how much of that is caused by upgrading without running scripts like mariadb-upgrade or if that might be MariaDBs fault (since they switched default collation from utf8mb4_general_ci to utf8mb4_uca1400_ai_ci in 11.4.2, cf. MariaDB/mariadb-docker#591) - Either way I am wondering whether this is really something that we can address like that?

@BlackDex
Copy link
Copy Markdown
Collaborator Author

BlackDex commented Jan 9, 2026

In think it sets the connection preferred option. And shouldn't override the tables. Though conversion could happen, and maybe that possibly could cause an issue?

@stefan0xC
Copy link
Copy Markdown
Contributor

stefan0xC commented Jan 9, 2026

New users probably won't be affect but I think this would break for everyone who does have existing tables with a different collation (as soon as we add any new tables or for users that upgrade from an older version without this change). Because if I understand it correctly, this would ignore the existing default collation for the database and change the charset and collation of newly added tables to utf8mb4 and utf8mb4_unicode_ci which could and probably would lead to more of the reported issues only that in that case we would be responsible for that mess and not users that switch from bare metal to a container or have upgraded their MariaDB version.

@BlackDex
Copy link
Copy Markdown
Collaborator Author

BlackDex commented Jan 9, 2026

Hmm conversion should happen, but not sure what that could cause.

We could adjust all migrations which create tables to have the correct settings.

@stefan0xC
Copy link
Copy Markdown
Contributor

stefan0xC commented Jan 9, 2026

We could adjust all migrations which create tables to have the correct settings.

But who decides what the correct settings are? I mean that's what I think the default collation is for, is it not? And if I want to have support for a newer unicode standard I'd set it to a new collation (which we can't do ourselves because MariaDB and MySQL have diverged long enough that the newest common utf8mb4 collation seems to be utf8mb4_unicode_520_ci).

To me this is not something that we can or should attempt to solve via code but that would have to be addressed by anyone who decides to use MariaDB or MySQL as their preferred database. We can probably just clarify the issue and improve our documentation so that people running into this have a better understanding of what to do. (Like migrating to Postgres or SQLite or fixing the collation issue when it comes up.)

@BlackDex
Copy link
Copy Markdown
Collaborator Author

BlackDex commented Jan 9, 2026

We could create a check in diagnostics which report non recommended settings as a different option.

@stefan0xC
Copy link
Copy Markdown
Contributor

stefan0xC commented Jan 9, 2026

Yeah however I think that the diagnostics page will not be available if Vaultwarden refuses to start due to migration errors. Maybe we can do a check on startup if the charset + collation of existing tables are the same as the default collation?

@BlackDex
Copy link
Copy Markdown
Collaborator Author

BlackDex commented Jan 9, 2026

Maybe during startup we can do a simple table check?
Just a one time check and report an issue with a link to the wiki.

@dani-garcia
Copy link
Copy Markdown
Owner

leaving this approved in case we decide to go this way but not merging yet, lmk if we decide to add the startup checks only and we can just close this then

Copy link
Copy Markdown
Contributor

@stefan0xC stefan0xC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested it with using mariadb:10, mariadb:11 and mariadb:12 upgrading v1.28.0 to your PR (1.35.1-9a320171) and also checking what happens if you update from 1.28.0 with mariadb 10 to 1.35.1 with mariadb 11 and all seem to work without any issue, so I think this should actually be fine. (I mean I did not manage to reproduce the error that others have been facing without your PR either but at least setting the connection collation like this should not really hurt.)

@lifeofguenter
Copy link
Copy Markdown

if vaultwarden does not enforce, then migrations should also not fail.

i dont think its valid for collate to be user defined as its a implementation detail.

simplest example: case sensitive vs non case sensitive.

at a minimum supported/recommended settings should be documented.

that being said, SET NAMES at a minimum has been fairly standard practice, supporting a dynamic range is not really trivial and id be surprised if there was any greater value in that.

@BlackDex
Copy link
Copy Markdown
Collaborator Author

if vaultwarden does not enforce, then migrations should also not fail.

The problem is that if a database product, Linux Distro, Docker Packager thinks a different default is better, and changes it, that will effect new tables or columns even. So, a user doesn't have to know or did anything for this to be changed.
That will cause issues of course if columns or tables which need to be linked via Foreign Keys are, because of that, using different collation or charsets.

From my PHP day's i can remember i needed to set the specifics i wanted, else it would use the defaults from the server, which might or might not be in your control.

@BlackDex
Copy link
Copy Markdown
Collaborator Author

Ill do some more testing btw, so keep this open for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Crash loop: Referencing column 'user_uuid' and referenced column 'uuid' in foreign key constraint 'sso_users_ibfk_1' are incompatible.

4 participants