Page 1 of 1

'utf8' charset is now deprecated

Posted: Sun 28 Aug 2022 17:10
by robert84
Hi

I have noticed that when UseUnicode is enabled, MyDAC sends this command to server right after starting its connection:

Code: Select all

SET NAMES 'utf8'
Note that as of MySQL 8.0, this charset identifier is deprecated and is expected to be removed in future releases. The MySQL developers recommend that applications use utf8mb4 instead.

utf8mb4 is a much better alternative, since unlike its predecessor it is standards-compliant. It is also a requirement if you want to use some of the newer collations (such as utf8mb4_0900_ai_ci).

It's fairly simple for the application developer to override MyDAC SET NAMES string by sending a new one afterwards, e.g.

Code: Select all

MyConnection1.ExecSQL('SET NAMES utf8mb4');
However messing with charset connection settings behind MyDAC's back seems somewhat dangerous so I'm hesitant to use this on production environments. Has someone tried this? (I've tested it in a small application and couldn't find any issues).

Would be great if the MyDAC developers could confirm if utf8mb4 can be used or -even better!- upgrade the SET NAMES string in MyDAC.

Re: 'utf8' charset is now deprecated

Posted: Mon 29 Aug 2022 15:09
by davidmarcus
I think utf8mb4 works with MyDAC. See viewtopic.php?f=7&t=31291

Re: 'utf8' charset is now deprecated

Posted: Mon 29 Aug 2022 16:26
by robert84
davidmarcus wrote: Mon 29 Aug 2022 15:09 I think utf8mb4 works with MyDAC. See viewtopic.php?f=7&t=31291
Thanks David. That post discusses something different. Setting Charset to 'utf8mb4' would make MyDAC simply request utf8mb4 from server and not do any conversions locally. This doesn't work in my case as I'm still stuck with Delphi 7 and hence the VCL expects windows-1252 / latin1.

My concern is with UseUnicode = True mode, which makes MyDAC convert between charsets in the client side. As described here, when UseUnicode is enabled the Charset property is ignored (and 'utf8' is used behind the scenes, which is the issue at hand here).

Re: 'utf8' charset is now deprecated

Posted: Mon 29 Aug 2022 16:34
by davidmarcus
My Delphi app is using Unicode. I have Options.UseUnicode = True and Options.CharSet = 'utf8mb4'. I think my app is working. I did test it back then. But, if your app is using Latin 1, then I don't know what happens.

Re: 'utf8' charset is now deprecated

Posted: Mon 29 Aug 2022 16:43
by robert84
davidmarcus wrote: Mon 29 Aug 2022 16:34 My Delphi app is using Unicode. I have Options.UseUnicode = True and Options.CharSet = 'utf8mb4'. I think my app is working. I did test it back then. But, if your app is using Latin 1, then I don't know what happens.
Hi David

As I said in my previous post, when you enable UseUnicode the Charset field is ignored (this behaviour is documented here). So you aren't really using utf8mb4 but utf8/utf8mb3 (the non-standard version which has now been deprecated). You can easily verify this by enabling the general_log in MySQL server and inspecting the commands sent by MyDAC.

If you really want to force charset to utf8mb4 you have to issue a SET NAMES command, e.g.

Code: Select all

MyConnection1.ExecSQL('SET NAMES utf8mb4');
but this is potentially dangerous and could even lead to data corruption as you're changing charset behind MyDAC's back, and MyDAC could well be making the assumption that utf8/utf8mb3 is still in place.

This is why I'm asking for advice from MyDAC devs as they're the ones who know if this would be a problem (it likely would) and have the ability to fix it.

Re: 'utf8' charset is now deprecated

Posted: Mon 29 Aug 2022 16:46
by davidmarcus
I do

Code: Select all

set names utf8mb4 collate utf8mb4_unicode_520_ci

Re: 'utf8' charset is now deprecated

Posted: Thu 15 Sep 2022 07:31
by pavelpd
Hi there,
Thanks for interest in our product!

Also thanks for your point about the fact that when the UseUnicode option is enabled, the Charset property is ignored.

Please note that if you set the Charset property to 'utf8mb4' and share the enabled UseUnicode=true parameter, then the Charset property will not be ignored and a request like "SET NAMES 'utf8mb4'" will be sent to the server, in all other cases when using UseUnicode=true, the Charset property is ignored and a "SET NAMES 'utf8'" request is sent to the server.

We will correct this inaccuracy in the documentation in the near future.

Re: 'utf8' charset is now deprecated

Posted: Thu 15 Sep 2022 20:56
by robert84
pavelpd wrote: Thu 15 Sep 2022 07:31 Please note that if you set the Charset property to 'utf8mb4' and share the enabled UseUnicode=true parameter, then the Charset property will not be ignored and a request like "SET NAMES 'utf8mb4'" will be sent to the server, in all other cases when using UseUnicode=true, the Charset property is ignored and a "SET NAMES 'utf8'" request is sent to the server.

We will correct this inaccuracy in the documentation in the near future.
Great. Finally it will be possible for me to use utf8mb4_0900_ai_ci collation.

Thanks for the tip, this was quite useful.

Re: 'utf8' charset is now deprecated

Posted: Mon 26 Sep 2022 14:48
by pavelpd
Hi,

You're always welcome!

Please feel free to contact us if you have any further questions about our products!