'utf8' charset is now deprecated

Discussion of open issues, suggestions and bugs regarding MyDAC (Data Access Components for MySQL) for Delphi, C++Builder, Lazarus (and FPC)
Post Reply
robert84
Posts: 8
Joined: Sun 28 Aug 2022 16:51

'utf8' charset is now deprecated

Post by robert84 » Sun 28 Aug 2022 17:10

Hi

I have noticed that when UseUnicode is enabled, MyDAC sends this command to server right after starting its connection:

Code: Select all

SET NAMES 'utf8'
Note that as of MySQL 8.0, this charset identifier is deprecated and is expected to be removed in future releases. The MySQL developers recommend that applications use utf8mb4 instead.

utf8mb4 is a much better alternative, since unlike its predecessor it is standards-compliant. It is also a requirement if you want to use some of the newer collations (such as utf8mb4_0900_ai_ci).

It's fairly simple for the application developer to override MyDAC SET NAMES string by sending a new one afterwards, e.g.

Code: Select all

MyConnection1.ExecSQL('SET NAMES utf8mb4');
However messing with charset connection settings behind MyDAC's back seems somewhat dangerous so I'm hesitant to use this on production environments. Has someone tried this? (I've tested it in a small application and couldn't find any issues).

Would be great if the MyDAC developers could confirm if utf8mb4 can be used or -even better!- upgrade the SET NAMES string in MyDAC.

davidmarcus
Posts: 50
Joined: Tue 25 Jan 2005 11:22
Location: Somerville, MA
Contact:

Re: 'utf8' charset is now deprecated

Post by davidmarcus » Mon 29 Aug 2022 15:09

I think utf8mb4 works with MyDAC. See viewtopic.php?f=7&t=31291

robert84
Posts: 8
Joined: Sun 28 Aug 2022 16:51

Re: 'utf8' charset is now deprecated

Post by robert84 » Mon 29 Aug 2022 16:26

davidmarcus wrote: Mon 29 Aug 2022 15:09 I think utf8mb4 works with MyDAC. See viewtopic.php?f=7&t=31291
Thanks David. That post discusses something different. Setting Charset to 'utf8mb4' would make MyDAC simply request utf8mb4 from server and not do any conversions locally. This doesn't work in my case as I'm still stuck with Delphi 7 and hence the VCL expects windows-1252 / latin1.

My concern is with UseUnicode = True mode, which makes MyDAC convert between charsets in the client side. As described here, when UseUnicode is enabled the Charset property is ignored (and 'utf8' is used behind the scenes, which is the issue at hand here).

davidmarcus
Posts: 50
Joined: Tue 25 Jan 2005 11:22
Location: Somerville, MA
Contact:

Re: 'utf8' charset is now deprecated

Post by davidmarcus » Mon 29 Aug 2022 16:34

My Delphi app is using Unicode. I have Options.UseUnicode = True and Options.CharSet = 'utf8mb4'. I think my app is working. I did test it back then. But, if your app is using Latin 1, then I don't know what happens.

robert84
Posts: 8
Joined: Sun 28 Aug 2022 16:51

Re: 'utf8' charset is now deprecated

Post by robert84 » Mon 29 Aug 2022 16:43

davidmarcus wrote: Mon 29 Aug 2022 16:34 My Delphi app is using Unicode. I have Options.UseUnicode = True and Options.CharSet = 'utf8mb4'. I think my app is working. I did test it back then. But, if your app is using Latin 1, then I don't know what happens.
Hi David

As I said in my previous post, when you enable UseUnicode the Charset field is ignored (this behaviour is documented here). So you aren't really using utf8mb4 but utf8/utf8mb3 (the non-standard version which has now been deprecated). You can easily verify this by enabling the general_log in MySQL server and inspecting the commands sent by MyDAC.

If you really want to force charset to utf8mb4 you have to issue a SET NAMES command, e.g.

Code: Select all

MyConnection1.ExecSQL('SET NAMES utf8mb4');
but this is potentially dangerous and could even lead to data corruption as you're changing charset behind MyDAC's back, and MyDAC could well be making the assumption that utf8/utf8mb3 is still in place.

This is why I'm asking for advice from MyDAC devs as they're the ones who know if this would be a problem (it likely would) and have the ability to fix it.

davidmarcus
Posts: 50
Joined: Tue 25 Jan 2005 11:22
Location: Somerville, MA
Contact:

Re: 'utf8' charset is now deprecated

Post by davidmarcus » Mon 29 Aug 2022 16:46

I do

Code: Select all

set names utf8mb4 collate utf8mb4_unicode_520_ci

pavelpd
Devart Team
Posts: 109
Joined: Thu 06 Jan 2022 14:16

Re: 'utf8' charset is now deprecated

Post by pavelpd » Thu 15 Sep 2022 07:31

Hi there,
Thanks for interest in our product!

Also thanks for your point about the fact that when the UseUnicode option is enabled, the Charset property is ignored.

Please note that if you set the Charset property to 'utf8mb4' and share the enabled UseUnicode=true parameter, then the Charset property will not be ignored and a request like "SET NAMES 'utf8mb4'" will be sent to the server, in all other cases when using UseUnicode=true, the Charset property is ignored and a "SET NAMES 'utf8'" request is sent to the server.

We will correct this inaccuracy in the documentation in the near future.

robert84
Posts: 8
Joined: Sun 28 Aug 2022 16:51

Re: 'utf8' charset is now deprecated

Post by robert84 » Thu 15 Sep 2022 20:56

pavelpd wrote: Thu 15 Sep 2022 07:31 Please note that if you set the Charset property to 'utf8mb4' and share the enabled UseUnicode=true parameter, then the Charset property will not be ignored and a request like "SET NAMES 'utf8mb4'" will be sent to the server, in all other cases when using UseUnicode=true, the Charset property is ignored and a "SET NAMES 'utf8'" request is sent to the server.

We will correct this inaccuracy in the documentation in the near future.
Great. Finally it will be possible for me to use utf8mb4_0900_ai_ci collation.

Thanks for the tip, this was quite useful.

pavelpd
Devart Team
Posts: 109
Joined: Thu 06 Jan 2022 14:16

Re: 'utf8' charset is now deprecated

Post by pavelpd » Mon 26 Sep 2022 14:48

Hi,

You're always welcome!

Please feel free to contact us if you have any further questions about our products!

Post Reply