Unicode Issue: utf8mb4_unicode_ci and utf8mb4

Discussion of open issues, suggestions and bugs regarding ADO.NET provider for MySQL
Post Reply
Miart
Posts: 1
Joined: Fri 04 Aug 2017 16:04

Unicode Issue: utf8mb4_unicode_ci and utf8mb4

Post by Miart » Fri 04 Aug 2017 16:11

Hi, I seem to have a Unicode issue with the dotConnect MySQL connector and wondered if anyone had a solution or could offer any light.

My MySQL database is set to charset utf8mb4 and collation utf8mb4_unicode_ci. I have verified this is all working as expected, all tables and fields are correctly reporting the right values, the my.ini is set up according to the docs etc, so as far as I can see, the database is working as expected.

When I use the Oracle GPL Connector/Net DLLs, my application connects correctly and reports that it is using [collation_connection: utf8mb4_unicode_ci] and [character_set_connection: utf8mb4]. I can save 4byte Unicode and can run queries (SELECT ..LIKE ‘%bob’) with no problems (on columns that are explicitly set as utf8mb4_unicode_ci, i.e. the result of a function call in a view, that returns a varchar column as COLLATION utf8mb4_unicode_ci.)

When I say reporting back, I am running “SHOW VARIABLES WHERE Variable_name LIKE 'character\\_set\\_%' OR Variable_name LIKE 'collation%'” from the same GUI connection.

When I use the dotConnect MySQL connector I get a different story:

If I do not add anything to the connection string, then everything reports back fine, but when I save Unicode to the database it is getting converted, i.e. it is losing the Unicode part and being saved as ?? in the GUI.

If I set ‘Unicode=True;’ in the connection string, then the connection is reporting [collation_connection: utf8_general_ci] and [character_set_connection: utf8], i.e. I am no longer getting the full 4 byte Unicode, just the 3 byte version, and if I try and save any 4 byte Unicode, it crashes with the usual error.

If I try and set ‘CharSet=utf8mb4;’ in the connection string, then I end up with the database reporting [collation_connection: utf8mb4_general_ci] and [character_set_connection: utf8mb4]. The big difference here is ‘utf8mb4_general_ci’ rather than ‘utf8mb4_unicode_ci’, this mean that I can save and load Uniciode correctly, but my (SELECT ..LIKE ‘%bob’) is failing with a mixed collation error.

I’m at a bit of a loss now, but it just seems a little odd that the Oracle connector is working fine, but the dotConnect one is not working in the same way. Is there anything else I should set (RTFM maybe)?

Thanks if anyone can throw any light on this?

Stuart

Pinturiccio
Devart Team
Posts: 2420
Joined: Wed 02 Nov 2011 09:44

Re: Unicode Issue: utf8mb4_unicode_ci and utf8mb4

Post by Pinturiccio » Wed 09 Aug 2017 14:55

We could not reproduce the issue. Your query

“SHOW VARIABLES WHERE Variable_name LIKE 'character\\_set\\_%' OR Variable_name LIKE 'collation%'”

returns utf8mb4_general_ci as the collation_connection value. However, MySQL Connector/Net returns the same value for this parameter.
Miart wrote:but my (SELECT ..LIKE ‘%bob’) is failing with a mixed collation error.
Please provide the full query that cause issue. Please also describe a case when MySQL Connector/Net works correctly, and dotConnect for MySQL doesn't. Please also tell us the version of MySQL Connector/Net that you use.

If possible, create and send it a small test project.

Post Reply