Unicode Issue: utf8mb4_unicode_ci and utf8mb4

Unicode Issue: utf8mb4_unicode_ci and utf8mb4

Postby Miart » Fri 04 Aug 2017 16:11

Hi, I seem to have a Unicode issue with the dotConnect MySQL connector and wondered if anyone had a solution or could offer any light.

My MySQL database is set to charset utf8mb4 and collation utf8mb4_unicode_ci. I have verified this is all working as expected, all tables and fields are correctly reporting the right values, the my.ini is set up according to the docs etc, so as far as I can see, the database is working as expected.

When I use the Oracle GPL Connector/Net DLLs, my application connects correctly and reports that it is using [collation_connection: utf8mb4_unicode_ci] and [character_set_connection: utf8mb4]. I can save 4byte Unicode and can run queries (SELECT ..LIKE ‘%bob’) with no problems (on columns that are explicitly set as utf8mb4_unicode_ci, i.e. the result of a function call in a view, that returns a varchar column as COLLATION utf8mb4_unicode_ci.)

When I say reporting back, I am running “SHOW VARIABLES WHERE Variable_name LIKE 'character\\_set\\_%' OR Variable_name LIKE 'collation%'” from the same GUI connection.

When I use the dotConnect MySQL connector I get a different story:

If I do not add anything to the connection string, then everything reports back fine, but when I save Unicode to the database it is getting converted, i.e. it is losing the Unicode part and being saved as ?? in the GUI.

If I set ‘Unicode=True;’ in the connection string, then the connection is reporting [collation_connection: utf8_general_ci] and [character_set_connection: utf8], i.e. I am no longer getting the full 4 byte Unicode, just the 3 byte version, and if I try and save any 4 byte Unicode, it crashes with the usual error.

If I try and set ‘CharSet=utf8mb4;’ in the connection string, then I end up with the database reporting [collation_connection: utf8mb4_general_ci] and [character_set_connection: utf8mb4]. The big difference here is ‘utf8mb4_general_ci’ rather than ‘utf8mb4_unicode_ci’, this mean that I can save and load Uniciode correctly, but my (SELECT ..LIKE ‘%bob’) is failing with a mixed collation error.

I’m at a bit of a loss now, but it just seems a little odd that the Oracle connector is working fine, but the dotConnect one is not working in the same way. Is there anything else I should set (RTFM maybe)?

Thanks if anyone can throw any light on this?

Stuart
Miart
 
Posts: 1
Joined: Fri 04 Aug 2017 16:04

Re: Unicode Issue: utf8mb4_unicode_ci and utf8mb4

Postby Pinturiccio » Wed 09 Aug 2017 14:55

We could not reproduce the issue. Your query

“SHOW VARIABLES WHERE Variable_name LIKE 'character\\_set\\_%' OR Variable_name LIKE 'collation%'”

returns utf8mb4_general_ci as the collation_connection value. However, MySQL Connector/Net returns the same value for this parameter.

Miart wrote:but my (SELECT ..LIKE ‘%bob’) is failing with a mixed collation error.

Please provide the full query that cause issue. Please also describe a case when MySQL Connector/Net works correctly, and dotConnect for MySQL doesn't. Please also tell us the version of MySQL Connector/Net that you use.

If possible, create and send it a small test project.
Pinturiccio
Devart Team
 
Posts: 1982
Joined: Wed 02 Nov 2011 09:44


Return to dotConnect for MySQL