Page 1 of 1

Illegal mix of collations for operation 'UNION' bug

Posted: Thu 01 Aug 2019 06:37
by Dmytro
Hi,
we are using dotConnector for mysql 8.13.1402 and found the bug over there.

If to change the DB collation to the utf8-mb4 it fails on requests with unions. For example, to reproduce the request with union you need to create the LINQ with some classes constructor in select and return there some lists and array.

Example of such request is here:

Code: Select all

SELECT 
UnionAll1.customer_id AS C1,
UnionAll1.customer_id1 AS C2,
UnionAll1.customer_name AS C3,
UnionAll1.C1 AS C4,
UnionAll1.customer_id2 AS C5,
UnionAll1.customer_terminology_key AS C6,
UnionAll1.customer_terminology_value AS C7,
UnionAll1.C2 AS C8
FROM  (SELECT 
    CASE WHEN Extent2.customer_id IS NULL THEN CAST(NULL AS SIGNED) ELSE 1 END AS C1,
    Extent1.customer_id,
    Extent1.customer_id AS customer_id1,
    Extent1.customer_name,
    Extent2.customer_id AS customer_id2,
    Extent2.customer_terminology_key,
    Extent2.customer_terminology_value,
    CAST(NULL AS SIGNED) AS C2
    FROM  customer AS Extent1
    LEFT OUTER JOIN customer_terminology AS Extent2 ON Extent1.customer_id = Extent2.customer_id
UNION ALL
    SELECT 
    2 AS C1,
    Extent3.customer_id,
    Extent3.customer_id AS customer_id1,
    Extent3.customer_name,
    CAST(NULL AS SIGNED) AS C2,
    CAST(NULL AS CHAR) AS C3,
    CAST(NULL AS CHAR) AS C4,
    Extent4.user_id
    FROM  customer AS Extent3
    INNER JOIN user AS Extent4 ON (Extent3.customer_id = Extent4.customer_id) AND (Extent4.role_id IN (5,6))) AS UnionAll1
ORDER BY UnionAll1.customer_id1 ASC, UnionAll1.C1 ASC
It looks like the problem is not in union itself, but in CAST statement. It looks like the the CAST statememnt in combination with UNION and charset utf8-mb5 cause the exception : Illegal mix of collations for operation 'UNION'.

How can we solve this issue? How to avoid the cast or the unions in the request? Oer maybe have you another solution?

How fasr can it be fixes? We really need it for our next release and it will be very soon.

Re: Illegal mix of collations for operation 'UNION' bug

Posted: Thu 01 Aug 2019 14:45
by Shalex
Try setting "Unicode=true;" in your connection string.

If this doesn't help, please upload a small test project with the corresponding DDL/DML script to some file exchange server and provide the download link via contact form.

Re: Illegal mix of collations for operation 'UNION' bug

Posted: Wed 02 Oct 2019 12:14
by Bengb
I ran into this recently as well. In my case the relevant columns were utf8mb4_unicode_ci, but I found out that the session was utf8mb4_general_ci. You can see this from the collation_connection variable:

Code: Select all

SHOW VARIABLES LIKE '%collat%';
I found that this came from the connection string including "Character Set=utf8mb4". This caused a "SET NAMES UTF8MB4;" query to be run on every connection, which causes MySQL to take the default collation for this character set, which is utf8mb4_general_ci.

Updating the server's character set and collation to match the data (setting character-set-server and collation-server in the server config) and then using "Character Set=Auto" caused the connection to have the correct collation, and the issue was fixed. Manually running something like "SET NAMES UTF8MB4 COLLATE utf8mb4_unicode_ci" should fix it, too.

Re: Illegal mix of collations for operation 'UNION' bug

Posted: Fri 04 Oct 2019 18:25
by Shalex
Bengb wrote: Wed 02 Oct 2019 12:14 I ran into this recently as well. In my case the relevant columns were utf8mb4_unicode_ci, but I found out that the session was utf8mb4_general_ci. You can see this from the collation_connection variable:

Code: Select all

SHOW VARIABLES LIKE '%collat%';
I found that this came from the connection string including "Character Set=utf8mb4". This caused a "SET NAMES UTF8MB4;" query to be run on every connection, which causes MySQL to take the default collation for this character set, which is utf8mb4_general_ci.

Updating the server's character set and collation to match the data (setting character-set-server and collation-server in the server config) and then using "Character Set=Auto" caused the connection to have the correct collation, and the issue was fixed. Manually running something like "SET NAMES UTF8MB4 COLLATE utf8mb4_unicode_ci" should fix it, too.
The described behavior is correct.