Changing from latin1 to utf8mb4
-
- Posts: 50
- Joined: Tue 25 Jan 2005 11:22
- Location: Somerville, MA
- Contact:
Changing from latin1 to utf8mb4
I would appreciate some guidance on how to work with utf8mb4 tables.
My Delphi app communicates with a MySQL database (version 5.5.40-MariaDB-cll-lve) on a Web server. Currently, the string fields in the MySQL tables are char or varchar using the latin1 character set. I have TMyConnection.Options.CharSet set to 'latin1'.
For insert, replace, update, and delete statements, I've been using TMyScript by constructing the SQL myself, assigning it to TMyScript.SQL.Text, then doing TMyScript.Execute. I escape and quote the values myself, i.e., put single quotes around strings, prefix quotes with a backslash, etc. I end up with a string (UnicodeString in XE7) that I assign to TMyScript.SQL.Text.
For select statements, I use TMyQuery, construct the SQL myself, and assign it to TMyQuery.SQL.Text. After calling TMyQuery.Execute, I retrieve the string values via TMyQuery.FieldByName('TheFieldName').AsString.
I'm planning to convert the MySQL tables so most of the varchar fields will be utf8mb4. Some of the varchar fields and all the char fields will be ascii. What do I need to change on the Delphi side?
I'm using MyDAC 8.4.13, Delphi XE7 Professional, Windows Vista Home Premium.
My Delphi app communicates with a MySQL database (version 5.5.40-MariaDB-cll-lve) on a Web server. Currently, the string fields in the MySQL tables are char or varchar using the latin1 character set. I have TMyConnection.Options.CharSet set to 'latin1'.
For insert, replace, update, and delete statements, I've been using TMyScript by constructing the SQL myself, assigning it to TMyScript.SQL.Text, then doing TMyScript.Execute. I escape and quote the values myself, i.e., put single quotes around strings, prefix quotes with a backslash, etc. I end up with a string (UnicodeString in XE7) that I assign to TMyScript.SQL.Text.
For select statements, I use TMyQuery, construct the SQL myself, and assign it to TMyQuery.SQL.Text. After calling TMyQuery.Execute, I retrieve the string values via TMyQuery.FieldByName('TheFieldName').AsString.
I'm planning to convert the MySQL tables so most of the varchar fields will be utf8mb4. Some of the varchar fields and all the char fields will be ascii. What do I need to change on the Delphi side?
I'm using MyDAC 8.4.13, Delphi XE7 Professional, Windows Vista Home Premium.
Re: Changing from latin1 to utf8mb4
In your case, you can set the TMyConnection.Options.CharSet property to "utf8mb4" or set the TMyConnection.Options.UseUnicode property to "True" .
-
- Posts: 50
- Joined: Tue 25 Jan 2005 11:22
- Location: Somerville, MA
- Contact:
Re: Changing from latin1 to utf8mb4
Thank you. Simple as could be.ViktorV wrote:In your case, you can set the TMyConnection.Options.CharSet property to "utf8mb4" or set the TMyConnection.Options.UseUnicode property to "True".
When would someone need to set the UseUnicode property rather than just setting CharSet? Is it for use with TMyTable?
Re: Changing from latin1 to utf8mb4
You can find out how to use these properties in the MyDAC help: http://www.devart.com/mydac/docs/usingn ... acters.htm Both these properties affect the behavior of TMyTable.
-
- Posts: 50
- Joined: Tue 25 Jan 2005 11:22
- Location: Somerville, MA
- Contact:
Re: Changing from latin1 to utf8mb4
I'm trying to use utf8mb4 for CharSet, but MySQL seems to be receiving
Latin 1. I'm using MyDAC 8.6.20 and MySQL 5.5.47. I create a table as
follows:
I then run this app:
I get the error
able to use characters not in Latin 1. What am I doing wrong?
Latin 1. I'm using MyDAC 8.6.20 and MySQL 5.5.47. I create a table as
follows:
Code: Select all
create database if not exists TestDatabase;
use TestDatabase;
drop table if exists TestTable;
create table TestTable (
ID integer unsigned not null auto_increment,
Name varchar(50) not null,
primary key ( ID ),
index Name ( Name ) )
default character set utf8mb4 collate utf8mb4_unicode_ci pack_keys=1 engine=MyISAM;
show warnings;
Code: Select all
{$apptype console}
program TestMyDAC;
uses
Winapi.Windows, System.SysUtils, MyAccess, MyScript;
var
Connection: TMyConnection;
Updater: TMyScript;
begin
try
Connection := TMyConnection.Create( nil );
Connection.Options.Direct := true;
Connection.Options.Compress := true;
Connection.Options.CharSet := 'utf8mb4';
Connection.Server := '127.0.0.1';
Connection.Username := 'root';
Connection.Password := '';
Connection.Database := 'testdatabase';
Connection.Connected := true;
Updater := TMyScript.Create( nil );
Updater.Connection := Connection;
Updater.SQL.Text := 'insert into TestTable set Name = ''ÀÁÂÃÄÅÇÈÉ''';
Updater.Execute;
Connection.Connected := false;
Updater.Free;
Connection.Free;
except
on E: Exception do
writeln( E.Message );
end;
end.
This test only tries to insert a Latin 1 string, but I also want to be#HY000Incorrect string value: '\xC0\xC1\xC2\xC3\xC4\xC5...' for column 'Name' at row 1
able to use characters not in Latin 1. What am I doing wrong?
Re: Changing from latin1 to utf8mb4
To solve the issue, please replace the line with and check if the issue is reproduced.
Code: Select all
Connection.Options.CharSet := 'utf8mb4';
Code: Select all
Connection.Options.UseUnicode := True;
-
- Posts: 50
- Joined: Tue 25 Jan 2005 11:22
- Location: Somerville, MA
- Contact:
Re: Changing from latin1 to utf8mb4
That works better. I tried some Japanese in the Basic Multilingual Plane, and it worked. But, when I tried a Japanese character not in the Basic Multilingual Plane, I got
Do you know if it should work for such characters?#HY000Incorrect string value: '\xF0\xA0\x80\x81' for column 'Name' at row 1
-
- Posts: 50
- Joined: Tue 25 Jan 2005 11:22
- Location: Somerville, MA
- Contact:
Re: Changing from latin1 to utf8mb4
MySQL Workbench seems to have no problem with utf8mb4. See http://www.davidmarcus.com/MySQL_Workbench_utf8mb4.png. I still get an error with MyDAC with Options.UseUnicode := true. Setting UseUnicode to true and also setting Options.CharSet to 'utf8mb4' gives the same "Incorrect string value" error.
-
- Posts: 50
- Joined: Tue 25 Jan 2005 11:22
- Location: Somerville, MA
- Contact:
Re: Changing from latin1 to utf8mb4
I also tried using PHP and the mysqli extension. It has no problem with the four byte character.
Also, how should I set collation_connection to utf8mb4_unicode_ci (which is not the default collation)?
Also, how should I set collation_connection to utf8mb4_unicode_ci (which is not the default collation)?
Re: Changing from latin1 to utf8mb4
Thank you for the information. The problem investigation is in progress. We will inform you when we have any results.
Re: Changing from latin1 to utf8mb4
oops. I see I was responding to something really old.
Jeff
Jeff
Last edited by JEFFB on Tue 19 Jan 2016 20:35, edited 1 time in total.
-
- Posts: 50
- Joined: Tue 25 Jan 2005 11:22
- Location: Somerville, MA
- Contact:
Re: Changing from latin1 to utf8mb4
Jeff,
It works fine with MySQL Workbench. You don't need to change the server and database variables. You only need to do Unfortunately, MySQL Workbench defaults to utf8 instead of utf8mb4, so you have to change it every time you start a session.
For MyDAC, did you try a four byte character?
David
It works fine with MySQL Workbench. You don't need to change the server and database variables. You only need to do
Code: Select all
set names 'utf8mb4' collate 'utf8mb4_unicode_ci'
For MyDAC, did you try a four byte character?
David
-
- Posts: 50
- Joined: Tue 25 Jan 2005 11:22
- Location: Somerville, MA
- Contact:
Re: Changing from latin1 to utf8mb4
Viktor,
This seems to work (replace the asterisk with a four-byte character; the forum won't let me post it):
This seems to work (replace the asterisk with a four-byte character; the forum won't let me post it):
Code: Select all
{$apptype console}
program TestMyDAC;
uses
Winapi.Windows, System.SysUtils, MyAccess, MyScript;
var
Connection: TMyConnection;
Updater: TMyScript;
begin
try
Connection := TMyConnection.Create( nil );
Connection.Options.Direct := true;
Connection.Options.Compress := true;
Connection.Options.UseUnicode := true;
Connection.Options.CharSet := 'utf8mb4';
Connection.Server := '127.0.0.1';
Connection.Username := 'root';
Connection.Password := '';
Connection.Database := 'testdatabase';
Connection.Connected := true;
Updater := TMyScript.Create( nil );
Updater.Connection := Connection;
Updater.SQL.Text := 'set names ''utf8mb4'' collate ''utf8mb4_unicode_ci''';
Updater.Execute;
Updater.SQL.Text := 'insert into TestTable set Name = ''*''';
Updater.Execute;
Connection.Connected := false;
Updater.Free;
Connection.Free;
except
on E: Exception do
writeln( E.Message );
end;
end.
Re: Changing from latin1 to utf8mb4
You can save characters in utf8mb4 encoding correctly during execution of the provided code. But when reading them, you might get an error: imcomplete multibyte char.
-
- Posts: 50
- Joined: Tue 25 Jan 2005 11:22
- Location: Somerville, MA
- Contact:
Re: Changing from latin1 to utf8mb4
OK. I'll wait for an official fix.