utf8 not working for me

Discussion of open issues, suggestions and bugs regarding MyDAC (Data Access Components for MySQL) for Delphi, C++Builder, Lazarus (and FPC)
PlutoPlanet
Posts: 10
Joined: Tue 28 Aug 2007 08:08

utf8 not working for me

Post by PlutoPlanet » Tue 28 Aug 2007 08:47

Hello World!

My goal is to read and write utf8 strings to a mysql database with Delphi2006 and php as well.

Database: Ver 14.12 Distrib 5.0.27, for Win32 (ia32)

Code: Select all

DROP DATABASE IF EXISTS Testing;
CREATE DATABASE Testing;
USE Testing;

CREATE TABLE Testing (
	TestingId INTEGER NOT NULL AUTO_INCREMENT,
	TestingValue VARCHAR(255),
	PRIMARY KEY ( TestingId )
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Then I stored some utf8 using the following php script:

Code: Select all

mysql_pconnect( 'localhost', 'test', 'test' );
mysql_select_db( 'Testing' );

if ( $_REQUEST[ "TestingValue" ] )
{
	mysql_query( "INSERT INTO Testing ( TestingValue ) VALUES ( '" . $_REQUEST[ "TestingValue" ] . "' )" );
}

?>



	



	
	




I used the string "Motörhead" which was correctly stored into mysql:

Code: Select all

mysql> select * from Testing;
+-----------+--------------+
| TestingId | TestingValue |
+-----------+--------------+
|         1 | Mot├╢rhead   |
+-----------+--------------+
1 row in set (0.00 sec)
Now I opened my D2006 for win32 project and inserted a TMyConnection and set MyConnection.Options.UseUnicode := true

version: MyDac 4.40.0.22

I also added a TMyTable, a TDataSource and a TntDBEdit onto the form linked everything together and..... the TntDBEdit contained the multibyte string from database: "Motörhead"

I also tried to play around with MyConnection.Options.Charset := utf8 but with no success.

What did I do wrong?

Thanks,
Herwig

Antaeus
Posts: 2098
Joined: Tue 14 Feb 2006 10:14

Post by Antaeus » Wed 29 Aug 2007 08:24

Please perform the following two tests:
Open your query with the UseUnicode = True option, and save the field value to file:
TWideStringField(MyTable1.FieldByName('TestingValue')).ValueOpen your query with the UseUnicode = False and Charset = 'utf8' options, and save the field value to file:
MyTable1.FieldByName('TestingValue').AsStringPost the saved values to the forum in hexadecimal codes.

vga
Posts: 58
Joined: Sat 08 Jul 2006 12:04

GBK Error

Post by vga » Wed 29 Aug 2007 12:45

MyConnection.Charset : GBK
MyConnection.useunicde: True;


MyCommand1.SQL.Text :=
'UPDATE `mydac_loaded` ' +
'set `Str`=:str';
MyCommand1.ParamByName('str').AsString := '';
MyCommand1.Execute;

a error occured when execute"
"'#HY000Incorrect string value: '\xEE\xA1\xA3' for column 'Str' at row 1'"

Antaeus
Posts: 2098
Joined: Tue 14 Feb 2006 10:14

Post by Antaeus » Thu 30 Aug 2007 07:31

Most likely you should properly setup the language version for non-Unicode programs. See Control Panel -> Regional and Language Options -> Advanced.

An alternative way to resolve this problem is to assign the parameter value using the AsWideString property.

PlutoPlanet
Posts: 10
Joined: Tue 28 Aug 2007 08:08

Post by PlutoPlanet » Thu 30 Aug 2007 12:12

first result:
4d 6f 74 c3 b6 72 68 65 61 64
second result:
4d 6f 74 c3 83 c2 b6 72 68 65 61 64

the first result seems quite ok to me, but I don't understand why it is not correctly displayed in the tnt components.

Here is exactly what i did:

Code: Select all

procedure TForm1.SaveString(ws: WideString; filename: string);
var
  sl: TStringList;
begin
  sl := TStringList.Create;
  sl.Add(ws);
  sl.SaveToFile( ExtractFilePath( Application.ExeName ) + filename );
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  s: WideString;
begin
  Self.MyConnection1.Options.Charset := '';
  Self.MyConnection1.Options.UseUnicode := true;

  Self.MyConnection1.Open;
  Self.MyTable1.Open;

  s := TWideStringField( Self.MyTable1.FieldByName( 'TestingValue' )).Value;
  Self.SaveString( s, 'numbaone.bin' );

  // result is:
  // Motrhead
  // 4d 6f 74 c3 b6 72 68 65 61 64 0d 0a                                               Motörhead..

  Self.MyTable1.Close;
  Self.MyConnection1.Close;
end;

procedure TForm1.Button2Click(Sender: TObject);
var
  s: WideString;
begin
  Self.MyConnection1.Options.Charset := 'utf8';
  Self.MyConnection1.Options.UseUnicode := false;

  Self.MyConnection1.Open;
  Self.MyTable1.Open;

  s := MyTable1.FieldByName('TestingValue').AsString;
  Self.SaveString( s, 'numbatwo.bin' );

  // result is:
  // Motrhead
  // 4d 6f 74 c3 83 c2 b6 72 68 65 61 64 0d 0a                                         MotÃ.¶rhead..

  Self.MyTable1.Close;
  Self.MyConnection1.Close;
end;
the second version seems to be the utf8 representation of the utf8 string

Antaeus
Posts: 2098
Joined: Tue 14 Feb 2006 10:14

Post by Antaeus » Thu 30 Aug 2007 15:51

The way you save WideString data is not correct, because usage of TStringList in the SaveString method leads to implicit conversion WideString values to string ones. Please replace the code in your SaveString procedure with this one, and save values again:

Code: Select all

var
  fs: TFileStream;
begin
  fs := TFileStream.Create(filename, fmCreate);
  fs.Write(ws, length(ws)*2);
  fs.Free;
end;

PlutoPlanet
Posts: 10
Joined: Tue 28 Aug 2007 08:08

Post by PlutoPlanet » Thu 30 Aug 2007 17:49

Code: Select all

procedure TForm1.SaveString(ws: WideString; filename: string);
var 
  fs: TFileStream; 
begin 
  fs := TFileStream.Create(filename, fmCreate); 
  fs.Write(ws, length(ws)*2); 
  fs.Free; 
end; 

//var
//  sl: TStringList; 
//begin 
//  sl := TStringList.Create; 
//  sl.Add(ws);
//  sl.SaveToFile( ExtractFilePath( Application.ExeName ) + filename ); 
//end;

procedure TForm1.Button1Click(Sender: TObject); 
var 
  s: WideString; 
begin 
  Self.MyConnection1.Options.Charset := ''; 
  Self.MyConnection1.Options.UseUnicode := true; 

  Self.MyConnection1.Open; 
  Self.MyTable1.Open; 

  s := TWideStringField( Self.MyTable1.FieldByName( 'TestingValue' )).Value; 
  Self.SaveString( s, 'numbaone.bin' ); 

  // result is:
  // 54 c8 14 00 b8 f5 12 00 f7 b9 53 00 30 f9 12 00 23 ba 53 00                       TÈ..¸õ..÷¹S.0ù..#ºS.

  Self.MyTable1.Close; 
  Self.MyConnection1.Close; 
end; 

procedure TForm1.Button2Click(Sender: TObject); 
var 
  s: WideString; 
begin 
  Self.MyConnection1.Options.Charset := 'utf8'; 
  Self.MyConnection1.Options.UseUnicode := false; 

  Self.MyConnection1.Open; 
  Self.MyTable1.Open; 

  s := MyTable1.FieldByName('TestingValue').AsString; 
  Self.SaveString( s, 'numbatwo.bin' ); 

  // result is: 
  // dc 95 15 00 b8 f5 12 00 ef ba 53 00 30 f9 12 00 23 bb 53 00 b8 f5 12 00           Ü...¸õ..ïºS.0ù..#»S.¸õ..


  Self.MyTable1.Close; 
  Self.MyConnection1.Close; 
end;

Antaeus
Posts: 2098
Joined: Tue 14 Feb 2006 10:14

Post by Antaeus » Fri 31 Aug 2007 06:29

The way you save WideString data is not correct, because usage of TStringList in the SaveString method leads to implicit conversion WideString values to string ones. Please replace the code in your SaveString procedure with this one, and save values again:

Code: Select all

var
  fs: TFileStream;
begin
  fs := TFileStream.Create(filename, fmCreate);
  fs.Write(ws, length(ws)*2);
  fs.Free;
end;

PlutoPlanet
Posts: 10
Joined: Tue 28 Aug 2007 08:08

Post by PlutoPlanet » Fri 31 Aug 2007 17:35

that is exactly what i did :)
the code is not the same as the one before (see comments).
However: the results are:

UseUnicode:=true
54 c8 14 00 b8 f5 12 00 f7 b9 53 00 30 f9 12 00 23 ba 53 00

and

UseUnicode:=false
charset=utf8
dc 95 15 00 b8 f5 12 00 ef ba 53 00 30 f9 12 00 23 bb 53 00 b8 f5 12 00

both look somehow strange to me

thanks

Willo
Posts: 34
Joined: Thu 24 Aug 2006 18:29

think my problem is related...

Post by Willo » Sat 01 Sep 2007 15:57

im using D7 with MyDAC 3.55 (i read the version on WhastNew file), also, i use MySQL 5.x and Navicat as my DB manager.

On Navicat, i can write Ñ ñ on my records, but i got an error when i try to do the same on my Delphi app.

i read this thread and noticed my components does not have the properties you mentioned, so... is there a way for me to solve this?

TIA

PlutoPlanet
Posts: 10
Joined: Tue 28 Aug 2007 08:08

Post by PlutoPlanet » Sun 02 Sep 2007 15:38

Probably there are no such properties on v.3.x. I am using v.4.40. Maybe you need to upgrade.

However, what error happens? Is your ñ just stored wrong? Or does your delphi app throw an error?

BTW: In the meantime I manipulated the TNTDbEdit. Now it uses utf8encode and utf8decode before reading and writing. Now my application is working (somehow) with TntDBEdit, but as soon as I try to do other things, like reporting, i still have the same problems...

Willo
Posts: 34
Joined: Thu 24 Aug 2006 18:29

Post by Willo » Sun 02 Sep 2007 18:25

I got an error from mysql, just before i try to insert/update a record.


Updating MyDac is not an available option right now. And from your answer, i think updating is not the solution either.

Antaeus
Posts: 2098
Joined: Tue 14 Feb 2006 10:14

Post by Antaeus » Mon 03 Sep 2007 13:36

On the computer with the German localization put the same string constant to a WideString variable, and than convert it to utf8. Show us both values in the hexadecimal view.

PlutoPlanet
Posts: 10
Joined: Tue 28 Aug 2007 08:08

here we are:

Post by PlutoPlanet » Sat 08 Sep 2007 09:38

Here is what I did:
1. I put TntEdit and a TButton on my TForm.
2. Typed the string "Motörhead" into the TntEdit
3. Pressed button.

Here's the code:

Code: Select all

procedure TForm1.SaveString(ws: WideString; filename: string);
var 
  fs: TFileStream; 
begin 
  fs := TFileStream.Create(filename, fmCreate); 
  fs.Write(ws, length(ws)*2);
  fs.Free; 
end; 

procedure TForm1.Button4Click(Sender: TObject);
var
  ws: WideString;
begin
  ws := Self.TnTEdit1.Text;
  Self.SaveString(ws,'widestring.bin');
  // result:
  // 9c 69 15 00 b8 f5 12 00 03 c9 53 00 30 f9 12 00 4b c9 53 00
  // .i..¸õ...ÉS.0ù..KÉS.

  Self.SaveString(Utf8Encode( ws ), 'utfstring.bin' );
  // result:
  // 1c 71 15 00 b8 f5 12 00 28 c9 53 00 30 f9 12 00 4b c9 53 00 b8 f5 12 00
  // .q..¸õ..(ÉS.0ù..KÉS.¸õ..
end;

Antaeus
Posts: 2098
Joined: Tue 14 Feb 2006 10:14

Post by Antaeus » Tue 11 Sep 2007 08:11

Try to change the connection charset to utf8 in your PHP script. Just add the following command immediately after connect:

Code: Select all

mysql_query('SET NAMES utf8');

Post Reply