Page 1 of 1
Malformed string exception with Firebird 2.5 UTF8 database
Posted: Sun 21 Nov 2010 20:07
by tsteinmaurer
Hello,
I'm using Delphi 2010 and IBDAC 3.10.0.14.
I read data from a Firebird 2.5 database with character set NONE, which also can contain characters like e.g. 'Ø'. This data then should be inserted into a UTF8 Firebird 2.5 database in a BLOB SUB_TYPE 1 column. For the destination connection, I use the following code in my data access layer to instantiate the connection:
Code: Select all
FIB_Connection := TIBCConnection.Create(nil);
FIB_Connection.ClientLibrary := 'fbclient.dll';
FIB_Connection.Options.Charset := 'UTF8';
FIB_Connection.Options.UseUnicode := True;
FIB_Connection.Options.EnableMemos := True;
So I guess, pretty much all I can do for "enabling" unicode stuff at connection level. But, anytime I'm trying to insert data with including e.g. 'Ø', I get a "Malformed string" exception.
Any ideas?
Thanks,
Thomas
Posted: Tue 23 Nov 2010 11:49
by AndreyZ
Hello,
I cannot reproduce the problem. I have created two connections with the following options:
Code: Select all
// database with character set UTF8
IBCConnection1.ClientLibrary := 'fbclient.dll';
IBCConnection1.Options.Charset := 'UTF8';
IBCConnection1.Options.UseUnicode := True;
IBCConnection1.Options.EnableMemos := True;
// database with character set NONE
IBCConnection2.ClientLibrary := 'fbclient.dll';
IBCConnection2.Options.Charset := '';
IBCConnection2.Options.UseUnicode := False;
IBCConnection2.Options.EnableMemos := True;
and tried to copy data from one database to another in the following way:
Code: Select all
IBCQuery1.Append;
IBCQuery1.FieldByName('id').AsInteger := IBCQuery2.FieldByName('id').AsInteger;
IBCQuery1.FieldByName('blb').AsString := IBCQuery2.FieldByName('blb').AsString;
IBCQuery1.Post;
where IBCQuery1.Connection=IBCConnection1, IBCQuery2.Connection=IBCConnection2, and the blb field has the BLOB SUB_TYPE 1 database type.
The copy process was successful. Please try reproducing the problem using these options and code.
Posted: Tue 23 Nov 2010 12:44
by tsteinmaurer
Hello,
I'm using a parametrized INSERT INTO query and access the BLOB paramters with ParamByName(...).AsMemo := ... Does this make a difference?
In your test, does the blob field in database 2 hold special characters like 'Ø'?
Thanks,
Thomas
Posted: Tue 23 Nov 2010 12:51
by tsteinmaurer
Btw, why is UNICODE_FSS used instead of the provided UTF8 character set when UseUnicode is set to True?
Thanks,
Thomas
Posted: Tue 23 Nov 2010 15:02
by AndreyZ
I'm using a parametrized INSERT INTO query and access the BLOB paramters with ParamByName(...).AsMemo := ... Does this make a difference?
You should use the following code:
Code: Select all
IBCQuery1.ParamByName('id').AsInteger := IBCQuery2.FieldByName('id').AsInteger;
IBCQuery1.ParamByName('blb').AsMemoRef.IsUnicode := true;
IBCQuery1.ParamByName('blb').AsMemoRef.AsString := IBCQuery2.FieldByName('blb').AsString;
IBCQuery1.Execute;
Posted: Tue 23 Nov 2010 15:03
by AndreyZ
Btw, why is UNICODE_FSS used instead of the provided UTF8 character set when UseUnicode is set to True?
Please read this topic:
http://www.devart.com/forums/viewtopic.php?t=19480
Posted: Tue 23 Nov 2010 17:13
by tsteinmaurer
Hello,
thanks. With the workaround, it seems to work, but I wonder what other hacks do I need to be sure that my Firebird based application is fully Unicode-enabled? Because the mentioned "Working with Unicode character data" article doesn't even mentioned your proposed solution.
Imagine, my product connects to databases where I don't have the control over whether the user connects with UTF8/UNICODE_FSS or not. Your parameter hack isn't re-usable in that scenario.
You really should look into using UTF8 instead of UNICODE_FSS, because UNICODE_FSS is not only deprecated, but also has bugs in handling unicode data correctly.
To be honest, using D2010 with IBDAC, I thought, by using standard datatypes like String etc. I'm Unicode-enabled out-of-the box. Including BLOBs SUB_TYPE 1. Don't get me wrong, but the Unicode-support doesn't work as advertised. Hopefully things are getting better with forthcoming releases, but this also forces me to renew my subscription for that. Unicode-support for a Firebird D2010 application was my number on reason for moving to IBDAC ...
Thanks anyway,
Thomas
Posted: Wed 24 Nov 2010 13:35
by AndreyZ
The workaround I gave you is needed only for this particular case when you are trying to copy data from a non-unicode database to a unicode database. In other cases you should set the TIBCConnection.Options.UseUnicode property to True to have full unicode support.
We are investigating the possibility of using UTF-8 instead of the UNICODE_FSS character set.
Posted: Thu 17 Mar 2011 15:31
by ralfiii
AndreyZ wrote:
You should use the following code:
Code: Select all
IBCQuery1.ParamByName('blb').AsMemoRef.IsUnicode := true;
IBCQuery1.ParamByName('blb').AsMemo := xxx;
This article solved my current problem.
However, for a UTF8-Unicode-Database I expected this to be the default behaviour.
Maybe you should consider making that the standard behaviour. (I would think the more people using D2009+ the more troubles you will hear of).
At least you should mention this "fix" in the helpfile in the section "Working with Unicode Character Data".
After reading the helpfile I am still not sure:
My whole application is D2010, unicode, UTF8. To use Unicode-Parameters, do I always need to set that explicitly or use .AsWideString.
(under D2010 I would have expected .AsString to be unicode and .AsAnsiString to be AnsiStrings.)
Posted: Fri 18 Mar 2011 08:32
by AndreyZ
We will investigate this question. As soon as we get any results, we will let you know.
Posted: Fri 08 Apr 2011 09:26
by Dimon
To use this functionality by default, you should set the UnicodeMemoParameters global variable from the IBC unit to True.
Posted: Mon 20 Jun 2011 09:19
by ralfiii
Dimon wrote:To use this functionality by default, you should set the UnicodeMemoParameters global variable from the IBC unit to True.
Sounds promising.
Is there any downside/bad sideeffect in doing that?
Posted: Mon 20 Jun 2011 12:14
by AndreyZ
When you set the UnicodeMemoParameters global variable to True, IBDAC automatically sets the following TIBCParam properties: IsUnicode to True, CharsetID to CH_UTF8. In this case, all queries with Memo parameters will transfer parameters' data to the server in the Unicode format.