3 CompressBlobMode Problems

Discussion of open issues, suggestions and bugs regarding MyDAC (Data Access Components for MySQL) for Delphi, C++Builder, Lazarus (and FPC)
Justmade
Posts: 108
Joined: Sat 16 Aug 2008 03:51

3 CompressBlobMode Problems

Post by Justmade » Thu 13 Jan 2011 04:54

1.
In former days, I have some TMyQuery with WideMemoField setting CompressBlobMode to cbClientServer.

The WideMemoField also got compressed when reading back, it complain content not valid. So I changed all Memo Field to Blob Fields. So I didn't got a sample right now and I am too busy to try doing one currently.

2.
When the content has some unicode non-default-code-page data, after compress and read back, those data become ?. So I think the compressing including converting the data to AnsiString which cause data loss. You can just paste something like "☒,☑" into a field with compress setting and see the effect.

As Unicode is a world-wide trend, I think it is quite necessary to prevent data loss when compressing.

3.
I recall I can see compressBlobMode help in previous version but in recent version help I cannot find them any more....


Thanks in advance for your kind help

Justmade
Posts: 108
Joined: Sat 16 Aug 2008 03:51

Post by Justmade » Thu 13 Jan 2011 16:11

4th Problem

When QuickRefresh with CompressBlobMode on, there are many errors / strange behaviour (like making the existing data disappeared)

those QuickRefresh code which worked very well before start to be un-usable when the tables had changed to CompressBlobMode ClientServer. So I had to change them all to Refresh and the problems disappeared!

I am current in a rush for some deliveries so I dont have time to drop down and test all strange behaviour :(

AndreyZ

Post by AndreyZ » Fri 14 Jan 2011 09:10

Hello,

I cannot reproduce the problems one, two, and four. You are right about the help, we will return the topic about CompressBlobMode into MyDAC Reference Manual.
Please specify the following:
- the table structure (script) you are using;
- the exact version of MyDAC. You can see it in the About sheet of TMyConnection Editor;
- the exact version of your IDE;
- the exact version of MySQL server and client. You can see it in the Info sheet of TMyConnection Editor.

Justmade
Posts: 108
Joined: Sat 16 Aug 2008 03:51

Post by Justmade » Tue 18 Jan 2011 02:38

As I said b4, I don't have time to do a sample for you currently. I would try to make one when I have time.

Re point 2, I has just further test and find that it is because TBlobField's as string is default to AnsiString.

I am using
the latest 6.00.0.3 of MyDAC,
D2010 V14.0.3593.25826
MySQL server version: 5.1.49-community
MySQL client version: Direct

Justmade
Posts: 108
Joined: Sat 16 Aug 2008 03:51

Post by Justmade » Tue 18 Jan 2011 03:58

I have re-test setting field from blob to text

then when saving this error appear :

---------------------------
Debugger Exception Notification
---------------------------
Project EBSAPP.exe raised exception class EMySqlException with message '
#HY000Incorrect string value: '\xF2\x99\xB8\x8A\xD2\xB0...' for column 'FieldSet' at row 1'.
---------------------------
Break Continue Help
---------------------------

If I change back text to blob and then manually define the field as Type Wide Memo. It work good in memory and when applyupdate this message appear :

---------------------------
Debugger Exception Notification
---------------------------
Project EBSAPP.exe raised exception class EMySqlException with message '
#42000You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '?:???Z??U?皝怎??tN}謍?M??x??嚚?????6{H蝵兵璁株?-2?曌萄瘝?' at line 4'.
---------------------------
Break Continue Help
---------------------------

What I want to do is have some fields which have long unicode string data so I want them compressed. What should I do to achieve that?

AndreyZ

Post by AndreyZ » Tue 18 Jan 2011 11:02

Please make sure that you have created fields for unicode data with the UTF-8 charset in the following way:

Code: Select all

FieldSet TEXT CHARACTER SET utf8 COLLATE utf8_general_ci

Justmade
Posts: 108
Joined: Sat 16 Aug 2008 03:51

Post by Justmade » Tue 18 Jan 2011 14:23

Of cause if I use Text, the field is created in unicode. My Server and database and tables all default and I had checked and make sure they are in unicode.

The problem is that if use Text, after compress, the compressed binary data is not accepted by the Text field, you can see it from the error message I quoted before :

#HY000Incorrect string value: '\xF2\x99\xB8\x8A\xD2\xB0...' for column 'FieldSet' at row 1'.

I think it is very easy to reproduce this error.

On the other hand, if we use blob which is good for storing binary data, MyDAC will detect to use TBlobFIeld, which default to use AnsiString for AsString. Actually, I can use AsWidthString to operate successfully. However, the TDBMemo control will use AsString for data reading / writing.

So, I tried to define manually the blob field to TWideStringField (so that it will use WideString for AsString. It worked well in memory (in cached update mode) but the generated update SQL statement is not valid (as you can see in my last post).

Justmade
Posts: 108
Joined: Sat 16 Aug 2008 03:51

Post by Justmade » Fri 21 Jan 2011 02:20

In Summery, I think :

1. Text Field should not be compressed to server because MySQL do not accept those binary data. This is critical because currently it make table with text field cannot work with compressblobmode.

2. It would be great to have a way to let blob field to be defined as other TBlobField descendants (such as TWideMemo) so that we can operate in Unicode string.

Thanks in advance for your help.

AndreyZ

Post by AndreyZ » Fri 21 Jan 2011 09:39

We have reproduced these problems and the investigation is in progress. As soon as we solve these problems we will let you know.

Dimon
Devart Team
Posts: 2910
Joined: Mon 05 Mar 2007 16:32

Post by Dimon » Wed 26 Jan 2011 12:06

The point is that the CompressBlobMode mode has meaning for text data more than for any other type of data. This is because compression rate greatly depends on stored data, for example, usually compression of graphic data is not that effective as of text data.
But compression of text fields has some problems with table code page. In case of using the LATIN1 character set MySQL server allows saving any data, but in case of using the UTF8 character set you can have problems.
If you use text fields and the UTF8 character set, set the CompressBlobMode option to cbNone.

Justmade
Posts: 108
Joined: Sat 16 Aug 2008 03:51

Post by Justmade » Thu 27 Jan 2011 04:55

If ansi text need compression, unicode text will have the same, if not more, need for compression. Unless you think you product will stop at the ansi age, we have to find some solutions to adept to the unicode solutions. Delphi take too long to adept and it is one of the reasons for it to lost popularity. I see MyDac had put much effort in adding much support for unicode and hope there will be a solutions for the compression issue.

For text field, now I understand that mysql act differently for ansi field and uft8 field. I think that MyDAC should be able to distinguish between ansi and unicode field, thencompress on the former and not the latter.

For blob field, can you please at least let us define the blob field to TWideMemo field so that we can deal with unicode text at least using blob field? Thanks.

Dimon
Devart Team
Posts: 2910
Joined: Mon 05 Mar 2007 16:32

Post by Dimon » Thu 27 Jan 2011 14:47

The problem with the UTF8 character arises not because we cannot compress this data, but because the compressed data is a set of bytes and doesn't have any meaning as a UTF8 text. Therefore MySQL server doesn't allow saving such data in a text field.
To solve the problem try to set the LATIN1 character set for compressed text fields.

Justmade
Posts: 108
Joined: Sat 16 Aug 2008 03:51

Post by Justmade » Fri 28 Jan 2011 03:30

Dimon wrote: To solve the problem try to set the LATIN1 character set for compressed text fields.
No, it do not solve the problem. As long as Connection.Charset is utf8 or Connection.useunicode is True, the error is still there even the text field is set to latin1.

If setting the connection charset to latin1 and use useunicode to false, it work but all other unicode support is disabled.

That's why I repeated and repeated that maybe using blob field to deal with unicode long strings might be the best. As your MyDAC can already handle most of the issue already. As I said, I had test that using TBlobField.AsWidthString to read write work perfectly. The only problem is that most existing visual components use AsString (which route to AsAnsiString for TBlobField) to populate / update their components text. If there is a way to define the field as TWideMemoField, the AsString will be using AsWidthString instead and solve all the problems.

AndreyZ

Post by AndreyZ » Mon 14 Feb 2011 14:14

We have thoroughly investigated this question, and we came to a decision that it's impossible to compress blob fields with unicode data. Here are two reasons why:
1) when the field is defined as text, server doesn't accept compressed data because it's not a correct text;
2) when the field is defined as blob, you will not be able to input unicode data in any database component because they use AsAnsiString methods for obtaining and inputing data. Mapping blob fields to TWideMemoField is not a solution, because most of the blob fields contain binary data, and not text data.
Last edited by AndreyZ on Thu 17 Feb 2011 09:39, edited 1 time in total.

Justmade
Posts: 108
Joined: Sat 16 Aug 2008 03:51

Post by Justmade » Wed 16 Feb 2011 15:37

Thank you for your thoroughly investigation.

I know that it is impossible to map blob to TWideMemoField for default as you had explained.

I just asking if we can manually map the blob field to TWideMemoField through Field Editor / code and keep the field work properly. Currently, if I do that, it will got error.

Thanks you.

Post Reply