OracleConnection.Unicode property

Discussion of open issues, suggestions and bugs regarding ADO.NET provider for Oracle
Post Reply
Alladin
Posts: 149
Joined: Mon 27 Nov 2006 16:18
Contact:

OracleConnection.Unicode property

Post by Alladin » Fri 16 Jan 2009 18:54

What does this property do and when should I / should I not use it?

All strings in .NET are Unicode, what's a point to have this parameter then?
Are there any performance penalties/benefits for read/write data?

Could you please answer, when Unicode better be true:

a) Server character set is unicode UTF8 (NLS_CHARACTER=AL32UTF8)

b) Server character set is unicode UTF16 (NLS_CHARACTER=AL32UTF16)

c) Server character set is single byte encoding (f.e. NLS_CHARACTER= WE8MSWIN1252)

Thank you in advance,
Lex

Shalex
Site Admin
Posts: 8247
Joined: Thu 14 Aug 2008 12:44

Post by Shalex » Mon 19 Jan 2009 11:03

The Unicode property defines the charset that will be used at the network level for string transferring. The choice of this option has influence on the amount of conversions in the program and the possibility of data loss. If the Unicode property is set to true, UTF8 is used at the network level. Otherwise, the charset from regional settings of Windows is used.
a) If Unicode=true, the following scheme is implemented:
UTF8 (stored at the server) -> UTF8 (transported through the network) -> UTF16 (string in .NET).
b) If Unicode=true:
UTF16 -> UTF8 -> UTF16. Unfortunately, UTF16 can not be used at the network level.
c) If Unicode=false:
WE8MSWIN1252->WE8MSWIN1252->UTF16.

Alladin
Posts: 149
Joined: Mon 27 Nov 2006 16:18
Contact:

Post by Alladin » Mon 19 Jan 2009 14:38

Thank you very much for the detailed answer.

As a logical conclusion, it is safer to use Unicode. Always. Right?

And there is no obvious performance penalty: transport data should always be converted in Unicode/.NET whether from some single byte codepage or from UTF8.

So Unicode is our choice!

Shalex
Site Admin
Posts: 8247
Joined: Thu 14 Aug 2008 12:44

Post by Shalex » Mon 19 Jan 2009 16:34

The only disadvantage of using Unicode=true is performance penalty. But it is negligible.

Post Reply