Page 1 of 1

UTF8/UTF16 and NVARCHAR2 string comparison failure

Posted: Mon 07 Nov 2011 11:58
by object
We have been using our application with Oracle database using NLS_NCHAR_CHARACTERSET set to AL16UTF16, and everything worked fine.

Now we have connected it to a database that uses UTF8 and we are experiencing strage behavior.

A simple LINQ query that contains join with string comparison suddently returns an empty result set. Here is one:

--> FAIL: returns emtpy result set

Code: Select all

            string mediumtype = "tv";

            var currentSeries = from s in _entities.Series

                                join tm in _entities.SeriesTitles on s.SerieId equals tm.SerieId into tmj
                                from tm in tmj.Where(m =>
                                        m.TitlesGroupType == TitlesGroupType.MainTitle)
                                where s.Medium.ToLower() == mediumtype.ToLower()

                                select s;
But if change mediumtype to a hardcoded string, it succeeds!

--> SUCCEDS

Code: Select all

            var currentSeries = from s in _entities.Series

                                join tm in _entities.SeriesTitles on s.SerieId equals tm.SerieId into tmj
                                from tm in tmj.Where(m =>
                                        m.TitlesGroupType == TitlesGroupType.MainTitle)
                                where s.Medium.ToLower() == "tv"

                                select s;
The string data types are all NVARCHAR2, and here is the generated queries:

SELECT "GroupBy1".A1 AS C1
FROM (SELECT Count(1) AS A1
FROM SERIES "Extent1"
INNER JOIN SERIES_TITLES "Extent2"
ON ("Extent1".SERIE_ID = "Extent2".SERIE_ID)
AND ("Extent2".TITLES_GROUP_TYPE = 'mainTitle' /* @p__linq__0 */)
WHERE (LOWER("Extent1".MEDIUM)) = (LOWER('tv' /* @p__linq__1 */))) "GroupBy1"

SELECT "GroupBy1".A1 AS C1
FROM (SELECT Count(1) AS A1
FROM SERIES "Extent1"
INNER JOIN SERIES_TITLES "Extent2"
ON ("Extent1".SERIE_ID = "Extent2".SERIE_ID)
AND ("Extent2".TITLES_GROUP_TYPE = 'mainTitle' /* @p__linq__0 */)
WHERE (LOWER("Extent1".MEDIUM)) = 'tv') "GroupBy1"

Both queries work fine on a UTF16 database.

Posted: Mon 07 Nov 2011 13:46
by object
I did more tests and narrowed the problem: strange things happen when a query contains a string literal, e.g.

s == MyClass.SomeStringConstant

If I add ToLower() to both parts of the comparison, everything works as it should.

This only happens when Oracle database uses UTF8 for N(VAR)CHAR types.

Posted: Mon 07 Nov 2011 16:59
by Shalex
Please turn on dbMonitor and tell us the data type of the p__linq__1 parameter.

Posted: Tue 08 Nov 2011 07:23
by object
I've added to a test the following code:

var dbMonitor = new OracleMonitor();
dbMonitor.IsActive = true;

However, the only event I see in DB Monitor is "dotConnect for Oracle monitoring is started". I am running DB Monitor with default filter, i.e. it should display all events. The test code is running from within a unit test, I tried different unit test runners with no difference.

Posted: Tue 08 Nov 2011 07:36
by object
Oops, I incorrectly set up dbMonitor first, now it's working, and here's what I get:

1. When query fails (empty result set):
string mediumtype = "tv";

var currentSeries = from s in _entities.Series
join tm in _entities.SeriesTitles on s.SerieId equals tm.SerieId into tmj
from tm in tmj
.Where(m => m.TitlesGroupType == TitlesGroupType.MainTitle)
where s.Medium.ToLower() == mediumtype.ToLower()
select s;
SELECT
"GroupBy1".A1 AS C1
FROM ( SELECT Count(1) AS A1
FROM SERIES "Extent1"
INNER JOIN SERIES_TITLES "Extent2" ON ("Extent1".SERIE_ID = "Extent2".SERIE_ID) AND ("Extent2".TITLES_GROUP_TYPE = :p__linq__0)
WHERE (LOWER("Extent1".MEDIUM)) = (LOWER(:p__linq__1))
) "GroupBy1"

p__linq__0: NVarChar, mainTitle
p__linq__1: VarChar, tv

2. When query succeeds (non-empty result set):
var currentSeries = from s in _entities.Series
join tm in _entities.SeriesTitles on s.SerieId equals tm.SerieId into tmj
from tm in tmj
.Where(m => m.TitlesGroupType == TitlesGroupType.MainTitle)
where s.Medium.ToLower() == "tv"
select s;
SELECT
"GroupBy1".A1 AS C1
FROM ( SELECT Count(1) AS A1
FROM SERIES "Extent1"
INNER JOIN SERIES_TITLES "Extent2" ON ("Extent1".SERIE_ID = "Extent2".SERIE_ID) AND ("Extent2".TITLES_GROUP_TYPE = :p__linq__0)
WHERE (LOWER("Extent1".MEDIUM)) = 'tv'
) "GroupBy1"

p__linq__0: NVarChar, mainTitle

3. Another variant of the same query that succeds:
string mediumtype = "tv";

var currentSeries = from s in _entities.Series
join tm in _entities.SeriesTitles on s.SerieId equals tm.SerieId into tmj
from tm in tmj
.Where(m => m.TitlesGroupType.ToLower() == TitlesGroupType.MainTitle.ToLower())
where s.Medium.ToLower() == mediumtype.ToLower()
select s;
SELECT
"GroupBy1".A1 AS C1
FROM ( SELECT Count(1) AS A1
FROM SERIES "Extent1"
INNER JOIN SERIES_TITLES "Extent2" ON ("Extent1".SERIE_ID = "Extent2".SERIE_ID) AND ((LOWER("Extent2".TITLES_GROUP_TYPE)) = (LOWER(:p__linq__0)))
WHERE (LOWER("Extent1".MEDIUM)) = (LOWER(:p__linq__1))
) "GroupBy1"

p__linq__0: VarChar, mainTitle
p__linq__1: VarChar, tv

Note the difference in representation of p__linq__0 in variants 1 and 3! Looks like when VarChar and NVarChar parameters are mixed, query fails.

Posted: Wed 09 Nov 2011 11:37
by Shalex
1. Please send us a small test project with the corresponding DDL/DML script to reproduce the issue in our environment.
2. Specify:
a) the version and NLS_NCHAR_CHARACTERSET, NLS_CHARACTERSET, NLS_LANGUAGE parameters of your Oracle server;
b) the HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\%HomeName% > NLS_LANG setting of your Oracle client;
c) the exact (x.xx.xxx) version of your dotConnect for Oracle;
d) your connection string (roughly, without credentials). Have you tried using the "Unicode=true;" connection string parameter?

Posted: Wed 09 Nov 2011 15:19
by object
I have sent required information (schema, test code and environment data).

Looking forward to hear from you soon.

Posted: Thu 10 Nov 2011 08:53
by object
In the mail that I sent yesterday I didn't submit and test data, but it's simple to create: just insert some rows in SERIES table with MEDIUM set to 'tv' and 'radio', and some SERIES_TITLES rows with TITLES_GROUP_TYPE set to 'mainTitle' and 'secondaryTitle'. I think this should be sufficient.

Posted: Mon 14 Nov 2011 09:41
by object
I have donwloaded dotConnect 6.50.244 and these UTF8/UTF16 issues are gone. Everything works fine.

Thanks for the update.