Multiple sums in sql when not required

Discussion of open issues, suggestions and bugs regarding ADO.NET provider for PostgreSQL
Post Reply
damon.cognito
Posts: 50
Joined: Wed 22 Jul 2009 09:30

Multiple sums in sql when not required

Post by damon.cognito » Fri 15 Apr 2016 16:19

Found a bit of a performance issue with the generated sql in some instances. I've reproduced on a sample database. The LINQ is:

Code: Select all

var dt = new DateTime(2005, 10, 1);
var q1 = (from mh in entity.Head
            select new
            {
                t = (decimal?)(mh.Accs.Bal1 - mh.Bills.Where(o => o.StampDate > dt).Sum(o => o.Amount - o.Total ?? 0M)) ?? 0M,
                a = (decimal?)(mh.Accs.Bal2 - mh.Bills.Where(o => o.StampDate <= dt).Sum(o => o.Amount)) ?? 0M,
                mh.OIDHead
            }).ToList();
The resultant sql carries out the Sums on TblBills twice each making 4 in total. The issue appears to be cuased by the ?? 0M on the end. As soon as these are removed the sql has the expected two sums (but I do need them as it must return 0 not null if there are no records).

If I then add a where clause based on the calculated variables like so

Code: Select all

var q1 = (from mh in entity.Head
            select new
            {
                t = (decimal?)(mh.Accs.Bal1 - mh.Bills.Where(o => o.StampDate > dt).Sum(o => o.Amount - o.Total ?? 0M)) ?? 0M,
                a = (decimal?(mh.Accs.Bal2 - mh.Bills.Where(o => o.StampDate <= dt).Sum(o => o.Amount)) ?? 0M,
                mh.OIDHead
            }).Where(o => o.t != 0M || o.a != 0M).ToList();

It is doubled, so that the the total amount of sums is 8 instead of 2. The example is trivial but when scaled up it is too slow even with indices. Is there any optimisation you can do your side to the SQL or is this a tree problem?

Head -> Accs is a 1 -> 1
Head -> Bills is a 1 -> *

I can send you the schema if you need it.

Thanks.

VS2013, EF6.01, Devart 7.4.592

Shalex
Site Admin
Posts: 9543
Joined: Thu 14 Aug 2008 12:44

Re: Multiple sums in sql when not required

Post by Shalex » Tue 19 Apr 2016 17:11

damon.cognito wrote:Found a bit of a performance issue with the generated sql in some instances. I've reproduced on a sample database. The LINQ is:

Code: Select all

var dt = new DateTime(2005, 10, 1);
var q1 = (from mh in entity.Head
            select new
            {
                t = (decimal?)(mh.Accs.Bal1 - mh.Bills.Where(o => o.StampDate > dt).Sum(o => o.Amount - o.Total ?? 0M)) ?? 0M,
                a = (decimal?)(mh.Accs.Bal2 - mh.Bills.Where(o => o.StampDate <= dt).Sum(o => o.Amount)) ?? 0M,
                mh.OIDHead
            }).ToList();
The resultant sql carries out the Sums on TblBills twice each making 4 in total. The issue appears to be cuased by the ?? 0M on the end. As soon as these are removed the sql has the expected two sums (but I do need them as it must return 0 not null if there are no records).
We have reproduced the SQL generation and are investigating the case.
damon.cognito wrote:If I then add a where clause based on the calculated variables like so

Code: Select all

var q1 = (from mh in entity.Head
            select new
            {
                t = (decimal?)(mh.Accs.Bal1 - mh.Bills.Where(o => o.StampDate > dt).Sum(o => o.Amount - o.Total ?? 0M)) ?? 0M,
                a = (decimal?(mh.Accs.Bal2 - mh.Bills.Where(o => o.StampDate <= dt).Sum(o => o.Amount)) ?? 0M,
                mh.OIDHead
            }).Where(o => o.t != 0M || o.a != 0M).ToList();

It is doubled, so that the the total amount of sums is 8 instead of 2.
The amount of Sums in the generated SQL is 4 in our test with the newest versions: EFv6.1.3 and dotConnect for PostgreSQL v7.4.616.

We will notify you about the results of our investigation.

Shalex
Site Admin
Posts: 9543
Joined: Thu 14 Aug 2008 12:44

Re: Multiple sums in sql when not required

Post by Shalex » Thu 28 Apr 2016 13:18

damon.cognito wrote:Is there any optimisation you can do your side to the SQL or is this a tree problem?
This is a tree problem. The tree was generated by EF engine which failed to optimize it. There are the following workarounds:

1. Move some functionality to the .NET level:

Code: Select all

var q3 = (from mh in entity.Heads
            select new {
                t = (decimal?)(mh.Acc.Bal1 - mh.Bills.Where(o => o.StampDate > dt).Sum(o => o.Amount - o.Total ?? 0M)),
                a = (decimal?)(mh.Acc.Bal2 - mh.Bills.Where(o => o.StampDate > dt).Sum(o => o.Amount)),
                mh.OIDHead
            }).ToList()
            .Select(r => new {
                t = r.t ?? 0M,
                a = r.a ?? 0M,
                OIDHead = r.OIDHead
            })
            .ToList();
2. Create a stored procedure in your database which will contain an optimized SQL and return result set. Then add it to your model and allow Entity Developer to generate the corresponding ComplexType.

3. Write an optimized SQL and add it to CommandText (Model Explorer > Store > Stored Procedures > Add > New Command Text). Then add it to CSDL and generate the corresponding ComplexType.

Post Reply