I'm using Entity framework with the latest version of DotConnect for MySql (6.50.237.0) in C# 4.0
I'm trying a simple groupby query on a very large table (100,000,000 records) with Linq
var imprs = from axi in db.aximpressions
where axi.Axd_SiteID == 2
group axi by axi.Axd_BannerID
into g
select new
{
g.Key,
Sum = g.Sum(i => i.Axi_Impressions)
};
The query takes a few minutes to execute (and 5.5 gigs of ram)
Here is the generated .ToTraceString() SQL
SELECT GroupBy1.K1 AS Axd_BannerID,
GroupBy1.A1 AS C1
FROM
(
SELECT Extent1.Axd_BannerID AS K1,
Sum(Extent1.Axi_Impressions) AS A1
FROM
(
SELECT aximpression.Axi_ID AS Axi_ID,
[...ALL table other columns...]
FROM adxpose.aximpression AS aximpression
) AS Extent1
WHERE Extent1.Axd_SiteID = 2
GROUP BY Extent1.Axd_BannerID
) AS GroupBy1
The innermost query select ALL records (yes all 100,000,000 records)
Then another query filter on the Axd_SiteID field and do the grouping and sum
Now I understand why it takes forever
But why the generated query isn't something simple like this ?
select axd_bannerid,
sum(axi_impression)
from AxImpression
where axd_siteid = 2
group by axd_bannerid
This execute in a 3-4 seconds
Thanks!
Michel
Linq Group by performance
It seems like you have DefiningQuery in the SSDL part of your *.edmx: http://msdn.microsoft.com/en-us/library/bb738450.aspx. Try removing it.
-
- Posts: 2
- Joined: Thu 03 Nov 2011 01:22