Page 1 of 1
Problem withTmyTable implementation of Handler
Posted: Sat 05 Sep 2009 20:23
by Brent
Your TmyTable can execute a Handler statement but I don't see how I can have the TmyTable loop through all the rows in the table by fetching only 1000 rows at a time.
I have a large table (25 million rows) and the Handler would be one way to quickly traverse the table in INDEX ORDER. The problem is the TmyTable using a Handler will load ALL the rows of the table into memory when the table is opened. This means it sits there for 10 minutes for a table of just 5 million rows before my loop gets executed.
The Handler syntax has an option where it returns only 'x' number of rows at a time, then when it runs out, it executes another Handler to finish from where it left off. I don't see how I can do this in just one loop.
1) Can you provide me with a simple example with a TmyTable using a Handler in a loop that traverses the entire table by fetching only 1000 rows into memory at a time?
Also I can't see how I can use a TmyQuery to traverse this large table in index order (readonly) without it loading up a lot of rows in memory or by taking a long time sorting the 25m rows. Right now I am executing 1500 queries to fetch only a few thousand rows at a time and this is very slow.
I tried Unidirectional and ReadOnly but as soon as I add a Sort clause, it physically sorts the entire table even though I have an index that matches the sort order.
2) Any ideas on how to get the query to run faster?
TIA
Brent
Posted: Mon 07 Sep 2009 07:29
by Dimon
To solve the problem try to use the TMyTable.Offset and Limit properties.
Posted: Mon 07 Sep 2009 16:46
by Brent
Yes I can use OffSet,Limit but I am going through 25 million row table. That means if I retrieve 100k rows each time, that will be 250 queries. There are 2 reasons I don't want to do this. The sorting is killing any speed I hope to achieve. The Handler does not need to sort because it is capable of going through in index order without sorting. The Offset idea is fine if the offset is small. But as it becomes larger it will take longer and longer to skip over the first 'n' rows.
The only "solution" I see is to use TmyTable with Handler. The problem I found with Devart's implementation of Handler is it will try and retrieve all rows in one fell swoop and I'll run out of memory. But the Handler syntax can retrieve 'n' rows at a time which is what I need to do. Unlike SQL's offset, handler can fetch the next 'n' rows from where it left off so it doesn't have to jump over the first 'n' rows to get the next 'n' rows. But the Devart's implementation of TmyTable with Handler doesn't seem to support that. From what I can see it is all or nothing. Which is why I am posting here. To see if there is a way around this limitation.
Also when using SQL I haven't seen any way to traverse the whole table without running out of memory. I have 3gb and there should be some way to do it with SQL (without using offset,limit). Even with FetchAll to False and retrieving only 1k rows at a time, it is trying to return all rows at once. Not sure why. I'm using the latest version of MyDAC. Maybe it is the sort on the 25M row table? Handler would have been so much less resource intensive had it worked.
Brent
Re: Problem withTmyTable implementation of Handler
Posted: Tue 08 Sep 2009 08:31
by Dimon
Brent wrote:The Handler syntax has an option where it returns only 'x' number of rows at a time, then when it runs out, it executes another Handler to finish from where it left off. I don't see how I can do this in just one loop.
Please specify a SQL query that executes this functionality.
Posted: Wed 09 Sep 2009 22:03
by Brent
<<Please specify a SQL query that executes this functionality.<<
Sure. Handler is suppose to be able to traverse a huge table in index order without requiring any sorting and can accomplish it with a modest amount of memory because it only loads a few rows at a time and it remembers where it left off. (Handler remains open between Handler calls!)
Example:
This was executed from the MySQL.Exe command line.
-- This opens the handler on the tmp table and gives is a name "myalias"
handler tmp open as myalias;
-- I have requested the next 10 rows from myalias (tmp table) in ix_Main index order
handler myalias read ix_main next limit 10;
.... I now get back 10 rows from the table ....
handler myalias read ix_main next limit 10;
.... I now get back the NEXT 10 rows from the table ....
I can repeat this handler command as many times as I like, even change the # of rows returned, until I get through the entire table.
It will only return "limit n" rows at a time. The next time the Handler is executed, because the Handler is still open, will get the next 10 rows etc..
So it only uses as much memory as it takes to retrieve the 'n' rows and it requires no sorting. This is a big plus when traversing a table with tens of millions of rows in it.
I don't see how Devart's implementation of Handler with TmyTable can keep retrieving the next 10 rows ad infinitum until it hits the end of table. Unless I am missing something here, it appears I can only retrieve all the rows, or a Offset,Limit number of rows.
Someone suggested to use Offset,Limit which means I would have to keep opening Handler repetitively which defeats the purpose of Handler. As the offset gets larger, it takes more and more time to retrieve the next set of rows. That's why I'd like to traverse the table by opening the Handler only once without using Offset. This will be quite fast.
Is there a way to use TmyTable to achieve this?
TIA
Brent
Posted: Thu 10 Sep 2009 07:41
by Dimon
TMyTable does not provide such functionality. To solve the problem you should use the TMyQuery component.
Posted: Thu 10 Sep 2009 16:02
by Brent
>>TMyTable does not provide such functionality.<<
Actually it does! I finally got it to work. Now I can traverse a table at 20k rows/second in index order without sorting and without using any additional memory.
Brent
Posted: Mon 14 Sep 2009 06:44
by Dimon
Please, descride how you solved the problem.
Posted: Tue 15 Sep 2009 01:51
by chihebbs
Actually, I would like to see this extremely fast solution to traverse a table.
Posted: Sun 20 Sep 2009 02:22
by chihebbs
Any news about the miraculous solution. It is a forum, developers are supposed to help each other.
Posted: Thu 24 Sep 2009 15:26
by Brent
Sorry I didn't mean to leave you guys hanging. I just didn't check back here after I posted.
I'll post the solution in a couple of days. But first I'd like you guys to try TmyTable Handler yourself to see if you run into the same problem I did. Set the TmyTable to a large table, say 10+ million rows, then try and traverse it with Handler using an index. If you do it correctly then it should start retrieving rows immediately without having to sort and without waiting to buffer the results. And it can go through the entire table without using any additional memory. Often when I use SQL and a sort to traverse a large table, it eats up RAM faster than a vacuum cleaner sucking up sand.
So post your results and I get back to you in a couple of days.
Brent