I have about 160 million records to import into one master table and one child table. All input records are in various .csv files. To improve the speed of insert I did not define all the indexes on the table, except two. Table has IDENTITY field as PK and one digest field which is binary(16) and it is an IX. I check for duplicates on this digest field.
I use MSQuery to import. I have an output IDENTITY parameter (SET :RowId=SCOPE_IDENTITY()). I need IDENTITY to insert row into child table. I start transaction and then every 10,000 rows I commit it and start a new one.
Whole process is painfully slow. It is inserting about 60 rows a second.
Questions:
1. How can I improve the performance?
2. I have a feeling that MSQuery is doing AutoCommit after every INSERT (regardless of transaction). Is that true and, if it is, how do I disable it?
Any help is appreciated.
Thanks
BDS2006, SDAC 3.70, SQL Server 2005 SP1, WinXP Pro SP2, 4GB RAM, 500GB Hard Disk
How to speed up INSERT
> 1. How can I improve the performance?
- Use batches. This will reduce the number of network roundtrips to the server.
- Try to use TMSSQL instead of TMSQuery.
- Use prepared execution to execute a parameterized INSERT statement.
> 2. I have a feeling that MSQuery is doing AutoCommit after every INSERT (regardless of transaction). Is that true and, if it is, how do I disable it?
This should not happen. If you test it an ensure that COMMIT actually happens, please send us a complete small test project to reproduce the problem; it is desirable to use Northwind or Master schema objects, otherwise include definition of your own database objects; don't use third party components.
- Use batches. This will reduce the number of network roundtrips to the server.
- Try to use TMSSQL instead of TMSQuery.
- Use prepared execution to execute a parameterized INSERT statement.
> 2. I have a feeling that MSQuery is doing AutoCommit after every INSERT (regardless of transaction). Is that true and, if it is, how do I disable it?
This should not happen. If you test it an ensure that COMMIT actually happens, please send us a complete small test project to reproduce the problem; it is desirable to use Northwind or Master schema objects, otherwise include definition of your own database objects; don't use third party components.
I think I cannot use batches because I need IDENTITY column to insert row in the child table. Is there a way to use batches and get IDENTITY for each inserted master row?- Use batches. This will reduce the number of network roundtrips to the server.
Changed from MSQuery to MSSQL but did not see any performance improvement.- Try to use TMSSQL instead of TMSQuery.
Cannot prepare because I have an output parameter in INSERT statement (SET :RowId=SCOPE_IDENTITY()).- Use prepared execution to execute a parameterized INSERT statement.
Well, I guess 60 rows per second is what I'm stuck with. That is 4 million rows in 24 hours. Tough on initial import, but later additions of new rows will be moderate.
It is not happening. My mistake. Slow speed of inserts threw me off...> 2. I have a feeling that MSQuery is doing AutoCommit after every INSERT (regardless of transaction). Is that true and, if it is, how do I disable it?
This should not happen.
Thanks.