BigQuery Destination Insert Errors

Discussion of issues, suggestions and bugs of Devart SSIS Data Flow Components, our product line for building SSIS-based ETL solutions, performing data access to popular cloud applications and databases.
Post Reply
rjdkey
Posts: 4
Joined: Thu 11 Mar 2021 21:16

BigQuery Destination Insert Errors

Post by rjdkey » Thu 11 Mar 2021 21:22

We have a couple SSIS packages that are throwing errors when inserting data into BigQuery with the errors below. Most of our packages are running fine so these errors are specific to a couple packages:


Error Description: An exception has occurred during insert operation, the message returned from the provider is:
Quota exceeded: Your table exceeded quota for imports or query appends per table.

Error Description: System.Exception: An exception has occurred during insert operation, the message returned from the provider is:
The underlying connection was closed: An unexpected error occurred on a send.


Has anyone else also experience these errors and found a solution? We are using bulk insert.

DmitryGm
Devart Team
Posts: 42
Joined: Fri 11 Dec 2020 10:27

Re: BigQuery Destination Insert Errors

Post by DmitryGm » Mon 15 Mar 2021 08:50

There is restriction from Google.
See the article: Quotas and limits

Note that:
  • Destination tables in a query job are subject to the limit of 1,500 updates per table per day.
  • You are limited to 1,500 operations per table per day whether the operation appends data to a table or truncates a table.
  • If you populate the insertId field for each row inserted, you are limited to 100,000 rows per second per table.
  • If you populate the insertId field for each row inserted, you are limited to 100 MB per second, per table.
First of all, determine what type of restrictions you are facing and try to fix you flow.

rjdkey
Posts: 4
Joined: Thu 11 Mar 2021 21:16

Re: BigQuery Destination Insert Errors

Post by rjdkey » Fri 26 Mar 2021 17:51

I looked through all of the documentation for the DevArt BigQuery components as well as the google documentation around quotas and am still struggling to understand why it is happening. There are a total of 20 million rows we are inserting with the BigQuery destination component. We have the UploadBatchSize set to the default of 31457280 which should break the data up into 30MB csv files while uploading to google cloud storage. The BatchSize was set to 314572800 which should be doing a single load of all of the 30MB csv every time its gets to 300MB of data and deleting the csv files at that point. The interesting thing is that if I monitor the google cloud storage and keep refreshing while the SSIS package is running I only ever see one file in there that keeps changing with each refresh and the file always seems to be under 1MB. That is telling me that it is doing loads more often than it should be based on the way I have the parameters set. I would expect to see between 1 and 10 csv files in there at a time each around 30MB if it was working correctly. If its doing individual loads of a single csv file under 1MB it will definitely hit the quota. Does it matter what type of source I am using? I am using an ODBC connection source set to a batch size of 100,000 records. It is also going through a data conversion control between the source and destination. Other than that its a fairly simple source to destination without anything complicated. Any idea why it wouldn't be creating larger CSV files and waiting until it get to the BatchSize set before it does the load? Would switching the source to an OLEDB or ADO.NET connection help in any way or should the function of the BigQuery Destination be totally independent of the source I am using? I also tried switching the BatchSize up to the max values it lets me enter which is 999999999. In that case it should be doing a load every 953.67MB but that didn't help either. The total size of the 20 million records in the database is 2.5GB so even when the BatchSize is set to 314572800 (300MB) it should only be doing just over 8 loads, correct?

DmitryGm
Devart Team
Posts: 42
Joined: Fri 11 Dec 2020 10:27

Re: BigQuery Destination Insert Errors

Post by DmitryGm » Wed 31 Mar 2021 07:46

Be aware of limitations put by Data Flow Task itself. "SSIS buffer" is allocated for each task, based on the number of columns in the source component, type of these columns, and a lot of other parameters. It's size also depends on the runtime. As the result, different computers may have different SSIS buffer size. However, it won't exceed DefaultBufferSize. Refer to Data Flow Performance Features .

Try to adjust Data Flow Task parameters: DefaultBufferSize and DefaultBufferMaxRows.
Devart SSIS Data Flow Components product cannot change this behavior. This process is performed by SSIS itself.

This articles may be useful:
Improving data flow performance with SSIS AutoAdjustBufferSize property
Buffer Management in SSIS Data Flow

Post Reply