PostgreSQL lookup - inefficient query
Posted: Thu 12 Jul 2018 10:27
Hi,
I'm using Devart SSIS Data Flow components (v 1.7.672) for SQL Server => Postgres data migration.
Trying to investigate weak performance of Devart PostgreSQL Lookup component, I profiled queries generated by it. Long story short - I got the following query (object names changed but logic stays the same):
- there are 10000 OR conditions (this seems to be batch size for Lookup component).
Problem is, that type of query generates the following execution plan in Postgres (tested on Postgres 9.5 and 10):
Basically, for those 10000 conditions I get 10000 index scans which is.... suboptimal to say the least ;)
Now, if that query is rewritten to the following form (which produces exactly the same results):
I get way better plan - one that executes in milliseconds.
My question is - any chance for changes in Lookup component for Postgres that would generate <column> in (all values) query instead of <column> = value OR <column> = value OR....
?
While I understand it's a postgres issue, unfortunately in current form lookup component in unusable because of performance reasons :/
I'm using Devart SSIS Data Flow components (v 1.7.672) for SQL Server => Postgres data migration.
Trying to investigate weak performance of Devart PostgreSQL Lookup component, I profiled queries generated by it. Long story short - I got the following query (object names changed but logic stays the same):
Code: Select all
SELECT col1, col2, col3, col4
FROM my_table
WHERE (col1 = 1) OR
(col1 = 2) OR
(...)
(col1 = 10000)
Problem is, that type of query generates the following execution plan in Postgres (tested on Postgres 9.5 and 10):
Code: Select all
Bitmap Heap Scan on my_table (cost=4686.00..10491.10 rows=1014 width=12) (actual time=3.963..6.279 rows=1020 loops=1)
Recheck Cond: ((col1 = 1) OR (col1 = 2) OR (...) OR (col1 = 10000))
Heap Blocks: exact=760
-> BitmapOr (cost=4686.00..4686.00 rows=1015 width=0) (actual time=3.780..3.780 rows=0 loops=1)
-> Bitmap Index Scan on ix_on_my_table_on_col_1 (cost=0.00..4.43 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)
Index Cond: (col1 = 1)
-> Bitmap Index Scan on ix_on_my_table_on_col_1 (cost=0.00..4.43 rows=1 width=0) (actual time=0.003..0.003 rows=1 loops=1)
Index Cond: (col1 = 2)
(...)
-> Bitmap Index Scan on ix_on_my_table_on_col_1 (cost=0.00..4.43 rows=1 width=0) (actual time=0.003..0.003 rows=1 loops=1)
Index Cond: (col1 = 10000)
Now, if that query is rewritten to the following form (which produces exactly the same results):
Code: Select all
SELECT col1, col2, col3, col4
FROM my_table
WHERE col1 IN (1, 2, (...), 10000)
Code: Select all
Index Scan using ix_on_my_table_on_col_1 on my_table (cost=0.42..14.73 rows=2 width=12) (actual time=0.053..0.059 rows=2 loops=1)
Index Cond: (col1 = ANY ('{1,2,(...), 10000}'::integer[]))
?
While I understand it's a postgres issue, unfortunately in current form lookup component in unusable because of performance reasons :/