I'm using Devart SSIS Data Flow components (v 1.7.672) for SQL Server => Postgres data migration.
Trying to investigate weak performance of Devart PostgreSQL Lookup component, I profiled queries generated by it. Long story short - I got the following query (object names changed but logic stays the same):
Code: Select all
SELECT col1, col2, col3, col4
FROM my_table
WHERE (col1 = 1) OR
(col1 = 2) OR
(...)
(col1 = 10000)
Problem is, that type of query generates the following execution plan in Postgres (tested on Postgres 9.5 and 10):
Code: Select all
Bitmap Heap Scan on my_table (cost=4686.00..10491.10 rows=1014 width=12) (actual time=3.963..6.279 rows=1020 loops=1)
Recheck Cond: ((col1 = 1) OR (col1 = 2) OR (...) OR (col1 = 10000))
Heap Blocks: exact=760
-> BitmapOr (cost=4686.00..4686.00 rows=1015 width=0) (actual time=3.780..3.780 rows=0 loops=1)
-> Bitmap Index Scan on ix_on_my_table_on_col_1 (cost=0.00..4.43 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)
Index Cond: (col1 = 1)
-> Bitmap Index Scan on ix_on_my_table_on_col_1 (cost=0.00..4.43 rows=1 width=0) (actual time=0.003..0.003 rows=1 loops=1)
Index Cond: (col1 = 2)
(...)
-> Bitmap Index Scan on ix_on_my_table_on_col_1 (cost=0.00..4.43 rows=1 width=0) (actual time=0.003..0.003 rows=1 loops=1)
Index Cond: (col1 = 10000)
Now, if that query is rewritten to the following form (which produces exactly the same results):
Code: Select all
SELECT col1, col2, col3, col4
FROM my_table
WHERE col1 IN (1, 2, (...), 10000)
Code: Select all
Index Scan using ix_on_my_table_on_col_1 on my_table (cost=0.42..14.73 rows=2 width=12) (actual time=0.053..0.059 rows=2 loops=1)
Index Cond: (col1 = ANY ('{1,2,(...), 10000}'::integer[]))
?
While I understand it's a postgres issue, unfortunately in current form lookup component in unusable because of performance reasons :/