Import data into two tables

FCS · Post by **FCS** » Tue 12 Jun 2018 21:25

Hello,

I have a file with data containing info of person/firm and its address. Now I want to divide this data to two tables: one for persons, second for addresses. These two tables have relation by Address_ID constraint. Persons may have the same address. Due this I have to check if address exist in the address table before adding person data. If address exist I get its ID in the other case I must add new address and get its ID. At this moment I do this manually (record by record) but it takes a lot of time.
What is the best solution for this ?

Regards
Michal

Stellar · Post by **Stellar** » Tue 19 Jun 2018 08:03

Unfortunately, before adding user data it is impossible to disable the check whether there is an address in the address table.
When performing a search in DataSet, you can use a filter, which works faster than search of record by record, for example:

Code: Select all

UniQuery1.Filter := ' AddressName = ' + QuotedStr('Address');
UniQuery1.Filtered := True;
if not UniQuery1.Eof then
//find address

To speed up the address search in the address table on the server, you can add an index to the address table for the address name field.

FCS · Post by **FCS** » Tue 19 Jun 2018 08:27

Hello,

Thank you.

Can filter work on a few fields in the record (town, street, house, flat) ?

I use PostgreSQL. At this moment address fields have type Varchar(60) and index is built on them. The index is UNIQUE. Will it works fast when I change Varchars to Chars ?

At this moment I have a separated function which looks for the address before adding a person. It makes a separate Uniquery and asks for an address.

I can change it for a global data set which will be read all addresses, and looks for an address before adding person. If no address will be found I will add it by append method. Should I refresh the global address query after adding new record to make it visible for next search ?

I consider use the VirtualTable to store existing addresses too. After adding the address to database I can add it with its ID into VT. Next I will search an address in the VT first. This will be used only on importing data from old database.

Regards
Michal

azyk · Post by **azyk** » Fri 22 Jun 2018 07:03

You can use a few fields with a filter condition. Use SQL-92 syntax for this. For example:

Code: Select all

town = 'town1' and (street = 'stree1' or street = 'stree2' ) and house = 2 and  flat in (10, 20, 30)

The performance issue of PostgreSQL server indexes built on different data types is not related to UniDAC functionality. Please, ask PostgreSQL developers this question.

When calling the TUniQuery.Append method, UniDAC will first add a record to the table on the server. If PostgreSQL responds that the record is successfully added, then UniDAC will add it to the dataset. In this case, the Refresh method is not required. If the record meets a condition of the local filter, then you will see it immediately.

FCS · Post by **FCS** » Sun 24 Jun 2018 17:23

Hello,

Thanks for answers.

Which method is faster UniQuery.Locate or UniQuery.Filter with UniQuery.Filtered ?

Regards
Michal

azyk · Post by **azyk** » Tue 26 Jun 2018 08:24

Searching for a record in datatset and records filtering of the whole dataset are different features. Therefore, the question which of them is faster is incorrect.

It seems that in you custom task, at least three ways can be used:

1. Using TUniQuery.Locate
2. Using TUniQuery.Filter with TUniQuery.Filtered
3. Combining the first and the second ways

Compose a small test project in which the same task is solved using the first, second and third methods. Perform testing in your environment. According to the results you will see which of the ways has some or other advantages.

FCS · Post by **FCS** » Sun 01 Jul 2018 16:05

Hello,

Thanks for your advice.

On the same imported data I tested these options:

1. Addresses are read into UniQuery and indexed by needed fields, and then I set "Filtered" on these fields = time about 16:53 min (The address is adding if doesn't exist)

2. Addresses are read into UniQuery and indexed by needed fields, and then I use "Locate" method on these fields = time about 14:53 min (The address is adding if doesn't exist)

3. Addresses are read into UniQuery and indexed by ID's. I create Virtual Table with 3 fields (ID, key, status). The key field is filled by a string which is unique searching index (key=Town+'|'+Street+'|'+House+'|'+Flat+'|'). Initially VT has data coresponding to the UniQuery. After reading an address from the old database, I locate it by the key in the VT. If the address exists, I get it's ID and I using it for the person. If the address doesn't exist I append a new record into UniQuery, I reading it's ID, next I create a new record in VT and I set the status field into "1". All the process is in transaction, so when transaction is committed I change status field to "0" - when there is an error and transaction is aborted I change the ID field in VT to '0' for records with status =1.
In this option the time is about 6:31 min.

Regards
Michal

Devart Forums

Import data into two tables

Import data into two tables

Re: Import data into two tables

Re: Import data into two tables

Re: Import data into two tables

Re: Import data into two tables

Re: Import data into two tables

Re: Import data into two tables