Page 1 of 1

Multiple Keys join of Files

Posted: Tue Nov 15, 2005 8:30 am
by maffan76
Hi,
I have a requirement that i want to join 2 data files having multiple keys a primary key.

I have read posts suggesting make one file as HASH and then joining with other file but as per documentation HASH file can have only 1 column as key. So how would i achieve this?

Secondly, Whats the best practice in DS of joining multiple tables as i am using TD and the Stage for fecthing large volume is not that efficient.

What if the size of the table is quite huge then whats the best strategy of joining these kind of tables.

Thanks in advance.

Re: Multiple Keys join of Files

Posted: Tue Nov 15, 2005 8:35 am
by chulett
maffan76 wrote:I have read posts suggesting make one file as HASH and then joining with other file but as per documentation HASH file can have only 1 column as key.
Not sure where you read that, but it is incorrect... hashed file stages support multiple key columns.

Posted: Tue Nov 15, 2005 8:57 am
by maffan76
Below is the line from the Server JOb Developer Guide/HASH file Stage
The primary role of a Hashed File stage is as
a reference table based on a single key field.
I think its says about the primary function, i think then i have mistaken it.

Posted: Tue Nov 15, 2005 8:59 am
by chulett
Exactly. While that may be the 'primary' function, there are many secondary functions it supports and these can include multiple key fields.

Posted: Tue Nov 15, 2005 9:20 am
by maffan76
Thanks for the clarification,

Now coming to the second part of the question is that if have to join source table data coming in a file with some target table in database, whats the optimal way of doing it, i mean one ways is to make a hash file and then join those or is there anyother way of doing it?

eventually have to generate a surrogate key of for the new records and assign exisitng generated key to the existing record while joining the 2 data sets with multiple keys as joining criteria.

the size of the existing table is quite big and will continue to grow so how do we handle this kind of situation in DS because exporting the data on every run of the job to HASH file will be time consuming.