surrogate key generation and linking hdr and detail record

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsuser1
Participant
Posts: 14
Joined: Thu Oct 16, 2003 5:58 pm

surrogate key generation and linking hdr and detail record

Post by dsuser1 »

Hi,

I have a couple of files to be loaded to UDB data base. The files to be loaded are like order and order details. Order detail file have the corresponding order key value of the order in each of the order detail record. Now in the data base I will need to load the master order to a order master table. I need to generate a surrogate key as the primary for this table. Also the generated surrogate key will be the foriegn key in the order detail table.
Now to do this loading, one method is I load the order table first. Then while loading the order detail, look up to order table with the order key to get the corresponding surrogate key. But since we have large volume of data to be processed, this will increase load time.

Is there any other methods to do this loading to avoid this database look up and get the surrogate ID passed to detail transaction records in the same job?(or any good solution). The input is flat files and target is UDB database. (Also if the DB foreign key constraint is disabled the header and detail can be loaded in parallel right??)

Please throw some light..

Thanks.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Process the parent first and assign the surrogate for each order number and create two output datasets: one a complete dataset of the ready to load data and a second dataset of order number and surrogate key.

Process the children (detail rows) next, merging on order number the previously stored dataset. Create a second dataset of the ready to load data.

If your error counts, integrity checks, and everything else is in order, then load the parents, followed by the children.

I do have to ask, do you have all transactions for the order, or could some transactions come in the next day? If this is the case, you have to retrieve the currently assigned surrogate key for the order number. This requires you to have to address your holistic framework. It will mean you will want to pre-scan your source data collecting the natural keys (order numbers). You'll take that distinct list and join it to your target on natural key, extracting pre-existing orders from the warehouse and staging it a dataset. You'll then have to change your job designs to not assume the order is original, but check to see if the dataset contains a previously assigned surrogate key and use it else assign a new one. Now, processing your details will work because the parent cross reference dataset is truly complete.

Ahhhh, data staging. You gotta love Kimball.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply