Validate all the column data in a Table

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Terala
Premium Member
Premium Member
Posts: 73
Joined: Wed Apr 06, 2005 3:04 pm

Validate all the column data in a Table

Post by Terala »

Hi,
I have a job where i get input in flat file, i need to check the input data againest target DB2 table based on key values, if input row exists in DB2 table validate all the columns data, if any difference - update it in DB2 table, if row not exists insert as a new row.

Thanks in advance!
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Are you performing a slowly-changing-dimension?

Search for that topic in the forum for more info.
Terala
Premium Member
Premium Member
Posts: 73
Joined: Wed Apr 06, 2005 3:04 pm

Post by Terala »

no we are not using slowly-changing-dimension.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Even then, you can search for that topic as it contains the steps to achieve your requirement.

You may have to decide on which columns you wish to track because finding changes on even one column - especially if it is a varchar - can be tedious and time consuming during execution.
ketfos
Participant
Posts: 562
Joined: Mon May 03, 2004 8:58 pm
Location: san francisco
Contact:

Post by ketfos »

Hi,
You can use in the taget stage - Update Action - Update Existing rows or Insert News.

Thks
Ketfos
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

I agree with keftos. If you want to update change to even one column, you may be better-off performing 'insert new or update existing rows' or vice-versa as that will give you the flexibility to track any change in any column.
Terala
Premium Member
Premium Member
Posts: 73
Joined: Wed Apr 06, 2005 3:04 pm

Post by Terala »

Thanks for the info,
But the solution to use Update or Insert option will hit the data base for each and every record, and i guess sometimes for each record multiple times.. so, i believe it is extensive DB call process. Please advise me if i am wrong.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

As you are attempting to check for any changes in even one column, you are in a better situation to decide this.

Also you can improve performance using hash files, checksum etc. Search this forum for more info.
elavenil
Premium Member
Premium Member
Posts: 467
Joined: Thu Jan 31, 2002 10:20 pm
Location: Singapore

Post by elavenil »

Based on key value in the input file do a left outer join (Select all records from your input) and matching rows from target table. Based on the value of key columns the row existence can be determined and find the difference in existing records using Change capture stage and update the records if there is any change. new records can be inserted into target table.

HTWH.

Regards
Saravanan
lebos
Participant
Posts: 74
Joined: Mon Jan 20, 2003 10:02 am
Location: USA

Post by lebos »

1. Extract all (or some portion if you can predetermine what portion you will need) data from your target table and create two hash files: KeyHash with only key fields, and CDCHash with all fields (and with all defined as key fields).

2. To determine if a row already exists, do a lookup against the KeyHash file in one Transform.

3. If a row already exists, to validate it against current data do a lookup against the CDCHash in another Transform.

This method is detailed in the Best Practices class and document. Searching this forum and/or the ADN forum might find more detail also.

Good luck!!

Larry
Post Reply