Data Comparision

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
suri
Participant
Posts: 24
Joined: Tue May 25, 2004 12:17 am
Location: piscataway
Contact:

Data Comparision

Post by suri »

Hi ,
Does anybody tell me to how to do compare the source and target data and reject the records into a file which does not fond a match and also the count of the number or records matched and the count of the records that are not matched.
Thanks
ketfos
Participant
Posts: 562
Joined: Mon May 03, 2004 8:58 pm
Location: san francisco
Contact:

Post by ketfos »

Hi,

You can have two inputs to transformer.
1. Source table
2. Target table (as a lookup)

Match the keys in two tables
Use Isnull fucntion on this match

have two outputs from transformer
1. Match records
2. Non match records

Depending upon output of Isnulll fucntion direct to match or not match file.
use @outrownum fucntion to find records written to each link.


Ketfos
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How is the match to be performed? For the sake of example, let's assume only the key is to be matched.

Optionally pre-load the target's keys into a hashed file. This is for performance, not functionality.

Construct a job that reads the source, performs a lookup against the target, and writes to either of two outputs ("matched" and "unmatched").

To determine which output link to use, constraint expressions are used to determine whether the lookup succeeded - you can use the input link variable NOTFOUND, or an IsNull on the reference input key column.

DataStage itself gives you the count; you can include @OUTROWNUM as a column on each link (the final row contains the count), you can get the counts from the job log, you can write routines that determine the link row counts.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
suri
Participant
Posts: 24
Joined: Tue May 25, 2004 12:17 am
Location: piscataway
Contact:

Post by suri »

Thanks for the help,
Here I need to do the comparision between each and every column of the source table with each and every column of the target table and if there is any difference in atleast column also ,the whole record should be rejected .
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

You can extract the records into a seq file and use a Unix 'diff' or 'cmp' command.

Alternatively, you can lookup source records into the target using the key and matching the checksum value.
amsh76
Charter Member
Charter Member
Posts: 118
Joined: Wed Mar 10, 2004 10:58 pm

Post by amsh76 »

How about comparing key columns and running, CRC32 on Non key columns. The best way would be Unix Cmp or Diff command...if files.

Regards
Post Reply