Comparing complete records in Datastage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nvuradi
Participant
Posts: 54
Joined: Tue Dec 12, 2006 11:03 pm

Comparing complete records in Datastage

Post by nvuradi »

Hi



I have the following columns in my input dataset which is sorted on ColA



colA colB colC colD colE



The data is such that for a single colA it can have multiple values of other columns, for eg the data can be



colA colB colC colD colE

1 1 1 1 1

1 2 1 1 1

1 3 1 1 1

1 4 1 1 1

1 5 1 1 1

2 1 1 1 1

2 2 1 1 1

2 3 1 1 1

2 4 1 1 1

2 5 1 1 1





I need to check the records based on ColA, for value of ColA(1) I need to check the other columns for the next value of colA(2) and so on. If other columns (col B,C,D,E) match exactly for all the colA values then I need to pass only on record to output.

I need to check whether all the records with value of ColA as 1 matched with the all the records with Value of ColA as value other than 1





Basically, I need to compare the current and previous complete records (consisting of 5 columns). If they match, then I have to pass only one such occurrence of the record.



Consider the second scenario

colA colB colC colD colE

1 1 1 1 1

1 2 1 1 1

1 3 1 1 1

1 4 1 1 1

1 5 1 1 1

2 1 1 1 1

2 2 1 1 1

2 3 1 1 1





In this case even though the columns B, C, D, E match for different values of ColA but not all the rows of ColA with value 1 matched with the colA with Value 2, so in this case I need to output 2 records each for different values of ColA.

Any idea how to go about this?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Have you reviewed the capabilities of the Compare stage?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
nvuradi
Participant
Posts: 54
Joined: Tue Dec 12, 2006 11:03 pm

Post by nvuradi »

ray.wurlod wrote:Have you reviewed the capabilities of the Compare stage? ...
Compare stage compares the data from 2 different input links...and my requirement is to compare the data from the same input file.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Misunderstood. You can do it with stage variables in a Transformer stage, constraining the output for an exact match with what's in the buffer. You could probably leverage the Remove Duplicates stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply