which stage to use

prasad111 · Post by **prasad111** » Thu May 31, 2007 12:39 pm

Hi All,

How to implement this

Ipt_Col1 | Ipt_Col2 | Ipt_Col3

1 | XYZ | 104
1 | ABC | 207
1 | DEF | 302
2 | MNO | 403
2 | PQR | 560
2 | UVW | 30

Transformation rule is, whenever there is a ABC in Ipt_Col2 then the value associated with ABC is replaced in all the columns of Ipt_Col3 with the corresponding column of Ipt_Col1.

In the above file there is a combination of

1 | ABC | 207

so the combination of the above will become

1 | XYZ | 207
1 | ABC | 207
1 | DEF | 207
the rest will be
2 | MNO | 0
2 | PQR | 0
2 | UVW | 0

The complete output should look like

Opt_Col1 | Opt_Col2 | Opt_Col3

1 | XYZ | 207
1 | ABC | 207
1 | DEF | 207
2 | MNO | 0
2 | PQR | 0
2 | UVW | 0

Thank you for your responses
Prasad

DSguru2B · Post by **DSguru2B** » Thu May 31, 2007 12:44 pm

SO in the entire dataset, if you find ABC in col2, then change col3 value of all associated with identical first col, and the rest of the data will have zero? Did I understand you correctly?

prasad111 · Post by **prasad111** » Thu May 31, 2007 12:51 pm

DSguru2B wrote:SO in the entire dataset, if you find ABC in col2, then change col3 value of all associated with identical first col, and the rest of the data will have zero? Did I understand you correctly?

Yes thats correct.

To more elaboarte, I have added one more set of example
Input
1 | XYZ | 207
1 | ABC | 207
1 | DEF | 207
2 | MNO | 0
2 | PQR | 0
2 | UVW | 0
3 | HIJ | 412
3 | ABC | 12
3 | LMN | 5123

Output
1 | XYZ | 207
1 | ABC | 207
1 | DEF | 207
2 | MNO | 0
2 | PQR | 0
2 | UVW | 0
3 | HIJ | 12
3 | ABC | 12
3 | LMN | 12

Thanks
Prasad

DSguru2B · Post by **DSguru2B** » Thu May 31, 2007 1:04 pm

A single stage wont be enough. But you can design a job to do this.
Pass your dataset through filter stage and filter on the second column value = "ABC". This will give you the entire record for every key that has ABC. Load into a lookup file set if data size is small else load it into a dataset (say FilteredResult).
Now start with your original dataset and do a lookup on "FilteredResult". If there is a match, change the third column of input dataset with the third column of "FilteredResult". The joining key will be the first and second column.
The records that fail the join, replace their third column value with 0.