Combining multiple column and record data into single output

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mgray412
Participant
Posts: 12
Joined: Tue Jan 18, 2011 10:18 am

Combining multiple column and record data into single output

Post by mgray412 »

I have a problem where I am trying to get the unique values of
multiple columns from a group of records and store the
concatenated unique values into a single column in the output.
In the example below, if any of the values of col1 - col5 change
within any group of records (d-num) then the output would be an error
record. Records for d-num 10 and 20 would be errors where d-num 30
records would be acceptable. Any ideas on how to do this?

I am using DataStage v8.5 Parallel job.

Input records:

Rec d-num col1 col2 col3 col4 col5
1 10 a a a a a
2 10 a a a
3 10 a b b a
4 10 c c c c c

5 20 d d d
6 20 e
7 20

8 30 h h h h
9 30 h h h
10 30 h h

Output should be:

Rec d-num outputcol
1 10 a b c
2 20 d e


Thanks in advance for your help
tbharathkumar
Participant
Posts: 26
Joined: Mon Aug 27, 2007 6:27 am
Location: Des Moines

Re: Combining multiple column and record data into single ou

Post by tbharathkumar »

Quick design,

---- remove dup (Col1)-----
---- remove dup (Col2)-----
input ---> copy ---- remove dup (Col3)--- transformer- pivot ----> Output
---- remove dup (Col4)-----
---- remove dup (Col5)-----

This may help, Pls improve the design.
Regards,
Bharath Tipirisetty
Post Reply