how to populate duplicate records in target

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
venugopal.123
Participant
Posts: 9
Joined: Sun Jun 27, 2010 11:52 pm

how to populate duplicate records in target

Post by venugopal.123 »

Hi All,

I have source like...

EmpName EmpNo Sal
venu 123 10000
raju 213 20000
venkat 423 30000

O/p i want like this...

EmpName EmpNo Sal
venu 123 10000
raju 213 20000
venkat 423 30000
venu 123 10000
raju 213 20000
venkat 423 30000


How can i get duplicate records in target...
Some times i need to populate only some columns for Ex:

EmpName EmpNo Sal
venu 123 10000
raju 213 20000
venkat 423 30000
venu
raju
venkat

Like this how it is possible....
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

The easiest way is to:
  • add an artificial key to the data using a Column Generator stage

    create a copy of the data stream using a Copy stage

    join the two copies using a Join stage (take care to separate column names)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ersunnys
Participant
Posts: 29
Joined: Wed Sep 13, 2006 1:39 pm
Location: Singapore

Post by ersunnys »

Hi,

Strange requirement... :wink:

There is an alternate way too...

just simply use a copy stage, with entire partition and this will create as many duplicates of you input depending upon you number of nodes in APT Config File. If your APT Config file is of two nodes, this will create 2 records per input... if four nodes, then 4 records and so on...
Regards,
Sunny Sharma.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Entire partition has 2 issues
1.) Must either restrict to 2 nodes in config file or within the stage where duplication must happen
2.) Cannot create option 2, where only selected columns must be duplicated.
udayk_2007
Participant
Posts: 72
Joined: Wed Dec 12, 2007 2:29 am

Post by udayk_2007 »

hi

you can also get the duplicates using the 'Create Cluster Key Column' Option in Sort Stage. This option set the Cluster Key column value as 1 for first record of each group and remaining records of that group will have Cluster Key Column value as 0. Depending upon your requirement of which records to capture,chose the sorting order as asc or desc.

After Sort Stage,you can put a filter stage to get all records with cluster column value as 0.This way you can capture the duplicates.

Regards
Ulhas
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

:idea:
My solution does not require sorting.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply