how to populate duplicate records in target

venugopal.123 · Post by **venugopal.123** » Mon Jun 28, 2010 12:16 am

Hi All,

I have source like...

EmpName EmpNo Sal
venu 123 10000
raju 213 20000
venkat 423 30000

O/p i want like this...

EmpName EmpNo Sal
venu 123 10000
raju 213 20000
venkat 423 30000
venu 123 10000
raju 213 20000
venkat 423 30000

How can i get duplicate records in target...
Some times i need to populate only some columns for Ex:

EmpName EmpNo Sal
venu 123 10000
raju 213 20000
venkat 423 30000
venu
raju
venkat

Like this how it is possible....

ray.wurlod · Post by **ray.wurlod** » Mon Jun 28, 2010 12:36 am

Welcome aboard.

The easiest way is to:

add an artificial key to the data using a Column Generator stage

create a copy of the data stream using a Copy stage

join the two copies using a Join stage (take care to separate column names)

ersunnys · Post by **ersunnys** » Mon Jun 28, 2010 1:34 am

Hi,

Strange requirement...

There is an alternate way too...

just simply use a copy stage, with entire partition and this will create as many duplicates of you input depending upon you number of nodes in APT Config File. If your APT Config file is of two nodes, this will create 2 records per input... if four nodes, then 4 records and so on...

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Mon Jun 28, 2010 2:15 am

Entire partition has 2 issues
1.) Must either restrict to 2 nodes in config file or within the stage where duplication must happen
2.) Cannot create option 2, where only selected columns must be duplicated.

udayk_2007 · Post by **udayk_2007** » Mon Jun 28, 2010 3:47 am

hi

you can also get the duplicates using the 'Create Cluster Key Column' Option in Sort Stage. This option set the Cluster Key column value as 1 for first record of each group and remaining records of that group will have Cluster Key Column value as 0. Depending upon your requirement of which records to capture,chose the sorting order as asc or desc.

After Sort Stage,you can put a filter stage to get all records with cluster column value as 0.This way you can capture the duplicates.

Regards
Ulhas

ray.wurlod · Post by **ray.wurlod** » Mon Jun 28, 2010 3:23 pm

My solution does not require sorting.