Page 1 of 1

Logic Required

Posted: Wed Jun 01, 2011 1:36 am
by udayk_2007
Hi friends

Can you please help me with the logic for the following requirement

Source
1
2
2
3
4
4
5

Target1
1
3
5

Target 2
2
2
4
4

Basically the records with duplicate values should go to a file and unique records should be collected in other file.

Thanks in Advance

Regards
Ulhas

Posted: Wed Jun 01, 2011 1:45 am
by jastichoudaryuu
using aggregator count rows will work

Posted: Wed Jun 01, 2011 3:26 am
by udayk_2007
Using aggregator count rows,we can separate unique and duplicate values but my requirement is directing the records to different files

I guess,using aggregator and followed by filter,we will get the output as

File1
1
3
5

File2
2
4

my requirement is

File1
1
3
5

File2
2
2
4
4

Posted: Wed Jun 01, 2011 3:41 am
by peddakkagari
Process your records into two flows main flow and changed flow

In changed flow use aggregator and find the count by grouping on key, your output from aggregator will be key,count

then use join stage and join main flow and changed flow records based on key and return count from changed flow(from aggregator)

Next use filter to route the rows based on count, if count=1 then first target else second target.

Thanks,
Sreekanth

Posted: Wed Jun 01, 2011 9:15 am
by Ravi.K
it could be sovled with the help of Fork Join concept also.

Derive the data into 2 streams. At one stream use aggregator stage and count the records.

At second stream take the count of Aggregated data as lookup for matched keys and derive count of occurences.

In Next level use a constrint to direct unique records based on count into one file and non unique records into another file.

Posted: Wed Jun 08, 2011 2:50 am
by Bicchu
I have modified the solution a little-bit:

1. Read the file with the help of a sequential file stage.

2. Next, we sort the data with help of a sort stage on the key column and set the Create Key Change Column to True.

3. Now we will be using a transformer stage to seperate the data.
In the input link of the transformer, we will be hash partioning the data and also perform asceding sort for both the columns (key column and the key change column). We will be using three stage variables in the transformer:

a. New
b. Counter
c. Old

The New and the Old stage variables will be linked by the key change column deriving from the sort stage.
And in the counter stage variable will give the following condition:

If New=0 Then 1 Else If Old = 0 and New = 1 Then 1 Else 0

In the two output links from the transformer will give the following constraints:

For Unique Record Link : Counter = 0
For Duplicate Record Link: Counter > 0

Posted: Thu Jun 09, 2011 2:56 am
by vishal_rastogi
so u have used the aggregator and u r taking the record count of it in transformer you can apply two constraint recordcount > 1 and record count = 1 through this you will get the unique records
in another link add a lookup stage and take a lookup with the same file u r using intially use lookup with drop option and link coming from transformer should be the lookup link and another one should be primary link you will get the duplicate record sin another file

Posted: Thu Jun 09, 2011 5:16 am
by ray.wurlod
U is one of our posters, and not the one who posted this question. The second person personal pronoun in English is spelled "you". The present tense of the verb "to be" is spelled "are". Please use a professional standard of written English on DSXchange; among other things doing so makes life easier for those of our participants whose first language is other than English.

Re: Logic Required

Posted: Thu Jun 09, 2011 5:42 am
by ds_dwh
Hi,

take aggregator stage and group the column, then take filter write filter condition as group by col=1-----------------------target1
group by col>1-------------------------------------target2

Re: Logic Required

Posted: Thu Jun 09, 2011 11:35 am
by soumya5891
Let Say your key col is A.
1. First group by on A to determine count in a aggregator stage.
2. In one file write count = 1
3. In another link send count > 1
4. now join the record set retrieved in 3 with the source file.
5. Write the output of the above in a file

Hope you will get the desired output