Logic Required

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
udayk_2007
Participant
Posts: 72
Joined: Wed Dec 12, 2007 2:29 am

Logic Required

Post by udayk_2007 »

Hi friends

Can you please help me with the logic for the following requirement

Source
1
2
2
3
4
4
5

Target1
1
3
5

Target 2
2
2
4
4

Basically the records with duplicate values should go to a file and unique records should be collected in other file.

Thanks in Advance

Regards
Ulhas
jastichoudaryuu
Participant
Posts: 12
Joined: Mon Jan 31, 2011 7:19 am

Post by jastichoudaryuu »

using aggregator count rows will work
anilkumar
udayk_2007
Participant
Posts: 72
Joined: Wed Dec 12, 2007 2:29 am

Post by udayk_2007 »

Using aggregator count rows,we can separate unique and duplicate values but my requirement is directing the records to different files

I guess,using aggregator and followed by filter,we will get the output as

File1
1
3
5

File2
2
4

my requirement is

File1
1
3
5

File2
2
2
4
4
peddakkagari
Participant
Posts: 26
Joined: Thu Aug 12, 2010 12:07 am

Post by peddakkagari »

Process your records into two flows main flow and changed flow

In changed flow use aggregator and find the count by grouping on key, your output from aggregator will be key,count

then use join stage and join main flow and changed flow records based on key and return count from changed flow(from aggregator)

Next use filter to route the rows based on count, if count=1 then first target else second target.

Thanks,
Sreekanth
Ravi.K
Participant
Posts: 209
Joined: Sat Nov 20, 2010 11:33 pm
Location: Bangalore

Post by Ravi.K »

it could be sovled with the help of Fork Join concept also.

Derive the data into 2 streams. At one stream use aggregator stage and count the records.

At second stream take the count of Aggregated data as lookup for matched keys and derive count of occurences.

In Next level use a constrint to direct unique records based on count into one file and non unique records into another file.
Cheers
Ravi K
Bicchu
Participant
Posts: 26
Joined: Sun Oct 03, 2010 10:49 pm
Location: India

Post by Bicchu »

I have modified the solution a little-bit:

1. Read the file with the help of a sequential file stage.

2. Next, we sort the data with help of a sort stage on the key column and set the Create Key Change Column to True.

3. Now we will be using a transformer stage to seperate the data.
In the input link of the transformer, we will be hash partioning the data and also perform asceding sort for both the columns (key column and the key change column). We will be using three stage variables in the transformer:

a. New
b. Counter
c. Old

The New and the Old stage variables will be linked by the key change column deriving from the sort stage.
And in the counter stage variable will give the following condition:

If New=0 Then 1 Else If Old = 0 and New = 1 Then 1 Else 0

In the two output links from the transformer will give the following constraints:

For Unique Record Link : Counter = 0
For Duplicate Record Link: Counter > 0
Thanks,
Pratik.
vishal_rastogi
Participant
Posts: 47
Joined: Thu Dec 09, 2010 4:37 am

Post by vishal_rastogi »

so u have used the aggregator and u r taking the record count of it in transformer you can apply two constraint recordcount > 1 and record count = 1 through this you will get the unique records
in another link add a lookup stage and take a lookup with the same file u r using intially use lookup with drop option and link coming from transformer should be the lookup link and another one should be primary link you will get the duplicate record sin another file
Vish
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

U is one of our posters, and not the one who posted this question. The second person personal pronoun in English is spelled "you". The present tense of the verb "to be" is spelled "are". Please use a professional standard of written English on DSXchange; among other things doing so makes life easier for those of our participants whose first language is other than English.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ds_dwh
Participant
Posts: 39
Joined: Fri May 14, 2010 6:06 am

Re: Logic Required

Post by ds_dwh »

Hi,

take aggregator stage and group the column, then take filter write filter condition as group by col=1-----------------------target1
group by col>1-------------------------------------target2
ANJI
soumya5891
Participant
Posts: 152
Joined: Mon Mar 07, 2011 6:16 am

Re: Logic Required

Post by soumya5891 »

Let Say your key col is A.
1. First group by on A to determine count in a aggregator stage.
2. In one file write count = 1
3. In another link send count > 1
4. now join the record set retrieved in 3 with the source file.
5. Write the output of the above in a file

Hope you will get the desired output
Soumya
Post Reply