Need to drop records

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Sort your data so that the combination of F1/F2 values that trigger a drop will appear in the first record for a given value of keycolumn. In a transformer after the sort, keep track of key changes. When the key changes, examine F1 & F2 for the the drop condition and set a stage variable flag accordingly. Use a constraint to keep or drop records based on the flag.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
soumya5891
Participant
Posts: 152
Joined: Mon Mar 07, 2011 6:16 am

Post by soumya5891 »

You can do this in the following way also
1. First use a sort and then use a remove duplicate to retain the first row
2. Then use a filter with the following conditions f1<>"N" and f2<>"N".
Soumya
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

No, that will not meet the requirements he has specified. He needs to completely remove only the keys which have at least one record where f1 and f2 are both equal to 'N'. All other keys will be left alone (no drops when 'N' and 'N' doesn't exist for a key).

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
peddakkagari
Participant
Posts: 26
Joined: Thu Aug 12, 2010 12:07 am

Post by peddakkagari »

1. Process your source data into two flows(mainflow and change flow) from transformer stage
2. add dummy=1 in tranformer for changed flow
3.put sort stage in changed flow by sorting on the 3 columns(key,f1 and f2)
4.put filter using the condition f1=f2=N, then the output of this stage will be having the key's with both f1 and f2=N
5. Use Left outer join between main flow and changed flow, output of this join will be 3 columns from main flow and dummy column from changedflow
6.Use transformer and add conditon as dummy<>1 and send the output to target
Tejas Pujari
Participant
Posts: 14
Joined: Thu Jul 10, 2008 7:37 am
Location: mumbai

Re: Need to drop records

Post by Tejas Pujari »

Step 1:

Usea filter stage apply condition f1 = N and f2= N
you will get all key columns having both non key columns value N.

Step 2:

use look up stage with step1 out put as reference. join on the Key column
use reject link for lokkup failure.
the data in Reject file will be your desired output.
Post Reply