Need help in logic implementation

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
moupanja
Premium Member
Premium Member
Posts: 6
Joined: Thu Aug 31, 2006 6:25 am

Need help in logic implementation

Post by moupanja »

I'm working on a very tricky requirement but I'm really not getting any idea on how to go about it.
Here is the scenario~

Input File:

Bank_Name Cycle AccountNo State Status Balance
ABC 1 123 NJ P 400
ABC 1 234 NJ P 789
ABC 1 567 NJ X 678
ABC 1 567 NJ C 678
ABC 1 789 NJ C 999
ABC 1 666 MI V 899
EFG 1 777 NA C 777
Condition File:

Case_No Condition
-------- ---------
CA1 State=NJ
CA2 Status=C
CA3 Status=V


Output:

Bank_Name Cycle CaseNo AccountNo
ABC 1 CA1 123
ABC 1 CA1 234
ABC 1 CA2 567
ABC 1 CA2 789
ABC 1 CA3 666
EFG 1 CA2 777

Basically we have to fetch 2 records per bank per cycle per criteria. Thus even though row 2 was satisfying condition CA1 but it is not propagated in the output because we have already 2 such cases for that particular bank and cycle.
Row 3 & 4 were satisfying condition CA1 and CA2 both, but since we have already fetched 2 records for CA1, row 3 &4 will be tagged to the next condition that it satisfies, in this case it is CA2 and so on.
Please help me out!
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

This can be done using sort and transformer, you can assign count based on keys (bank, cycle, criteria) using stage variables and then release only records having count as 1 or 2.

If you don't want to fetch data, assuming you have oracle 10g or above, you can use rank function over partition by clause to do the same thing. but this depends on the database you are using.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
moupanja
Premium Member
Premium Member
Posts: 6
Joined: Thu Aug 31, 2006 6:25 am

Post by moupanja »

Sorry my input file looks like this~
Bank_Name Cycle AccountNo State Status Balance
ABC 1 123 NJ P 400
ABC 1 234 NJ P 789
ABC 1 888 NJ X 878
ABC 1 567 NJ C 678
ABC 1 789 NJ C 999
ABC 1 666 MI V 899
EFG 1 777 NA C 777

Thus even though row 3 was satisfying condition CA1 but it is not propagated in the output because we have already 2 such cases for that particular bank and cycle.
Row 4 & 5 were satisfying condition CA1 and CA2 both, but since we have already fetched 2 records for CA1, row 4&5 will be tagged to the next condition that it satisfies, in this case it is CA2 and so on.

Both the input and output files are flat files, we are not handling with any database as such.
The input file is very huge and we have been given a condition file which has around 400 such criterias.
My question is how can I tag a criteria for each record in input and then filter out just 2 records per bank per cycle per criteria and load it in the output.
I have tried out with sort and transformer but o/p is not coming as expected.
I've first sort with bank_name and cycle. Then declared 3 stage variables :
ChkCaseNo= If STATE='NJ' and counter <=2 Then 'CA1' Else
If STATUS='C' and counter <=2 Then 'CA2' Else
If STATUS='V' and counter <=2 Then 'CA3' Else 'NA'
PrevCaseNo= ChkCaseNo
Counter= If PrevCaseNo=ChkCaseNo Then (Counter+1) Else 0

But o/p is not coming as expected.
Thanks
Moumita
Post Reply