Logic required

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ajithaselvan
Participant
Posts: 75
Joined: Mon Jul 12, 2010 4:11 am
Location: Chennai

Logic required

Post by ajithaselvan »

Hi,
below is my input.
CUS_TYPE,CUS_ID,REC_TYPE,DATE,DETAIL
C, 123, 1, 2011/02/28, XXXA
C, 123, 2, 2011/02/28, XXXB
C, 123, D, 2011/02/28, XXXC
C, 123, D, 2011/02/28, XXXD
C, 100, 1, 2011/02/28, AAA
C, 100, 2, 2011/02/28, AAA1
C, 100, D, 2011/02/28, AAA2

based on filed(CUS_ID),i need to do look up on other table
and fetch few fields (Eg. BNK,PORT,DESC).

my output should be:
CUS_TYPE,CUS_ID,REC_TYPE,DATE,DETAIL
C, 123, 1, 2011/02/28, XXXA
C, 123, 2, 2011/02/28, XXXB
C, 123, D1, 2011/02/28, XXXC
C, 123, D2, 2011/02/28, XXXD
C, 123, K1, 2011/02/28, 0001
C, 123, K2, 2011/02/28, N2
C, 123, K3, 2011/02/28, YYYY

C, 100, 1, 2011/02/28, AAA
C, 100, 2, 2011/02/28, AAA1
C, 100, D1, 2011/02/28, AAA2
C, 100, K1, 2011/02/28, 0005
C, 100, K2, 2011/02/28, N4
C, 100, K3, 2011/02/28, ZZZ

Note: K1,K2,K3 are the fields BNK,PORT,DESC respectively retrieved from look up table and its corresponding vales are populated in the filed:DETAIL.

Condition:INPUT
1. There should be alteast 3 rows for single CUS_ID.The first row will be
populated with the value '1' (field:REC_TYPE).The second row will be
populated with the value '2' (field:REC_TYPE) and rest of the rows for
same CUS_ID will be populated as 'D' (field:REC_TYPE).

Condition:OUTPUT
1. For the field REC_TYPE, if we find the value '1' or '2' , we need to
keep as it is. If we find the value 'D',for the first row it should be D1
and for the rest of the rows for the same CUS_ID, it should be
incremented by one
2. For the new fields retrieved from look up table, we need to add the
value K1 (for the look up field:BNK, and it should be incremented for
rest of the fields such as K2,K3) and its value will be populated
in the field:DETAIL


Kindly help me to resolve it




Thanks in advance
Ajitha S
nayanpatra
Participant
Posts: 41
Joined: Sat Jun 06, 2009 11:13 pm
Location: Kolkata

Post by nayanpatra »

1. Filter all the records according to the REC_TYPE.

LINK1:(REC_TYPE = '1')
CUS_TYPE,CUS_ID,REC_TYPE,DATE,DETAIL
C,123,1,2011/02/28,XXXA
C,100,1,2011/02/28,AAA

LINK2:(REC_TYPE = '2')
CUS_TYPE,CUS_ID,REC_TYPE,DATE,DETAIL
C,123,2,2011/02/28,XXXB
C,100,2,2011/02/28,AAA1

LINK3:(REC_TYPE = 'D')
CUS_TYPE,CUS_ID,REC_TYPE,DATE,DETAIL
C,123,D,2011/02/28,XXXC
C,123,D,2011/02/28,XXXD
C,100,D,2011/02/28,AAA2

2. Provide the LINK3 into a transformer stage and define 3 stage variables.

SV1 : CURR_KEY
SV2 : If SV1 = SV3 Then SV2 + 1 Else 1
SV3 (initialize it to NULL) : PREV_KEY = CURR_KEY

Feed this SV2 value into REC_TYPE as 'D' : SV2

CUS_TYPE,CUS_ID,REC_TYPE,DATE,DETAIL
C,123,D1,2011/02/28,XXXC
C,123,D2,2011/02/28,XXXD
C,100,D1,2011/02/28,AAA2

3. Funnel LINK1, LINK2 and output of transformer stage into DS1 dataset.

CUS_TYPE,CUS_ID,REC_TYPE,DATE,DETAIL
C,123,1,2011/02/28,XXXA
C,100,1,2011/02/28,AAA
C,123,2,2011/02/28,XXXB
C,100,2,2011/02/28,AAA1
C,123,D1,2011/02/28,XXXC
C,123,D2,2011/02/28,XXXD
C,100,D1,2011/02/28,AAA2

This ensures the first output condition is met.

4. Remove duplicate on CUS_ID in LINK1, LINK2 and output of transformer stage and perform an inner join on CUS_ID.

CUS_ID
123
100

5. Left outer join with DS1 and aggregate all the records.

CUS_TYPE,CUS_ID,DATE,COUNT
C,123,2011/02/28,4
C,100,2011/02/28,3

6. Filter the records for which the count is greater than 3.

CUS_TYPE,CUS_ID,DATE
C,123,2011/02/28
C,100,2011/02/28

Write it into DS2. This ensures (4 through 6) the input condition is met.

7. Feed the output into a transformer, output 3 links and hardcode the REC_TYPE to K1 (for BNK), K2 (for PORT) and K3 (for DESC).

TLNIK1:
CUS_TYPE,CUS_ID,REC_TYPE,DATE
C,123,K1,2011/02/28
C,100,K1,2011/02/28

TLNIK2:
CUS_TYPE,CUS_ID,REC_TYPE,DATE
C,123,K2,2011/02/28
C,100,K2,2011/02/28

TLNIK3:
CUS_TYPE,CUS_ID,REC_TYPE,DATE
C,123,K3,2011/02/28
C,100,K3,2011/02/28

8. Perform lookups for TLINK1, TLINK2 and TLINK3 based on CUS_ID and populate corresponding values in the DETAIL field.

TLNIK1:
CUS_TYPE,CUS_ID,REC_TYPE,DATE,DETAIL
C,123,K1,2011/02/28,0001
C,100,K1,2011/02/28,0005

TLNIK2:
CUS_TYPE,CUS_ID,REC_TYPE,DATE,DETAIL
C,123,K2,2011/02/28,N2
C,100,K2,2011/02/28,N4

TLNIK3:
CUS_TYPE,CUS_ID,REC_TYPE,DATE,DETAIL
C,123,K3,2011/02/28,YYYY
C,100,K3,2011/02/28,ZZZ

9. Now left outer join DS2 (Driver) with DS1 and funnel all the outputs of lookups.

CUS_TYPE,CUS_ID,REC_TYPE,DATE,DETAIL
C,123,1,2011/02/28,XXXA
C,100,1,2011/02/28,AAA
C,123,2,2011/02/28,XXXB
C,100,2,2011/02/28,AAA1
C,123,D1,2011/02/28,XXXC
C,123,D2,2011/02/28,XXXD
C,100,D1,2011/02/28,AAA2
C,123,K1,2011/02/28,0001
C,100,K1,2011/02/28,0005
C,123,K2,2011/02/28,N2
C,100,K2,2011/02/28,N4
C,123,K3,2011/02/28,YYYY
C,100,K3,2011/02/28,ZZZ

This ensures the second output condition is met. You may sort the output as required.
Nayan
Post Reply