Page 1 of 1

Help with Logic

Posted: Tue Nov 29, 2011 9:45 pm
by dsadm_ws
Hi All,

I have issues with the following logic. Below is my scenario:

Input:

cust_id time_period rep_cd
6111 200501 123
6111 200502 123
6111 200503 123
6111 200504 xxx
6111 200505 xxx
6111 200506 xxx
6111 200507 abc
6112 200508 123
6112 200509 123

Desired Output:

cust_id strt_dt End_dt rep_cd
6111 20050101 20050331 123
6111 20050401 20050631 xxx
6111 20050701 20050631 abc
6112 20050801 99991231 123

Logic used:

Used following Stage variables to get the desired output...

CurEffDt PrvEffDt
CurExpDt PrvExDt
If CurRepKey=1 then PrvEffDt Else CurExpDT CurExpDt
time_period CurEffDt
CustKey CurCustKey
RepKey CurRepKey

Constraint: (CurRepKey = 1 and CurCustkey = 1 OR CurCustkey = 0 and CurRepKey = 1)

Logic Output:
cust_id strt_dt End_dt rep_cd
6111 20050101 123
6111 20050401 20050331 xxx
6111 20050701 20050431 abc
6112 20050801 20050731 123

Please let me know, how can I modify my Logic in the Stage variables.. or is it the Constraint??

Your input would be greatly appreciated.

Thanks in advance!!

Posted: Tue Nov 29, 2011 10:07 pm
by major
how are you deriving the CurRepKey , by the way how are you generating the values for desired End_dt ?

Thanks
Major

Posted: Tue Nov 29, 2011 10:18 pm
by dsadm_ws
Thanks for your quick response.

I am deriving the currepkey using another Transformer. Used two Stage variables:
PrvRepkey = Currepkey
Currepkey = Rep_Code

Used the following Stage Variable in the next Transformer using the Below Stage variables:

PrvEffDt = CurEffDt
PrvExDt = CurExpDt
CurExpDt = If CurRepKey=1 then PrvEffDt Else CurExpDT
CurEffDt = time_period
CurCustKey = CustKey
CurRepKey = RepKey
Constraint:
(CurRepKey = 1 and CurCustkey = 1 OR CurCustkey = 0 and CurRepKey = 1)

The stage variable in bold generates the End_Dt. This stage variable has an Initial value = '99991231'.

Posted: Wed Nov 30, 2011 8:04 am
by dsadm_ws
Can anyone suggest, what i am missing in my logic.

Any advise would be appreciated.

Posted: Wed Nov 30, 2011 2:05 pm
by dsadm_ws
Got it. I have used an Aggregator to get the Min & max of the time period.

Posted: Wed Dec 14, 2011 3:34 pm
by dsadm_ws
Hi All,

I have previously marked this topic to be resolved. Thought I got my desired output. However, my logic does not cover one of the important scenario.

Used the aggregator stage to get the min and max of the time period by grouping on Cust_ID and Rep_Cd .

Code: Select all

 my input is in this way:


       Cust_id  Rep_Cd  Time_Period
           001        125        200501
           001        125        200502
           001        125        200503
           001        1BA       200504
           001        125        200506
           001        125        200507

Code: Select all

 
[b]Desired Output:
Cust_id  Rep_Cd  Min_Period   Max_Period
001        125        200501        200503
001        1BA        200504        200504
001        125        200506        200507[/b]

Here we have the Rep_Cd coming in again to serve the customer after the break. And when I try to group it by Cust_Id and Rep_Cd all the recurring
Rep_Cd's after a break seem to combine into one group and output the min and max incorrectly.

Code: Select all

In my case:

Output:

Cust_id  Rep_Cd  Min_Period   Max_Period
[b]001        125        200501        200507[/b]
    001        1BA        200504        200504    



The one bolded does not comeout as inteded. Instead of splitting them separately, it just combines them into one record.

Any advise on this would be highly appreciated. Thanks in Advance!![/code]

Posted: Wed Dec 14, 2011 3:49 pm
by FranklinE
You have a uniqueness problem that Aggregator can't handle as you intend it to. Rep_cd needs to be examined sequetially according to breaks, and when it "reappears" after a break it needs to be made unique compared to the previous occurance of it.

For example (not suggesting you can code it this way), you need to "index" the Rep_cd with each break. The first period 200501 to 200503 would be 125-1, and when it occurs again it would be 125-2 until the next break.

I know how I'd handle that in COBOL, but it presents an interesting problem in DataStage.

Posted: Wed Dec 14, 2011 5:16 pm
by jwiles
You might be able to do this:

1) Partition only on cust_id
2) Sort on cust_id, period, rep_id
3) In a transformer, create a surrogate key, incremented each time the combination of cust_id and rep_id changes (if cust_id <> pre_cust_id or rep_id <> pre_rep_id then key+1 else key)
4) Perform the aggregation on the surrogate key.

You may need to create the final surrogate key by concatenating the cust_id and rep_id onto the end of it, so you can extract them after the aggregation. Or use a fork-join process instead.

Regards,