Help with Logic

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsadm_ws
Premium Member
Premium Member
Posts: 65
Joined: Mon Apr 12, 2010 10:30 am

Help with Logic

Post by dsadm_ws »

Hi All,

I have issues with the following logic. Below is my scenario:

Input:

cust_id time_period rep_cd
6111 200501 123
6111 200502 123
6111 200503 123
6111 200504 xxx
6111 200505 xxx
6111 200506 xxx
6111 200507 abc
6112 200508 123
6112 200509 123

Desired Output:

cust_id strt_dt End_dt rep_cd
6111 20050101 20050331 123
6111 20050401 20050631 xxx
6111 20050701 20050631 abc
6112 20050801 99991231 123

Logic used:

Used following Stage variables to get the desired output...

CurEffDt PrvEffDt
CurExpDt PrvExDt
If CurRepKey=1 then PrvEffDt Else CurExpDT CurExpDt
time_period CurEffDt
CustKey CurCustKey
RepKey CurRepKey

Constraint: (CurRepKey = 1 and CurCustkey = 1 OR CurCustkey = 0 and CurRepKey = 1)

Logic Output:
cust_id strt_dt End_dt rep_cd
6111 20050101 123
6111 20050401 20050331 xxx
6111 20050701 20050431 abc
6112 20050801 20050731 123

Please let me know, how can I modify my Logic in the Stage variables.. or is it the Constraint??

Your input would be greatly appreciated.

Thanks in advance!!
major
Premium Member
Premium Member
Posts: 167
Joined: Mon Nov 26, 2007 12:21 am

Post by major »

how are you deriving the CurRepKey , by the way how are you generating the values for desired End_dt ?

Thanks
Major
dsadm_ws
Premium Member
Premium Member
Posts: 65
Joined: Mon Apr 12, 2010 10:30 am

Post by dsadm_ws »

Thanks for your quick response.

I am deriving the currepkey using another Transformer. Used two Stage variables:
PrvRepkey = Currepkey
Currepkey = Rep_Code

Used the following Stage Variable in the next Transformer using the Below Stage variables:

PrvEffDt = CurEffDt
PrvExDt = CurExpDt
CurExpDt = If CurRepKey=1 then PrvEffDt Else CurExpDT
CurEffDt = time_period
CurCustKey = CustKey
CurRepKey = RepKey
Constraint:
(CurRepKey = 1 and CurCustkey = 1 OR CurCustkey = 0 and CurRepKey = 1)

The stage variable in bold generates the End_Dt. This stage variable has an Initial value = '99991231'.
dsadm_ws
Premium Member
Premium Member
Posts: 65
Joined: Mon Apr 12, 2010 10:30 am

Post by dsadm_ws »

Can anyone suggest, what i am missing in my logic.

Any advise would be appreciated.
dsadm_ws
Premium Member
Premium Member
Posts: 65
Joined: Mon Apr 12, 2010 10:30 am

Post by dsadm_ws »

Got it. I have used an Aggregator to get the Min & max of the time period.
dsadm_ws
Premium Member
Premium Member
Posts: 65
Joined: Mon Apr 12, 2010 10:30 am

Post by dsadm_ws »

Hi All,

I have previously marked this topic to be resolved. Thought I got my desired output. However, my logic does not cover one of the important scenario.

Used the aggregator stage to get the min and max of the time period by grouping on Cust_ID and Rep_Cd .

Code: Select all

 my input is in this way:


       Cust_id  Rep_Cd  Time_Period
           001        125        200501
           001        125        200502
           001        125        200503
           001        1BA       200504
           001        125        200506
           001        125        200507

Code: Select all

 
[b]Desired Output:
Cust_id  Rep_Cd  Min_Period   Max_Period
001        125        200501        200503
001        1BA        200504        200504
001        125        200506        200507[/b]

Here we have the Rep_Cd coming in again to serve the customer after the break. And when I try to group it by Cust_Id and Rep_Cd all the recurring
Rep_Cd's after a break seem to combine into one group and output the min and max incorrectly.

Code: Select all

In my case:

Output:

Cust_id  Rep_Cd  Min_Period   Max_Period
[b]001        125        200501        200507[/b]
    001        1BA        200504        200504    



The one bolded does not comeout as inteded. Instead of splitting them separately, it just combines them into one record.

Any advise on this would be highly appreciated. Thanks in Advance!![/code]
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

You have a uniqueness problem that Aggregator can't handle as you intend it to. Rep_cd needs to be examined sequetially according to breaks, and when it "reappears" after a break it needs to be made unique compared to the previous occurance of it.

For example (not suggesting you can code it this way), you need to "index" the Rep_cd with each break. The first period 200501 to 200503 would be 125-1, and when it occurs again it would be 125-2 until the next break.

I know how I'd handle that in COBOL, but it presents an interesting problem in DataStage.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

You might be able to do this:

1) Partition only on cust_id
2) Sort on cust_id, period, rep_id
3) In a transformer, create a surrogate key, incremented each time the combination of cust_id and rep_id changes (if cust_id <> pre_cust_id or rep_id <> pre_rep_id then key+1 else key)
4) Perform the aggregation on the surrogate key.

You may need to create the final surrogate key by concatenating the cust_id and rep_id onto the end of it, so you can extract them after the aggregation. Or use a fork-join process instead.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
Post Reply