How to handle this logic?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

How to handle this logic?

Post by chandra.shekhar@tcs.com »

Hi,
Structure of my table is:

Code: Select all

Acct_no|Rec_no|Descript
100|10|AA
101|20|BB
100|9|CC
100|7|DD
101|19|EE
101|18|FF
102|30|GG
The reqirement is for a particular Acct_no if the there are multiple Rec_no and if the difference between rec_no is 1 then I need to take the record with lowest rec_no. Also the column Descript will have the concatenated value of all Descript column for the same Acct_no. Also if the difference between Rec_no is not 1 then that record should treated as new record.
So my output will look like

Code: Select all

Acct_no|Rec_no|Descript
100|9|AACC           --(Diff. between 10 and 9 is 1)
100|7|DD             --(Diff. between 9 and 7 is not 1)
101|18|BBEEFF        --(Diff. between 20,19 then 18 is 1)
102|30|GG            --(Single record) 
I have tried different logics in transformer using stage variables like checking old and new value etc but it is of no use.
Thanx and Regards,
ETL User
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

Code: Select all

100|10|AA 
101|20|BB 
100|9|CC 
100|7|DD 
How you are checking the difference as 1?
With the previous record or the first record in the group?
pandeeswaran
prasson_ibm
Premium Member
Premium Member
Posts: 536
Joined: Thu Oct 11, 2007 1:48 am
Location: Bangalore

Re: How to handle this logic?

Post by prasson_ibm »

Hi,

You can achieve this output by below design:-

Seq Input---->Sort----->Transformer---->Filter------>Seq Output

Step1:- Read data from seq. file.

Step2:- Sort input data based on Acct_no(desc) and Rec_No(asc)

Step3:- Create these stage veriables

Code: Select all

CURR=Input.RECNO
DIFF=if (PREV-CURR)<> 1 then 0 else 1 
DECPT=if(DIFF=0) THEN input.RECNO ELSE DECPT|input.RECNO
PREV=Input.RECNO(initia value=0)
Create two new columns named Difference and Descritption to output link.
Pass DIFF and DECPT stagevariables to these columns.

Step4:- Apply filter condition to difference=1

I think it will work.
Thanks
Prasoon
v2kmadhav
Premium Member
Premium Member
Posts: 78
Joined: Fri May 26, 2006 7:31 am
Location: London

Post by v2kmadhav »

Sort your data before your xfm on acct_no, rec_no (asc).

Within your stage variables whilst comparing keys - do a difference between previous_rec_no and present_rec_no and pass only records where the difference is <>1.
prasson_ibm
Premium Member
Premium Member
Posts: 536
Joined: Thu Oct 11, 2007 1:48 am
Location: Bangalore

Re: How to handle this logic?

Post by prasson_ibm »

After Filter also apply remove duplicate with condition "Duplicate to Retain=Last"


Thanks
Prasoon
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

@Pandeesh
I need to check with the previous value.

@Prasson and Madhav
I need to apply this logic Acct_no wise so I need to take the acct_no in the stage variable too. Also I need to concatenate the value of Descript column with the same the diff. between previous and current Rec_no is 1(This can go upto n levels like for Acct_no 101 in my example). I need to take the row which is having lowest rec_no for a particular acct_no if the consecutive diff. is 1, not all the records.
I have tried similarly like sorting the data acct_no, rec_no wise and checking with previous value but I am still getting no success
Last edited by chandra.shekhar@tcs.com on Mon Jan 16, 2012 5:51 am, edited 1 time in total.
Thanx and Regards,
ETL User
v2kmadhav
Premium Member
Premium Member
Posts: 78
Joined: Fri May 26, 2006 7:31 am
Location: London

Post by v2kmadhav »

Sorting your data and doing something similar to the if-then-else below will certainly ensure you pick the right record.. However concatenating it might be a little tricky.. does it matter in which order you concat the description ?

eg: for 100 - does it matter if its AACC or CCAA ?

Code: Select all

If svPrevKey <> svPresKey Then 0 Else If svPrevKey = svPresKey And svRecNoDiff <>1 Then 0 Else 1
you should be able to code your stage variables in a similar way to concat the description too if CCAA is allowed.
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

@Madhav
Yes I need to concatenate in the correct order like here AACC only.
And I tried your logic; sorting on Hash partitioning on Acct_no asc, Rec_no Desc but I cannot just filter the data based upon
Apply filter condition to difference=1
because if you see my data then row#1 and row #4 should be considered as different rows. And my data will not pass the above filter.
Thanx and Regards,
ETL User
v2kmadhav
Premium Member
Premium Member
Posts: 78
Joined: Fri May 26, 2006 7:31 am
Location: London

Post by v2kmadhav »

I was suggesting the constraint to be svRecNoDiff <>1 so the first record and the 4th record would satisfy this criteria and pass through.
ShashankDon
Participant
Posts: 9
Joined: Wed Mar 14, 2007 3:34 pm

Post by ShashankDon »

Chandra,

You can add another stage variable to indicate if Acc No is changing between prev and curr record. Out put records from Transfomer only when there is a change in Acc No. Mean while for every record you can keep appending the text to Descript variable. This was you can concatenate values from any number of rows and sort based on first or last what ever RecNo you want. For rest, Prasoon's logic is fine.

Thanks,
Shashank
Shashank
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

But according to my requirement, I want the record when the diff. is 1 between consecutive rec_no's.
This constraint will not work correctly.

Code: Select all

If svPrevKey <> svPresKey Then 0 Else If svPrevKey = svPresKey And svRecNoDiff <>1 Then 0 Else 1
Thanx and Regards,
ETL User
v2kmadhav
Premium Member
Premium Member
Posts: 78
Joined: Fri May 26, 2006 7:31 am
Location: London

Post by v2kmadhav »

Chandra, When you sort your data ascending by Accno & RecNo your data becomes...

Accno|recno|desc|svRecNoDiff|svChk

100|7|DD |7|0
100|9|CC |2|0
100|10|AA |1|1
101|18|FF |18|0
101|19|EE |1|1
101|20|BB |1|1
102|30|GG |30|0

so when you pass only where svChk = 0

your output would be

100|7|DD |7|0
100|9|CC |2|0
101|18|FF |18|0
102|30|GG |30|0

isnt that what you want? Ofcourse without the concatenation.
v2kmadhav
Premium Member
Premium Member
Posts: 78
Joined: Fri May 26, 2006 7:31 am
Location: London

Post by v2kmadhav »

I did a sample job and got this output..

Added another output from the tranformer that pulled across the concatenated desc value.. followed by a rd stage that retains the last record and then acts as a referential stream to the actual xfm output that lookups into this to get the DESC value...

Let me know if i got the logic wrong :(

Peek_32,0: ACC_NO:100 REC_NO:7 DESC:DD
Peek_32,0: ACC_NO:100 REC_NO:9 DESC:AACC
Peek_32,0: ACC_NO:101 REC_NO:18 DESC:BBEEFF
Peek_32,0: ACC_NO:102 REC_NO:30 DESC:GG
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

chandra.shekhar@tcs.com wrote:@Madhav
Yes I need to concatenate in the correct order like here AACC only.
And I tried your logic; sorting on Hash partitioning on Acct_no asc, Rec_no Desc .
Without looking at the suggested answers too deeply, you do not want to partition on acct_no, rec_no as this will push different rec_no down different partitions. Only partition on acct_no to ensure all records with the same acct_no go down the same partition (and sort on acct/rec)
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

@Madhav
Added another output from the tranformer that pulled across the concatenated desc value..
How??

@Krypton
Thats what happening with me. When I use hash partitioning in the same transformer(Not using Sort stage) the above logic works for some records only and not for all :cry:
Thanx and Regards,
ETL User
Post Reply