joining rows

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

joining rows

Post by us1aslam1us »

Hi all,

I am having an input:

name,add,loc,pos
A,l,s,BA
A,l,s,CA
A,l,s,DA

i need to transform it to

name,add,loc,pos
A,l,s,BA/CA/DA

I tried to use stage variables but any help is appreciated.

Thanks
Sam
rasi
Participant
Posts: 464
Joined: Fri Oct 25, 2002 1:33 am
Location: Australia, Sydney

Post by rasi »

Hi

Could you tell what is your source?...Are you running in sequential mode?
Regards
Siva

Listening to the Learned

"The most precious wealth is the wealth acquired by the ear Indeed, of all wealth that wealth is the crown." - Thirukural By Thiruvalluvar
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

rasi wrote:Hi

Could you tell what is your source?...Are you running in sequential mode?
source is sequential file!!!

Thanks
sam
rasi
Participant
Posts: 464
Joined: Fri Oct 25, 2002 1:33 am
Location: Australia, Sydney

Post by rasi »

Hi

Sort the name,add,loc column and use your stage variable to hold pos and keep concatinating pos until three columns are changed. On change write write it to output. This should work
Regards
Siva

Listening to the Learned

"The most precious wealth is the wealth acquired by the ear Indeed, of all wealth that wealth is the crown." - Thirukural By Thiruvalluvar
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Maintain 7 Stage variables. And "" as Default value for the follwoing variables. Curr_Name,Curr_Add,Curr_Loc
Sort the data based on Name, Add, Loc

Code: Select all

Curr_Name = InputLink.Name
Curr_Add = InputLink.Add
Curr_Loc = InputLink.Loc
Tot_Pos = If Old_Name=Curr_Name And Old_Add = Curr_Add And Old_Loc = Curr_Loc Then Tot_Pos:"/":InputLink.Pos
Old_Name=Curr_Name
Old_Add = Curr_Add
Old_Loc = Curr_Loc
Retain the last record either by using Remove duplicate or Sort.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

Hi kumar,

Thanks for your feedback.

But what if i am having 50 columns.That means i need to use 101 stage variables.Any other way to achieve this.

Sorry for not explaining the complete requirement.

col1, col2, col3, col4, col5, col6, col,......................................................
1,00,A, , ,Alp
1,00, ,B, ,Alp
1,00, , ,C,Alp
1,01,A, , ,Alp
1,01, , ,C,Alp

and so on.........


And i am looking for the output to be :

Col1,col2,col3,col4,..........................................
1,00,A,B,C,Alp
1,01,A, ,C,Alp

and so on....


Thanks in advance
sam
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

So? Nothing wrong with having 101 stage variables. Call them dalmatian variables. :lol:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rajiivnb
Participant
Posts: 77
Joined: Fri Sep 10, 2004 8:38 am
Location: India

Post by rajiivnb »

Hi

You can use the RowProcCompareWithPreviousValue tranform to check the distinct of those first two columns, if its equal u can have a stage variable, and concate rest of the columns which u want it in a single column. before that, the records should be sorted, based on ur first two columns,

So create a stage varaible and u can use this below logic in that,

IF RowProcCompareWithPreviousValue(col1,previous col1) = 1Then

Stage_col_cat + col1 +col2 + col3
Else
col1+col2+col3

Had not tried this, but this may give u an idea.

Regards
Rajesh
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Hi,
In such case, dont check for all the column to be same, rather just check for the few column which you sure about unique when compared to the other group. And concatinate the POS column.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

Hi Rajiv

I tried your logic but i am getting the same result as before.

thanks
sam
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

kumar_s wrote:Hi,
In such case, dont check for all the column to be same, rather just check for the few column which you sure about unique when compared to the other group. And concatinate the POS column.
Hi

Thanks for your time.But in the output i am just getting the space and /.Am i going wrong somewhere.

Thanks
sam
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Hi,
If you are much concer about the number of variables, you can also customize as follows. Previously i havent mention the Else part in the If condition. Make sure the current derivation is given Else you would get as you are getting.

Code: Select all

Tot_Pos = If Old_Name=InputLink.Name And Old_Add = InputLink.Add And Old_Loc = InputLink.Loc Then Tot_Pos:"/":InputLink.Pos Else InputLink.Pos
Old_Name=InputLink.Name 
Old_Add = InputLink.Add 
Old_Loc = InputLink.Loc
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

Hi kumar

I believe the concatination is not going fine.

Thanks
sam
shepli
Participant
Posts: 79
Joined: Fri Dec 17, 2004 9:56 am

Post by shepli »

Try if this works for you:

Create a stage variable "Concatenated", initialize it as empty string (""), and it defined as:
If Concatenated = "" Then Input.ColumnD Else Concatenated : "/" : Input.ColumnD

Read through the sequential file and send the data to a hash table with 4 columns.
Column A, B, and C are the values from the source file. And use these three columns as the key of the hash;
Column = Concatenated
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

Hi kumar

Its working fine now.But could you suggest me how to take the latest row from that rows.Coz when i am using the Rem Dup stage its giving me the first row not the last row as i need.

Thanks
sam
Post Reply