Page 1 of 1

Split a column

Posted: Sun Sep 28, 2008 3:26 am
by Deepa_Vaidyanathan
I have a record with two columns .
Example :
Job Id | Job desc
1 | Service Manager,Training Insructor,Finance Manager

I need to split it in a way like

Job Id | Job desc
1 | Service Manager
1 | Training Insructor
1 | Finance Manager

Please help as to how I can achieve this . Thanks in advance

Posted: Sun Sep 28, 2008 4:25 am
by ray.wurlod
Parse into separate columns then use a Pivot stage. Probably the easiest parse is to combine the record into a single field using a Transformer stage then run that through a Column Import stage.

Posted: Mon Sep 29, 2008 12:21 am
by meet_deb85
Hi ,

Another way if you are sure that your second column will
have only two commas: -

SeqFILE -----> Tranformer ------> Funnel joining the 3 links------->SeqFILE
3 links

Inside the tranformer for each link use two columns : -

JOBID ----Derived from Input JOBID column
JOB------1)Field(DSLink3.JOBNAME,",",1) //for first link
2)Field(DSLink3.JOBNAME,",",2) //for second link
3)Field(DSLink3.JOBNAME,",",3) //for third link

Then merge the three links using a funnel. I hope it will solve your problem.

Posted: Mon Sep 29, 2008 12:32 am
by tennetiharika
U can use pivot stage

Posted: Mon Sep 29, 2008 12:40 am
by ray.wurlod
U did not ask the question. Deepa_Vaidyanathan did. Can you please explain how a Pivot stage (presumably alone, based on what you said) can be used to solve this particular problem, rather than simply assert?

Posted: Mon Sep 29, 2008 8:05 am
by senthilmp
Hi Deepa,

I can suggest how we can do in Server edition , but i don how it works for parallel

In server

Step 1: Use Field function such that, Field(Read_Eg.Job_Desc,',',1) and so on till you have your end ',' ie if you have 10 ',' seperated then you have to use 10 field functions and make all these field function as Stage Variables(i.e. N(for eg 10) number of stage Variable SV1 to SV10)

Step 2:
In the output of transformer have N number of output for this particular column Job_Desc as Job_Desc1, Job_Desc2 and so on...having a direct link from Stage Variable ie SV1 to Job_Desc1

Step3:
Use a Pivot Stage,
In the Input you will be having Job_Id and Job_Desc(N number of Job_Desc) in our eg 10, so a total of N+1 (ie 11 in our eg). In the Output tab of PIVOT stage just keep two columns as Job_Id and Job_Desc.

Where the derivation of Job_Desc being Job_Desc1;Job_Desc2;Job_Desc3;Job_Desc4 and so on.
Do not give any derivation for Job_ID.

Step 4:
Write the output to the required target.

Thus you can seperate

Job Id | Job desc
1 | Service Manager,Training Insructor,Finance Manager

into

Job Id | Job desc
1 | Service Manager
1 | Training Insructor
1 | Finance Manager

Posted: Mon Sep 29, 2008 10:10 am
by Aruna Gutti
The solution given by Senthilmp can be used in parallel jobs too. I have a similar parallel job.