how to create multiple rows from a row
Moderators: chulett, rschirm, roy
how to create multiple rows from a row
Hi Everyone,
I have a file with tilda delimited(12 fields) and field3,field4,field5 can have one or multiple values separated by commas
pepsi~1~1,2,3~001,002~hourly,salaried~200612~quarterly~1~U~50~peer group~pepsi report
output should be:
pepsi,1,1,001,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,1,002,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,1,001,salaried,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,1,002,salaried,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,2,001,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,2,002,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,2,001,salaried,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,2,002,salaried,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,3,001,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,3,002,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,3,001,salaried,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,3,002,salaried,200612,quarterly,1,U,50,peer group,pepsi report
I need ur suggestions plz...
I have a file with tilda delimited(12 fields) and field3,field4,field5 can have one or multiple values separated by commas
pepsi~1~1,2,3~001,002~hourly,salaried~200612~quarterly~1~U~50~peer group~pepsi report
output should be:
pepsi,1,1,001,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,1,002,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,1,001,salaried,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,1,002,salaried,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,2,001,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,2,002,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,2,001,salaried,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,2,002,salaried,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,3,001,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,3,002,hourly,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,3,001,salaried,200612,quarterly,1,U,50,peer group,pepsi report
pepsi,1,3,002,salaried,200612,quarterly,1,U,50,peer group,pepsi report
I need ur suggestions plz...
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
It doesnt matter how many values are there. Its horizontal data being pivoted. The final meta data will remain constant. The pivot stage will get this done. I am not saying just one pivot stage. You might need to parse your data a little bit too.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Granted. This will take more work than just a simple use of multiple pivot stages as the number of fields that are comma delimited are unknown. Pivot stage would have worked if the comma delimited columns (3, 4 and 5) were static. You need something like awk. Here is what i came up with. This script will work for your input data.i.e;
-total of 12 columns
-3,4 and 5 are varying only.
Run the following script. Just change the variables inFile and outFile to adapt to your environment.
What i am doing is splitting the 5 column into an array called z. Printing all the rest of the columns as it is with every value of array z. The output of first part will result in a similar format as the input file, the only difference will be that column 5 will be split.
Now each row of first part will be sent to the second part which does a similar thing. Then each row of second part will be sent to third part and finally written to the output file.
Run this script and you will get a pivoted,comma delimited file as you want.
-total of 12 columns
-3,4 and 5 are varying only.
Run the following script. Just change the variables inFile and outFile to adapt to your environment.
Code: Select all
#!/usr/bin/ksh
#Change the following two lines
export inFile=/Data/SFDCDEV/temp/gsym.txt
export outFile=/Data/SFDCDEV/temp/Result_gsym.txt
#remove previous output file if exists.
if [ -f $outFile]
then
rm -f $outFile
fi
#the main body
cat $inFile | while read line
do
#The next 7 lines are referred as first part in my explanation below
echo $line | awk 'BEGIN{FS="~"}
{split ($5, c, ",")}
{
OFS="~"
for (z in c)
print $1,$2,$3,$4,c[z],$6,$7,$8,$9,$10,$11,$12
}' | while read A
#The next 8 lines are referred as second part in my explanation below
do
echo $A | awk 'BEGIN{FS="~"}
{split ($4, b, ",")}
{
OFS="~"
for (y in b)
print $1,$2,$3,b[y],$5,$6,$7,$8,$9,$10,$11,$12
}' | while read B
#The next 8 lines are referred as third part in my explanation below
do
echo $B | awk 'BEGIN{FS="~"}
{split ($3, a, ",")}
{
OFS=","
for (x in a)
print $1,$2,a[x],$4,$5,$6,$7,$8,$9,$10,$11,$12
}' >> $outFile
done
done
done
Now each row of first part will be sent to the second part which does a similar thing. Then each row of second part will be sent to third part and finally written to the output file.
Run this script and you will get a pivoted,comma delimited file as you want.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Well, it would have been if the comma delimited values were static. Then they could have been changed to columns and a pivot stage would do the trick. I dont know how the varying fields would be treated by DataStage. I thought about it, maybe not long enough. Did'nt want to burn any more brain cells, so i wrote out a script. I would love to know of a way for it to be done in DS. Hey i am here to learn after all. ![Wink :wink:](./images/smilies/icon_wink.gif)
![Wink :wink:](./images/smilies/icon_wink.gif)
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
No. I really mean @VM.DSguru2B wrote:You mean @FMleomauer wrote: 2. In every multi-value field replace commas with @VM (char(253))![]()
But how would you then read from the hashed file with 12 defined on your output ?
![Razz :P](./images/smilies/icon_razz.gif)
I wanted to envoke "Normalize on" hash file functionality, not just spread fields out.
There will be 1 record loaded to first hash file, 3 records unloaded from it and loaded to second. 6 records unloaded from second and loaded to third hash file. And 12 records unloaded from the third hash file.