Pivot stage question
Moderators: chulett, rschirm, roy
Pivot stage question
I have a pipe delimited file that I need to pivot to multiple columns:
e.g.
key| col1 | col2| col3 | col4| col5
1 |A |B | | |
2 |C | | | |
I want the result to be:
key | col
1 | A
1 | B
2 | C
But I'm getting the following from my pivot stage:
key | col
1 | A
1 | B
1 |
1 |
1 |
2 | C
2 |
2 |
2 |
2 |
Is there any setting in the pivot stage where I can prevent the null values to be output?
e.g.
key| col1 | col2| col3 | col4| col5
1 |A |B | | |
2 |C | | | |
I want the result to be:
key | col
1 | A
1 | B
2 | C
But I'm getting the following from my pivot stage:
key | col
1 | A
1 | B
1 |
1 |
1 |
2 | C
2 |
2 |
2 |
2 |
Is there any setting in the pivot stage where I can prevent the null values to be output?
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
Before pivot do a conversion of '||' to '' (Routine suggested. Do this for the whole file. Recursive till there are no more '||')
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
How about a routine?
$INCLUDE DSINCLUDE JOBCONTROL.H
File_Name = trim(TempFileDir):trim(FileName):trim(Ext)
File_size = 0
VTemp = ''
Ans = 'Error'
RecVar = ''
OpenSeq File_Name To FVar_Keys
Else
Call DSLogWarn("Unable To Open Text File ": File_Name,"FileOpenError")
ErrorCode = -1 ; * STOP THE ROUTINE
End
For N = 1 To 1000 ;* Find total count of rows in file n substitute for this 1000
ReadSeq FileLine From FVar_Keys
On Error
Call DSLogFatal("Error Reading from ":File_Name:" FILE_Read_STATUS=":Status(),"FileReadError")
ErrorCode = -1 ; * STOP THE ROUTINE
End
Then
* Line Read Successfully
Cnt = Count(FileLine, '|')
Q =2
Loop While Cnt <> 0
TempString = Trim(Field(FileLine,'|',Q))
If NOT(TempString = '' or IsNull(TempString)) Then
RecVar = RecVar : Field(FileLine,'|',1) :'|': Field(FileLine,'|',Q) : CHAR(10)
End
Q=Q+1
Cnt = Cnt - 1
Repeat
End Else
END
NEXT N
CLOSESEQ FVar_Keys
OpenSeq File_Name To FVar_Keys Then WEOFSEQ FVar_Keys
WriteBlk RecVar To FVar_Keys
Else
Call DSLogWarn("Unable write to Text File ": File_Name,"FileOpenError")
ErrorCode = -1 ; * STOP THE ROUTINE
End
Call DSLogInfo("Successful conversion", "FILE_CONVERSION_SUCCESS")
Ans = 'Pivot Success'
CLOSESEQ FVar_Keys
$INCLUDE DSINCLUDE JOBCONTROL.H
File_Name = trim(TempFileDir):trim(FileName):trim(Ext)
File_size = 0
VTemp = ''
Ans = 'Error'
RecVar = ''
OpenSeq File_Name To FVar_Keys
Else
Call DSLogWarn("Unable To Open Text File ": File_Name,"FileOpenError")
ErrorCode = -1 ; * STOP THE ROUTINE
End
For N = 1 To 1000 ;* Find total count of rows in file n substitute for this 1000
ReadSeq FileLine From FVar_Keys
On Error
Call DSLogFatal("Error Reading from ":File_Name:" FILE_Read_STATUS=":Status(),"FileReadError")
ErrorCode = -1 ; * STOP THE ROUTINE
End
Then
* Line Read Successfully
Cnt = Count(FileLine, '|')
Q =2
Loop While Cnt <> 0
TempString = Trim(Field(FileLine,'|',Q))
If NOT(TempString = '' or IsNull(TempString)) Then
RecVar = RecVar : Field(FileLine,'|',1) :'|': Field(FileLine,'|',Q) : CHAR(10)
End
Q=Q+1
Cnt = Cnt - 1
Repeat
End Else
END
NEXT N
CLOSESEQ FVar_Keys
OpenSeq File_Name To FVar_Keys Then WEOFSEQ FVar_Keys
WriteBlk RecVar To FVar_Keys
Else
Call DSLogWarn("Unable write to Text File ": File_Name,"FileOpenError")
ErrorCode = -1 ; * STOP THE ROUTINE
End
Call DSLogInfo("Successful conversion", "FILE_CONVERSION_SUCCESS")
Ans = 'Pivot Success'
CLOSESEQ FVar_Keys
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 7
- Joined: Wed Nov 02, 2005 12:19 pm
Pivoting
I am a newbie at this, it is my first post, so please bear with me.
I do a tremendous amount of pivoting in the work I do, and have not been very impressed with the pivot stage. I have 2000+ input formats that I have to standardize into one of a few standard formats. These input formats range from 3 or 4 columns to several thousand, and the output formats are just as variable. I decided to take a metadata driven approach and have stored my input and output formats in Oracle tables.
The approach I used was to write two parallel plugins one to pivot vertically, the other to pivot horizontally. I wrote them in C, and they accept parameters for the delimter, quotes, etc. They are very generic, and I have been able to utilize them in most everything I do.
1) Read line and parse and pivot vertically, write out with key, column number, and value (if not null)
2) Oracle lookup by input format and field number, replacing number with a sysname, and getting data format (number, value, string, date, etc)
3) Oracle lookup by output format and fieldname to get new field number, data type
4) pivot horizontally using key and field_number
Seems to work very well, and I was able to preserve the performance through making the plugins parallel.
Hope this helps. DSXchange has helped me on a number of occasions, and I would like to try and return the favor.
I do a tremendous amount of pivoting in the work I do, and have not been very impressed with the pivot stage. I have 2000+ input formats that I have to standardize into one of a few standard formats. These input formats range from 3 or 4 columns to several thousand, and the output formats are just as variable. I decided to take a metadata driven approach and have stored my input and output formats in Oracle tables.
The approach I used was to write two parallel plugins one to pivot vertically, the other to pivot horizontally. I wrote them in C, and they accept parameters for the delimter, quotes, etc. They are very generic, and I have been able to utilize them in most everything I do.
1) Read line and parse and pivot vertically, write out with key, column number, and value (if not null)
2) Oracle lookup by input format and field number, replacing number with a sysname, and getting data format (number, value, string, date, etc)
3) Oracle lookup by output format and fieldname to get new field number, data type
4) pivot horizontally using key and field_number
Seems to work very well, and I was able to preserve the performance through making the plugins parallel.
Hope this helps. DSXchange has helped me on a number of occasions, and I would like to try and return the favor.
Thanks
Larry
Larry
I came accross same situation and solving it in Transformer used a constraint Len(Trim(somelink.somecolumn)) = 0.
I had no gain in using IsNull() Validation but above logic worked fine.
It works and simple.
Since Transformer is parallel you still have performance enhanced.
I had no gain in using IsNull() Validation but above logic worked fine.
It works and simple.
Since Transformer is parallel you still have performance enhanced.
chulett wrote:No, you need to filter them out post-pivot via a constraint or some other mechanism.
-
- Participant
- Posts: 8
- Joined: Thu Nov 23, 2006 8:43 am
I am also facing similar problem.
I hv to pivot columns dynamically.
For example
1|a|b|c
2|d|e
3|f|g|h|i|j|k|l etc
ie., some times i have to pivot only 3 columns as in the first line of example, some times 2, some times n. This has to be dynamically.
Iam able to do it when i assume the values can be max of some 30 . But iam not able to do it dynamically.
Seeking for help.....
Thanks in advance
Subhashini
I hv to pivot columns dynamically.
For example
1|a|b|c
2|d|e
3|f|g|h|i|j|k|l etc
ie., some times i have to pivot only 3 columns as in the first line of example, some times 2, some times n. This has to be dynamically.
Iam able to do it when i assume the values can be max of some 30 . But iam not able to do it dynamically.
Seeking for help.....
Thanks in advance
Subhashini
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 8
- Joined: Thu Nov 23, 2006 8:43 am
But my problem, i won't always get 30 columns. I may get 30/50/100/2000 columns which iam mentioning as dynamically.
The columns to pivot is not static , it always differ. I can't set it to some max limit also.... b'coz the no. of columns are keep on changing
Waiting for some suggestion.
Thanks in advance.
Subhashini.
The columns to pivot is not static , it always differ. I can't set it to some max limit also.... b'coz the no. of columns are keep on changing
Waiting for some suggestion.
Thanks in advance.
Subhashini.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Can you live with an awk script to get this done. Its an awesome workaround for dynamic variables, such as dynamic columns. You are sitting on parallel engine on windoze. I presume you have MKS Toolkit installed and can run unix scripts, Yes?
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Re: Pivoting
Larry,Larry.Griffith wrote:I am a newbie at this, it is my first post, so please bear with me.
I do a tremendous amount of pivoting in the work I do, and have not been very impressed with the pivot stage. I have 2000+ input formats that I have to standardize into one of a few standard formats. These input formats range from 3 or 4 columns to several thousand, and the output formats are just as variable. I decided to take a metadata driven approach and have stored my input and output formats in Oracle tables.
The approach I used was to write two parallel plugins one to pivot vertically, the other to pivot horizontally. I wrote them in C, and they accept parameters for the delimter, quotes, etc. They are very generic, and I have been able to utilize them in most everything I do.
1) Read line and parse and pivot vertically, write out with key, column number, and value (if not null)
2) Oracle lookup by input format and field number, replacing number with a sysname, and getting data format (number, value, string, date, etc)
3) Oracle lookup by output format and fieldname to get new field number, data type
4) pivot horizontally using key and field_number
Seems to work very well, and I was able to preserve the performance through making the plugins parallel.
Hope this helps. DSXchange has helped me on a number of occasions, and I would like to try and return the favor.
Thanks for the post. We too have many pivot functionality in our application, and since Pivot is a performance poor, we did our own Buildops for each of these [though we haven't come up with what you did..come up with a generic pivot and customize for different metadata]
We face more of Horizontal pivot and some Vertical pivot.
Would be mind, if you can share the C code for it, which would make our life much simpler. Don;t bother, if that's too much of an asking.
Thanks