Pivot Enterprise Stage - Vertical Pivot issue

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
tehavele
Premium Member
Premium Member
Posts: 79
Joined: Wed Nov 12, 2008 5:41 am
Location: Melbourne

Pivot Enterprise Stage - Vertical Pivot issue

Post by tehavele »

I am trying to pivot rows to columns using Pivot Enterprise Stage (vertical Pivot) Somehow I am unable to execute successfully in parallel mode. Here is the detailed issue. I have mainly three columns KEY (bunch of group by columns) SEQ (values ranging 1 to 48 ) and VALUE (contains actual values - 48 values).

Code: Select all

key   SEQ   VALUE  
123   1       33.45
123   2       44.32
.......
123   48     34.65
124   1       43.21
124   2       09.54
.......
124   48      98.54
I want to convert it to below format.

Code: Select all

key   VAL1    VAL2  ....   VAL48
123   33.45   44.32         34.65
124   43.21   09.54         98.54
I am able to convert it to above format but some how SEQ is not matching with the columns. Meaning value for SEQ-32 is getting mapped to a VAL1.

I tried hash partitioning and sorting but it dint work. Interestingly, I am playing with couple of records at the moment, interestingly for the first records it is failing but the sequence is getting right for 2nd record (sort of unpredictable behavior).

When I run this job in Sequential mode it run fine when I use sort merge collector method at Pivot Stage. But its not working in Parallel mode.

Any help on this would be greatly appreciated.
Tejas
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: Pivot Enterprise Stage - Vertical Pivot issue

Post by chulett »

tehavele wrote:I tried hash partitioning and sorting but it dint work.
The fact that it works correctly on a single node and didn't when running on multiple nodes implies your hashed partitioning is not set up correctly. Can you detail your settings for that?
-craig

"You can never have too many knives" -- Logan Nine Fingers
tehavele
Premium Member
Premium Member
Posts: 79
Joined: Wed Nov 12, 2008 5:41 am
Location: Melbourne

Post by tehavele »

Hash partitioning and sorting is done on Input Page of Pivot Enterprise Stage.
  • Hash Partitioning - KEY (all key columns )
    Sorting - KEY (all key columns + SEQ).
Pivot Stage Properties:-
  • GroupBy - KEY( All key columns )
    Pivot - SEQ and VALUE
Tejas
tehavele
Premium Member
Premium Member
Posts: 79
Joined: Wed Nov 12, 2008 5:41 am
Location: Melbourne

Post by tehavele »

This issue is resolved now. Even though hash partitioning and sorting were set up properly it was not behaving as per expectation.

Issue Identification:-
Upon observing dump score it was cleat that Vertical Pivot stage was inserting it's own 'tsort' operator which was only happening on KEY columns and not on SEQ field which was important. I guess that's why the output was having incorrect order of columns after vertical pivot. Also, there was a log warning saying

Code: Select all

 pvVerticalPivot: When checking operator: User inserted sort "pvVerticalPivot.lnkVerticalPivot_Sort" does not fulfill the sort requirements of the downstream operator "pvVerticalPivot"
DataStage did modify sort to fulfill sort requirements of the downstream operator.

Resolution:-
Added and Set up environment variables '$APT_NO_SORT_INSERTION' and '$APT_NO_PART_INSERTION' to TRUE. This ensured that DataStage is not adding tsort operator on its own and then it worked perfectly.

Thanks Craig for your help.
Tejas
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Interesting, thanks for posting your resolution.
-craig

"You can never have too many knives" -- Logan Nine Fingers
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

I think it insert a sort operator if the key defined doesn't match the sort keys upstream. And the way out is either set APT_NO_SORT_INSERTION or have another sort stage to use the key as in pivot and mark "Do not sort (Previously sorted)". You may also set APT_SORT_INSERTION_CHECK_ONLY which will verify the sort order, may produce a warning, but won't sort the data.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
tehavele
Premium Member
Premium Member
Posts: 79
Joined: Wed Nov 12, 2008 5:41 am
Location: Melbourne

Post by tehavele »

Tried 2nd option by
1) inserting a separate sort stage and setting "Don't Sort (Previously Sorted)"
2) removing previous setting of hash partitions and sorting
3) removing APT_NO_SORT_INSERTION

but it dint give expected output. Orders of pivoted columns were incorrect for some of the random records. It threw below warning which came earlier.

Code: Select all

pvVerticalPivot: When checking operator: User inserted sort "Sort_89" does not fulfill the sort requirements of the downstream operator "pvVerticalPivot"
Tejas
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

you should not remove previous sorting as this sort stage will not sort anything.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
tehavele
Premium Member
Premium Member
Posts: 79
Joined: Wed Nov 12, 2008 5:41 am
Location: Melbourne

Post by tehavele »

Kept previous settings as it is and added Sort Stage with "Don't Sort (Previously Sorted)" set and It worked just fine.

Thanks Priyadarshi.
Tejas
Post Reply