Pivot Enterprise Stage - Vertical Pivot issue

tehavele · Post by **tehavele** » Fri Jul 10, 2015 8:22 am

I am trying to pivot rows to columns using Pivot Enterprise Stage (vertical Pivot) Somehow I am unable to execute successfully in parallel mode. Here is the detailed issue. I have mainly three columns KEY (bunch of group by columns) SEQ (values ranging 1 to 48 ) and VALUE (contains actual values - 48 values).

Code: Select all

key   SEQ   VALUE  
123   1       33.45
123   2       44.32
.......
123   48     34.65
124   1       43.21
124   2       09.54
.......
124   48      98.54

I want to convert it to below format.

Code: Select all

key   VAL1    VAL2  ....   VAL48
123   33.45   44.32         34.65
124   43.21   09.54         98.54

I am able to convert it to above format but some how SEQ is not matching with the columns. Meaning value for SEQ-32 is getting mapped to a VAL1.

I tried hash partitioning and sorting but it dint work. Interestingly, I am playing with couple of records at the moment, interestingly for the first records it is failing but the sequence is getting right for 2nd record (sort of unpredictable behavior).

When I run this job in Sequential mode it run fine when I use sort merge collector method at Pivot Stage. But its not working in Parallel mode.

Any help on this would be greatly appreciated.

chulett · Post by **chulett** » Fri Jul 10, 2015 9:13 am

tehavele wrote:I tried hash partitioning and sorting but it dint work.

The fact that it works correctly on a single node and didn't when running on multiple nodes implies your hashed partitioning is not set up correctly. Can you detail your settings for that?

tehavele · Post by **tehavele** » Fri Jul 10, 2015 6:23 pm

Hash partitioning and sorting is done on Input Page of Pivot Enterprise Stage.

Hash Partitioning - KEY (all key columns )
Sorting - KEY (all key columns + SEQ).

Pivot Stage Properties:-

GroupBy - KEY( All key columns )
Pivot - SEQ and VALUE

tehavele · Post by **tehavele** » Sat Jul 11, 2015 8:22 pm

This issue is resolved now. Even though hash partitioning and sorting were set up properly it was not behaving as per expectation.

Issue Identification:-
Upon observing dump score it was cleat that Vertical Pivot stage was inserting it's own 'tsort' operator which was only happening on KEY columns and not on SEQ field which was important. I guess that's why the output was having incorrect order of columns after vertical pivot. Also, there was a log warning saying

Code: Select all

 pvVerticalPivot: When checking operator: User inserted sort "pvVerticalPivot.lnkVerticalPivot_Sort" does not fulfill the sort requirements of the downstream operator "pvVerticalPivot"

DataStage did modify sort to fulfill sort requirements of the downstream operator.

Resolution:-
Added and Set up environment variables '$APT_NO_SORT_INSERTION' and '$APT_NO_PART_INSERTION' to TRUE. This ensured that DataStage is not adding tsort operator on its own and then it worked perfectly.

Thanks Craig for your help.

chulett · Post by **chulett** » Sun Jul 12, 2015 6:31 am

Interesting, thanks for posting your resolution.

priyadarshikunal · Post by **priyadarshikunal** » Sun Jul 12, 2015 10:35 am

I think it insert a sort operator if the key defined doesn't match the sort keys upstream. And the way out is either set APT_NO_SORT_INSERTION or have another sort stage to use the key as in pivot and mark "Do not sort (Previously sorted)". You may also set APT_SORT_INSERTION_CHECK_ONLY which will verify the sort order, may produce a warning, but won't sort the data.

tehavele · Post by **tehavele** » Sun Jul 12, 2015 11:42 pm

Tried 2nd option by
1) inserting a separate sort stage and setting "Don't Sort (Previously Sorted)"
2) removing previous setting of hash partitions and sorting
3) removing APT_NO_SORT_INSERTION

but it dint give expected output. Orders of pivoted columns were incorrect for some of the random records. It threw below warning which came earlier.

Code: Select all

pvVerticalPivot: When checking operator: User inserted sort "Sort_89" does not fulfill the sort requirements of the downstream operator "pvVerticalPivot"

priyadarshikunal · Post by **priyadarshikunal** » Sun Jul 12, 2015 11:53 pm

you should not remove previous sorting as this sort stage will not sort anything.

tehavele · Post by **tehavele** » Mon Jul 13, 2015 12:05 am

Kept previous settings as it is and added Sort Stage with "Don't Sort (Previously Sorted)" set and It worked just fine.

Thanks Priyadarshi.

DSXchange

Pivot Enterprise Stage - Vertical Pivot issue

Pivot Enterprise Stage - Vertical Pivot issue

Re: Pivot Enterprise Stage - Vertical Pivot issue