Page 1 of 1

Sort is failing

Posted: Tue Nov 19, 2013 1:10 pm
by Seyed
All,
I have a server job that the lookup fails due to sort problems. To isolate the cause, I simplified the job by limiting it to the source (ORAOCI9 stage), SORT stage, Hash stage and finally outputing to a sequential file. With or without the sort stage, the sort fails. I am already sorting the data in the ORAOCI9 using an 'order by' clause, but nevertheless, both the input and output tabs in the Hash stage end up being out of order.

The ORAOCI9 stage uses the following select statement to pull in the information. When I sample data from the ORAOCI stage, the data is orted.

Code: Select all

select table_name  from all_tables where owner in (#SCHEMAS#)
order by table_name
Any ideas why the sort is failing? By the way, I have deleted the Hashed stage and replaced it with a new one, but it didnt' make any difference.

Thank you,


Seyed

Posted: Tue Nov 19, 2013 1:22 pm
by chulett
A) Why add a Sort stage when the data is already sorted?
B) What makes you think the sort is 'failing'?
C) Are you using the Hashed table to remove duplicates and the 'wrong one' is going in last?
D) Please don't expect any particular order when selecting from a hashed table any more than you would with a database table, you'd need to either do a "sorted select" or use the Sort stage after reading from the Hashed table if the order in the Sequential file is critical.

If nothing above helps, detail for us what exactly "the lookup fails due to sort problems" means.

Posted: Tue Nov 19, 2013 4:10 pm
by ray.wurlod
Hashed files store data according to a hashing algorithm. Any previous sorting will be discarded.

Posted: Tue Nov 19, 2013 5:16 pm
by chulett
Your sort stage worked just fine, it's just the expectation of getting sorted output from the hashed file that was incorrect. Due to the destructive nature of hashed file key handling (duplicates are not allowed and last one in survives) the order records are written to the hashed file can be critical if you are replying on that functionality to remove duplicates. That was why I asked (C) as that is the only time the input order matters. If all of your records are unique, any manner of sorting before the hashed file doesn't buy you anything.

As noted, there are ways to get sorted data out of a hashed file (using SSELECT comes to mind but it's not supported by the stage from what I recall) so best to sort afterwards as you've done now so the final result is as expected.

Hope that helps.