Hash lookup in parallel

nishantrk · Post by **nishantrk** » Wed Mar 07, 2012 12:15 pm

Hi,
I have to extract data from a table based on the below sql

select * from xx where col1 = #var1#

The #var1# is a variable to be populated at run time (could be from a seq file or another table)

In server we can write 2 job first to have #var1# to be populated into a hashfile , then do a utilityhashlookup in the second job to use the variable to be put in the where condition.

How to achieve the same in parallel?? fileset Lookup stage will be inefficient
as as I need to select only few records out of millions.
Is there something similar to utility hash lookuo or pass data from one job to another??

Jboyd · Post by **Jboyd** » Wed Mar 07, 2012 1:08 pm

I believe using a sparse lookup would work. You just have to select the value you want from the initial table, then lookup to the table you mentioned. select the lookup option to sparse in the DB connection and in the where clause do where col1 = orchestrate.source_column

nishantrk · Post by **nishantrk** » Wed Mar 07, 2012 1:51 pm

Thanks for your reply...what if the var1 is coming from a seq file or .ds??

Jboyd · Post by **Jboyd** » Wed Mar 07, 2012 2:06 pm

I believe it has to be a table where you can implement a where clause in your SQL, that way you can refer to the variable value coming through

Jboyd · Post by **Jboyd** » Wed Mar 07, 2012 2:07 pm

You could always write that DS or sequential File to a staging table then go from there

kwwilliams · Post by **kwwilliams** » Wed Mar 07, 2012 2:16 pm

Couple of ways to do it, you can use a sparse lookup as referred to in the previous postings. Or you can write two jobs, and put them both in a sequence. In the sequence you can cat the file and then pass the return value of the output as a parameter into the second job. If you are going to have more than one value that you need to have in the where clause you would need to go with the sparse lookup

Kryt0n · Post by **Kryt0n** » Wed Mar 07, 2012 3:23 pm

Why move to a parallel job if you aren't really gaining anything by moving? Implementing the overhead of parallel processing for a few rows negates the performance improvement running in parallel can give.

qt_ky · Post by **qt_ky** » Wed Mar 07, 2012 7:02 pm

Sparse lookup should work if the reference stage supports sparse lookups, and it should not matter what stage var1 comes from in that case.