Page 1 of 1

Slow Lookup in Netezza

Posted: Tue Feb 05, 2008 11:25 am
by just4geeks
I have 3 datasets containing around a 100, 200, 70000 records in each. Each dataset holds a key. The keys are used to lookup a netezza table to pull up a few facts into a Seq file. The table holds 60 million records. The entire job takes around 20 - 25 min. Is there anyway at all to speed up the process? I use a lookup stage to peer into the table and the table is accessed using a ODBC enterprise stage.

Is the ODBC stage responsible for such a slow lookup? We do have a Netezza Enterprise stage but that can only act as a target and not as a source or reference.

I am using a Table Read Method and all options are at default. Will changing Isolation level or Fetch Array Size make a difference? I will try doing that but each run may take upto 25 min.

Any ideas will be appreciated. Thanks for your time.

Re: Slow Lookup in Netezza

Posted: Tue Feb 05, 2008 1:40 pm
by sud
Why will the Netezza enterprise stage act only as a target? I dont use netezza now, but as far as I remember, we used to use the netezza enterprise stage as a source. One thing about netezza, it can usually beat tools like datastage in almost all types of transformations, lookup etc so dont apply the general ETL techniques with it. Try to do stuff as much as possible on the database side. For example, in your case in case your datasets' data is coming from the same netezza database, try do all joins in a single SQL and then read the data.

From what it sounds like, you must be doing a cached lookup, hence datastage is reading the whole 60 million records into memory before doing the lookup. Attach the database stage directly to the lookup and choose a sparse lookup.

If even that takes a lot of time, try loading all the three datasets into temporary tables in netezza(which you can drop later) and do the joins in a SQL and then bring the data to datasatge. In case you are about to write to a file, do the whole stuff including export to file within netezza and dont even involve datastage, you will see the lookup and all happening in matter of seconds.

Posted: Tue Feb 05, 2008 2:37 pm
by Aruna Gutti
Netezza Enterprise Stage is just a bulk loader using either 'ET' or 'NZLoad' options. To use Netezza as target I am also using ODBC Enterprise stage. I haven't seen performance issue with Lookup ... but of course I have only around 10 million records on my Netezza table.

Re: Slow Lookup in Netezza

Posted: Tue Feb 05, 2008 3:18 pm
by just4geeks
sud wrote:Why will the Netezza enterprise stage act only as a target?
Following is the error I get when I try to reference Netezza Stage.

Image

I will try the following.
sud wrote: If even that takes a lot of time, try loading all the three datasets into temporary tables in netezza(which you can drop later) and do the joins in a SQL and then bring the data to datasatge.

Re: Slow Lookup in Netezza

Posted: Tue Feb 05, 2008 3:20 pm
by shawn_ramsey
Have you looked at what is going on with the Netezza? I have found that it is much faster to bulk load the data into a temporary table and use a SQL statement to join it and pull the data back out for stuff like this.

They have added a bulk unload with the new version of the stage but I am not sure that is going to help since it is only available under 8. I am also working on a request for a Netezza bulk join operator.