Performance problem using lookup

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

userasif
Participant
Posts: 50
Joined: Tue Jun 06, 2006 5:57 am

Performance problem using lookup

Post by userasif »

Hi all,

I have a 640 MB sequential source file. I want to extract and load data into dw table with only one lookup that inserts description of code in dw table. Without using any lookup my job extracts and loads data in 5 mintes .

But when i use lookup (in the form of hashed file or table), my job completes in more than 2 hours .

Jobs structure with lookups are:

1. Sequential File ----> Transformer -----> Oracle OCI


********* Hashed File is used for lookup on Transformer stage


2. Sequential File ----> Transformer -----> Oracle OCI

********** Oracle OCI table is used for lookup on Transformer stage


Any one can help me...... What i am missing and where i am wrong.

Thanks in advance to board,

ITsME
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

How many records in the lookup? Not that it should really matter - I can't see how a hash file could take that long - an oracle table lookup could take a while if you lookup was based on non key fields and you were doing full table scans, but a hash file should not have that problem. Are you getting any errors in your logs? Are you sure its being caused by the lookup and not the insert into the DB? The more information you can provide us for each step of the process the better.
userasif
Participant
Posts: 50
Joined: Tue Jun 06, 2006 5:57 am

Post by userasif »

Thanx for response,

Lookup table contains 70 records. It has composite key on two columns.

I am not receving any type of error. Job finishes successfully.

The structure for job1 (without use of lookup) is equal with job2 (lookup used).
Job1: SeqFile ---> Transformer ---> Oracle OCI

Job2: SeqFile ---> Transformer and Hashed File lookup on Transformer ---> Oracle OCI

Source file contains 2743873 records. In job1 insert rate is 13066 rows/sec and in job2 this is 271 rows/sec.

Both jobs drop and create new target table.

Thanks & Regards.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How many columns are in the hashed file? How many of these are actually used in the job? Are you enabling read cache for the hashed file?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
userasif
Participant
Posts: 50
Joined: Tue Jun 06, 2006 5:57 am

Post by userasif »

Hi Ray,

1. The hashed file contains 4 columns...

Column Type Key
Col1 Varchar Yes
Col2 Decimal Yes
Col3 Decimal No
Col4 Timestamp No

2. No. Not enabling read cache

Thanks & Regards.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Well then, the first suggestion would be to enable Read Caching in the lookup and let us know what kind of effect that has on your job speed.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Then, what is the "load rule" in the OCI stage? Did your fast run load into an empty table and your slow run into a non-empty table? Try separating your load into "insert only" and "update only", or even bulk load if they are all supposed to be new rows.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
userasif
Participant
Posts: 50
Joined: Tue Jun 06, 2006 5:57 am

Post by userasif »

Thnx all,

Enabling Read Caching in the lookup works fine ... :D ...But :( What happens is that my job starts very well with 6350 rows/sec and it loads 1.5 million rows in just 5 minutes. After this the rate of rows/sec goes down and the remaing 1.1 million rows load in 2 hrs with final rate 271 rows/sec. I have checked the job three times and it has same behavior all times.

I have also increased Read & Write Cache size for hashed files from Tunables tab in Adminstrator..but no change in speed of job.

The load is in empty table all the times. OCI drops and inserts table.

Thanks & Regards.
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

Another test-
Replace the OCI stage with a sequential file.
Check the performance.
We could try and identify the cause of the degradation.
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
userasif
Participant
Posts: 50
Joined: Tue Jun 06, 2006 5:57 am

Post by userasif »

Thanks Narasimha,

The test is succesful........ Job completed in just 4 min and target sequential file contains Lookup column value.

But whats wrong with OCI..still in question????????

Regards.
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

I guess now we can confirm that the actual degradation is happening in the OCI-Stage.
You can optimize the performance of the OCI-Stage in many ways. Try searching for it here.
Increasing the Array size in the OCI-Stage can considerably increase the speed of your load.
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I doubt there is anything wrong with 'OCI' but how you are using it. The preferred way to load large numbers of records is through a bulk loader. If all you are doing is inserts, why not give that a shot?

However, you never answered the question posed as to what your update action is, did you? Besides the normal Array Size and Transaction Size issues, you've probably got an ever increasing full table scan going on if updates are involved and there's no index to help it. That's one of the primary reasons something like... that... slows... down.

So, the only way to answer your question is for you to give us enough information so it is answerable. Explain what you are doing in the OCI stage, what you target looks like, any indexes or triggers or constraints it may have on it, yada yada.
-craig

"You can never have too many knives" -- Logan Nine Fingers
rleishman
Premium Member
Premium Member
Posts: 252
Joined: Mon Sep 19, 2005 10:28 pm
Location: Melbourne, Australia
Contact:

Post by rleishman »

Also check that you have "Treat Warning Message as Fatal Error" checked in the OCI stage. If you have Unique/PK violations, ignoring these errors adds a huge overhead.
Ross Leishman
userasif
Participant
Posts: 50
Joined: Tue Jun 06, 2006 5:57 am

Post by userasif »

Well, I give you the detail of Oracle OCI 8i.

General
Update Action = Insert rows without clearing
Transaction Isolation = Read Committed
Array Size = 5000, Transaction Size = 0
Create Table Action = Generate DDL (Create Table abc .....)
Drop Table Action = Generate DDL (Drop Table abc )
Treat Warning message as Fata error = NOT checked (OFF)

About Table abc....... No INDEX, Only Primarkey on Col1.

Transaction Handling

Rows per transaction = 0

********************************************************
I designed another job that loads data from sequentail file that I used as target table and lookup with hashedfile..this job is fine and loads data in 4 min into OCI target with above mentioned properties.... Only the difference is that this job has no LOOKUP......
********************************************************

Thanks to all of you.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

As mentioned, the Building up of Index will take up a long time as the number of records getting increased. That might be one of the reason where you get high load rate at initial stages. But still, as per your previous statement, If you load the data without lookup, it loads within 5 min. In that case have you not mentioned any key during load?
Try Bulk load/Droping index before load and recreating it.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply