Performance

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Performance

Post by rsunny »

Hi,

I am trying to load almost like 2 million records into database which is oracle.
I used link partitioner , link collector , IPC stages also. I also increased the array size in db to 30,000 and transaction handling to the same number but still taking almost more than 30 min to load and my source is flat file. Can any one please suggest me what are the best measures in order to increase the performance and reduce the time.

Thanks in advance
rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Post by rsunny »

Hi ,

I am also doing a lookup from the hash file , so the time taken to load from db to hash file for lookup is also more around 5000 rows / sec. So the time is taking more at the reading and loading. Is there anything that i can speed up to load the data from db to hash file so that it might increase my performance.

Thanks in advance
battaliou
Participant
Posts: 155
Joined: Mon Feb 24, 2003 7:28 am
Location: London
Contact:

Post by battaliou »

Have you tried a bulk load? Your flat file could load directly into oracle without the need for datastage at all.

Also, remember to check "Allow stage write cache" for your hashed file. If you know the key type of the hashed file you could get a marginal improvement in performance by making it a type 18 or 13, and by setting a modulus and group size. This gets a bit tricky so probably best to stay with type 30.

You get get partitioned parallelism going as well by inserting another oracle write stage into your job and then sending half the data down either link. If you go down this route you could also feasibly have two hashed files for your lookup too. Preloading the hashed files to memory is always good if its not too big.
3NF: Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key. So help me Codd.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

If you are updating records, check with your DBA to see if you are getting any row level locking.

Did you configure your stage to be parallel or sequential?

If you are set on one, than try the other to see if performance increases.

update stats on the table.
rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Post by rsunny »

Hi ,

I am doing some transformations before loading into DB.And most of the time taking while doing lookup i mean loading data from DB to hash file to do look up say if my job ran for 10 min , then in order to load from DB to hash file to do lookup is taking almost 6 to 7 min. And i tried with checking "Allow stage write cache" but no improvement in performance . And my stage is sequential and i am using server edition and i dont have have enterprise edition.

Thanks in advance
sajarman
Participant
Posts: 41
Joined: Mon Nov 28, 2005 6:29 am

Post by sajarman »

Try to drop any columns/rows that are not required as early as possible. If some columns of lookup are available in source, then do not fetch it in lookup. I have seen that, performance degrades rapidly as data in memory grows large. You can also try to split the job into 2 jobs, by logically dividing the data that is being processed/written into hash files in each job.
Post Reply