Performance while loading target database table

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Performance while loading target database table

Post by nkln@you »

I am selecting records from table T1 of Database D1 and inserting them into table T2 OF Database D2.

For this job J1 to complete, it took 18 Minutes. The no.of records selected is 1, 63, 497. So, i thought the performance is poor.

Then I developed JOb j2 where I am selecting records from table T1 of Database D1 ( same as J1) and inserting them into a sequnetial file.
The time taken for this job J2 is 5 mi ns 20 sec.



In J1 ArraySize =50000 in source stage . ;in target stage Arraysize=25000, Transaction size=30000.

In J2 Array Size in source stage =50000.

So, where is the problme in Job J1 with regards to performance.
Aim high
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Your J1 will load the records to a flat file. Build another job that loads this flat file to your D2.T2
See how much time that takes.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Post by nkln@you »

Is it that you want me to fisrst select records from D1.T1 and store them into sequnetial file and then load the sequential records into D2.T2.

If so instead of modifying J1,I can modify job J2, by adding o/p link from sequential file to D2.T2.

Is this ok to check your option.
Aim high
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

No No, you got me wrong.
First you were selecting from D1.T1 and loading to D2.T2 in the same job.
Then you created a new job that selects from D1.T1 and loads to a sequential file which took 5 mins or so.
Create another job that loads the sequential file to D2.T2 to see how much that takes.
Basically splitting your J1 to two jobs.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Post by nkln@you »

I checked . The time taken for the second job is 11 mins.

I feel this is also big time. The array size is 32767.

When the job started rows/sec was 600. When number of records read from sequential file was around 31000, trwos /sec gradually reduce to 300.

Thereafter it continued with 300 rows/sec.


I feel the rows/sec should be 1500 rows/sec which I heard form others. Where is it going wrong.
Aim high
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

While loading check the network traffic.
Also, what action are you performing on the table?
1500 rows/sec may be good for someone else's case but not for you, depending upon your record size, action performed etc. etc.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Why have you set the array size so large? Array size is the number of rows sent at a time. Therefore, until that many rows have been processed, no rows will be sent. But the clock is running, so your rows per second (a totally meaningless metric - search the forum for reasons) will be severely understated. Try a much smaller array size - say 100 - and don't forget that there needs to be a sensible relationship between array size and rows per commit (transaction size).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

And that 'sensible relationship' would typically be this: Transaction Size should be an even multiple of Array Size.
-craig

"You can never have too many knives" -- Logan Nine Fingers
nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Post by nkln@you »

Yes Ray, I made Array Size as 100 and Transaction Size as 30000( to avoid more numbr of commis to be done).

Now I have gained performance.i.e the job is taking 12 mins of time now.
For this job J1 to complete, it took 18 Minutes. The no.of records selected is 1, 63, 497
.

Still I believe it can be tuned (as 12 minutes is huge for that records). Any other parameters to be tuned??

DSGuru, how to check network traffic and how to proceed further??
Aim high
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I don't see an answer to the 'what is your update action' question. Are these strictly 'Insert without clearing' records? If so, why not bulk load them? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Post by nkln@you »

My update action is "insert only". I tried Bulk Stage now in Normal Mode with Maximum Recod number 100. The time taken is around 11 mins. So, there is no much improvement
in performance.

I heard that Bulk Load doesnt improve performance much. Even some of the threads say like this..

Any other thing to consider.
How to check the network traffic and continue
Aim high
vipshiva
Participant
Posts: 26
Joined: Thu Sep 28, 2006 4:48 am

Post by vipshiva »

Try to partition the target table and then bulk load it will improve performance...... Otherwise Create indexes in the target table.....

Regards,
Siva.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

nkln@you wrote:I heard that Bulk Load doesnt improve performance much. Even some of the threads say like this..
You heard wrong.

As an excercise, I just had one of my newbies convert a job from OCI inserts to the OCI Bulk Loader. Ray's favorite metric - rows/second - went from 600 to 15,000. Runtime went from 6.5 hours to 12 minutes. Nope, doesn't improve performance much... it improves it a lot.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Of course, you need to design the job(s) properly to take advantage of this. A poor design can still give poor performance, even when bulk loading.

What is your target database? Why would you set your 'max' to 100 for a load like this? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Who said that Bulk load does'nt make a difference. It makes a hellava difference. But then again I am not aware of your environment.
While your job is running, monitor it and see how much cpu is utlized. For network traffic, run tops or glance to monitor the network traffic.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply