Performance while loading target database table

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Post by nkln@you »

When I did a "top 10" command, I got the foolowing statistsics.

load averages: 0.96, 0.93, 0.66 14:52:00
242 processes: 234 sleeping, 5 zombie, 1 stopped, 2 on cpu
CPU states: 77.0% idle, 19.4% user, 3.6% kernel, 0.0% iowait, 0.0% swap
Memory: 4096M real, 1265M free, 66M swap in use, 8134M swap free

PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
1267 dsadm 4 10 0 45M 20M sleep 5:47 19.00% uvsh
423 dsadm 4 0 0 30M 13M sleep 2:55 1.23% dsapi_slave
7997 oracle 51 12 0 376M 247M cpu1 19.2H 1.19% java
1367 dsadm 1 58 0 2288K 1728K cpu3 0:00 0.16% top
29341 dsadm 4 58 0 28M 10M sleep 0:51 0.11% dsapi_slave
29366 dsadm 4 58 0 28M 10M sleep 0:41 0.11% dsapi_slave
20310 oracle 1 58 0 440M 400M sleep 1:38 0.05% oracle
1246 oracle 1 58 0 172M 134M sleep 0:08 0.03% oracle
269 dsadm 4 58 0 28M 10M sleep 0:23 0.02% dsapi_slave
626 dsadm 4 58 0 28M 10M sleep 0:46 0.01% dsapi_slave

The Process Id of my Job is 423, it is taking 1.23%.

CPU idle time is 77.0%. I am not familiar with these statistsics, are the statistics good or bad....??
Aim high
nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Post by nkln@you »

nkln@you wrote:When I did a "top 10" command, I got the foolowing statistsics.

load averages: 0.96, 0.93, 0.66 14:52:00
242 processes: 234 sleeping, 5 zombie, 1 stopped, 2 on cpu
CPU states: 77.0% idle, 19.4% user, 3.6% kernel, 0.0% iowait, 0.0% swap
Memory: 4096M real, 1265M free, 66M swap in use, 8134M swap free

When I did a "top 10" command, I got the foolowing statistsics.

load averages: 0.96, 0.93, 0.66 14:52:00
242 processes: 234 sleeping, 5 zombie, 1 stopped, 2 on cpu
CPU states: 77.0% idle, 19.4% user, 3.6% kernel, 0.0% iowait, 0.0% swap
Memory: 4096M real, 1265M free, 66M swap in use, 8134M swap free
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
1267 dsadm 4 10 0 45M 20M sleep 5:47 19.00% uvsh
423 dsadm 4 0 0 30M 13M sleep 2:55 1.23% dsapi_slave
7997 oracle 51 12 0 376M 247M cpu1 19.2H 1.19% java
1367 dsadm 1 58 0 2288K 1728K cpu3 0:00 0.16% top
29341 dsadm 4 58 0 28M 10M sleep 0:51 0.11% dsapi_slave
29366 dsadm 4 58 0 28M 10M sleep 0:41 0.11% dsapi_slave
20310 oracle 1 58 0 440M 400M sleep 1:38 0.05% oracle
1246 oracle 1 58 0 172M 134M sleep 0:08 0.03% oracle
269 dsadm 4 58 0 28M 10M sleep 0:23 0.02% dsapi_slave
626 dsadm 4 58 0 28M 10M sleep 0:46 0.01% dsapi_slave
The Process Id of my Job is 423, it is taking 1.23%.

CPU idle time is 77.0%. I am not familiar with these statistsics, are the statistics good or bad....?? The Process Id of my Job is 423, it is taking 1.23%.

CPU idle time is 77.0%. I am not familiar with these statistsics, are the statistics good or bad....??
Aim high
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Only taking 1.23% CPU indicates that CPU is not the bottleneck. However, dsapi_slave is an agent of your client process - it is not the process executing the job.
Processes running jobs usually run uvsh. I suspect (lacking other processes) that pid 1267 is the one running your job.
Even this process is only consuming 20% of a CPU.

Therefore I suggest you look elsewhere for the problem. 77% CPU means that there is plenty of spare CPU capacity. Look next at memory and disk, but chances are that the slowness comes from how you're loading the table. Are there indexes? Constraints? Triggers? All of these will slow loading operations.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Post by nkln@you »

Ray, Now I tested as follows:
One Source which populates a Sequential File.

There were 194120 records selected. The time taken for this job is 5 mins 47 sec.

Other details :

In toad the same sql took 8 sec to fetch 500 records.
Array Size:100
CPU utilization for this job : 17.59%
Interprocess Communication is enabled.
Cache size: 128 kb
Then I tried the following
One source stage , whose o/p link is connected to a transformer, the o/p link of transformer is connected to a sequential file.

Then I ran the job, stll the job took 5 min 25 sec.( with same parameters as above)

The number of records flowing on link to transformer are same as that of records flowing on link to sequential file.
I am unable to find out where the problem lies
Aim high
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What I advised you to do (in person, not here) was to add a constraint to the Transformer stage. The constraint expression should be the system variable @FALSE.

This will isolate the "problem" to the extraction from Oracle or not.

Let us know how long that takes to execute.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Some math, to keep you amused.

500 rows took 8 sec.

194120 rows is 388.24 times 500, so you would expect - if things scaled linearly - that it would take 8 times 388.24 seconds (which is 3105.02 seconds, or 51.76 minutes) to select 194120 rows.

You're getting it in 5 minutes 47 seconds.

Where's the problem?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I'm not sure there's ever been a true problem here per se, only the perception of a problem. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
nkln@you
Premium Member
Premium Member
Posts: 271
Joined: Wed Nov 17, 2004 5:15 am
Location: US

Post by nkln@you »

Hi Ray, I added constraint @FALSE in the trasformer, now the time taken is 4 min 53 sec.

Is it that fetching records scale up linearly?? Need a clarification on this.

Craig,

Is this not a problem? I think this is a problem, for loading 190,000 records into a sequential file it takes 5 mins (approx), if I write records from this sequential file to another table it takes 7 mins.

So, to load 190,000 records from one table to another table it is taking 12 mins. This is not normal I think.
Aim high
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Define 'normal'. It can depend on so many factors, some of which we're trying to help you quantify.

So, your extraction from Oracle to a flat file job went from 5:47 to 4:53? What does that tell you? Not that much of a difference between the two versions runtimes, especially if you run it a number of times I'll bet.

It tells you your 'problem' is how fast you can pull data from your Oracle database. However, without knowing everything there is to know about your network, database, hardware setup, table structures, indexing, statistics, query complexity, explain plan (etc etc) we have no idea if you can do any better or not. Perhaps that's as normal as it gets for this query. That's something you - with the help of your Friendly Neighborhood DBA Man - will need to work out.

ps. And how fast Toad is in returning the first few records is a fairly meaningless metric.
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

I think this is not a problem. As noted, a lot of factors are involved. If you dont like this, extract your data onto a file and bulk load the data. That otta make you happy.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

chulett wrote:ps. And how fast Toad is in returning the first few records is a fairly meaningless metric.
Craig, you ought to be more assertive. It's a totally meaningless metric.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ds_developer
Premium Member
Premium Member
Posts: 224
Joined: Tue Sep 24, 2002 7:32 am
Location: Denver, CO USA

Post by ds_developer »

Do you know if these records are unique based on the primary key before you try to load them? If you do, you can disable the primary key index before you load them and then enable it again when you are done. Same thing with foreign keys - if you know they are valid then you can disable the FK constraints before and enable them after. You can test this method by manually disabling these objects in the database before running your job.

Try it and let us know how long it takes. Of course, when you incorporate this into the job, enabling the indexes and constraints will add to the runtime.

John
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

He's "loading" them into a text file. Keys are irrelevant.

The problem here is one of expectation management (or maybe of bad math).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ds_developer
Premium Member
Premium Member
Posts: 224
Joined: Tue Sep 24, 2002 7:32 am
Location: Denver, CO USA

Post by ds_developer »

The original post is loading into a table so that is where the keys are relevant. Loading into a text file is a test to see where the slowdown is.

John
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

My bad. :oops: I was reviewing this diagnostic job in person yesterday and forgot about its origin.

But it remains the case that the significant part of the problem lies between the keyboard and the chair.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply