Simple Job Running for long time

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
raj_cipher
Participant
Posts: 90
Joined: Mon Dec 08, 2003 4:48 am
Location: Chennai

Simple Job Running for long time

Post by raj_cipher »

Hi ,

I hv a job in whick both i/p and o/p are seq files.
I use three transformers and i also hv 5 odbc lookups.
The data vol. is around 9 million.Only two ODBC look ups have 2.5 million data and the rest are in thousands.This job takes 600 mins to run.
The server config. is very good and no other heavy load was running.

Any idea so that i can look into the job in detail and tune this one ?
Think Ahead,
Raj.D
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Your 'simple job' is doing 45 million ODBC lookups over the course of its run. :shock: All this one at a time and over the network, unless your lookup sources are co-resident with the DataStage server.

Much better idea to bring this data into hash files and do the lookups locally against the hashes.
-craig

"You can never have too many knives" -- Logan Nine Fingers
raj_cipher
Participant
Posts: 90
Joined: Mon Dec 08, 2003 4:48 am
Location: Chennai

Post by raj_cipher »

Hi ,

My ODBC look up has queries with <= and >= conditions in it.This condition is not possible in hash files.Anyway that i could improve it from the current set-up ?The network connection is 1gbps VLan.
Think Ahead,
Raj.D
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

raj_cipher wrote:My ODBC look up has queries with <= and >= conditions in it.This condition is not possible in hash files.
Strictly speaking, this isn't true. Yes, it's not an "out of the box" in the GUI kind of thing and you do have to do some work to achieve something like that, but it can be done. See this post for some ideas that may get you walking down that path.

You could also play the old game of accessing the Hash file using a UV stage, which would allow you to use your SQL syntax on the data. With proper indexing, I would think this would beat the pants off your ODBC lookups. Maybe even without proper indexing. :wink:

I seem to recall some other posts on this topic recently, but couldn't turn them up. Anyone have any other good posts stashed away in their Favorites that might help? Either that or care to chip in?
-craig

"You can never have too many knives" -- Logan Nine Fingers
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

With proper indexing, I would think this would beat the pants off your ODBC lookups. Maybe even without proper indexing.
No, you pretty much have to have proper indexes when doing stuff like this unless the amount of data is very small. But it's pretty simple to add extra indexes.

Tony
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Well, a single instance of a job reading a seq file, doing a couple of ODBC lookups, and writing to a seq file will not use multiple CPUs to run. By instantiating the job and using a constraint to partition the source data, you can run as many instances as you have CPUs and increase your net throughput.

Unfortunately, your ODBC looups will slow you down, and chances are that the database will struggle to run N times as many similar lookup queries so you won't see a linear improvement as you run more instances. You MUST move to using hash files, which have an in-memory caching feature.

Now, to use hash files with a ranged lookup, you must create a memory-cacheable construct. This requires that you build a hash file in a way that supports your efforts. This post viewtopic.php?t=84998&highlight=rollup describes my solution for doing such thing.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Funny, that was the very post that I linked to above. :wink:

And Tony, you left off my little winky guy when you quoted me. I was kidding... mostly.
-craig

"You can never have too many knives" -- Logan Nine Fingers
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Yeah, I did. Sorry.
Post Reply