No rows process through job

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

No rows process through job

Post by tonystark622 »

We ran into a problem when running a job, no rows go through or are processed. It appears to happen when using certain database tables as lookups. When it's running, you can look at the job monitor and no rows are read, or are processed by the lookup link. If I enable all the tracing options on the job when I run it, I get two rows written to the job log on every refresh, indicating some database activity (I can't remember the message that's logged. It's NOT an error or warning).

We made a simple job (from scratch) to test the problem and it fails too. The job reads from a flat file, does a lookup against a table in the database (doesn't matter if we use an Oracle OCI stage or an ODBC stage, the results are the same), and writes to a flat file. There are NO constraints or routines of any kind used in the job. The data is simply copied from input to output with a "join" on the lookup table in the transformer.

We have deleted these table definitions from the repository and "reimported" them, but it didn't help. Curiously enough, if I step through the simple test job in the debugger in Designer, it appears to work.

Does anyone have any suggestions?

Thanks for your help,
tony
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Double-check your lookup keys. If I recall correctly, only *successful* lookups show in the monitor so if they all fail it would look like nothing was processed... but that, in and of itself, shouldn't prevent the rows from moving thru the job. Unless a constraint is involved and you've already said one isn't.

If you are saying that the lookups do not fail and the job processes rows when run in the debugger, but *not* outside of it, then I'd report that as a bug to Ascential support.

-craig
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Craig,
I believe that you're correct, that only successful look up rows are shown in the monitor. However, it doesn't show any rows read from the input flat file or written to the output flat file. I'll have to go look and see if any rows were actually written to the output file. As dumb as it sounds, I didn't look at it to see if anything had been written. Just didn't see any rows going through the monitor and didn't think anything was happening. Duh on me :)

Thanks for your reply.
Tony
inter5566
Premium Member
Premium Member
Posts: 57
Joined: Tue Jun 10, 2003 1:51 pm
Location: US - Midwest

Post by inter5566 »

Tony,

How about trimming your key fields? If you press the view data button, do both sets of keys look to be the same format?

Steve
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

That might be the answer to speeding it up, Steve. I just looked at the job a minute ago. It has been runnning for almost 24 hours and has processed 12,xxx rows. I think the problem is that the lookups are just taking a loooooooonnnnnngggggg time.

So, it actually is processing rows, just very, very slowly.

Sorry to bother you folks with this. I just need to go figure out what's holding things up.

Thanks again,
Tony
Creo
Participant
Posts: 34
Joined: Wed Mar 19, 2003 1:12 pm
Location: Canada

Post by Creo »

Hi Tony,

You might want to check if an index can be added on the actual table in the db that you use to do your lookup. The index must contain all the fields (and in the same order) you need to read in that table. I know it has resolved performance issues I had in the past so you might want to talk to your dba about it.

Hope it helps.

Creo
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Thanks, Creo. That was my exact recommendation to my co-worker that brought this problem to my attention. Glad to know I'm not completely off-base.

Best regards,
Tony
mhester
Participant
Posts: 622
Joined: Tue Mar 04, 2003 5:26 am
Location: Phoenix, AZ
Contact:

Post by mhester »

Also you may consider building a Hash table to use as a lookup especially if the lookup table is large. The Hash will offer much faster response.
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

I think that's what he ended up doing.

He started the job yesterday,with the database lookup in it and just let it run (accidently). It took 23 hours and 40 minutes to process 12,xxx rows. In the meantime, I think he re-wrote the job the way that you mentioned.

Tony
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are you doing the lookups over a network? [:0]
If so, load the lookup table into a local table (or, even better, hashed file) before processing, and stand by to be amazed at the gains.

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Yes, we were doing a lookup across the network (to the Oracle server). I think the guy that wrote that job did end up loading that table into a hash file and doing the lookup that way.

It was pretty weird that we couldn't see any data going through the job, though.

Tony
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Not at all. If you can get the database server to reduce the number of rows to be processed by DataStage, then so much the better. And if you can reduce the total network traffic then that's a bonus too.
Loading a local table or hashed file is decidedly recommended when the source of the reference keys is not the same database as the source of the stream data.
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Just wanted to send an update along...

I ran into this problem in another job. I was doing a lookup against an Oracle table. Because I needed to return multiple rows from my lookup link for every input row, I couldn't just use a hash file.

I did end up copying the Oracle table into a hash file, and doing the lookup from the hash file through a Universe stage. This seemed approach was faster than using an Oracle lookup, but still only returned 3 or 4 rows every 5 seconds. I put a BUILD.INDEX statement in the before stage routine, so that it would build the index for the column I needed (Before Stage Routine was "ExecTCL BUILD.INDEX hg_hash LSACONXB"). This fixed it... It's now running > 100 rows/second through that lookup link.

I would like to thank everyone for their help. Ray, what you mentioned was just the hint I needed to help me solve this problem.

Thanks everyone!
Tony
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Just wanted to ask a question about another, ummm, "feature" that I found using this technique. When I have multiple columns that I need to join on in my hashfile/Universe stage, it runs slowly. It is slow even if I have indexes on each of the columns in the join.

To fix this, I added a column, when I built the hash/Universe table, that contains all the fields that I needed to join on, concatenated together (and separated by a tilde ('~')). Then I built an index on this concatenated key field and joined to it (again concatenating the fields from the source file together, separated by a tilde). This worked wonderfully.

Why didn't this work when I had separate indexes on all the join columns? In other words, when I had separate indexes on each of the fields that I needed to join on, why was the lookup still very slow?

Thanks for your help,
Tony
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

As you've found, you need a composite index to do a lookup on a composite key speedily. Having *individual* indexes on all three fields doesn't help - having one index that encompasses all three fields does, just like in the relational world. You did it in a artificial manner by creating a composite field and then indexing that.

Sorry I don't have the syntax off the top of my head, but you can build composite indexes thru TCL or the Administrator... or your way works fine (and used to be the only way to do it in the "old days").

-craig
Post Reply