join question

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
dnat
Participant
Posts: 200
Joined: Thu Sep 06, 2007 2:06 am

join question

Post by dnat »

Hi,

I am joining a stage table(around 0.3 millions) with the main table(around 180 millions) based on key columns in the OCI stage using SQL.

But the main table is quite busy all the time, --selects, inserts and updates are being done simultaneously on the same table.
Hence this join takes too much time and results in "snapshot too old error"

If i do a join in datastage i.e extract records from both tables and use the transformer stage to lookup, would it be beneficial?

Any thoughts..Need your suggestion.
John Smith
Charter Member
Charter Member
Posts: 193
Joined: Tue Sep 05, 2006 8:01 pm
Location: Australia

Post by John Smith »

You'll probably get the same Oracle error doing that join in DS.
You need to engage your DBA to work out a feasbile solution. One suggestion is that you might want to consider a replicated database that is used only for reporting.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Would your DBA be amenable to setting up a mechanism to create a snapshot from which DataStage could work?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
dnat
Participant
Posts: 200
Joined: Thu Sep 06, 2007 2:06 am

Post by dnat »

hi,

my problem here is the volume of the data is so huge and even if we try to replicate the db,the time it would take would really be high i guess.

we have nearly 60-70 files daily which would have around 0.1 mil to 0.5 mil each. our batch timing is for around 8 hours to complete all the processing of files.

we had some issues with the files and eneded up in having backlogs. Now we have got nearly 30-40 days backlogs of 4 files. i.e nearly 160 files of each 0.3 mil approx.

Because of this snapshot error, we are not able to load the data into the table. Even if we create a replica of the table, that would have to be in a specific time period isnt it. so we cannot continuously run two days files as the second file will not have the updtated records for lookup.

Even if they create the snapshot of the data, i think it is not going to lookup and update correctly..isnt it?
dnat
Participant
Posts: 200
Joined: Thu Sep 06, 2007 2:06 am

Post by dnat »

Hi,

To my surprise , i made few changes to the subsequent stages to improve performance and this issue doesnt come up any longer. I didnt touch the OCI stage or the oracle tables.

the issue is solved, but i am trying to figure out the root cause so that i can keep this in mind when i design some other jobs. If the problem(performance issue) is related to datastage, then why is it giving oracle related error(snapshot too old).
John Smith
Charter Member
Charter Member
Posts: 193
Joined: Tue Sep 05, 2006 8:01 pm
Location: Australia

Post by John Smith »

perhaps you can elaborate what changes you made? or did the dba do something that you were not aware of?
dnat
Participant
Posts: 200
Joined: Thu Sep 06, 2007 2:06 am

Post by dnat »

No, the DBA didnt make any changes to the table or the setup.

After extracting the records which satisfy the join condition i am writing it into a hashed file which is being used as a reference for a lookup. This hashed file is used as reference and is being updated also, after lookup.

There were two keys for this hashed file and i removed it and made it as one key(with the two key fields concatenated). this improved the performance of the whole job.

Can this be due to reason that if the design is like this

Stage 1(OCI)-->Stage 2(Hashed file)-->Stage 3(Transformer)--Stage 4(Hashed file).

Will the slow transformation of Stage 3 can cause Stage 1 to perform slow?? i.e will it affect the OCI stage?
andy心&#262
Participant
Posts: 18
Joined: Tue Oct 28, 2008 2:31 am

join question

Post by andy心&#262 »

Join tables in oracle database server at first, the system bottleneck is oracle db. And if the db is busy, you will get the error that the snapshot is too old, it is not an error of datastage.
Extract tables then use datastage to join, the bottleneck is transformstage in the job.

First method will be quick if your DBA can optimize the db.
Second method will be slow, and it seems no good ways to get a better performance.
dnat
Participant
Posts: 200
Joined: Thu Sep 06, 2007 2:06 am

Post by dnat »

Hi,

The performance has improved and because of that we are not getting the snapshot too old error anymore.

but what i am asking is about the root cause..
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:? The root cause of a "snapshot too old"? That's Oracle, not DataStage and a quick google will turn up all kinds of discussions on the subject.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply