Page 1 of 1

join question

Posted: Wed Oct 22, 2008 10:33 pm
by dnat
Hi,

I am joining a stage table(around 0.3 millions) with the main table(around 180 millions) based on key columns in the OCI stage using SQL.

But the main table is quite busy all the time, --selects, inserts and updates are being done simultaneously on the same table.
Hence this join takes too much time and results in "snapshot too old error"

If i do a join in datastage i.e extract records from both tables and use the transformer stage to lookup, would it be beneficial?

Any thoughts..Need your suggestion.

Posted: Wed Oct 22, 2008 11:05 pm
by John Smith
You'll probably get the same Oracle error doing that join in DS.
You need to engage your DBA to work out a feasbile solution. One suggestion is that you might want to consider a replicated database that is used only for reporting.

Posted: Wed Oct 22, 2008 11:24 pm
by ray.wurlod
Would your DBA be amenable to setting up a mechanism to create a snapshot from which DataStage could work?

Posted: Thu Oct 23, 2008 2:58 am
by dnat
hi,

my problem here is the volume of the data is so huge and even if we try to replicate the db,the time it would take would really be high i guess.

we have nearly 60-70 files daily which would have around 0.1 mil to 0.5 mil each. our batch timing is for around 8 hours to complete all the processing of files.

we had some issues with the files and eneded up in having backlogs. Now we have got nearly 30-40 days backlogs of 4 files. i.e nearly 160 files of each 0.3 mil approx.

Because of this snapshot error, we are not able to load the data into the table. Even if we create a replica of the table, that would have to be in a specific time period isnt it. so we cannot continuously run two days files as the second file will not have the updtated records for lookup.

Even if they create the snapshot of the data, i think it is not going to lookup and update correctly..isnt it?

Posted: Mon Oct 27, 2008 9:31 pm
by dnat
Hi,

To my surprise , i made few changes to the subsequent stages to improve performance and this issue doesnt come up any longer. I didnt touch the OCI stage or the oracle tables.

the issue is solved, but i am trying to figure out the root cause so that i can keep this in mind when i design some other jobs. If the problem(performance issue) is related to datastage, then why is it giving oracle related error(snapshot too old).

Posted: Mon Oct 27, 2008 11:46 pm
by John Smith
perhaps you can elaborate what changes you made? or did the dba do something that you were not aware of?

Posted: Tue Oct 28, 2008 12:52 am
by dnat
No, the DBA didnt make any changes to the table or the setup.

After extracting the records which satisfy the join condition i am writing it into a hashed file which is being used as a reference for a lookup. This hashed file is used as reference and is being updated also, after lookup.

There were two keys for this hashed file and i removed it and made it as one key(with the two key fields concatenated). this improved the performance of the whole job.

Can this be due to reason that if the design is like this

Stage 1(OCI)-->Stage 2(Hashed file)-->Stage 3(Transformer)--Stage 4(Hashed file).

Will the slow transformation of Stage 3 can cause Stage 1 to perform slow?? i.e will it affect the OCI stage?

join question

Posted: Tue Oct 28, 2008 2:59 am
by andy心&#262
Join tables in oracle database server at first, the system bottleneck is oracle db. And if the db is busy, you will get the error that the snapshot is too old, it is not an error of datastage.
Extract tables then use datastage to join, the bottleneck is transformstage in the job.

First method will be quick if your DBA can optimize the db.
Second method will be slow, and it seems no good ways to get a better performance.

Posted: Wed Oct 29, 2008 6:01 am
by dnat
Hi,

The performance has improved and because of that we are not getting the snapshot too old error anymore.

but what i am asking is about the root cause..

Posted: Wed Oct 29, 2008 6:59 am
by chulett
:? The root cause of a "snapshot too old"? That's Oracle, not DataStage and a quick google will turn up all kinds of discussions on the subject.