DB issue or DS issue?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Nick_6789
Participant
Posts: 32
Joined: Thu Jan 05, 2006 2:07 am

DB issue or DS issue?

Post by Nick_6789 »

I keep encountering these wierd errors. Have yet to determine the root cause.

I have 2 jobs that extract information from a table using ODBC stage.

Both have exactly similar SQL statements referring to the same table with the same conditions (Including similar date condition) hence number of records extracted should be identical...

But this is wierd... Out of 10 seperate sessions I run these jobs, I will at least encounter 1 time inconsistency. Whereby the first job does not extract the supposed number of records whereas the second job does.

What gives? I have encountered this 2 times... There was an incident whereby the first job does not even extract any records but in Director it showed it ran.

Any enlightenment would be much appreciated. Thanks.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

There's not nearly enough information here to provide you with any coherent help. :?

You haven't mentioned the database. You state the two SQL statements are "exactly similar"... are they similar or are they exactly the same? Is the table you are sourcing from being actively written to when the jobs run? What do the job designs look like? The more details you can post, the better chance you'll have of someone doing something other than guessing what might be going on.

Perhaps it would help if you also posted the two SQL statements in question.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Nick_6789
Participant
Posts: 32
Joined: Thu Jan 05, 2006 2:07 am

DB issue or DS issue?

Post by Nick_6789 »

chulett wrote:There's not nearly enough information here to provide you with any coherent help. :?

You haven't mentioned the database. You state the two SQL statements are "exactly similar"... are they similar or are they exactly the same? Is the table you are sourcing from being actively written to when the jobs run? What do the job designs look like? The more details you can post, the better chance you'll have of someone doing something other than guessing what might be going on.

Perhaps it would help if you also posted the two SQL statements in question.

Hi Mr. Chulett,

The sql statement between the 2 are EXACTLY identical.

select * from table where last run = 'date' and curr run = 'date'.

That's about it for the sql statement I am just merely extracting record from specific dates onwards. The job design is a simple extraction from ODBC stage to a transformer and to a sequential file. The inconsistencies in terms of no. of records is seen in the link between the ODBC stage and the transformer.

And further adding on, yups, the database is being actively written into as I run the job by some other application users.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Are you sure that the difference in the row counts can't simply be a result of the fact that the jobs run at different times against a table that is actively being written to? That would seem to be the obvious culprit here. Your conclusion that "the number of records extracted should be identical" is only valid for static data sources.

You still haven't mentioned the database involved. When the results are different, have you compared the two extracted data sets to see what records are dropping out? Also curious what Isolation Level you are using on the Transaction Handling tab of the ODBC stage.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

And is the source database possibly being updated while your selects are running?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: DB issue or DS issue?

Post by chulett »

Ray, he's already said that is the case:
Nick_6789 wrote:And further adding on, yups, the database is being actively written into as I run the job by some other application users.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Nick_6789
Participant
Posts: 32
Joined: Thu Jan 05, 2006 2:07 am

DB issue or DS issue?

Post by Nick_6789 »

chulett wrote:Are you sure that the difference in the row counts can't simply be a result of the fact that the jobs run at different times against a table that is actively being written to? That would seem to be the obvious culprit here. Your conclusion that "the number of records extracted should be identical" is only valid for static data sources.

You still haven't mentioned the database involved. When the results are different, have you compared the two extracted data sets to see what records are dropping out? Also curious what Isolation Level you are using on the Transaction Handling tab of the ODBC stage.
Hi all,

Maybe i am still new in this. However, will try my best to clarify the issue.

I am using informix database, what is goin on on that side I am not too sure. I will have to check with my DBA for more info regarding that.

All i can say is that yes, those 2 jobs, though identical in extracting from the same source. They are both run at seperate times. After one another.

As for the isolation level, it's set to none. Is there a significance in setting it? Cause if a row is indeed locked due to updation goin on in the source, My jobs will abort. But this is not the issue I'm facing right now.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Isolation level none means that you CAN read uncommitted transactions. That's what I was trying to get at in my earlier post. If the database is being updated while your select is running, and you are not isolating those transactions, then you will get different results at different times.

Another thing that can affect things is if you have a database created with no logging. You did not mention whether this is the case or not.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

ray.wurlod wrote:Isolation level none means that you CAN read uncommitted transactions. That's what I was trying to get at in my earlier post.
As was I. You'll also read ones committed between the start of one job and the start of the other, even with a 'proper' isolation level. I just don't think it's reasonable to assume they will always pull the same record counts from an actively changing data source.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Post by kwwilliams »

Nick_6789 wrote:All i can say is that yes, those 2 jobs, though identical in extracting from the same source. They are both run at seperate times. After one another.
If the jobs are run right after each other, even though you are using the same sql you cannot expect the same results from an active database. The best analogy that I can think if is this. You have 1 camera (your sql), you take a picture at midnight (you run the sql) and then take a picture at noon. When you go to develop the film they are going to look different even though they were taken by the same camera. Your case is not that drastic, you are running them back to back. But the same thing would apply, you are taking pictures at different times and expecting the results to be the same.

I am curious about this, why would you have two jobs running the sme sql? You have all ready done the work to pick the data up from the database, why do it again?
Nick_6789
Participant
Posts: 32
Joined: Thu Jan 05, 2006 2:07 am

DB issue or DS issue?

Post by Nick_6789 »

kwwilliams wrote:
I am curious about this, why would you have two jobs running the sme sql? You have all ready done the work to pick the data up from the database, why do it again?
Hi all,

Thanks for the replies.

To: Mr kwwilliams

Makes sense, but given the current design... after each extraction when they're going through the transformer there's a different logic applied in the 2 different jobs. Different mappings so the jobs are split.

To: All

I did find the cause of the prob from my dba. The database source should be set as row lock instead of it's current setting of page locking. We will be reverting it to row locking.

And... Yeah, taking source from actively updating database is not a good idea, That's why we're think it may be a good idea to take the information from a specific date right till an hour earlier before I run the jobs. That would make sure that all my source info 1 hour earlier are already commited.

Thanks you guys, will keep you posted on my findings later.
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Re: DB issue or DS issue?

Post by kwwilliams »

Selecting data from an active system means that you are still in business, so that's a good thing. It just makes reconciliation of the source and the target more difficult.
Post Reply