Page 1 of 1

DB issue or DS issue?

Posted: Wed Feb 08, 2006 7:21 pm
by Nick_6789
I keep encountering these wierd errors. Have yet to determine the root cause.

I have 2 jobs that extract information from a table using ODBC stage.

Both have exactly similar SQL statements referring to the same table with the same conditions (Including similar date condition) hence number of records extracted should be identical...

But this is wierd... Out of 10 seperate sessions I run these jobs, I will at least encounter 1 time inconsistency. Whereby the first job does not extract the supposed number of records whereas the second job does.

What gives? I have encountered this 2 times... There was an incident whereby the first job does not even extract any records but in Director it showed it ran.

Any enlightenment would be much appreciated. Thanks.

Posted: Wed Feb 08, 2006 8:32 pm
by chulett
There's not nearly enough information here to provide you with any coherent help. :?

You haven't mentioned the database. You state the two SQL statements are "exactly similar"... are they similar or are they exactly the same? Is the table you are sourcing from being actively written to when the jobs run? What do the job designs look like? The more details you can post, the better chance you'll have of someone doing something other than guessing what might be going on.

Perhaps it would help if you also posted the two SQL statements in question.

DB issue or DS issue?

Posted: Wed Feb 08, 2006 9:29 pm
by Nick_6789
chulett wrote:There's not nearly enough information here to provide you with any coherent help. :?

You haven't mentioned the database. You state the two SQL statements are "exactly similar"... are they similar or are they exactly the same? Is the table you are sourcing from being actively written to when the jobs run? What do the job designs look like? The more details you can post, the better chance you'll have of someone doing something other than guessing what might be going on.

Perhaps it would help if you also posted the two SQL statements in question.

Hi Mr. Chulett,

The sql statement between the 2 are EXACTLY identical.

select * from table where last run = 'date' and curr run = 'date'.

That's about it for the sql statement I am just merely extracting record from specific dates onwards. The job design is a simple extraction from ODBC stage to a transformer and to a sequential file. The inconsistencies in terms of no. of records is seen in the link between the ODBC stage and the transformer.

And further adding on, yups, the database is being actively written into as I run the job by some other application users.

Posted: Wed Feb 08, 2006 10:18 pm
by chulett
Are you sure that the difference in the row counts can't simply be a result of the fact that the jobs run at different times against a table that is actively being written to? That would seem to be the obvious culprit here. Your conclusion that "the number of records extracted should be identical" is only valid for static data sources.

You still haven't mentioned the database involved. When the results are different, have you compared the two extracted data sets to see what records are dropping out? Also curious what Isolation Level you are using on the Transaction Handling tab of the ODBC stage.

Posted: Wed Feb 08, 2006 11:24 pm
by ray.wurlod
And is the source database possibly being updated while your selects are running?

Re: DB issue or DS issue?

Posted: Wed Feb 08, 2006 11:35 pm
by chulett
Ray, he's already said that is the case:
Nick_6789 wrote:And further adding on, yups, the database is being actively written into as I run the job by some other application users.

DB issue or DS issue?

Posted: Thu Feb 09, 2006 12:43 am
by Nick_6789
chulett wrote:Are you sure that the difference in the row counts can't simply be a result of the fact that the jobs run at different times against a table that is actively being written to? That would seem to be the obvious culprit here. Your conclusion that "the number of records extracted should be identical" is only valid for static data sources.

You still haven't mentioned the database involved. When the results are different, have you compared the two extracted data sets to see what records are dropping out? Also curious what Isolation Level you are using on the Transaction Handling tab of the ODBC stage.
Hi all,

Maybe i am still new in this. However, will try my best to clarify the issue.

I am using informix database, what is goin on on that side I am not too sure. I will have to check with my DBA for more info regarding that.

All i can say is that yes, those 2 jobs, though identical in extracting from the same source. They are both run at seperate times. After one another.

As for the isolation level, it's set to none. Is there a significance in setting it? Cause if a row is indeed locked due to updation goin on in the source, My jobs will abort. But this is not the issue I'm facing right now.

Posted: Thu Feb 09, 2006 3:21 am
by ray.wurlod
Isolation level none means that you CAN read uncommitted transactions. That's what I was trying to get at in my earlier post. If the database is being updated while your select is running, and you are not isolating those transactions, then you will get different results at different times.

Another thing that can affect things is if you have a database created with no logging. You did not mention whether this is the case or not.

Posted: Thu Feb 09, 2006 7:51 am
by chulett
ray.wurlod wrote:Isolation level none means that you CAN read uncommitted transactions. That's what I was trying to get at in my earlier post.
As was I. You'll also read ones committed between the start of one job and the start of the other, even with a 'proper' isolation level. I just don't think it's reasonable to assume they will always pull the same record counts from an actively changing data source.

Posted: Thu Feb 09, 2006 8:15 am
by kwwilliams
Nick_6789 wrote:All i can say is that yes, those 2 jobs, though identical in extracting from the same source. They are both run at seperate times. After one another.
If the jobs are run right after each other, even though you are using the same sql you cannot expect the same results from an active database. The best analogy that I can think if is this. You have 1 camera (your sql), you take a picture at midnight (you run the sql) and then take a picture at noon. When you go to develop the film they are going to look different even though they were taken by the same camera. Your case is not that drastic, you are running them back to back. But the same thing would apply, you are taking pictures at different times and expecting the results to be the same.

I am curious about this, why would you have two jobs running the sme sql? You have all ready done the work to pick the data up from the database, why do it again?

DB issue or DS issue?

Posted: Thu Feb 09, 2006 7:13 pm
by Nick_6789
kwwilliams wrote:
I am curious about this, why would you have two jobs running the sme sql? You have all ready done the work to pick the data up from the database, why do it again?
Hi all,

Thanks for the replies.

To: Mr kwwilliams

Makes sense, but given the current design... after each extraction when they're going through the transformer there's a different logic applied in the 2 different jobs. Different mappings so the jobs are split.

To: All

I did find the cause of the prob from my dba. The database source should be set as row lock instead of it's current setting of page locking. We will be reverting it to row locking.

And... Yeah, taking source from actively updating database is not a good idea, That's why we're think it may be a good idea to take the information from a specific date right till an hour earlier before I run the jobs. That would make sure that all my source info 1 hour earlier are already commited.

Thanks you guys, will keep you posted on my findings later.

Re: DB issue or DS issue?

Posted: Thu Feb 09, 2006 7:33 pm
by kwwilliams
Selecting data from an active system means that you are still in business, so that's a good thing. It just makes reconciliation of the source and the target more difficult.