Page 1 of 2

Unable to read higher volume of record with ODBC(Sql server)

Posted: Mon Jul 21, 2008 1:06 pm
by Sandeep.pendem
Hello,

I have a simple mapping with i/p as ODBC stage(Sql server as database) and output as a flat file(one to mapping) with an intermediate transformer stage.
i/p(ODBC) sql server------------------Xfm----------o/p flat file

This job works fine for when ur incoming data is comparatively small(arond 10 records or so). But the same job doesnt work when the volume of data coming in from source is around 3557255 records...and it doesnt prompt an error message. it looks like it either timeout sometime or keeps on running for few hours with no output.
Can anyone please let me know if I need to do any changes at the databse level, or anything else?

Posted: Mon Jul 21, 2008 1:11 pm
by chulett
What kind of query - one that does sorting/grouping that would cause it to hold on to all the records until ready? You may have a network/firewall timeout issue if that's the case.

Posted: Mon Jul 21, 2008 1:16 pm
by Sandeep.pendem
[quote="chulett"]What kind of query - one that does sorting/grouping that would cause it to hold on to all the records until ready? You may have a network/firewall timeout issue if that's the case. ...[/quote]

Its a simple query --> This query works fine on sql server

SELECT DISTINCT C.MMOUTLETCD,D.MMBRANDPACKAGECONFIGCD

FROM
dbo.SalesToRetailFact A,
dbo.SourceOutlet B,
dbo.Outlet C,
dbo.OrderableSKUDimension D

WHERE
A.SOURCEOUTLETID=B.SOURCEOUTLETID AND
B.OUTLETID=C.OUTLETID AND
A.OrderableSKUSid=D.OrderableSKUSid

But dont knw why it doesnt return any records when the volume is higher....

Posted: Mon Jul 21, 2008 1:21 pm
by chulett
The distinct clause adds the sorting/grouping I was asking about. When it "works fine on sql server" how long does it take to start returning rows with a "large" volume?

Posted: Mon Jul 21, 2008 1:26 pm
by Sandeep.pendem
[quote="chulett"]The [b]distinct[/b] clause adds the sorting/grouping I was asking about. When it "works fine on sql server" how long does it take to start returning rows with a "large" volume? ...[/quote]


It doesnt return any rows as such...I have started the job more than 1.5 hrs back..it still doesnt show up any records...do you mean that I need to get rid of the disticnt clause...?
or do a need to change a paramter some sort of timeout parameter at DS or database level?

Posted: Mon Jul 21, 2008 1:44 pm
by chulett
Run the query outside of DataStage so you know how long it will take. I suspect you have a network/firewall "inactivity timeout" issue here, but you need to know how long it will appear to be idle to have it addressed.

Posted: Mon Jul 21, 2008 1:46 pm
by Sandeep.pendem
[quote="chulett"]Run the query [i]outside[/i] of DataStage so you know how long it will take. I suspect you have a network/firewall "inactivity timeout" issue here, but you need to know how long it will appear to be i ...[/quote]


Ok, so how we overcome the netwrok/firewall inactivity timeout issue? I mean what parameters I need to change?

Posted: Mon Jul 21, 2008 1:50 pm
by chulett
If that is what is happening, you'd need to work with someone else - a SysAdmin - to see what your options are.

Posted: Tue Jul 22, 2008 1:43 pm
by Sandeep.pendem
[quote="chulett"][i][b]If[/b][/i] that is what is happening, you'd need to work with someone else - a [b]SysAdmin[/b] - to see what your options are. ...[/quote]

Hi,
Thanks for the responses.

I have worked with our sysadmin and he tried all the options as far as the SQL server 2005 is concerned and it doesnt seem to be a problem there.

The followig things occur in datastage and Sql server 2005
1) Query takes 8 mins to materialise in the management studio for sql
2) In Datastage Version8 and Version7, the query thread goes away after 3 minutes and no errors in sql server logs or traces are seen
4) suspect that the ODBC enforces some sort of Inactivity timeout
5) we have ODBC driver version 4.2 on Datastage version7 and ODBC driver version 5.2 on Datastage version 8.
5) Is there a parameter supplied on the ODBC connection defination to increase the Inactivity timeout? or do we need to add any timeout paramter in ODBC.INI file?

Posted: Tue Jul 22, 2008 2:05 pm
by satya99
Job1: Dump data to hashed file( 4 different hashed files)
job2: Now perform your look up operation

Posted: Tue Jul 22, 2008 2:06 pm
by Sandeep.pendem
[quote="satya99"]Job1: Dump data to hashed file( 4 different hashed files)
job2: Now perform your look up operation[/quote]


We have 400 milion records, so dumping to hash file is not an option for us.

Thanks,

Posted: Tue Jul 22, 2008 3:05 pm
by ray.wurlod
Yes it is. You may need to create/convert your hashed files with 64-bit addressing, but that's OK. Such hashed files can theoretically support 19 million TB of data (but most operating systems or file systems restrict this somewhat).

Posted: Wed Jul 23, 2008 7:35 am
by Sandeep.pendem
[quote="ray.wurlod"]Yes it is. You may need to create/convert your hashed files with 64-bit addressing, but that's OK. Such hashed files can theoretically support 19 million TB of data (but most operating systems or fi ...[/quote]


Dont we have any other option other than dumping the 400+ million records into the hash file? We would be have increemental loads and the volume of data would increase over the time, i.e the reason we didnt want to go with hash file.

Any idea on how do we set the timeout parameter on ODBC.INI file? syntax,etc?
Any help is highly appreciated.

Thanks for the support.

Posted: Wed Jul 23, 2008 8:00 am
by chulett
:? Why do you believe you need to "dump 400+ million" records into a hashed file? Best Practice is to constrain your hashed builds to the incremental keys so you only have "just what you need" hashed up.

Posted: Wed Jul 23, 2008 8:43 am
by Sandeep.pendem
[quote="chulett"]:? Why do you believe you need to "dump 400+ million" records into a hashed file? Best Practice is to constrain your hashed builds to the incremental keys so you only have "just what you need" hashed ...[/quote]

Hi,

Let me rephrase my question, Our query takes 8 minutes to materialize and it is acceptable performance to us. We are not looking for a re-design solution unless that is the only way to do it because there is no option to get around the 3 minute timeout happening on datastage We just need to know how to get past the 3 minute timeout that is happening when reading the data using ODBC stage(Sql server 2005 dB) in datastage. the question is do we need to upgrade any drivers or do we need to change the settings like adding a timeout paramter on ODBC,etc