Page 1 of 1

SQL query extraction from oracle stage

Posted: Mon Mar 11, 2013 8:55 am
by davidpsalms77
Hi Team,

I had encountred some problem in datastage oracle stage.

I have query as below

Select * from table A
where id in (select id from table a)
and
id in (select id from table b).

I used the above query to extract the data from oracle database,while runing the query using toad ,i am getting some 129000 rows,But when i run the same thing using datastage some time i am getting 129000 but some time i am getting 116000 or so. I analysied the issue and found that it may be beacuse of the array size which we defined in oracle stage.Please correct me if i am worng and let me know if you have any suggestion to trace the issue. I modified the query as below and run in toad and datastage i am getting the correct results.

Select * from table A
where id in (select id from table a where id in (select id from table b)).

I am using 7.5 Server Edition and my database is oracle and datastage server is UNIX.

Posted: Mon Mar 11, 2013 10:15 am
by prasannakumarkk
Can you please post the correct table names? I mean atleast different name and alias what u used

Posted: Mon Mar 11, 2013 11:52 pm
by davidpsalms77
select * from sample
where id in (select id from sample where unit='522')
and id in (select id from staging where unit='522')

for all the records unit is same.Please help me to find out why i am not getting the correct results running through datastage,but getting correct results while running in database. This is issue coming when there is huge amount of data.

Posted: Tue Mar 12, 2013 3:07 am
by prasannakumarkk
Since there is a IN clause in the query, the defined array size may impact this.

Just try this

select * from sample sa join staging st
on sa.id = st.id
where sa.unit='522'
and st.unit='522'

Posted: Tue Mar 12, 2013 8:29 am
by davidpsalms77
I did not get you what trying to say. i know the problem came with array size defined.

Defined array size may impact means,how it is impacting in the background.I need to show a proof that the problem came with data stage array size.Can you please explain me more clearly.

We kept the array size ass 3276 and memory fetch as 20000

Posted: Tue Mar 12, 2013 8:52 am
by prasannakumarkk
I cant clearly show you the picture.
One thing you can show as a proof is get the explain plan from toad. It will say Oracle is executing the sub query first and then executing the outer query. The inner query will be using Full Access i guess. Now from DS on each call records are fetched. So in that case if the results of inner query which will be in buffer if not available later it will give us different count.
To confirm this, check below methods
1) The records selected by using the IN clause will not be always same in different job run. Try different array size like 10k.
2) Rewrite the query as Join as i mentioned and get the explain plan, it will have nested loops and not Full Access. Or rewrite with exist clause and check the plan
Am telling this with the knowledge i have on DS and Oracle. Gurus please forgive me if you see this as blunder thoughts :(