Hello All,
I have an Oracle table(X) which has 4 columns.
Two lookupfiles are created from this table.
For each lookupfile(i am using datasets),we are having user-defined queries with diffrent filter conditions.
Ouestion:-Which is the good practice.
1.Fetching all the records from the table and dumping on to one dataset once and using this dataset in the transformation job to filter the user-defined filter conditions.
2.Creating two datasets separatelty with the appropriate user-defined queries in the table(X) twice.
Assuming the records from table(X) is more than a million.
Regards,
Bala.
User-defined queries
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Unless you are performing a sparse lookup you will always be using two virtual Data Sets for two lookups via a Lookup stage. You can see this happening in the score; each is built using a LUTCreate operator.
So, if the queries are different, your second approach is probably to be preferred, as it would minimize the number of rows in each virtual Data Set.
So, if the queries are different, your second approach is probably to be preferred, as it would minimize the number of rows in each virtual Data Set.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Hello Ray,
Thanks for the response.I have to re-frame my question I am not doing a sparse lookup.My question is can i create two different lookup files for diffrent userdefined queries or one dataset and use the filter condition in transformer stage or filter stage to yield optimised performance.Then the lookup process.
Regards,
Bala.
Thanks for the response.I have to re-frame my question I am not doing a sparse lookup.My question is can i create two different lookup files for diffrent userdefined queries or one dataset and use the filter condition in transformer stage or filter stage to yield optimised performance.Then the lookup process.
Regards,
Bala.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You can do either. That's my point. Whichever you do you get TWO virtual Data Sets. It's got nothing whatsoever with sparse lookup - it's how normal lookups work.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks Ray,
So i can go with any approach.Pefromance wise it works the same way.Just to clarify one more point
One the first occasion i am touching the database twice.
second one I am touching it once but fetching more rows and giving extra work to the datastage Engine.Any comments on this.
Regards,
Bala.
So i can go with any approach.Pefromance wise it works the same way.Just to clarify one more point
One the first occasion i am touching the database twice.
second one I am touching it once but fetching more rows and giving extra work to the datastage Engine.Any comments on this.
Regards,
Bala.
-
- Charter Member
- Posts: 822
- Joined: Sat Sep 17, 2005 5:25 pm
- Location: USA
Hi uslaslam1us/Ray,
Thanks are lot.
I have got a fair idea.
But my curiosity to strengthen it I am continuing it.
The queries are diffrent but each touch the same table.
I'll give examples.
query1:-Select Eno from Emp where sal>5000;
query2:-Select Dno from Emp where sal<5000;
so i can do this in two ways.
1.Select Eno,Dno from Emp ;(one user-defined query)
Put the result onto the dataset and then use filter conditions at the ETL level.
2.Treat these queries differenly and put it onto different datasets.
Then perform lookups.
My current scenario is similar to the one.I wish to know the pros and cons touching the database twice vs filter large number of records at ETL.
Note:-These are just examples my queries are complex.
Regards,
Bala.
Thanks are lot.
I have got a fair idea.
But my curiosity to strengthen it I am continuing it.
The queries are diffrent but each touch the same table.
I'll give examples.
query1:-Select Eno from Emp where sal>5000;
query2:-Select Dno from Emp where sal<5000;
so i can do this in two ways.
1.Select Eno,Dno from Emp ;(one user-defined query)
Put the result onto the dataset and then use filter conditions at the ETL level.
2.Treat these queries differenly and put it onto different datasets.
Then perform lookups.
My current scenario is similar to the one.I wish to know the pros and cons touching the database twice vs filter large number of records at ETL.
Note:-These are just examples my queries are complex.
Regards,
Bala.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: