User-defined queries

bala_135 · Post by **bala_135** » Tue Jun 19, 2007 11:52 pm

Hello All,

I have an Oracle table(X) which has 4 columns.
Two lookupfiles are created from this table.
For each lookupfile(i am using datasets),we are having user-defined queries with diffrent filter conditions.

Ouestion:-Which is the good practice.
1.Fetching all the records from the table and dumping on to one dataset once and using this dataset in the transformation job to filter the user-defined filter conditions.
2.Creating two datasets separatelty with the appropriate user-defined queries in the table(X) twice.

Assuming the records from table(X) is more than a million.

Regards,
Bala.

ray.wurlod · Post by **ray.wurlod** » Wed Jun 20, 2007 1:29 am

Unless you are performing a sparse lookup you will always be using two virtual Data Sets for two lookups via a Lookup stage. You can see this happening in the score; each is built using a LUTCreate operator.

So, if the queries are different, your second approach is probably to be preferred, as it would minimize the number of rows in each virtual Data Set.

bala_135 · Post by **bala_135** » Wed Jun 20, 2007 2:32 am

Hello Ray,

Thanks for the response.I have to re-frame my question I am not doing a sparse lookup.My question is can i create two different lookup files for diffrent userdefined queries or one dataset and use the filter condition in transformer stage or filter stage to yield optimised performance.Then the lookup process.

Regards,
Bala.

ray.wurlod · Post by **ray.wurlod** » Wed Jun 20, 2007 6:07 am

You can do either. That's my point. Whichever you do you get TWO virtual Data Sets. It's got nothing whatsoever with sparse lookup - it's how normal lookups work.

bala_135 · Post by **bala_135** » Wed Jun 20, 2007 7:00 am

Thanks Ray,

So i can go with any approach.Pefromance wise it works the same way.Just to clarify one more point
One the first occasion i am touching the database twice.
second one I am touching it once but fetching more rows and giving extra work to the datastage Engine.Any comments on this.

Regards,
Bala.

us1aslam1us · Post by **us1aslam1us** » Wed Jun 20, 2007 3:30 pm

If the queries are completely different then use the second approach orelse use the first one with filter stage to parse the data depending on the filters into two different datasets.

bala_135 · Post by **bala_135** » Wed Jun 20, 2007 10:18 pm

Hi uslaslam1us/Ray,

Thanks are lot.
I have got a fair idea.
But my curiosity to strengthen it I am continuing it.
The queries are diffrent but each touch the same table.
I'll give examples.

query1:-Select Eno from Emp where sal>5000;
query2:-Select Dno from Emp where sal<5000;

so i can do this in two ways.
1.Select Eno,Dno from Emp ;(one user-defined query)
Put the result onto the dataset and then use filter conditions at the ETL level.
2.Treat these queries differenly and put it onto different datasets.
Then perform lookups.
My current scenario is similar to the one.I wish to know the pros and cons touching the database twice vs filter large number of records at ETL.

Note:-These are just examples my queries are complex.

Regards,
Bala.

ray.wurlod · Post by **ray.wurlod** » Wed Jun 20, 2007 10:46 pm

Your second approach requires half the memory at run time of your first approach.