Query sequential files?

slavik0329 · Post by **slavik0329** » Fri Feb 24, 2006 2:56 pm

Hey,

I have a project in which there is a base query where I extract data to a sequential file in Dstage. Then up must run 3 seprate queries on that data. What is the most efficient way to do this?

Thanks,
Steve

I_Server_Whale · Post by **I_Server_Whale** » Fri Feb 24, 2006 3:00 pm

It is difficult to answer that without knowing what queries are you trying to run.

You can use stage variables and define constraints. But it would be nice on your part if you could provide the type of queries that you would like to run on this sequential file. This would help us in giving you a better solution.

Thanks,
Naveen.

slavik0329 · Post by **slavik0329** » Fri Feb 24, 2006 3:10 pm

naveendronavalli wrote:It is difficult to answer that without knowing what queries are you trying to run.

You can use stage variables and define constraints. But it would be nice on your part if you could provide the type of queries that you would like to run on this sequential file. This would help us in giving you a better solution.

Thanks,
Naveen.

type? umm, i'm not sure what you mean by type. The sub queries have multiple select statements, where clauses and order by's.

ray.wurlod · Post by **ray.wurlod** » Fri Feb 24, 2006 3:19 pm

The "most efficient" way will depend on exactly what you want to achieve, whether the queries are run in the same database, and a number of other factors. For example, can you form the join (or union) of the three queries in the database itself? That would probably be very efficient, and you may not even need DataStage - just export the result set to a text file. Most databases have this functionality.

slavik0329 · Post by **slavik0329** » Fri Feb 24, 2006 3:23 pm

ray.wurlod wrote:The "most efficient" way will depend on exactly what you want to achieve, whether the queries are run in the same database, and a number of other factors. For example, can you form the join (or union) of the three queries in the database itself? That would probably be very efficient, and you may not even need DataStage - just export the result set to a text file. Most databases have this functionality.

The point of having the base query run first and get extracted is to not have to run it(its a HUGE query) 3 times. It takes about 2 or more hours to run.

I_Server_Whale · Post by **I_Server_Whale** » Fri Feb 24, 2006 4:22 pm

Hi slavik,

Then if you don't want to use the same stage for the other three queries that are to be run after the base query.

You can use a temporary table or also a unidata or universe stage to run the other three queries.

I have created a template for this kind of job. You can download it at:

http://s38.yousendit.com/d.aspx?id=2S ... ANCNE3F9

First the base query runs and loads the data in to the Universe stage, then you query this universe table with your first query and so on and finally load it to the sequential file.

But make sure that these should be in seperate jobs, not in one job like I showed. I mean, the OCI-->Universe will be a seperate job, Universe(query1)--->Universe in a seperate job and so on.

However, I'm not sure about the efficiency of this design when compared to running the union of all queries as our guru Ray suggested.

But incase you try this design. Please let us know how it performed

Thanks,
Naveen.