Page 1 of 1
How to implement this in DataStage
Posted: Wed Jan 31, 2007 10:10 pm
by vij
Hi all,
I have the following requirement:
I have to query a SQL SERVER DB table and get a maximum value a particular column, say count and add this count column to an exsisting file columns.
the general flow is like this:
table ----> file ----> newfile
(get the maximum (add the column
value of count) count as a column
to the exsiting column in the file)
I would want to know how to implement this in DataStage.
Also is it possible to make this in a single job?
please help me.
Thanks in advance!
Posted: Wed Jan 31, 2007 10:25 pm
by DSguru2B
Sure you can do this in one job but tell me this, when you get a max, you will get one value. Your file might have x records. Do you want the max from database to be with all the records of your file?
Posted: Wed Jan 31, 2007 11:14 pm
by ray.wurlod
To paraphrase, do you want the overhead of selecting the (same) max value for every row you process? If so, you can do it in one job.
A more efficient approach would be to select the max value in a prior job, park it somewhere (for example the prior job's user status area), then feed this value into your main job as a job parameter. That way you only select the max value once, regardless of how many rows your main job processes.
Posted: Wed Jan 31, 2007 11:19 pm
by ray.wurlod
Having posted that, I had second thoughts. The Framework is likely to be intelligent enough to do this just once, because the Lookup stage is a composite operator; first it loads the virtual data set, then it performs lookups against the virtual data set. Use Entire partitioning.
Code: Select all
Select Max(col) From table;
|
(virtual Data Set)
|
V
------> Lookup Stage ------------>
I predict you'll see one row processed on the reference input link.
Posted: Thu Feb 01, 2007 12:09 am
by vij
Thanks for your reply, DSguru2B.
yes, exactly.Thats what i want.
Suppose if the maximum from the table is 35 and I want this value to be populated as a column in the file, for all the records, ie., i want the same value, 35.
additional information - For a single run, only once i will query that table and get the value and then use that same value for all the records in the file.
Actually I know that I can query the table, move that to a file and use a Joiner stage and join both the files, but I wanted to know is there any other way I can do this, which can simplify this job, with minimum use of resources.
Posted: Thu Feb 01, 2007 8:29 am
by DSguru2B
Ray has explained it all.