Inserting MaxValue only to a file without using Aggregator

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
nivas
Participant
Posts: 117
Joined: Sun Mar 21, 2004 4:40 pm

Inserting MaxValue only to a file without using Aggregator

Post by nivas »

Hi

I am getting a sequece of values From one input column. After completion of job I would like to insert only the max of input column value into a seq file or a table without using aggregator. Could any body please give me suggestion.

thanks
NIvas
ketfos
Participant
Posts: 562
Joined: Mon May 03, 2004 8:58 pm
Location: san francisco
Contact:

Post by ketfos »

Hi,
You can put these list of values into a table.
Then select MAX(column name) from table

Ketfos
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If the source is a text file, pre-process it (use Filter in the Sequential File stage, or ExecSH as a before-stage or before-job subroutine). Sort the file then grab the final line. sort filename | tail -1
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
nivas
Participant
Posts: 117
Joined: Sun Mar 21, 2004 4:40 pm

Post by nivas »

Hi

The source is a table. After reading all the input records and finding the max value I would like to insert into table once . I would like to avoid multiple times contacting database. Source has 16 million records . Please suggest.

thanks

Nivas
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I'd suggest using a hash file with a fixed value key and a data field for your max value. Use a Stage Variable to keep track of the max value, in essence putting a new value in it only when it is greater than the previous value. Write the stage variable to the hash with a constant key value (like '1' for instance) so there is only one record in it at all times. You could make use of a contraint to only write to the hash when the value changes, if you are worried about writing to it on each row.

After the first part of the job completes, read from the hash and write it to your database.
-craig

"You can never have too many knives" -- Logan Nine Fingers
nivas
Participant
Posts: 117
Joined: Sun Mar 21, 2004 4:40 pm

Post by nivas »

Hi

My idea is first get a max value in a stage variable. I would like to execute a script which runs after successiful execution of job and inserts the max value into database. I would like to define a job parameter and assign the stage variable value into Job parameter, there by I can pass the Job parameter as parameter to the shell script. My question is, is there any way to assign the stage variable value into Job parameter inside the job? Please suggest.

By the above I can avoid the hash stage. Infuture we can change it to a parallel job. So please suggest me how to assign the stage variable to a job parameter inside the job.

thanks
Nivas
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

:!: Ah, an easy one.

Job parameter values can not be changed once the job has started.
. My question is, is there any way to assign the stage variable value into Job parameter inside the job?
So the answer is no.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
nivas
Participant
Posts: 117
Joined: Sun Mar 21, 2004 4:40 pm

Post by nivas »

Hi

Is there any way by providing a constraint and send the the max value to output link, after end of reading and processing of all input records.
Please suggest.

thanks
Nivas
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What about making use of the UserStatus area? Once written to, a following Routine Activity stage (via a Sequencer Job) could automatically take the $UserStatus as an input parameter and use that to write out to a flat file or as input to your script. Or another job could use it to write to your target database.

Or is that too 'Server' of an answer again? :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
nivas
Participant
Posts: 117
Joined: Sun Mar 21, 2004 4:40 pm

Post by nivas »

Hi

I did not understand the link between the max value I got in the server job and the $userstatus in the next activity stage. How do I pass the max value I got in the first server job and how do I get that value in the next job through $userstatus.

By the way can we find the end of processing of records inthe server job and we use that in the constrain so that we can tell the datastage to send the value to the output link. Please suggest me.

thanks
Nivas
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Hmmm... have you created Sequencer jobs yet? They allow you to control the 'flow' of several related jobs / routines / activities and help automate the passing of parameters between or to them. When you add an Activity stage to a Sequencer, you can automate the passing of parameters from either the Sequencer job itself or the 'output' from any activity that has run before.

So, say for example your 'FirstJob' wrote this max value to its USERSTATUS area. Your 'SecondJob' could have a job parameter called MaxValue defined in it. In the Sequencer, using the 'External Parameter Helper' from the Properties of 'SecondJob', you could set the value of MaxValue to FirstJob.$UserStatus. This would 'automatically' retrieve the current value from the USERSTATUS area of the first job and pass it into the second job.

There have been quite a number of posts here on the subject of USERSTATUS, searching for them should help with your understanding. It is written to using the DSSetUserStatus function and you'll need to 'wrapper' it in a routine to call it from a derivation. I even posted the two lines of code needed a little while back.

On the subject of 'end of processing', there is no indicator or anything available in the engine itself that would let you know you are on the last row in a Server job. You know it was the last row when the job completes. :wink: One nice thing about the User Status area is it works just like the hash game I described earlier, mostly because it is one that each job has associated with it. It follows the 'destructive overwrite' philosophy as well, such that only the last thing you wrote to it will be available when the job ends.
-craig

"You can never have too many knives" -- Logan Nine Fingers
nivas
Participant
Posts: 117
Joined: Sun Mar 21, 2004 4:40 pm

Post by nivas »

Chulett

Thanks for your support and information. I donot have remote access. As you suggested I am going to develop jobs on Monday at office.

thanks
NIvas
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No problem, glad to help! Let us know if you have more questions when you get into the thick of it. :lol:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply