Need an optimized solution

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Need an optimized solution

Post by ag_ram »

All,

Requirement:
Extract Col1(DataType:TimeStamp) from Table1 where Col2="<value>". Based on the value of Col1, filter the records while extracting from some other table.
In short:
I need to pass the execution result of an SQL Query as the value of a Job parameter to the next DS Job having another extract query.
Possible solutions:
1. Create a Job: [DB Stage] --> [Seq File Stage]. Open that file and fetch the value then pass it as paramter value to next Job.
2. By exploring Dsxchange as of last min, could be done using Job control code or using Server Routine. But need elaboration on this solution.

Please give your suggestion to make it more innovative with optimization on Sol2.
Note:
First extract will result only one record and have one column for ever. So this makes me not go for Sol1.

ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Forget Sol2. Anything with code carries a maintenance overhead. Use a Lookup stage to get the comparison value. Perform the comparison downstream of that.

Code: Select all

                 Extract2
                    |
                    |
                    V
  Extract1 ----> Lookup ----> Filter ----->
Extract2 is of the form SELECT 'X', value FROM table (the constant 'X' is important). Extract1 is of the form SELECT Col1, 'X' FROM table (the constant 'X' is important). Perform the lookup based on the constant field - with Entire partitioning you will always find it. If either of your sources is not amenable to SQL, insert a Column Generator stage to generate the constant using a cycle with a single Value property of 'X'.

The reference input data set will contain only one row.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

Thanks Ray,

In production, I may have a chance to handle more than 5 Millions of records as a result of Extract2. Considering your sugesting, one value(Timestamp from Extract1) would be assigned to all records coming from Extract2 - shows undisirable to be considered as better solution but acceptable.

I need a clue for a server routine of:
1. Execute SQL Extract1 and assign the result(Timesatmp) to a local variable(LocVar1).
2. Run the Second Job with assigning the LocVar1 value to the Job parameter of Extract2.

Ray, any light on this kind of solution.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Why do you need a server routine? You can't call that from a parallel job.
What do you believe to be problematic about the solution I already gave you? The lookup will be performed against a one-row virtual Data Set.

If you want to go your way, do it in a job sequence. Don't write code. Code has an additional maintenance overhead. The sequence contains one server job to retrieve the single value and park it somewhere (a file or the user status area of that job) which can be retrieved and passed as a value for the downstream (parallel) job's parameter.

Search the (server) forum for how to use the user status area in a server job.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

Thanks Ray,

Really such a valuable pointer. I created a server Job and before that i created a transform function(Server Routine) then use that routine in the transformer Stage, renders the desired result.

Code: Select all

Call DSSetUserStatus(Arg1)
Ans = Arg1
It is working now.
Post Reply