Page 1 of 1

Preserve the value of a particular field in transformer

Posted: Sat Jan 19, 2008 3:19 am
by sumesh.abraham
Hi All,

I have a parallel job.The first stage is an Oracle Enterprise stage and the records fetched are passed to a transformer and few other stages follow to carry out the other logic. For all the records fetched from the table there is a field POST_DATE which has the same value. I need to preserve the value of POST_DATE in the transformer(Using the derivation ) sothat I can use this value in multiple stages within the job. I would appreciate if you can suggest me different approaches for this and the efficient one (In terms of performance/reusability..etc).

I've another question as well. Can we use SQL query in a parallel routine.

Thanks,
Sumesh

Posted: Sat Jan 19, 2008 5:40 am
by ray.wurlod
The most efficient is just to keep passing POST_DATE along all the links until you don't need it any more.

You can do whatever you like in a parallel routine, provided that you can encapsulate it all in C++. For example, you could use the functions of the ODBC API. (This is NOT a part of DataStage, and you may need to use licensed ODBC drivers.) Or you could use the individual database's client API - they all have one.

Posted: Sat Jan 19, 2008 6:59 am
by sumesh.abraham
Ray,

Thanks for the reply. The first way which comes to our mind is passing POST_DATE along all the links. But I am intested in knowing the alternates.

Posted: Mon Jan 21, 2008 6:48 am
by priyadarshikunal
sumesh.abraham wrote:Ray,

Thanks for the reply. The first way which comes to our mind is passing POST_DATE along all the links. But I am intested in knowing the alternates.
Way suggested by Ray is most appropriate one thats why he suggested it.

can you think about another better option?

you can't.

lets see another options

perform join or lookup on the source data once again to get POST_DATE
but thats a costly operation.

but if you dont want to pass it then this can use this approach.

Posted: Mon Feb 18, 2008 4:57 pm
by sajarman
Yes, i too think passing them through is the best option.

But this just gave another thought to me - is there an option for us to assign the value of this column to a variable/argument/parameter within a stage and then later use at another stage; assuming that the value is same across all rows for this column.

Posted: Mon Feb 18, 2008 6:04 pm
by JoshGeorge
Yes, using a parallel routine. Option 1: Set and get job / environment parameter. But not an optimised method for this requirement. Option 2: Write to a file from the first transformer and read the file in the next, again not an optimised method. Best for this requirement is the one noted above by Ray.

Posted: Thu Feb 21, 2008 3:57 pm
by sajarman
Thanks Joshy!

Posted: Thu Feb 21, 2008 5:16 pm
by ray.wurlod
Stages (operators) run (possibly) in separate processes. Therefore the variables available to one are not available to the others.