Page 1 of 1

DSGetStageInfo in Parallel job

Posted: Wed Apr 18, 2007 2:18 am
by pravin1581
Hi All,

I have requirement where I need to derive the data corresponding to the row number. Same thing was achieved in server job using DSGetStageInfo(DSJ.ME, DSJ.ME, DSJ.STAGEINROWNUM) in the constraint of the transformer stage but as DSGetStageInfo is not there in the parallel job, how can the same be done.

Re: DSGetStageInfo in Parallel job

Posted: Wed Apr 18, 2007 3:31 am
by priyadarshikunal
pravin1581 wrote:Hi All,

I have requirement where I need to derive the data corresponding to the row number. Same thing was achieved in server job using DSGetStageInfo(DSJ.ME, DSJ.ME, DSJ.STAGEINROWNUM) in the constraint of the transformer stage but as DSGetStageInfo is not there in the parallel job, how can the same be done.

the same function is available in parellel also

syntax:

Result = DSGetStageInfo (JobHandle, StageName, InfoType)

JobHandle is the handle for the job as derived from DSAttachJob, or it
may be DSJ.ME to refer to the current job.

StageName is the name of the stage to be interrogated. It may also be
DSJ.ME to refer to the current stage.

InfoType specifies the information required it may be DSJ.STAGEINROWNUM

try to use it in routine :)

Re: DSGetStageInfo in Parallel job

Posted: Wed Apr 18, 2007 4:02 am
by pravin1581
priyadarshikunal wrote:
pravin1581 wrote:Hi All,

I have requirement where I need to derive the data corresponding to the row number. Same thing was achieved in server job using DSGetStageInfo(DSJ.ME, DSJ.ME, DSJ.STAGEINROWNUM) in the constraint of the transformer stage but as DSGetStageInfo is not there in the parallel job, how can the same be done.

the same function is available in parellel also

syntax:

Result = DSGetStageInfo (JobHandle, StageName, InfoType)

JobHandle is the handle for the job as derived from DSAttachJob, or it
may be DSJ.ME to refer to the current job.

StageName is the name of the stage to be interrogated. It may also be
DSJ.ME to refer to the current stage.

InfoType specifies the information required it may be DSJ.STAGEINROWNUM

try to use it in routine :)
can i directly use in the transformer constraint or i need to write a routine for it and then call that routine in the transformer.

Re: DSGetStageInfo in Parallel job

Posted: Wed Apr 18, 2007 5:14 am
by priyadarshikunal
pravin1581 wrote:
priyadarshikunal wrote:
pravin1581 wrote:Hi All,

I have requirement where I need to derive the data corresponding to the row number. Same thing was achieved in server job using DSGetStageInfo(DSJ.ME, DSJ.ME, DSJ.STAGEINROWNUM) in the constraint of the transformer stage but as DSGetStageInfo is not there in the parallel job, how can the same be done.

the same function is available in parellel also

syntax:

Result = DSGetStageInfo (JobHandle, StageName, InfoType)

JobHandle is the handle for the job as derived from DSAttachJob, or it
may be DSJ.ME to refer to the current job.

StageName is the name of the stage to be interrogated. It may also be
DSJ.ME to refer to the current stage.

InfoType specifies the information required it may be DSJ.STAGEINROWNUM

try to use it in routine :)
can i directly use in the transformer constraint or i need to write a routine for it and then call that routine in the transformer.

the above function returns a value depending on InfoType provided no matter where u use it, after that it depends on u how to handle the result.

ur query will return Primary links input row number.

Posted: Wed Apr 18, 2007 3:48 pm
by ray.wurlod
Forget the function. The same number is available in the system variable @INROWNUM.

Posted: Wed Apr 18, 2007 10:32 pm
by pravin1581
ray.wurlod wrote:Forget the function. The same number is available in the system variable @INROWNUM.
@INROWNUM returns the incremental row count. My requirement is the value in the column corresponding to the row number not the row number.

Posted: Wed Apr 18, 2007 10:52 pm
by ray.wurlod
You won't get that from DSGetStageInfo().

Use a stage variable to keep the running total.

Posted: Wed Apr 18, 2007 11:02 pm
by pravin1581
ray.wurlod wrote:You won't get that from DSGetStageInfo().

Use a stage variable to keep the running total.
How can a running total be of any help in this case , i need the values corresponding to the row number.

rownum ID
---------------------------
10 5
11 4

I need the values 5 and 4 in the output file.

Posted: Wed Apr 18, 2007 11:35 pm
by ray.wurlod
How did you propose to get it from DSGetStageInfo()?!!

Is the source a Sequential File? If so you can use the Row Number Column property to yield the row number from the file - even if you are reading multiple files or using multiple readers per node.

Presumably - because you've never said - the value of ID comes from the row you've imported.

Posted: Thu Apr 19, 2007 1:22 am
by pravin1581
ray.wurlod wrote:How did you propose to get it from DSGetStageInfo()?!!

Is the source a Sequential File? If so you can use the Row Number Column property to yield the row number from the file - even if you are reading multiple files or using multiple readers per node.

Presumably - because you've never said - the value of ID comes from the row you've imported.
The source is the sequential file and DSGetStageInfo(DSJ.ME, DSJ.ME, DSJ.STAGEINROWNUM) is working in the server job to return the value corresponding to the row.

Posted: Thu Apr 19, 2007 5:01 am
by ray.wurlod
So what?

BASIC functions do not work in the C++ environment of parallel jobs.

If you want to write your own C++ equivalent of DSGetStageInfo() as a parallel routine - and you prefer such an inefficient approach - then there is a C-callable DataStage API that offers a DSGetStageInfo() function.

Posted: Thu Apr 19, 2007 8:49 am
by pravin1581
ray.wurlod wrote:So what?

BASIC functions do not work in the C++ environment of parallel jobs.

If you want to write your own C++ equivalent of DSGetStageInfo() as a parallel routine - and you prefer such an inefficient approach - then there is a C-callable DataStage API that offers a DSGetStageInfo() function.
C-callable DataStage API , can u please elaborate on that part as I am new to parallel jobs and we will be needing quite a few number of DS Functions in our job as we have done in the server job. Please elaborate that how can DS Functions be used in the PX job , they are not available in the transformer.

Posted: Thu Apr 19, 2007 3:03 pm
by ray.wurlod
DS functions can not be used in the parallel job until they are available from the Transformer (expression editor).

For this to happen, you have to create the parallel routines.

A parallel routine in the Repository is merely an interlude to a C++ function that you yourself have written, compiled and linked. And, presumably, tested.

Within that routine, you can call functions in the DataStage API. The functions in the API are documented in Chapter 7 of the Parallel Job Advanced Developer's Guide

There is nothing "straight out of the box", which is what you appear to be expecting.

Posted: Thu Apr 19, 2007 10:42 pm
by pravin1581
ray.wurlod wrote:DS functions can not be used in the parallel job until they are available from the Transformer (expression editor).

For this to happen, you have to create the parallel routines.

A parallel routine in the Repository is merely an interlude to a C++ function that you yourself have written, compiled and linked. And, presumably, tested.

Within that routine, you can call functions in the DataStage API. The functions in the API are documented in Chapter 7 of the Parallel Job Advanced Developer's Guide

There is nothing "straight out of the box", which is what you appear to be expecting.
But when we create some routines it asks for the library path to save it.

Posted: Thu Apr 19, 2007 10:56 pm
by ray.wurlod
If that requirement bewilders you, hire a competent C++ programmer.

What you are creating in the Repository is not the routine itself, but an interlude that allows the C++ function to be found.

It is not asking you for a library in which to save anything - it is asking you for the pathname of the library in which the C++ function can be found.