1) I need to summarize (sum total) 1 column and get record count of input data using a transformer (BASIC or PARALLEL).I want to avoid using Aggregator.Also since I need to call a server routine, I need to use BASIC transformer.
2)I cannot call server routine from parallel transformer.Can i abort a job using parallel transformer if some condition is not satisfied for input data.I have a server routine for the same.Anybody aware of parallel routine which can be called from parallel transformer which does the same job ?
Summarizing Columns in Transformer
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 13
- Joined: Tue Jan 25, 2005 12:07 am
- Location: Mumbai,India
-
- Participant
- Posts: 232
- Joined: Sat May 07, 2005 2:49 pm
- Location: USA
Hi Mahesh,
To get a record count of the input data, use the system variable @INROWNUM
I hope this helps.
To sum 1 column using a transformer, you can use stage variables and achieve this.1) I need to summarize (sum total) 1 column and get record count of input data using a transformer (BASIC or PARALLEL).I want to avoid using Aggregator.Also since I need to call a server routine, I need to use BASIC transformer.
To get a record count of the input data, use the system variable @INROWNUM
You can use a basic transformer in the PX job, that will allow you to call your server routine.2)I cannot call server routine from parallel transformer.Can i abort a job using parallel transformer if some condition is not satisfied for input data.I have a server routine for the same.Anybody aware of parallel routine which can be called from parallel transformer which does the same job ?
I hope this helps.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Welcome aboard! :D
Best practice is never to abort, so that you retain control. Pre-process the data to look for violations. If any is found, your job sequence can choose not to run the "real" job. Or to run an intermediate job to correct those violations, if such action is appopriate/possible.
Best practice is never to abort, so that you retain control. Pre-process the data to look for violations. If any is found, your job sequence can choose not to run the "real" job. Or to run an intermediate job to correct those violations, if such action is appopriate/possible.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 13
- Joined: Tue Jan 25, 2005 12:07 am
- Location: Mumbai,India
pnchowdary wrote:Hi Mahesh,
To sum 1 column using a transformer, you can use stage variables and achieve this.1) I need to summarize (sum total) 1 column and get record count of input data using a transformer (BASIC or PARALLEL).I want to avoid using Aggregator.Also since I need to call a server routine, I need to use BASIC transformer.
Mahesh: I used stage variable,but you see, it outputs as many rows in the input.I just need one summarized row.
To get a record count of the input data, use the system variable @INROWNUM
You can use a basic transformer in the PX job, that will allow you to call your server routine.2)I cannot call server routine from parallel transformer.Can i abort a job using parallel transformer if some condition is not satisfied for input data.I have a server routine for the same.Anybody aware of parallel routine which can be called from parallel transformer which does the same job ?
Mahesh: I'm already doing that,My question is can I call sever routine in parallel transformer ?? Or how can i convert it into parallel routine ???
I hope this helps.
-
- Participant
- Posts: 13
- Joined: Tue Jan 25, 2005 12:07 am
- Location: Mumbai,India
Summarizing Using Transformer (BASIC / PARALLEL)
Mahesh.Parte Thanks for a Warm Welcome ! :Dray.wurlod wrote:Welcome aboard! :D
Best practice is never to abort, so that you retain control. Pre-process the data to look for violations. If any is found, your job sequence can choose not to run the "real" job. Or to run an intermediate job to correct those violations, if such action is appopriate/possible.
I agree with you Ray, but you see Client is the KING.Though I will try following your approach, but it's important here to mention that we are not suppose to PREPROCESS the data , abort the job if the rejects exceeds the threshold value specified by the Client.Ray would request you to read the my questions abt using transformer for summarization,I'm sure your inputs would be valuable.Make a wonderful weekend !
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The client may always be KING but even kings are fallible, or ignorant. If the client has demanded that you construct jobs in a way that is against your better (educated) judgment, demand to know why and point out the consequences of following the client's mandate versus your better design. The courage to do so marks out the truly professional consultant from the ordinary.
For example, why avoid the right tool for the job? The Aggregator stage is designed precisely to group and count (among other aggregation functions). You can have the Aggregator stage following your Transformer stage. (Incidentally, the parallel Aggregator stage is more robust than the server Aggregator stage, so any argument about it being flaky can be dismissed.)
Why must you call a server routine? Can't you implement the same logic using parallel job techniques, perhaps even an equivalent parallel routine? What does this routine do that it must be a server routine? The BASIC Transformer stage will prove to be a major throughput bottleneck, because it must run in Sequential mode. Create something that can take advantage of the parallel execution architecture.
I don't believe you or your client has thought this through. Challenge your client!
For example, why avoid the right tool for the job? The Aggregator stage is designed precisely to group and count (among other aggregation functions). You can have the Aggregator stage following your Transformer stage. (Incidentally, the parallel Aggregator stage is more robust than the server Aggregator stage, so any argument about it being flaky can be dismissed.)
Why must you call a server routine? Can't you implement the same logic using parallel job techniques, perhaps even an equivalent parallel routine? What does this routine do that it must be a server routine? The BASIC Transformer stage will prove to be a major throughput bottleneck, because it must run in Sequential mode. Create something that can take advantage of the parallel execution architecture.
I don't believe you or your client has thought this through. Challenge your client!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
I agree with Ray, introduce a milestone point where data is staged, run the data to this milestone point and then decide if the load can be continued. We preprocess our data into load ready dataset files and then use simple database load jobs to get it in. Use job reporting and link counting to work out how successful the processing was and whether it passed threshhold levels.
You could hack the behaviour of parallel jobs to get what you want. The problem is that unlike server jobs you do not have much control over a transformer reject link. You cannot specifically define a reject message or define the rows to go down a reject link. You can hack it by creating a link out of a transformer that leads to a peek stage which will produce a log message for each row, use a custom job message handler to downgrade the peek message to a warning message. You now have a link that produces a warning for every row which will trigger the 50 warning message limit or whatever limit you set it to (via your sequence job or job control code). Also include a standard reject link and you get both custom rejects and standard rejects from that transformer that both produce warning messages.
I haven't used this, I prefer a message handling system that delivers meaningful messages to a log file or log table.
You could hack the behaviour of parallel jobs to get what you want. The problem is that unlike server jobs you do not have much control over a transformer reject link. You cannot specifically define a reject message or define the rows to go down a reject link. You can hack it by creating a link out of a transformer that leads to a peek stage which will produce a log message for each row, use a custom job message handler to downgrade the peek message to a warning message. You now have a link that produces a warning for every row which will trigger the 50 warning message limit or whatever limit you set it to (via your sequence job or job control code). Also include a standard reject link and you get both custom rejects and standard rejects from that transformer that both produce warning messages.
I haven't used this, I prefer a message handling system that delivers meaningful messages to a log file or log table.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn