Determine end of input data
Moderators: chulett, rschirm, roy
Determine end of input data
Background:
We have several simple jobs which read a source table/sequential file and output the data to a target table via a transformer.
Requirement:
Write a row to a common audit table ehich includes the input and output row count columns:
Issue:
I was thinking a simply insterting a into the audit table, with a link from the transformer. But this link should only be taken once all the input rows have been processed.
Which constraint can I use to determine end of inout data?
I'm also open to other suggestions on meeting the above requirement.
We have several simple jobs which read a source table/sequential file and output the data to a target table via a transformer.
Requirement:
Write a row to a common audit table ehich includes the input and output row count columns:
Issue:
I was thinking a simply insterting a into the audit table, with a link from the transformer. But this link should only be taken once all the input rows have been processed.
Which constraint can I use to determine end of inout data?
I'm also open to other suggestions on meeting the above requirement.
Ok I've tried to use an Aggregator stage. The input columns to the stage are :
'1' = Key1
@INROWCOUNT = wk_Input_Row
@OUTROWCOUNT = wk_Output_Row
In the Aggregator stage I am grouping my Key1, and getting the maximum values for the other 2 columns.
But the output from the Aggregator stage seems to be 1 row, with the value from 1 partition. I'm working on a 4-node partition, therefore the value is roughly a quarter of the expected value. The partition methos is currently set to Auto on the Agg stage.
Which partition method should I use, to ensure I get the correct value?
'1' = Key1
@INROWCOUNT = wk_Input_Row
@OUTROWCOUNT = wk_Output_Row
In the Aggregator stage I am grouping my Key1, and getting the maximum values for the other 2 columns.
But the output from the Aggregator stage seems to be 1 row, with the value from 1 partition. I'm working on a 4-node partition, therefore the value is roughly a quarter of the expected value. The partition methos is currently set to Auto on the Agg stage.
Which partition method should I use, to ensure I get the correct value?
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Get ETLStats from Kim Duke's website - it will do all you require (and more) without you re-inventing the wheel.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
ETLStats is still applicable.
The only way to aggregate over the entire data set is to run the Aggregator stage in a single node (that is, sequentially). You can use a Sort/Merge collector to preserve any existing sorting.
The only way to aggregate over the entire data set is to run the Aggregator stage in a single node (that is, sequentially). You can use a Sort/Merge collector to preserve any existing sorting.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
If this is another lovely 'not in the v8 online help' thing, then use your pdf documentation files - the Advanced PX, Server Developer and BASIC pdf all mention those functions.mydsworld wrote:How do you capture the counts in 'after-job' in Job properties.The procedures listed there do not contain 'DSGetLinkInfo'.
I meant something that takes Job Name, Stage Name and Link Name as parameters and gets link information for any job. Typically this would be a routine and then the routine could pass the result to a generic 'load this metric' job that puts the result in your audit table.mydsworld also wrote:For a trailing generic job (for getting those counts), how can we do that in sequence.
Use a single node to... erk, never mind, Ray is up and online.
![Wink :wink:](./images/smilies/icon_wink.gif)
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Suppose I have a server routine 'MyServRoutine'(written in DS BASIC) which I want to include in PX job as 'Before-job' or 'After-job' subroutine.Now when I open the Job properties of PX job, I find only the following options in 'Before-job' or 'After-job' :
(none)
DSJobReport
DSSendMail
DSWaitForFile
ExecSH
ExecSHSilent
ExecTCL
So, there is no way to choose the routine 'MyServRoutine'.
Please advise.
(none)
DSJobReport
DSSendMail
DSWaitForFile
ExecSH
ExecSHSilent
ExecTCL
So, there is no way to choose the routine 'MyServRoutine'.
Please advise.