Page 1 of 1

Reject record number cut off

Posted: Mon Jan 23, 2012 4:29 pm
by evee1
My jobs may reject records in various stages, but I would like to log only a specific number (project wide parameter X) of rejected records and disregard the rest.
I have a shared container that I send all the reject to. The container logs the rejects into the database tables. I'm only interested in the first X records, but any of the rejected records will do.
I tried using @inrownum and having record number column in the transformer, but this solution does not work with parallel processing, as it counts records on each node separately.
Providing that following a transformer stage I also have an aggregator in this container (I need to abort if the number of rejects goes above a certain threshold, specific to each job) can I run the transformer in the sequential mode to use @inrownum without a detrimental effect on the performance?

Posted: Mon Jan 23, 2012 4:52 pm
by ray.wurlod
Try using a Head stage in the shared container on the input to the shared container.

Posted: Mon Jan 23, 2012 6:30 pm
by evee1
The Head stage also has a limit per partition - Number of Rows (Per Partition)

Posted: Mon Jan 23, 2012 8:15 pm
by qt_ky
I think running the Transformer sequentially would not have much impact because the assumption is the records coming into it are only rejects, so perhaps a small fraction of the overall records processed by the main part of the job.

You can also use various combinations of System Variables with a formula to calculate a sequential input row number in a parallel Transformer.

@INROWNUM
Input row counter.

@NUMPARTITIONS
The total number of partitions for the stage.

@PARTITIONNUM
The partition number for the particular instance.

Posted: Tue Jan 24, 2012 12:05 am
by ray.wurlod
True but, given the small number of reject rows likely, why not run the Head stage in Sequential mode?

Posted: Tue Jan 24, 2012 7:20 pm
by qt_ky
That works for me too! Just throwing out some options. :D