Reject record number cut off

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
evee1
Premium Member
Premium Member
Posts: 96
Joined: Tue Oct 06, 2009 4:17 pm
Location: Melbourne, AU

Reject record number cut off

Post by evee1 »

My jobs may reject records in various stages, but I would like to log only a specific number (project wide parameter X) of rejected records and disregard the rest.
I have a shared container that I send all the reject to. The container logs the rejects into the database tables. I'm only interested in the first X records, but any of the rejected records will do.
I tried using @inrownum and having record number column in the transformer, but this solution does not work with parallel processing, as it counts records on each node separately.
Providing that following a transformer stage I also have an aggregator in this container (I need to abort if the number of rejects goes above a certain threshold, specific to each job) can I run the transformer in the sequential mode to use @inrownum without a detrimental effect on the performance?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Try using a Head stage in the shared container on the input to the shared container.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
evee1
Premium Member
Premium Member
Posts: 96
Joined: Tue Oct 06, 2009 4:17 pm
Location: Melbourne, AU

Post by evee1 »

The Head stage also has a limit per partition - Number of Rows (Per Partition)
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

I think running the Transformer sequentially would not have much impact because the assumption is the records coming into it are only rejects, so perhaps a small fraction of the overall records processed by the main part of the job.

You can also use various combinations of System Variables with a formula to calculate a sequential input row number in a parallel Transformer.

@INROWNUM
Input row counter.

@NUMPARTITIONS
The total number of partitions for the stage.

@PARTITIONNUM
The partition number for the particular instance.
Choose a job you love, and you will never have to work a day in your life. - Confucius
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

True but, given the small number of reject rows likely, why not run the Head stage in Sequential mode?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

That works for me too! Just throwing out some options. :D
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply