1. Create a parallel job as the one below
Seq file2
|
|
Aggregator(record count, group by Field1)
|
|
Seq file1-----Join-----------------Seq File2(write the join output)
based on Field1
2. Execute a Unix command as post job sub routine to add a column which performs the following:
a. For record where record count is >1 add a new field with value Y
b. For record where record count is =1 add a new field with value N
But I'm to create a dataset output, and not any text file, and Unix commands do not work on dataset files.
As because I'm to introduce a new column, can't Column Generator stage serve this purpose, without using any Transformer/Sort?
Please help.
Regards,
Kumarjit.
Pain is the best teacher, but very few attend his class..
You can use a sort stage to generate key change column. Then in the transformer through stage variables check for change in key change column and assign value accordingly.
If key change column is '0', assign Y. If previous key change column was 0 and current is '1', assign 'Y' else 'N'
If yes use a column generator to generate "Y" for all rows then use the Modify stage to convert the NULL from left outer join into "N". And/or use a fork/join to split the streams based on the result of the join (or lookup).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks Ray, but I was not able to view you full post as it's Premium Content .
However, I will try and change the design to extent I was able to see in your post.
Regards,
Kumarjit.
Pain is the best teacher, but very few attend his class..
kumarjit wrote:I intend not to load the job, when the same can be achieved by other lightweight stages like column generator and/or modify stages.
You are relying on out-of-date knowledge. These days (since about version 8.7) the Transformer stage is no less efficient than most other stages, sometimes it's more efficient (for example than the Filter stage).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod wrote:You are relying on out-of-date knowledge. ...
I'm afraid to admit that its true to some sense. But if there are not more than 1K rows in the input, should I be trying something as time consuming as a transformer?
Please advise.
Regards.
Pain is the best teacher, but very few attend his class..
What makes you think transformer is a time consuming stage. The weight of transformer has decreased over time and its not an expensive stage anymore. Now its even lighter than filter and switch stages. If you can combine work of 2 or more stages in transformer, it may give you better result as well.
I think you were not able to see the complete reply from Ray.
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.