All,
Fact: Remove Duplicate Stage functionality can be implemented in a Transformer Stage.
Does this fact quarantee for improvements on execution time consumption or Bulk records handling or any specific options.
Please suggest me to have profound view on this.
Performance: Remove Duplicates or Transformer?
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Fact: Remove duplicates can be performed in any stage that has an input link.
The answer to your question may well be dependent on the data type of the keys and the size of the records. I would suggest experimentation to determine whether there is any difference at all. There is a startup (calling) overhead for a Transformer stage but, for a sufficiently large volume of data, this may be considered to be negligible.
All remove duplicates methods required sorted data, so that cost can be factored out of the equation.
The answer to your question may well be dependent on the data type of the keys and the size of the records. I would suggest experimentation to determine whether there is any difference at all. There is a startup (calling) overhead for a Transformer stage but, for a sufficiently large volume of data, this may be considered to be negligible.
All remove duplicates methods required sorted data, so that cost can be factored out of the equation.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.