Yes, while a link collector stage performs no SQL "join" functions, it does perform, in effect, a "UNION" of n-input links into one output link regardless of values as long as the columns are identical.
Do we have to use link collector and link partitioner in pair or single works fine i.e to use link collector do we need to use link partitioner as well?
ArndW wrote:The two are independant of each other and do not need to be used as a pair. ...
Actually i have a job that has a source as DRS stage with source query as union between two tables.This this stg writes to a table and then to a hash file.The source query takes longer for execution and hence the job.Are there any ways(other than optimising source query and enabling row buffering on) to optimize the performance of job?Say by using IPCs or Link partitioner or link collector.HOw to use these stages?
Those are a lot of questions for one post. Since a server job link collector cannot be set to take input from one link, then the other you cannot optimize that way. Do you have PX available?
The Link collector Stage can be used independently if you have set the row buffer inter process property active in job properties.
from what I recall what Link collector does is an SQL 'UNION ALL' functionality not the SQL 'UNION' functionality. So you might have to design your job to remove duplicates such as using an intermediate hash file