Partioning to be used in transformer stage
Moderators: chulett, rschirm, roy
Partioning to be used in transformer stage
Hi,
I need to use a transformer stage after a sequantial file stage which is the source. I have no idea on which partition to be used in the transformer stage that is after the sequential stage. Can anyone help me on this...
Thanks in advance....
Regards,
Grace J.
I need to use a transformer stage after a sequantial file stage which is the source. I have no idea on which partition to be used in the transformer stage that is after the sequential stage. Can anyone help me on this...
Thanks in advance....
Regards,
Grace J.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
By default the Transformer stage will execute in all nodes of the default node pool specified in the current configuration file (named in $APT_CONFIG_FILE). You can choose to override this if you wish, but it's not necessary in the job design you specified unless you are doing some task that requires execution on only one node.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 597
- Joined: Fri Apr 29, 2005 6:19 am
- Location: Singapore
jst appyling trimming function and filtering
jst appyling trimming function and filtering
-
- Participant
- Posts: 597
- Joined: Fri Apr 29, 2005 6:19 am
- Location: Singapore
Re: jst appyling trimming function and filtering
Then you don't have to worry about partitioning type in Transformer stage. AUTO partitioning method in transformer stage will decide the best partitioning method for you.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
False.
There will be a partitioner between the Sequential File stage and any downstream stage that executes in parallel.
Further, for sufficiently large sequential files, you can assign multiple readers. If you assign N readers, each processes 1/N of the lines in the file.
There will be a partitioner between the Sequential File stage and any downstream stage that executes in parallel.
Further, for sufficiently large sequential files, you can assign multiple readers. If you assign N readers, each processes 1/N of the lines in the file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
If we increase the number of readers per node...to say 5 and we have some 10 million records...in a single large file......so each reader will take 2 million records and process the data in parallel.
If the numbers of readers is set to one, will it not run the whole downstream sequentially? it wont induce any partition,
Can you please explain....! without changing the number of readers per node.
Thanks
If the numbers of readers is set to one, will it not run the whole downstream sequentially? it wont induce any partition,
Can you please explain....! without changing the number of readers per node.
Thanks
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
With one reader per node, the Sequential File stage will execute sequentially. But you will note that there is a "fan out" icon on the link between this and the next stage, indicating that the next stage will execute in parallel (which you can verify by inspecting its Advanced tab).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.