Partioning to be used in transformer stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Grace J.
Participant
Posts: 22
Joined: Mon Nov 03, 2008 5:34 am

Partioning to be used in transformer stage

Post by Grace J. »

Hi,

I need to use a transformer stage after a sequantial file stage which is the source. I have no idea on which partition to be used in the transformer stage that is after the sequential stage. Can anyone help me on this...
Thanks in advance....

Regards,
Grace J.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

By default the Transformer stage will execute in all nodes of the default node pool specified in the current configuration file (named in $APT_CONFIG_FILE). You can choose to override this if you wish, but it's not necessary in the job design you specified unless you are doing some task that requires execution on only one node.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kandyshandy
Participant
Posts: 597
Joined: Fri Apr 29, 2005 6:19 am
Location: Singapore

Post by kandyshandy »

What are you trying to do in the transformer?
Grace J.
Participant
Posts: 22
Joined: Mon Nov 03, 2008 5:34 am

jst appyling trimming function and filtering

Post by Grace J. »

jst appyling trimming function and filtering
kandyshandy
Participant
Posts: 597
Joined: Fri Apr 29, 2005 6:19 am
Location: Singapore

Re: jst appyling trimming function and filtering

Post by kandyshandy »

Then you don't have to worry about partitioning type in Transformer stage. AUTO partitioning method in transformer stage will decide the best partitioning method for you.
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

Since you are reading from a sequential the name itself says it will run in sequential mode, so i believe the next stage also will become sequential,
There is nothing much you can do here apart from changing the properites in the sequential file.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

False.

There will be a partitioner between the Sequential File stage and any downstream stage that executes in parallel.

Further, for sufficiently large sequential files, you can assign multiple readers. If you assign N readers, each processes 1/N of the lines in the file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

If we increase the number of readers per node...to say 5 and we have some 10 million records...in a single large file......so each reader will take 2 million records and process the data in parallel.

If the numbers of readers is set to one, will it not run the whole downstream sequentially? it wont induce any partition,

Can you please explain....! without changing the number of readers per node.

Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

With one reader per node, the Sequential File stage will execute sequentially. But you will note that there is a "fan out" icon on the link between this and the next stage, indicating that the next stage will execute in parallel (which you can verify by inspecting its Advanced tab).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply