same partition after a sequential file stage??

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
srinivas.nettalam
Participant
Posts: 134
Joined: Tue Jun 15, 2010 2:10 am
Location: Bangalore

same partition after a sequential file stage??

Post by srinivas.nettalam »

This is not an interview question and is just for my understanding.

A sequential file stage is executed in parallel(file pattern)
and there is a data link to copy stage and then to dataset.The copy stage by default uses same partition.As per the defintion of same parition it has to apply the preceeding stage's partition but which partition is applied to sequential file stage in this scenario?
N.Srinivas
India.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Partitioning only ever occurs on an input link. How the data are partitioned within the Sequential File stage will depend on a number of factors, but will typically be one file per partition if you have the same number of files as partitions. There are other properties such as "treat as File Set" that can affect the way that this works.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
srinivas.nettalam
Participant
Posts: 134
Joined: Tue Jun 15, 2010 2:10 am
Location: Bangalore

Post by srinivas.nettalam »

Thanks Ray for your reply.Partition occurs only on input link but in this scenario what parition would same partition invoke?
Since I am not a premium member ,I could understand from your reply that 1 file per partition at the sequential file stage and the same is applied to copy stage but what about the rows in each file?would they be distributed in round robin?
N.Srinivas
India.
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

when you have "same" pratitioning set on the copy stage i/p link, it means that the i/p link of copy stage should not attempt a re-partitioning. what ever records are placed in what ever partitions they would remain in their respective partitions unless you have any node map/ node pool constraints set on the copy stage.

Regarding the partitioning at seq file stage, as ray said it should be as one file per partition else a round robin which is the default one in most cases.
Post Reply