One stage is sequential in parallel jon

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vishu19aug
Participant
Posts: 39
Joined: Mon Feb 13, 2012 1:30 pm

One stage is sequential in parallel jon

Post by vishu19aug »

hi,

In my parallel job one stage (transformer) is sequential. Does it have any negative impact on performance? will it be bottleneck for the complete prallel flow?

Thanks,
Vishal
mandyli
Premium Member
Premium Member
Posts: 898
Joined: Wed May 26, 2004 10:45 pm
Location: Chicago

Post by mandyli »

Hi

Have you developed and tested with any data ?

without knowing anything how we will say about performance.

Is this is simple and low number of rows or huge?

Thanks
Man
vishu19aug
Participant
Posts: 39
Joined: Mon Feb 13, 2012 1:30 pm

Post by vishu19aug »

Yes i did but with very small data. However, my job will process around 1 million records.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Is the transformer sequential for a certain reason? Does it HAVE to be sequential in order to meet business logic requirements?

Any sequentially operating stage in a parallel job has the potential to be a bottleneck depending upon the job design and data being processed, but that doesn't mean they will be.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
vishu19aug
Participant
Posts: 39
Joined: Mon Feb 13, 2012 1:30 pm

Post by vishu19aug »

I have the following logic to compare the name of two records. Where the file is sorted on name and rownumber, because there can be two groups same name but sandwitched with another group of name. THe below does not work correctly if the stage is parallel

StageVar1 = Input.name
StageVar2 = (if StageVar1 = StageVar3 then StageVar4 else StageVar4 + 1)
StageVar3 = StageVar1
StageVar4 = StageVar2
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Get the partitioning right and it WILL work properly in parallel. A sequential stage in a parallel job will always be a bottleneck.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply