Folks,
I just had an interesting conversation with one of the other developers here on the topic of Basic Routines in a parallel job.
I am aware that if I were to include a Basic Transform in a parallel job, the job would only ever be able to run in a SMP environment, not in a MPP environment.
Is the same true if I were to use a Basic Routine as an Before/After Routine in a Parallel job, or are routines run as Before/After treated differently?
Rob W.
Basic Routines in Parallel
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 209
- Joined: Fri Jan 09, 2004 1:14 pm
- Location: Toronto, Canada
- Contact:
Basic Routines in Parallel
Rob Wierdsma
Toronto, Canada
bartonbishop.com
Toronto, Canada
bartonbishop.com
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
Server routines can be called from the BASIC transformer in a parallel job, these jobs will run in the SMP or MPP but it becomes a bottleneck in your job as it runs as a single process instead of multiple processes. You can get around this by designing your BASIC transformer jobs as multiple instance jobs and partition the data manually. This will give you multiple copies of that BASIC transformer across each instance. This is a lot of development effort and it may be easier to convert your BASIC routines to C++ routines that can be called from the parallel transformer.
The before/after routine is executed just once for each job, both server and parallel.
The before/after routine is executed just once for each job, both server and parallel.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Any BASIC can only execute where the BASIC run machine is. Before-job and after-job subroutines execute in the conductor process (and therefore on the conductor node) which, by default at least, is where the DataStage server is, so you get away with it. If you start fiddling around with the configuration (for example APT_PM_CONDUCTOR_HOSTNAME) then you run the risk of disallowing execution of before/after subroutines.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.