Server Vs PX

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ippie02
Participant
Posts: 14
Joined: Tue Aug 16, 2005 2:13 pm

Server Vs PX

Post by ippie02 »

Hi, worked with Datastage 7.1 with both server and PX jobs, but always got vague answers from coworkers as to what the differences were other that stages in server had a correspondance in PX.

Now that I'm looking for a job, companies seem to discard my resume if I don't mention PX. Those who don't, demand that I modify my resume to show PX experience.

As the great Stevie Wonder says :arrow: So what the funk?

:?: What are the differences?
:roll: Why does it seem to be so important
:? is it REALLY different from server or is it the same; honeslty?

thank you
<h3>Consulting</h3>
If you're not part of the solution, There's good money to be made in prolonging the problem.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Re: Server Vs PX

Post by roy »

Hi and Welcome Aboard :),
IMHO, For someone who worked with both Server and EE/PX this sounds like a very odd question!
I think this was answered befor so I recomend you use the search option in the top menue above.

No offence ment!
ippie02 wrote:Hi, worked with Datastage 7.1 with both server and PX jobs, but always got vague answers from coworkers as to what the differences were other that stages in server had a correspondance in PX.

Now that I'm looking for a job, companies seem to discard my resume if I don't mention PX. Those who don't, demand that I modify my resume to show PX experience.

As the great Stevie Wonder says :arrow: So what the funk?

:?: What are the differences?
:roll: Why does it seem to be so important
:? is it REALLY different from server or is it the same; honeslty?

thank you
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
ippie02
Participant
Posts: 14
Joined: Tue Aug 16, 2005 2:13 pm

Reputation

Post by ippie02 »

just to keep my reputation...

the reason I don't know the difference is I did support. It broke I fixed it. Never did development with it, i'll continue searching through the posts then.\

thanks
<h3>Consulting</h3>
If you're not part of the solution, There's good money to be made in prolonging the problem.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

All three variants are hugely different.

Server generates DataStage BASIC, parallel generates Orchestrate shell script (osh) and C++, mainframe generates COBOL and JCL.

In server and mainframe you tend to do most of the work in Transformer stage. In parallel you tend to use specific stage types for specific tasks (and the Transformer stage doesn't do lookups). There are many more stage types for parallel than server or mainframe, and parallel stages correspond to Orchestrate operators.

Finally, of course, there's the automatic partitioning and collection of data in the parallel environment, which would have to be managed manually (if at all) in the server environment.

That's a few off the top of my head.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

From a career point of view the parallel architect is the future of the product suite and all new functionality is going into parallel job, server jobs are in cruise control. Most new customers will probably want to use enterprise edition. There are still a huge number of customers with server edition and a lot of PeopleSoft EPM customers with server edition. It will be interesting to see the packaging and pricing of the Hawk release to see whether server and parallel are offered at the same price and whether SOA becomes free.
ippie02
Participant
Posts: 14
Joined: Tue Aug 16, 2005 2:13 pm

Post by ippie02 »

Thank you, this is very helpful.
<h3>Consulting</h3>
If you're not part of the solution, There's good money to be made in prolonging the problem.
nishant_prakash
Participant
Posts: 27
Joined: Wed Aug 17, 2005 5:18 am

Post by nishant_prakash »

The basic difference between server and parallel jobs is the degree of parallelism that PX offers.

Server job stages do not have in built partitoning and parallelism mechanism for extracting and loading data between different stages. All you can do to enhance the speed and perormance in server jobs is to enable inter process row buffering through the administrator. This helps stages to exchange data as soon as it is available in the link. You could use IPC stage too which helps one passive stage read data from another as soon as data is available. In other words, stages do not have to wait for the entire set of records to be read first and then transferred to the next stage. Link partitioner and link collector stages can be used to achieve a certain degree of partitioning paralellism.

All of the above features which have to be explored in server jobs are built in datastage Px. The Px engine runs on a multiprocessor sytem and takes full advantage of the processing nodes defined in the configuration file. Both SMP and MMP architecture is supported by datastage Px.
Px takes advantage of both pipeline parallelism and partitoning paralellism. Pipeline parallelism means that as soon as data is available between stages( in pipes or links), it can be exchanged between them without waiting for the entire record set to be read. Partitioning parallelism means that entire record set is partitioned into small sets and processed on different nodes(logical processors). For example if there are 100 records, then if there are 4 logical nodes then each node would process 25 records each. This enhances the speed at which loading takes place to an amazing degree. Imagine situations where billions of records have to be loaded daily. This is where datastage PX comes as a boon for ETL process and surpasses all other ETL tools in the market.

This is all that I could think of at this point of time. More can be found out in PX pdfs that come with the enterprise edition.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard! :D

You can do both pipeline parallelism and partition parallelism in server jobs; it's just that it's more automatic in parallel jobs.

In server jobs pipeline parallelism is mainly effected through row buffering (and/or IPC stages), and partition parallelism through either multi-instance running or link partitioners/collectors.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply