Hi, worked with Datastage 7.1 with both server and PX jobs, but always got vague answers from coworkers as to what the differences were other that stages in server had a correspondance in PX.
Now that I'm looking for a job, companies seem to discard my resume if I don't mention PX. Those who don't, demand that I modify my resume to show PX experience.
As the great Stevie Wonder says So what the funk?
What are the differences?
Why does it seem to be so important
is it REALLY different from server or is it the same; honeslty?
thank you
Server Vs PX
Moderators: chulett, rschirm, roy
Server Vs PX
<h3>Consulting</h3>
If you're not part of the solution, There's good money to be made in prolonging the problem.
If you're not part of the solution, There's good money to be made in prolonging the problem.
Re: Server Vs PX
Hi and Welcome Aboard ,
IMHO, For someone who worked with both Server and EE/PX this sounds like a very odd question!
I think this was answered befor so I recomend you use the search option in the top menue above.
No offence ment!
IMHO, For someone who worked with both Server and EE/PX this sounds like a very odd question!
I think this was answered befor so I recomend you use the search option in the top menue above.
No offence ment!
ippie02 wrote:Hi, worked with Datastage 7.1 with both server and PX jobs, but always got vague answers from coworkers as to what the differences were other that stages in server had a correspondance in PX.
Now that I'm looking for a job, companies seem to discard my resume if I don't mention PX. Those who don't, demand that I modify my resume to show PX experience.
As the great Stevie Wonder says So what the funk?
What are the differences?
Why does it seem to be so important
is it REALLY different from server or is it the same; honeslty?
thank you
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Reputation
just to keep my reputation...
the reason I don't know the difference is I did support. It broke I fixed it. Never did development with it, i'll continue searching through the posts then.\
thanks
the reason I don't know the difference is I did support. It broke I fixed it. Never did development with it, i'll continue searching through the posts then.\
thanks
<h3>Consulting</h3>
If you're not part of the solution, There's good money to be made in prolonging the problem.
If you're not part of the solution, There's good money to be made in prolonging the problem.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
All three variants are hugely different.
Server generates DataStage BASIC, parallel generates Orchestrate shell script (osh) and C++, mainframe generates COBOL and JCL.
In server and mainframe you tend to do most of the work in Transformer stage. In parallel you tend to use specific stage types for specific tasks (and the Transformer stage doesn't do lookups). There are many more stage types for parallel than server or mainframe, and parallel stages correspond to Orchestrate operators.
Finally, of course, there's the automatic partitioning and collection of data in the parallel environment, which would have to be managed manually (if at all) in the server environment.
That's a few off the top of my head.
Server generates DataStage BASIC, parallel generates Orchestrate shell script (osh) and C++, mainframe generates COBOL and JCL.
In server and mainframe you tend to do most of the work in Transformer stage. In parallel you tend to use specific stage types for specific tasks (and the Transformer stage doesn't do lookups). There are many more stage types for parallel than server or mainframe, and parallel stages correspond to Orchestrate operators.
Finally, of course, there's the automatic partitioning and collection of data in the parallel environment, which would have to be managed manually (if at all) in the server environment.
That's a few off the top of my head.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
From a career point of view the parallel architect is the future of the product suite and all new functionality is going into parallel job, server jobs are in cruise control. Most new customers will probably want to use enterprise edition. There are still a huge number of customers with server edition and a lot of PeopleSoft EPM customers with server edition. It will be interesting to see the packaging and pricing of the Hawk release to see whether server and parallel are offered at the same price and whether SOA becomes free.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 27
- Joined: Wed Aug 17, 2005 5:18 am
The basic difference between server and parallel jobs is the degree of parallelism that PX offers.
Server job stages do not have in built partitoning and parallelism mechanism for extracting and loading data between different stages. All you can do to enhance the speed and perormance in server jobs is to enable inter process row buffering through the administrator. This helps stages to exchange data as soon as it is available in the link. You could use IPC stage too which helps one passive stage read data from another as soon as data is available. In other words, stages do not have to wait for the entire set of records to be read first and then transferred to the next stage. Link partitioner and link collector stages can be used to achieve a certain degree of partitioning paralellism.
All of the above features which have to be explored in server jobs are built in datastage Px. The Px engine runs on a multiprocessor sytem and takes full advantage of the processing nodes defined in the configuration file. Both SMP and MMP architecture is supported by datastage Px.
Px takes advantage of both pipeline parallelism and partitoning paralellism. Pipeline parallelism means that as soon as data is available between stages( in pipes or links), it can be exchanged between them without waiting for the entire record set to be read. Partitioning parallelism means that entire record set is partitioned into small sets and processed on different nodes(logical processors). For example if there are 100 records, then if there are 4 logical nodes then each node would process 25 records each. This enhances the speed at which loading takes place to an amazing degree. Imagine situations where billions of records have to be loaded daily. This is where datastage PX comes as a boon for ETL process and surpasses all other ETL tools in the market.
This is all that I could think of at this point of time. More can be found out in PX pdfs that come with the enterprise edition.
Server job stages do not have in built partitoning and parallelism mechanism for extracting and loading data between different stages. All you can do to enhance the speed and perormance in server jobs is to enable inter process row buffering through the administrator. This helps stages to exchange data as soon as it is available in the link. You could use IPC stage too which helps one passive stage read data from another as soon as data is available. In other words, stages do not have to wait for the entire set of records to be read first and then transferred to the next stage. Link partitioner and link collector stages can be used to achieve a certain degree of partitioning paralellism.
All of the above features which have to be explored in server jobs are built in datastage Px. The Px engine runs on a multiprocessor sytem and takes full advantage of the processing nodes defined in the configuration file. Both SMP and MMP architecture is supported by datastage Px.
Px takes advantage of both pipeline parallelism and partitoning paralellism. Pipeline parallelism means that as soon as data is available between stages( in pipes or links), it can be exchanged between them without waiting for the entire record set to be read. Partitioning parallelism means that entire record set is partitioned into small sets and processed on different nodes(logical processors). For example if there are 100 records, then if there are 4 logical nodes then each node would process 25 records each. This enhances the speed at which loading takes place to an amazing degree. Imagine situations where billions of records have to be loaded daily. This is where datastage PX comes as a boon for ETL process and surpasses all other ETL tools in the market.
This is all that I could think of at this point of time. More can be found out in PX pdfs that come with the enterprise edition.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Welcome aboard! :D
You can do both pipeline parallelism and partition parallelism in server jobs; it's just that it's more automatic in parallel jobs.
In server jobs pipeline parallelism is mainly effected through row buffering (and/or IPC stages), and partition parallelism through either multi-instance running or link partitioners/collectors.
You can do both pipeline parallelism and partition parallelism in server jobs; it's just that it's more automatic in parallel jobs.
In server jobs pipeline parallelism is mainly effected through row buffering (and/or IPC stages), and partition parallelism through either multi-instance running or link partitioners/collectors.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.