Getting my foot wet with PX .......

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
PilotBaha
Premium Member
Premium Member
Posts: 202
Joined: Mon Jan 12, 2004 8:05 pm

Getting my foot wet with PX .......

Post by PilotBaha »

Ok boys and girls, I finally got my hands on the PX and I am going through training for the Enterprise Edition. The initial course, provided by "them", doesn't provide that much for a person who has been working on Server. I learned a lot in past 3 days, but slightly more than what I would get from reading the manual, etc.

What the course lacks is the internal workings of PX and more tricks, tips. One thing I'd really like to learn is how the things are getting processed and how we can maximize the performance. I am suspecting that the answers I am looking for are not going to be in Ascential product documention. Even a book or a white paper about parallel processing would work.

Any ideas from the experienced folks? Ray? :)

thanks..
Earthbound misfit I..
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

There are a couple good chapters on parallel processing and partitioning in the Parallel Job Developers Guide. According to Ray there is a new training course starting after Live Int 2005 on parallel edition for server developers.

A lot of tips are spread through the forum. Some of the key discussions:
- Use lookups on low volume reference checks, depends on the RAM available per node, we can execute lookups on a small number of columns with over 1,000,000 rows.
- Use modify/filter/copy stage instead of transformer unless you need to use a wide range of transformer functionality. Eg. derivations and constraints and counters and rejects.
- Stage the data less often then you would with server jobs. Stage to datasets not sequential files.
- Don't waste resources partitioning a small volume job across a lot of nodes. Either write it as a server job or run it on just one node.
- Don't have a heart attack when your first job produces ten warning messages and twenty error messages.
- Either write jobs that do not produce warning messages or filter out production warning messages using a message handler.
- Don't run transformer functions (if, trim etc) against a field that might contains nulls without doing null handling against it (eg. nulltovalue, handle_null). If a null is found in that field it could drop the row or abort the job.
- Write a lot of small test jobs that isolate specific functionality you are having trouble with. For example a small generic change data capture job, or a small job with transformer functions. Trying to debug these stages in a complex job can drive you mad and trial and error in a small job is easier.
- Get a copy of the Orchestrate Operators Guide as it has better information on some of the transformer and modify functions.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Watch this site for announcement. The course title is "Server to Parallel Transition". It's aimed at experienced and competent (both) server job developers and aims to equip them with the correct mindset and skill set with which to undertake development of parallel jobs.

This is NOT an IILive event; it's a training class to be run by DSX after IILive. More news as it comes to hand...

(The class was initially presented last April in Dallas to a group of guinea pigs from the Inner Circle group. Based on their feedback certain improvements have been made.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Guinea pigs, eh? :lol:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Find me a better term and I'll use it. Lab rats? :lol:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply