Page 1 of 1

Key points for PX development

Posted: Sun Jul 23, 2006 8:46 am
by clarcombe
Bit of a woolly topic I know but I have just landed my first PX contract and start tomorrow :? ; a 3 month migration of MVS flat file data to MILOS or something like that ( a SAP load tool) also flat file output

I did the PX training a couple of years ago and have reached a relatively proficient level of Server.I have been rereading the PX server guide although only at page 79 of 1130 :cry:. Have also just looked at the modify training tip (could do with a few more like that)

If there were a list of 10 PX commandments what would they be ?

Thanks

Posted: Sun Jul 23, 2006 9:18 am
by ArndW
(This could be an interesting thread if everyone would post 1 additional commandment; we'll certainly get more than 10 but it will be interesting!)

#1 Don't think or design in terms of the server jobs you know
The designer canvas and identical looking designer objects lull you into the false sense of security that you really aren't doing anything very different from server jobs when you design parallel jobs. Without exception the inferior PX jobs that I've seen to date are badly written because of this basic error, by developers who came from server and were not given sufficient training or experimentation time to realize the differences.
Always remember that you are working on a different product that requires a different mindset in designer and later on when executing jobs.
Since the number of concurrent parallel streams that your job will run in depends upon the configuration file and, to some extent, on the parallelism of your databases, you need to understand that a certain amount of implicit repartitioning and interaction is going to happen to your data.
Always design with those parallel streams in mind, even if your job will most likely end up running on a 1-node configuration. I am sure that each and every one of us who have worked in PX have, at one time, done a lookup and neglected to think about the partitioning and needed to revisit and fix that stage sometime during the testing phase when the results weren't quite right.

Posted: Sun Jul 23, 2006 10:51 am
by kris007
Having made the transition from server to parallel recently.
Here are 3 of many points I noticed.
1.Thou shalt not perform any functions without properly handling NULLs(this one be more useful for you as you are dealing with flat files both in source and target).
2.Thou shalt perform explicit type conversion whenever possible and not trust Datastage to do type conversion.(Px is more strict in the data types,strognly typed environment if I may say so).
3. Remember thou follow the dsxchange and use wisely.(thats what I have been doing :) ).

Amen,
HTH

Posted: Sun Jul 23, 2006 9:19 pm
by pigpen
Think carefully about the selection of partition keys when creating dataset. It may hinder the job performance using the dataset afterwards.

Posted: Sun Jul 23, 2006 9:28 pm
by kcbland
1. Thou shalt remember that PX comes with Server as well, so decide when to do it as pure PX and not.
2. Thou shalt spend remember not to hog the server nodes until running full stress tests.
3. Thou shalt use a wimpy node pool during minor test runs so that Director doesn't crawl and (-14) errors don't occur.
4. Thou shalt remember that it is not a sign of skill to reach the limit of stages allowed in a single job design.
5. Thou shalt remember that even though you're doing a one-time data conversion project your standards and practices should still be good proper programming, no variables named x,y, or z.

and I've ran out of pointers... :cry:

Posted: Tue Jul 25, 2006 1:06 am
by clarcombe
Many thanks for your comments. Luckily I discovered that the project is at an advanced stage so there are norms and rules in place, some of which have been mentioned here

Posted: Tue Jul 25, 2006 2:50 am
by kumar_s
4th point given by Kenneth should be noted.
Adding up of consicutive active stages wouldnt affect the performace much in Server job, the same wont work in PX. People might happily add more stages which will directly affect the performace, even the combinalble are enabled.

Posted: Tue Jul 25, 2006 6:26 am
by rwierdsm
I would add that you need to remember in your estimates that PX jobs WILL take longer to code.

The server transformer stage was like a Swiss Army knife, now you have a whole drawer full of utensils, none of which do everything the Swiss Army knife did, but will do each individual task better.

Rob