Faster method to move from Server Routines to PX ones?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
manuel.gomez
Premium Member
Premium Member
Posts: 291
Joined: Wed Sep 26, 2007 11:23 am
Location: Madrid, Spain

Faster method to move from Server Routines to PX ones?

Post by manuel.gomez »

Hello everybody,

I am very used to Server version of Datastage, and now working in a PX one. I am afriad I still think in terms of Server, and it's being difficult to me to make my mind to the Enterprise scope

I want to define a routine that carries out some simply validations on a char string (length, some characters on determined positions...), for example, checking that a string is an email.

In server, I would have defined a very easy routine, and reused it on everyjob where needed.

But now, I do have to define a C++ code (very short experience with this) even for such an easy task

Any idea on how to do this easier?

Thanks a lot!!!
Meera
Premium Member
Premium Member
Posts: 21
Joined: Mon Nov 28, 2005 8:42 pm

Re: Faster method to move from Server Routines to PX ones?

Post by Meera »

You can use a String Function like Count to see the occurence of @ to ensure its is an email string. I do not think we need any C++ routines. Try different string functions to see what would be better for your requirements.
gateleys
Premium Member
Premium Member
Posts: 992
Joined: Mon Aug 08, 2005 5:08 pm
Location: USA

Re: Faster method to move from Server Routines to PX ones?

Post by gateleys »

Meera, I think manuel.gomez was just gave the email address parsing as an example, and you seemed to have taken it literally.

Gomez, you can still go ahead and use server routines and call them via the Basic Transformer. However, it is not suggested, since you will lose out on the "parallel" feature advantages. You will have to look into the existing C++ routines that exist and learn from them.
gateleys
cdp
Premium Member
Premium Member
Posts: 113
Joined: Tue Dec 15, 2009 9:28 pm
Location: New Zealand

Post by cdp »

Hi Manuel,

Did you ever get a resolution on this?

Or find an easy way to centralise functionality that can be called through a transformer stage without having to resort to C++.

Thanks...Jonathan
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I agree.

The strength of Server is these kinds of routines and the ease of creating good, strong, reusable rules.

The strength of PX is speed. You cannot scrub data as easily.
Mamu Kim
rameshrr3
Premium Member
Premium Member
Posts: 609
Joined: Mon May 10, 2004 3:32 am
Location: BRENTWOOD, TN

Post by rameshrr3 »

I thought about this algorithm to decide on when to use what depending on transformation complexity ( the original post seems to be pretty old)

Step 1 : Check Data Volumes to
Decide on Server or Parallel Jobs - it depends on your system resources and job logic to decide what's 'high'
If you decide on server jobs - you can use the large number of funtions and options / or even develop routines with all too familiar BASIC
Step 2 : If Data Volumes are high , and its a regular incremental load use Parallel Jobs
Step 3 : If Source data is from a database source , check if DB native SQL has functions that can be used or if a stored proc stage can be used as a transform stage.
( ie : I have succesfully used pattern match functions in oracle SQl , whereas the equivalent functionality would lead to lengthy derivations in parallel transformer)
Step 4 : If Source is a file , check if you can use perl or any scripting language/tools ( awk/sed ) to make that transformation - using external source stage or external filter- or even filter commands in server job seq file stages
Even Windows may have some scripting packages available
( Yes : I have shirked the responsibility of creating px routines using such 'shortcuts' or even wrapper stages)
Step 5 : If you havent yet reached a solution , it may be time to think of a parallel routine, check if you can get hold of a good C/C++ programmer in you organization to develop a routine based on its specification - *** you will need to provide very clean specs *** .. I think such routines should not have a main() method - .. Attaching the library file to the PX routine is a cakewalk , but make sure to test ..

If all else fails , see if the processing logic can be moved to another development tool like java/.NET etc and if you can call the executable from datastage job/sequence or a plugin stage ( I have even had this experience with an older version of the XML output stage - which couldnt loop over multiple sets of attribute qualifies elements and had to check with the Java folks who could use DOM )
Post Reply