Page 1 of 1

Substring replacement

Posted: Mon Nov 21, 2011 9:24 am
by PhilHibbs
I've just come across a requirement on my project to replace " -" or "- " with "-", so I looked up Replace, and found that it is Server only, not PX. This is astonishing! There really is no built-in string replace function in parallel DataStage? I will try to get the client to accept the pxEreplace function, but I think they have two policies that will be a problem: "No parallel routines" and "No BASIC transformers". I imagine if I added an External Filter stage with a Unix command to do the replacement, they would ban that as well. Am I out of options for a parallel DataStage out-of-the-box solution for substring replacement?

Posted: Mon Nov 21, 2011 10:54 am
by qt_ky
Do you mean no server routines or truly no parallel routines? If parallel derivations are used 100's of times across jobs, it seems like a parallel routine would be easier to maintain.

What if your input contains " - " or "- " with multiple spaces?

Code: Select all

"      - " or "-      "
Perhaps using a loop within a Transformer stage would allow you to output "-" in these cases.

Posted: Mon Nov 21, 2011 11:56 am
by PhilHibbs
qt_ky wrote:Do you mean no server routines or truly no parallel routines?
They are ok with BASIC Routines for use within Job Sequences - we have a number of routines that do various DB2 operations, get file row counts, etc.
What if your input contains " - " or "- " with multiple spaces?
Trim() will have dealt with that.
Perhaps using a loop within a Transformer stage would allow you to output "-" in these cases.
Transformer loops are for outputting multiple rows, I don't think it would be efficient to use a Transformer loop to loop through a string character by character - and there are multiple strings to perform this replace operation on as well.

I think I am winning the argument on allowing Parallel Routines though which is good. We just need to work out some support and maintenance documentation.

Posted: Mon Nov 21, 2011 2:26 pm
by qt_ky
I didn't think it would be efficient, but given all the client constraints, it seemed like they were eliminating all your options. I would lean towards the external filter stage you mentioned with calling the sed command. It's supported out of the box, but like you said, the pxerplace is not a built in function coming "out of the box" either.

Posted: Mon Nov 21, 2011 11:24 pm
by chandra.shekhar@tcs.com
Use Convert() function in transformer

Posted: Tue Nov 22, 2011 4:16 am
by ray.wurlod
If it's only " -" and "- " there's a tolerably ugly solution involving nested If..Then..Else and Index() functions. But one which would meet your clients' stated requirements.

<rant>Resist stupid requirements!</rant>

Posted: Tue Nov 22, 2011 5:28 am
by PhilHibbs
ray.wurlod wrote:If it's only " -" and "- " there's a tolerably ugly solution involving nested If..Then..Else and Index() functions. But one which would meet your clients' stated requirements.
Only if you can set an upper limit on the number of occurences
ray.wurlod wrote:<rant>Resist stupid requirements!</rant>
There is a valid point to be made, that for an organization that is using DataStage for the first time, and which may not have any one with C programming in their skill set, that the maintenance that parallel routines add is something that they might not want to take on. You have to compile it and put the .o file in the right place, and remember to recompile any jobs that use it whenever it changes, and make sure you update the .o file on uat and production servers, and if you want a different version in different projects on the same server then that is tricky as well.

Posted: Tue Nov 22, 2011 8:04 am
by qt_ky
If they don't have C programming, it seems like they would more easily accept using a BASIC Transformer with the eReplace function.