Page 1 of 1

Escaping # in Execute Command definition

Posted: Wed Mar 28, 2012 6:05 pm
by Gazelle
Requirement:
We need to read a text file that has a record for each of the tables to be loaded.
The file can have comment lines, which begin with a #.
e.g.

Code: Select all

#Batch Code, Source File, Schema File, Target Table, Key Field(s)
#PRS tables
PRS,PRS-EMPLOYEE-*-EXTRACT.DAT,EmployeeFile.schema,EMPLOYEE,EMP_ID
#FIN tables
FIN,FIN-GLCODE-*-EXTRACT.DAT,GLCodeFile.schema,GLCODE,GLID
Approach:
We're trying to use a Sequence to read the file using an Execute Command stage, and ignore any records starting with #. The command gets a list of Tables that we can loop through.
e.g. (I've bolded the part that ignores comment lines):

awk -v BatchCode="#PS_BatchCode#" -F, '!/^#/ && $1~BatchCode {print $4}' | paste -s -d, - | tr -d '\n'

Problem:
However, when trying to compile the Sequence, it complains that the parameter "/^# &&" is missing.
i.e. When the compiler reads the # symbol, it wants to do a parameter substitution.

Resolution:
Using \# did not work, so we created a parameter PS_HashSymbol with the value '#', and used that in the expression:
awk -v BatchCode="#PS_BatchCode#" -F, '!/^#PS_HashSymbol#/ && $1~BatchCode {print $4}' | paste -s -d, - | tr -d '\n'

But I'd prefer to use an escape character, rather than create another parameter.

Is there is an escape character that we can use, so that the Sequence compiler treats the # as a literal, not as a parameter identifier?

Posted: Wed Mar 28, 2012 6:08 pm
by qt_ky
A related idea: How about putting the awk command into a shell script and calling the shell script from the Execute Command stage? Parameters can still be used.

Posted: Wed Mar 28, 2012 9:00 pm
by ray.wurlod
Possibly you could use grep -v ^\# as the first part of the pipeline, to eliminate lines beginning with "#".

Posted: Wed Mar 28, 2012 9:42 pm
by qt_ky
DataStage removes any single # from ExecCommand stage upon execution.

I think it is a bug, as it should give a warning and also provide an escape character for a # sign.

In place of this:

Code: Select all

!/^#/
Substitute in the regular expression octal code \043 in place of # like so:

Code: Select all

!/^\043/
Hex may also work as in \xhh but it depends upon the awk program on your OS. I played around on mine (AIX) and it would not accept hex; only octal.

Posted: Wed Mar 28, 2012 10:32 pm
by Gazelle
Thanks Eric. We're also using AIX, so the octal code will do very nicely.