Hi,
I need to build a custom operator that should run under DataStage 9.1 and RedHat Linux 5 (64 bit). There is an introduction on the IBM site how to build custom operators for the Windows platform here: http://www.ibm.com/developerworks/data/ ... 0702chard/
I was trying to find something similar that describes the compilation process for the Linux but no luck so far... The g++ compiler is set correctly (transformers in a job are working fine) but upon trying to build the example from the link above it returns hundreds of errors found in the .h files referenced by the sample code.
Is there an example that describes the compilation process of a custom operator for Linux? Or maybe somebody can share a personal experience on the subject.
Thanks
How to compile a custom operator for RHEL 5
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 9
- Joined: Sun Feb 12, 2012 11:05 am
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Custom operators are compiled within the stage, irrespective of platform. The compiler is specified through environment variables such APT_COMPILER, and the Build stage allow you to specify additional flags if necessary.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Hello Stephan,
when I started working on building custom operators I thought I'd start with the few examples that IBM has posted on the web but that ended up being a bit of a mistake, as the published examples don't work, or only work partially and on one platform (Windows).
There are 3 types of stages available for you to create: "build", "custom" and "wrapped"
The "build" type lets you define a rigid input and output schema and create c++ code within the DataStage Designer GUI to manipulate the data. When you compile/build this type of stage DataStage will include the requisite header files and your design effort is made a bit simpler. On the other hand, it splits the final program into logic in "definitions", "pre-loop", "per-record" and "post-loop". The compilation is done under the covers by DataStage and uses the setting defined in the DSPARAMS file. If what you want to do fits into this structure then I would suggest you use this method.
The next type is "custom" and this is somewhat more involved. The DataStage Designer for this stage defines the entrypoint and the command line options (using the "properties" tab). You can open up existing stages and see how their "properties" are defined to see how easy this part is.
The coding of your custom operator in this case is done outside of DataStage and you need to ensure that you include the requisite classes and headers as well as other libraries and write a standard c++ program as a library. I've found that I use almost the same compiler settings as found in the DSPARAMS file, on windows one needs to make some changes when using the Design Studio, but on UNIX that hasn't been necessary.
Once you have a library, it needs to be placed in the environment setting for libraries and also registered with DataStage and it will be dynamically linked at runtime. While I don't go into great depth, you can download the most recent document at Hist-Op Documentation and check out chapter 8 for some installation information.
The third option is the "wrapped" type and is one I haven't actually used, so I will let someone else comment on that approach if that is what you need.
when I started working on building custom operators I thought I'd start with the few examples that IBM has posted on the web but that ended up being a bit of a mistake, as the published examples don't work, or only work partially and on one platform (Windows).
There are 3 types of stages available for you to create: "build", "custom" and "wrapped"
The "build" type lets you define a rigid input and output schema and create c++ code within the DataStage Designer GUI to manipulate the data. When you compile/build this type of stage DataStage will include the requisite header files and your design effort is made a bit simpler. On the other hand, it splits the final program into logic in "definitions", "pre-loop", "per-record" and "post-loop". The compilation is done under the covers by DataStage and uses the setting defined in the DSPARAMS file. If what you want to do fits into this structure then I would suggest you use this method.
The next type is "custom" and this is somewhat more involved. The DataStage Designer for this stage defines the entrypoint and the command line options (using the "properties" tab). You can open up existing stages and see how their "properties" are defined to see how easy this part is.
The coding of your custom operator in this case is done outside of DataStage and you need to ensure that you include the requisite classes and headers as well as other libraries and write a standard c++ program as a library. I've found that I use almost the same compiler settings as found in the DSPARAMS file, on windows one needs to make some changes when using the Design Studio, but on UNIX that hasn't been necessary.
Once you have a library, it needs to be placed in the environment setting for libraries and also registered with DataStage and it will be dynamically linked at runtime. While I don't go into great depth, you can download the most recent document at Hist-Op Documentation and check out chapter 8 for some installation information.
The third option is the "wrapped" type and is one I haven't actually used, so I will let someone else comment on that approach if that is what you need.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Premium Member
- Posts: 9
- Joined: Sun Feb 12, 2012 11:05 am
- Contact:
Hello and thank you for the responses. I'm happy to say that I have made progress on this...
For a future reference if somebody wants to compile the code from the IBM artcile mentioned in the first post, here is what needs to be done:
1) the following code needs to be removed from src/myhelloworld.c because it causes the wrong file to be included later on and the following compilation error raises:
2) The source code compilation can be done like this (the params are like the APT_COMPILEOPT as Arnd said above):
3) The link step is:
Now about integrationg this into the designer and running a job with the custom operator - it was a partial success...
I was following the Arnd's pdf and created an antry into the /opt/IBM/InformationServer/Server/PXEngine/etc/operator.apt
Then I have create a new "Parallel Stage Type (Custom)" and introduced "hello" as an operator and moved the libhello.so library here: /opt/IBM/InformationServer/Server/DSComponents/bin
Unfortunatelly during runtime the job aborts with:
main_program: PATH search failure:
main_program: Could not locate operator definition, wrapper, or Unix command for "hello"; please check that all needed libraries are preloaded, and check the PATH for the wrappers
But the PATH has deployment directory:
PATH=/opt/IBM/InformationServer/Server/Projects/dstage1/wrapped:/opt/IBM/InformationServer/Server/Projects/dstage1/buildop
:/opt/IBM/InformationServer/Server/Projects/dstage1/RT_BP8.O
:/opt/IBM/InformationServer/Server/DSComponents/lib:/opt/IBM/InformationServer/Server/DSComponents/bin
:/opt/IBM/InformationServer/Server/DSParallel:/opt/IBM/InformationServer/Server/PXEngine/user_osh_wrappers
:/opt/IBM/InformationServer/Server/PXEngine/osh_wrappers
:/opt/IBM/InformationServer/Server/PXEngine/bin
:/opt/IBM/InformationServer/Server/PXEngine/grid:/opt/IBM/InformationServer/ASBNode/apps/jre/bin:/sbin
:/usr/sbin:/bin:/usr/bin:/usr/local/nz/bin64:/home/loadl/bin:/usr/kerberos/bin:/usr/local/bin:/usr/X11R6/bin:.
What I have tried as well is to copy the libhello.so under a folder referenced by LD_LIBRARY_PATH. No luck...
The libhello.so has permissions like this:
-rwxr-xr-x 1 root root 144264 Sep 15 22:01 libhello.so
So what could be the problem for the above???
The only way I was able to run the job was to introduce a wrapper script for the operator like this:
Having this script in one of the PATH directories makes the job run without problems. Could we avoid running this script?
For a future reference if somebody wants to compile the code from the IBM artcile mentioned in the first post, here is what needs to be done:
1) the following code needs to be removed from src/myhelloworld.c
Code: Select all
#define __NUTC__
Code: Select all
/opt/IBM/InformationServer/Server/PXEngine/include/unicode/umachine.h:47:30: error: unicode/pnutc.h: No such file or directory
Code: Select all
g++ -c -O -fPIC -Wno-deprecated -m64 -mtune=generic -mcmodel=small -I/opt/IBM/InformationServer/Server/PXEngine/include src/myhelloworld.c
Code: Select all
g++ -shared -m64 -L/opt/IBM/InformationServer/Server/PXEngine/lib -lorchgeneralx86_64 -lorchx86_64 -lorchmonitorx86_64 -lorchcorex86_64 -lorchsortx86_64 myhelloworld.o
I was following the Arnd's pdf and created an antry into the /opt/IBM/InformationServer/Server/PXEngine/etc/operator.apt
Code: Select all
hello libhello 1
Unfortunatelly during runtime the job aborts with:
main_program: PATH search failure:
main_program: Could not locate operator definition, wrapper, or Unix command for "hello"; please check that all needed libraries are preloaded, and check the PATH for the wrappers
But the PATH has deployment directory:
PATH=/opt/IBM/InformationServer/Server/Projects/dstage1/wrapped:/opt/IBM/InformationServer/Server/Projects/dstage1/buildop
:/opt/IBM/InformationServer/Server/Projects/dstage1/RT_BP8.O
:/opt/IBM/InformationServer/Server/DSComponents/lib:/opt/IBM/InformationServer/Server/DSComponents/bin
:/opt/IBM/InformationServer/Server/DSParallel:/opt/IBM/InformationServer/Server/PXEngine/user_osh_wrappers
:/opt/IBM/InformationServer/Server/PXEngine/osh_wrappers
:/opt/IBM/InformationServer/Server/PXEngine/bin
:/opt/IBM/InformationServer/Server/PXEngine/grid:/opt/IBM/InformationServer/ASBNode/apps/jre/bin:/sbin
:/usr/sbin:/bin:/usr/bin:/usr/local/nz/bin64:/home/loadl/bin:/usr/kerberos/bin:/usr/local/bin:/usr/X11R6/bin:.
What I have tried as well is to copy the libhello.so under a folder referenced by LD_LIBRARY_PATH. No luck...
The libhello.so has permissions like this:
-rwxr-xr-x 1 root root 144264 Sep 15 22:01 libhello.so
So what could be the problem for the above???
The only way I was able to run the job was to introduce a wrapper script for the operator like this:
Code: Select all
#!/bin/sh
props=''
numtimes=''
uppercase=0
usage="hello [-u] [-n times] < input > output"
status=0
error () {
echo "select: $1" 1>&2 ;
if [ $# -eq 1 ] ; then status=1 ; else status=$2 ; fi
}
# Parse argument list
while [ $# -gt 0 ] ; do
case "$1" in
-n) # number of times
if [ $# -lt 2 ] ; then
error "no value specified for -n argument"
break
else
numtimes="$2"
shift; shift
fi
;;
-u) # print uppercase
uppercase=1
shift
;;
*) # otherwise
error "Unrecognized argument, $1"
shift # skip to next argument
;;
esac
done
# Check for properties
if [ ${status} -eq 0 ] ; then
if [ $uppercase -eq 1 ] ; then
props="${props:+$props,}uppercase"
fi
if [ -n "$numtimes" ] ; then
props="${props:+$props,}numtimes=$numtimes"
fi
fi
if [ ${status} -ne 0 ] ; then
echo "{
usage=\"$usage\"
}"
echo "{
class=HelloWorldOp,
initialization={${props}},
usage=\"$usage\",
library=\"hello\"
}"
fi
exit $status
Stephan
http://szahariev.blogspot.com
http://szahariev.blogspot.com
-
- Premium Member
- Posts: 9
- Joined: Sun Feb 12, 2012 11:05 am
- Contact:
OK, I have managed to resolve it. The issue described in the last post was due to wrong invocation of the APT_DEFINE_OSH_NAME macro.
If anyone is interested I have summarized the experience here: http://szahariev.blogspot.com/2013/09/C ... Howto.html
If anyone is interested I have summarized the experience here: http://szahariev.blogspot.com/2013/09/C ... Howto.html
Stephan
http://szahariev.blogspot.com
http://szahariev.blogspot.com