Page 1 of 2

Basic Transformer in Parallel job

Posted: Fri Dec 22, 2006 4:11 am
by Suman
I have one server routine to be called from transformer.As in parallel job normal transformer cannot call server routine I am thinking to use basic transformer instead.But as it reduces the performance a lot I want to know if any statistics is present where I can find how much the perfomance will hamper because of using this basic transformer in parallel job. Is it better to use a server job than using basic transformer in parallel job.

Posted: Fri Dec 22, 2006 6:35 am
by johnthomas
try using shared container instead for calling the server routine from a transformer

Posted: Fri Dec 22, 2006 6:39 am
by DSguru2B
If you could re-do the Basic Routine in C that would be great. This way you can call that routine in a 'normal' transformer. This is offcourse if the data volume is expected to be large.

Posted: Fri Dec 22, 2006 4:00 pm
by ray.wurlod
If you don't have statistics how can you assert that "it reduces performance a lot"?!

Depending on the data volumes it may be better (easier) to use a server job.

Are you sure that the server routine's functionality could not be migrated to a parallel routine or BuildOp stage?

Posted: Tue Dec 26, 2006 3:17 am
by Suman
johnthomas wrote:try using shared container instead for calling the server routine from a transformer
Shared Container option is fine when server routine is used for each of the record coming from input.But server routine is used only if there is no value found for a particular field inside the transformer.It is like
If IsNull(A) then Serverroutine Value else value of A where A is the field value.
So shared container value should be taken only if field A is null.

Posted: Tue Dec 26, 2006 3:32 am
by Suman
ray.wurlod wrote:If you don't have statistics how can you assert that "it reduces performance a lot"?!

Depending on the data volumes it may be better (easier) to use a server job.

Are you sure that the server ro ...
Server job is already existing and taking around 17-20 secs. There is no parallel job. Server job now has to be converted to parallel job and the options I am thinking of now is using Basic Transformer or a parallel routine using C. Basic transformer in Parallel job reduces performance this is recommended by Ascential and I came to know about it from my collegues.

Posted: Tue Dec 26, 2006 4:49 am
by ray.wurlod
Suman wrote:Server job now has to be converted to parallel job
Why?

If it isn't broken, don't fix it.

Posted: Tue Dec 26, 2006 7:17 am
by DSguru2B
It doesnt "have to have to "be changed. Argue your way accross. If no avail, then atleast post the logic, maybe we can help you build a c routine.

Posted: Tue Dec 26, 2006 8:26 am
by johnthomas
suman ,

As per your reply "server routine is used only if there is no value found for a particular field " .This could be achieved using switch and merge stage . Also since you include a server job in a sequencer job along with parallel job , i would go with like ray has commented "dont fix unless its broke"

Posted: Wed Dec 27, 2006 1:00 am
by Suman
ray.wurlod wrote:
Suman wrote:Server job now has to be converted to parallel job
Why?

If it isn't broken, don't fix it. ...
Parallel job is required as I am converting all server jobs into parallel jobs to improve the performance of the whole process. The output of the existing server job is a hash file which is used for lookup in the next few jobs which will be parallel jobs. As there is no hash file in parallel job dataset is required for lookup instead of hash file and that is the reason for a parallel job instead of a server job.

Suman

Posted: Wed Dec 27, 2006 1:10 am
by ray.wurlod
Suman wrote:Parallel job is required as I am converting all server jobs into parallel jobs to improve the performance of the whole process.
It won't.

For small tasks server jobs, at versions earlier than 8.0, will always be more efficient (finish faster) than the equivalent parallel job. This is mainly because of the much greater startup costs of parallel jobs.

There is no reason not to use both server jobs and parallel jobs. They can be started from the same job sequence.

If you want to get rid of the hashed file, write a parallel job that loads a Lookup File Set and use that instead.

Posted: Wed Dec 27, 2006 7:27 am
by DSguru2B
Build hashed file and dump its contents to a sequential file. Then use that sequential file to build your lookup set or dataset.

Posted: Wed Dec 27, 2006 7:54 am
by chulett
Suman wrote:Parallel job is required as I am converting all server jobs into parallel jobs to improve the performance of the whole process.
You'll actually decrease 'the performance of the whole process' by doing that. As noted, smaller tasks will be more efficient and take less time as Server jobs. And someone sold you a bill of goods if they told you that you 'needed' to do this just because you upgraded to EE. Keep a mixture of the two job types. Convert only what would benefit from the conversion.

Posted: Wed Jan 03, 2007 4:40 am
by Suman
I have written the C program and created .o file . But during compilation I am getting the following errors in transformer:

##E TBLD 000000 02:19:10(000) <main_program> Error when checking composite operator: Subprocess command failed with exit status 256.
##E TFSR 000019 02:19:10(001) <main_program> Could not check all operators because of previous error(s)
##W TFCP 000000 02:19:10(002) <transform> Error when checking composite operator: The number of reject datasets "0" is less than the number of input datasets "1".
##W TBLD 000000 02:19:10(003) <main_program> Error when checking composite operator: Output from subprocess: "/opt/ds/app/ETLDev/RT_BP633.O/V0S3_SampleRoutinetest_Transformer_3.C", line 523: error #2390: function "main" may not be called or have its address taken
output0Int32B[0]=main();
^

##W TBLD 000000 02:19:10(004) <main_program> Error when checking composite operator: Output from subprocess:

##I TFCP 000000 02:19:10(005) <transform> Error when checking composite operator: /opt/aCC/bin/aCC -L/opt/ds/app/ETLDev/RT_BP633.O/ -L/home/dsadm/Ascential/DataStage/PXEngine/lib -L/home/dsadm/Ascential/DataStage/PXEngine/user_lib +DD64 -b -Wl,+s -Wl,+vnocompatwarnings -lorchhpia64 -lorchcorehpia64 -lorchbuildophpia64 /home/skundu/Test/Read1.o /opt/ds/app/ETLDev/RT_BP633.O/V0S3_SampleRoutinetest_Transformer_3.tmp.o -o /opt/ds/app/ETLDev/RT_BP633.O/V0S3_SampleRoutinetest_Transformer_3.so.
##W TBLD 000000 02:19:10(006) <main_program> Error when checking composite operator: Output from subprocess: 1 error detected in the compilation of "/opt/ds/app/ETLDev/RT_BP633.O/V0S3_SampleRoutinetest_Transformer_3.C".

##W TBLD 000000 02:19:10(007) <main_program> Error when checking composite operator: Output from subprocess: aCC: warning 1913: `/opt/ds/app/ETLDev/RT_BP633.O/V0S3_SampleRoutinetest_Transformer_3.tmp.o' does not exist or cannot be read

##W TBLD 000000 02:19:10(008) <main_program> Error when checking composite operator: Output from subprocess: ld: Can't find library or mismatched ABI for -lorchhpia64
Fatal error.

Do I need to change the library path in env variable ld_library_path or any other setting because one error is it is not getting the library.

Any idea about these errors will be helpful.

Posted: Wed Jan 03, 2007 8:04 am
by DSguru2B
When you test your routine from command line, does it work, how are you compiling the source code, make sure you compile it with the +Z option.