Page 1 of 1

Server vs parallel jobs

Posted: Tue Feb 08, 2011 10:28 am
by rsunny
Hi ,

As we can run the server jobs in Enterprise edition , then what is the configuration file that it is going to be when we run the server jobs in Enterprise edition . And when we run the server jobs in Enterprise edition , then is it going to compile in Basic language or in C++ compiler. And can we compile the transformer in basic language in Enterprise edition or do we need to compile the transformer in C++ compiler only?

And if i am not wrong , we cant run the parallel jobs in server edition right as there is no parallel engine installed ?

Thanks in advance

Posted: Tue Feb 08, 2011 10:47 am
by chulett
Just because you have the "Enterprise Edition" doesn't mean Server jobs work or run any differently.

Posted: Tue Feb 08, 2011 10:52 am
by jwiles
The Enterprise Edition (Information Server for v8x) contains both the Server (Universe) and Parallel (Orchestrate) engines.

Server jobs are server jobs...they run in the server engine as always. They will not use a parallel configuration file because they do not utilize the parallel engine.

Parallel jobs run only on the parallel engine, except for those instances where you can use a BASIC transformer or other server-based stage as part of a parallel job. Those particular stages will be run in the server engine (probably as a child process of a parallel skeleton operator) while the rest of the parallel job runs in the parallel engine. BASIC transformers cannot be compiled as a C++ operator.

If you have not installed the parallel engine in an instance, due to licensing or other reason, you cannot run parallel jobs.

Regards,

Posted: Tue Feb 08, 2011 11:06 am
by chulett
Ah... a much more thorough reply. Thank goodness for new fingers. :wink:

Posted: Tue Feb 08, 2011 11:51 am
by rsunny
jwiles wrote: Server jobs are server jobs...they run in the server engine as always. They will not use a parallel configuration file because they do not utilize the parallel engine.

Parallel jobs run only on the parallel engine, except for those instances where you can use a BASIC transformer or other server-based stage as part of a parallel job. Those particular stages will be run in the server engine (probably as a child process of a parallel skeleton operator) while the rest of the parallel job runs in the parallel engine. BASIC transformers cannot be compiled as a C++ operator.

Regards,
Hi ,

Thank you for you reply

So when we purchase enetrprise edition does it also include server engine right?

Can we use server-based stage in a parallel job? As a parallel job contains all the parallel job stages and not server based stages. I am little bit confused as can you tell me what are server-based stages that we can use in parallel jobs?

How does datastage differentiate if we use server based stages in parallel jobs as for eg Sort stage is included in both server and parallel?

Thanks in advance

Posted: Tue Feb 08, 2011 12:11 pm
by ray.wurlod
There are two Sort stages, even though they're called the same thing. (What else would you call it?) Open the properties in the Stage Types branch of the Repository to note significant differences.

Posted: Tue Feb 08, 2011 12:17 pm
by rsunny
Hi ,

thanks for your reply , but how can we use server-based stage as part of a parallel job. I mean can we use server stages in parallel jobs. I am sorry for giving hardtime but i am confused.as the parallel jobs can use only parallel stages and not server based stages.

Thanks in advance

Posted: Tue Feb 08, 2011 12:23 pm
by ray.wurlod
Why are you confused? Server jobs use server stages, parallel jobs use parallel stages, sequence jobs use sequence stages. You even get a different Palette depending on what job type you have open.

(You can use server Shared Containers in parallel jobs under certain circumstances, but not server stage types directly.)

Posted: Tue Feb 08, 2011 12:55 pm
by rsunny
jwiles wrote: Parallel jobs run only on the parallel engine, except for those instances where you can use a BASIC transformer or other server-based stage as part of a parallel job. Those particular stages will be run in the server engine (probably as a child process of a parallel skeleton operator) while the rest of the parallel job runs in the parallel engine. BASIC transformers cannot be compiled as a C++ operator.

Regards,
Hi ray,
As james mentioned "use other server-based stage as part of a parallel job". so i was thinking how can we use server based job as a part of parallel job.

Thanks in advance

Posted: Tue Feb 08, 2011 1:10 pm
by chulett
Server Shared Container.

Posted: Tue Feb 08, 2011 3:06 pm
by jwiles
(My fingers don't feel all that new today...gotta be the cold here in Chicago :) )

Be sure to read the documentation for the server shared container to understand how to use it properly. Also, it won't be a magic bullet for your server jobs as far a performance...they are still running in the server engine. Performance of your parallel job which uses one will likely be hindered somewhat as well.

Regards,

Posted: Tue Feb 08, 2011 10:35 pm
by ray.wurlod
Ah, "performance". Is it time for that argument again?

For small volumes of data a server job can be started, run and finished before a parallel job has completed its startup phase. This startup phase is allegedly improved (quicker) in version 8.5 but I've not yet had a chance to verify that one way or another.

So the real question remains: how should you define "performance" in an ETL context? I tend heavily to the view that it's about being able to meet time window KPIs with a margin for safety, that it's not properly measured as a rate at all.

Server components in parallel jobs will always slow them down, if only because of the need to transition between strongly-typed and typeless environments.

Posted: Tue Feb 08, 2011 11:06 pm
by jwiles
I agree. Startup time is a killer for small files in the parallel engine (small being a loosely-typed term). 30+ seconds to startup a job that runs in <1 sec is a waste.

Poorly chosen wording after a long day on-site...I'm allowed at least 10-12 of those I think. :)

Posted: Wed Feb 09, 2011 11:07 am
by gateleys
ray.wurlod wrote:For small volumes of data a server job can be started, run and finished before a parallel job has completed its startup phase.
Ray, what would you consider a "small" volume, versus a "medium"??? or "large" volume?

What about load times differences between server and parallel jobs, both in terms of conventional inserts and direct path loads?

Posted: Wed Feb 09, 2011 11:11 am
by ray.wurlod
Case by case basis. I guess anything under 1-2 MB (possibly larger) would fit in the small category. There's nothing to prevent a server job setting up a parallel load.