Shared containers in parallel -- multiple instances?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Shared containers in parallel -- multiple instances?

Post by RodBarnes »

If I create a shared container and then use that in four separate jobs, can those jobs run in parallel without conflicting one another? Asked another way, does each job get its own instance of the shared container?

I've skimmed through the posts on "shared containers" and haven't found a definitive answer to this question. I know that there is value in being able to reuse the container (i.e., not having to rewrite those stages multiple times) but couldn't determine if multiple instances can be run in parallel.

Thanks.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Can they run in parallel? Yes. Without conflicting one another? Depends on what you are doing in the container.

It's just reusable code that gets compiled into your job. It's more about whether your jobs can run concurrently without issue, not just what is in the shared container.
-craig

"You can never have too many knives" -- Logan Nine Fingers
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Post by RodBarnes »

Can they run in parallel? Yes Without conflicting one another? Depends on what you are doing in the container.
Fair enough. My question is more along the general line of "does each instance get its own thread and/or variables? Or are the variables shared?"

To be more specific: I have an update job and an insert job that each read from the same sequential file, use the same lookup hash-files, and update or insert into the same Oracle table. The only real difference is at the beginning, the insert job passes only new records to be processed while the update job passes only existing records. The update job has an additional stage to compare the CRC value to see if anything changed in the curent record compared to the existing one in the table.

It seems reasonable to create a shared container that does all this and use that same container in each of the insert and update jobs. So, the insert job would have an input from the sequential file to a test for new record and then into the container. It would have an output to the table.

The update job woud have an input from the sequentil file to a test for existing records and then into the container. It would have and output to a test for the CRC and then on to update the table for changed records.

Ok, given that, would the fact that these two jobs run at the same time cause issues? It seems like they could since the stages within the container would only be doing lookups to transform the record.

Long, I know, but trying to be clear. :)
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

RodBarnes wrote:My question is more along the general line of "does each instance get its own thread and/or variables? Or are the variables shared?"
Ah... not what you asked. :wink:

You are over thinking it. Think of what the two jobs do without regards to the fact that they have a 'shared container' in them. Can you run them at the same time without issue? The fact that some of the code is encapsulated into a shared container plays no role here. Other than to make your job as a developer easier, that is.

As previously noted, they are just reusable code that is compiled into your job. Nothing else is magical about them. So the end result, the job code that ends up actually running, is no different than if you had coded those stages directly into your job.

Does that help?
-craig

"You can never have too many knives" -- Logan Nine Fingers
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Post by RodBarnes »

So a shared container is really just like including code from a shared library; e.g., a set of code that gets compiled into the overall module. You can use that code (and run it) in as many modules as you choose because each one gets its own copy at compile time.

I just wasn't clear whether the container ended up compiled into its own separate component somehow (like on the first compile) and each module then used that same instance. Its just my experience as a Windows SW eng and dealing with shared DLLs and such that made me wonder if there was something similar going on here.

Thanks.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DataStage server uses multiple processes rather than multiple threads, with very few exceptions (for example the sort engine is multi-threaded). Since separate jobs run in separate process you can correctly deduce that their local variables are independent of those in other jobs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sb_akarmarkar
Participant
Posts: 232
Joined: Fri Sep 30, 2005 4:52 am
Contact:

Post by sb_akarmarkar »

RodBarnes wrote: To be more specific: I have an update job and an insert job that each read from the same sequential file, use the same lookup hash-files, and update or insert into the same Oracle table. The only real difference is at the beginning, the insert job passes only new records to be processed while the update job passes only existing records. The update job has an additional stage to compare the CRC value to see if anything changed in the curent record compared to the existing one in the table.
I thinks Insert & Update you are trying to do for Oracle table in shared container ... Cannot be possible in parallel for jobs... :? If you try to run it in sequence then it may abort or it may hang.... :)


Thanks,
Anupam
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

... but here we're talking about server jobs, so your point is moot.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply