Page 1 of 2

shared container called more than once in a single job

Posted: Tue Mar 13, 2012 10:34 am
by samyamkrishna
Hi,

I have created a container.

The design is as shown below

Code: Select all


ipSC----->Transformer--------CS--------SF(dd conv=ascii 2>/dev/null)
                     |                       |
             Constraint                 |
            fail if more than          |
              #threshlod              SF
                     |
                     |
                    CS 

ipSC=Input Link Shared Container
CS= Copy Stage
SF=Sequential File
filter for one sequential file is dd conv=ascii 2>/dev/null
there is no filter for the other sequential file

the data in the Input link for the shared container is EBCDIC


both the sequential file names are parameterised.

****************************************************


The problem i am facing is when i use teh same shared container in a job more than once the job hangs.

in the director when the job hangs it says
main_program: orchgeneral: loaded
orchsort: loaded
orchstats: loaded

there are no locks on the file, because the file names are parameterised.
there are no database stages

need to know if there is something i am missing.

Thanks in advance.

Samyam

Re: shared container called more than once in a single job

Posted: Tue Mar 13, 2012 10:36 am
by samyamkrishna
updating the code

Code: Select all


ipSC----->Transformer------------CS--------SF(dd conv=ascii 2>/dev/null)
                     |           |
             Constraint          |
            fail if more than    |
              #threshlod        SF
                     |
                     |
                    CS 

Posted: Tue Mar 13, 2012 12:06 pm
by jwiles
Are you using different filenames for all four sequential file outputs when you have two copies of the shared container?

Posted: Tue Mar 13, 2012 12:50 pm
by samyamkrishna
yes teh file names are parameterised and i am passing different values for both teh shared containers

Posted: Tue Mar 13, 2012 10:26 pm
by jwiles
When you are using multiple copies of the shared container, what does your job design look like? Perhaps you are inadvertently creating a deadlock condition.

Regards,

Posted: Wed Mar 14, 2012 5:00 am
by samyamkrishna

Code: Select all


         |--------CI1----------CP
         |
         |--------CI2----------CP
         |            .
         |            .
         |            .
SF----|            .
         |            .
         |            .
         |            .
         |---------CIn----------CP
SF----Sequential File
CI----Column Import
CP----Copy stage

The output link from Column import goes to Copy stage
The reject link from the ColumnImport goes to the container

There are 30 Column import stages
and 30 reject link go to 30 shared containers

The porpouse of this job is to check if the source file is in the correct format.

All the 30 CI stages use diff schema file to iport data

Posted: Wed Mar 14, 2012 3:07 pm
by jwiles
I expect there's more logic between the source sequential file and the column importers :)

When you are running with only one copy of the shared container, are you also running with one CI (that is, only processing one schema)? Or do you still have the multiple CIs and only one SC?

Does the job appear to freeze when you add a second SC, or can you add several (how many?) before it freezes?

Is any other processing occurring within the job (sorts/joins/etc.)?

What degree of parallelism is the job being run with? Have you tried running with a single partition and still encountered the freeze?

Regards,

Posted: Wed Mar 14, 2012 4:30 pm
by samyamkrishna
Hi,

I tried a simple design with the shared container.

Code: Select all

             SC
             |
             |
RG------CP---------SC
             |
             |
             SC
RG Row Generator
CP Copy
SC Shared container.

Row generator generating 10 records.
Sending these records into the same sharedcontainer thrice.

With One shared container works fine
two shared container works fine

It hangs when the third container is added at the same point.

director log stops at this point.

main_program: orchgeneral: loaded
orchsort: loaded
orchstats: loaded



:( :( :( :( :(

Posted: Wed Mar 14, 2012 4:51 pm
by jwiles
Add the following environment variables to your job to get a closer look at what is happening internally

$APT_DUMP_SCORE=True
$OSH_DUMP=True
$APT_STARTUP_STATUS=True
$APT_PM_SHOWRSH=True

And to keep operators from being combined:
$APT_DISABLE_COMBINATION=True

Regards,

Posted: Wed Mar 14, 2012 5:02 pm
by samyamkrishna
ok will try and let you know....

Posted: Thu Mar 15, 2012 7:17 am
by samyamkrishna
Hey I just added the variables and it ran fine. :D

Can we conclude something from this.

Posted: Thu Mar 15, 2012 7:26 am
by chulett
That your issue is resolved?

In all seriousness, we see this sometimes when operators are not combined. Not sure if that's a bug, if sometimes the combinations that are done are a little too... overzealous. [shrug]

Posted: Thu Mar 15, 2012 7:39 am
by samyamkrishna
thanks to the variable

$APT_DISABLE_COMBINATION=True

which made it all possible.

Posted: Thu Mar 15, 2012 7:42 am
by samyamkrishna
And thanks to Jwiles too :)

Posted: Thu Mar 15, 2012 7:44 am
by jwiles
Along the subject of Craig's post, try rerunning the job with APT_DISABLE_COMBINATION=False--if you want to, with it working you probably don't :) . I suspect it will probably hang as before as that's the only change made that would have affected how the job was built and run.

Overzealous combination is sometimes an understatement :)

Regards,