Startup time high in jobs having a DB operation

chetan.c · Post by **chetan.c** » Fri Jul 06, 2012 9:44 am

Hi,

Startup time of parallel jobs that contain Database lookups and/or inserts is high.
Like 40 seconds for startup and production time is 5 seconds.

I read the posts here,and came across &PH& directory cleanup.There were around 1600 files which we cleared.

This did not result in any change on the start up time.

I read about time involved in establishing session,authenticating Username and password.
The same jobs on the development database takes very little starup time like around 8 to 9 seconds.
So, this makes me think that there is something about the DB that is making this happen.

I would like to know if there is someway I can find out that DB is creating the incresed startup time or atleast rule out my assumption?(Unfortunately contacting the DBA is not an option as apparently, I have to be 100% sure about what im saying before I say it!! )

Or anything on the installation box that I need to check out for.
Also some more information on start up time would be really helpful to me.

Thanks,
Chetan.C

PaulVL · Post by **PaulVL** » Fri Jul 06, 2012 11:14 am

Which Database connection method are you using?

What DBMS platform?

chulett · Post by **chulett** » Fri Jul 06, 2012 11:51 am

chetan.c wrote:I read the posts here,and came across &PH& directory cleanup.There were around 1600 files which we cleared.

This did not result in any change on the start up time.

It wouldn't affect the "start up time" of the lookups.

chetan.c also wrote:The same jobs on the development database takes very little starup time like around 8 to 9 seconds.

I would imagine this is all about the volume of data in dev versus prod.

Kryt0n · Post by **Kryt0n** » Sun Jul 08, 2012 4:26 pm

Could you be on DB2 by any chance? If so, there is definitely a DB setting (maybe for other DBs too) as we had our Prod machine taking 2+ minutes to start up before the DB guys made some adjustments that sped up connection.

What would be nice is if DataStage created one connection for all queries to use, we have some jobs with many DB2 stages and we have to go through each one making a connection in turn

PaulVL · Post by **PaulVL** » Sun Jul 08, 2012 4:48 pm

With the DB2 settings you might want to also check Validation of Column Type and Size mismatches (if you are confident that the table def won't change from under you).

chetan.c · Post by **chetan.c** » Mon Jul 09, 2012 12:31 am

Hi Paul and Kryt0n ,

Sorry I dint mention the DB.
Its Oracle.

Look up Configuration.
Enable_partioned reads---Yes
Patitioned Read method--Rowid Hash
Record Count ---20000
Array size ---2000
Prefetch count---1
No before or after SQl statements.

Insert Configuration:
Write mode---Insert
Table Action --Append
Record count--2000
Array Size --1
Drop Unmatched Fields ---Yes
No before and after SQL statements.

Hi Craig,

Actually we are using the same load which is used in the Dev.No change absolutely in the load.
Also is there anything that needs to be monitored on the box which can lead to this?

Thanks,
Chetan.C

chulett · Post by **chulett** » Mon Jul 09, 2012 7:05 am

chetan.c wrote:Actually we are using the same load which is used in the Dev.No change absolutely in the load.

If by "load" you mean the data you are loading, that's not what I meant. I meant the amount of data in the target database, in other words the amount of data it needed to cache before the main pipleline can start delivering data.

chetan.c · Post by **chetan.c** » Mon Jul 09, 2012 7:14 am

Hi Craig,

Actually we are just testing the job runs in this environment,so its not with actual live data.
The data that is maintained in the tables of both the environment is almost the same(Checked record counts).Infact in the new environment its lesser than the dev environment.

I understand its difficult to guess the reason,without actually seeing it..

but i'm open to all suggestions.Thanks for your patience.

Thanks,
Chetan.C

chulett · Post by **chulett** » Mon Jul 09, 2012 7:18 am

OK, good to know. Cross that one off your list.

chetan.c · Post by **chetan.c** » Mon Jul 09, 2012 7:23 am

chulett wrote:..Cross that one off your list.

Few more things which I can check and cross them off ..?

Thanks,
Chetan.C

chulett · Post by **chulett** » Mon Jul 09, 2012 8:01 am

If you haven't done so already, add $APT_STARTUP_STATUS and $APT_PM_PLAYER_TIMING to your job set to True in both environments. Compare and contrast.

chetan.c · Post by **chetan.c** » Tue Jul 10, 2012 2:35 am

chulett wrote:...add $APT_STARTUP_STATUS and $APT_PM_PLAYER_TIMING to your job set to True in both environments. Compare and contrast. ...

Hi Craig,

I added the 2 environment varibales $APT_STARTUP_STATUS and $APT_PM_PLAYER_TIMING to the job and this is the pattern I found for all DB look up tables.

Environment One

Code: Select all

DIM_DATES,0: Operator completed. status: APT_StatusOk  elapsed: 0.73  user: 0.12  sys: 0.06 (total CPU: 0.18 )
DIM_DATES,1: Operator completed. status: APT_StatusOk  elapsed: 0.67  user: 0.09  sys: 0.08 (total CPU: 0.17)

Environment two

Code: Select all

DIM_DATES,0: Operator completed. status: APT_StatusOk  elapsed: 5.28  user: 0.14  sys: 0.06 (total CPU: 0.20)
DIM_DATES,1: Operator completed. status: APT_StatusOk  elapsed: 5.32  user: 0.14  sys: 0.06 (total CPU: 0.20)

The elapsed time appers to be higher for the second environment.
Is it like Datastage is not getting sufficient allocation of CPU because of some other processes?

Also in the second environment I see

Code: Select all

repartition(5),0: Operator completed. status: APT_StatusOk  elapsed: 4.73  user: 0.00  sys: 0.00 (total CPU: 0.00)

About 20 entries of it only in the second environment.
The jobs is the same one migrated from from Env1 to env2 .

What does the above information mean?

Thanks,
Chetan.C

DSXchange

Startup time high in jobs having a DB operation

Startup time high in jobs having a DB operation

Re: Startup time high in jobs having a DB operation