Rows/Sec

pratyusha · Post by **pratyusha** » Mon Jun 11, 2007 5:48 am

Hi all,
I have two questions here.
1.Suppose the load to the target data base is around 100-400 rows every day,if we define the array size and transaction size somewhere near 10000-15000,will this affect the job performance adversely?
2.I have a job in which we are dealing completely with DB2 data bases and there are no hash files/sequential files
I am reading from a Db2 stage(It is a view-1000-2000 recs/day) and if there are any error records i m writing them to an error table(0-10 recs/day).I have a look up on the target table thru which i check If i have the record is already present in the target.If yes,then again i have a look up on the view which tells me the changed column value and using this i update my taget data base Else I am inserting it.

Now what happens is if i get around 100-300 records/day for update the load finished in 15 minutes or else it is taking 45min to 1 hour for aroung 700 updates.
I have checked the forum thoroughly and made as many changes i can to in array/transaction size.
The inserts and updates have array/trasaction size 1 and 1
The view look up has pre fetch of 1 row/sec and the target dim table look up has pre fetch value of 100 rows/sec

Please tell me what i can do to improve the performance.
After certain time,the rows/sec is going to 0 rows/sec everywhere in the job

Thanks
Prathyusha

chulett · Post by **chulett** » Mon Jun 11, 2007 6:43 am

1) We really can't say. Try it and see.

2) Use hashed files for your lookups. That's what Server is all about, the proper leveraging of hashed files.

pratyusha · Post by **pratyusha** » Mon Jun 11, 2007 6:53 am

Cant change the job design..because we were asked to tune the existing jobs

DSguru2B · Post by **DSguru2B** » Mon Jun 11, 2007 6:57 am

Doing a lookup against the database, directly, is your bottle neck. Unload your table to a hashed file and reference that hashed file. You should see a boost in your job performance.
P.S: Job tuning, sometimes, may and will require design changes. Make that clear to your client.

chulett · Post by **chulett** » Mon Jun 11, 2007 7:01 am

Well... you are severely hampered in your 'tuning' ability if you "can't change the job design".

Those direct database lookups (which it sounds like are being done twice) are the worst performant way the job could have been designed. 45 minutes to an hour to process 700 records is pretty bad. About all you can do is ensure the lookup queries are as efficient as possible. Heck, since that seems to be your only avenue of attack, all of your sql should be tuned to the best of your ability.

ray.wurlod · Post by **ray.wurlod** » Mon Jun 11, 2007 3:17 pm

Suppose rows/sec were even a meaningful metric. You could boost the rows/sec figure simply by using shorter rows.

pratyusha · Post by **pratyusha** » Wed Jun 13, 2007 6:41 am

Hi all,
thanks for the replies
All the other dimensions have the same design.
The size of the data is more/less same in each dimension.
so we are clueless

chulett · Post by **chulett** » Wed Jun 13, 2007 6:48 am

Again, assure that all of your queries are as efficient as possible. I suspect they can be improved because your current performance is horrid, but who knows. For example, check to see if any new indexes would help.

Either that or get a waiver from the Powers That Be. Redesign the jobs.

VCInDSX · Post by **VCInDSX** » Wed Jun 13, 2007 10:49 am

pratyusha wrote:All the other dimensions have the same design.
The size of the data is more/less same in each dimension.
so we are clueless

Even though you are "Not Allowed" to change the job design, were you able to try a test implementation of this job using "Hash File Lookup"?
Sometimes, management tends to accept such "tuning" efforts/timelines if they happen to see dramatic improvement in performance.
If you can create 2 test jobs quickly to show the numbers, it should help your cause.
Job1: No hash lookup, but pure table joins and write to target table or file
Job2: Using hash lookup and write to a target table or file.

Good luck,

thamark · Post by **thamark** » Wed Jun 13, 2007 12:41 pm

pratyusha wrote:Hi all,
thanks for the replies
All the other dimensions have the same design.
The size of the data is more/less same in each dimension.
so we are clueless

You could try more option to identify where exactly you have problem.

1) Remove lookup and do straingt DB Update/Insert(This change is from original job)
2) Keep sequential file stage intead of DB Update/Insert(This change is from original job)
3) Try to split insert and update seprately, if it is Upsert.

I guess by doing this you would be able to identify the issue

pratyusha · Post by **pratyusha** » Thu Jun 28, 2007 12:57 am

Hi
thanks for the suggestions
I have tried all the options mentioned and did some changes to the commit level and was able to bring down the time upto ten minutes less

one more thing i would like to clarify here is:
They are using a routine for generating a ERR_ID.This routine interacts with the data base evokes a sequence and gets the value from it and assigns it to the column.
Please find the code below:
If
IsNull(Db2GetMaxColVal($p_TargetServerNm, $p_TargetUserNm, $p_TargetUserPwd, 'Err_Id', 'Etl_Job_Err_Log')) then 1 else Db2GetNextSeqVal($p_TargetServerNm, $p_TargetUserNm,
$p_TargetUserPwd, 'Seq_Err'))

Most the the time, the records which come for this routine are >2000.
Will the execution of this routine result in slowing down the process?
Because when i see the log,there is a big time lag at this point

Please advice
Thanks
Prathyusha

ray.wurlod · Post by **ray.wurlod** » Thu Jun 28, 2007 3:53 am

pratyusha wrote: They are using a routine for generating a ERR_ID.This routine interacts with the data base evokes a sequence and gets the value from it and assigns it to the column.
Please find the code below:
If
IsNull(Db2GetMaxColVal($p_TargetServerNm, $p_TargetUserNm, $p_TargetUserPwd, 'Err_Id', 'Etl_Job_Err_Log')) then 1 else Db2GetNextSeqVal($p_TargetServerNm, $p_TargetUserNm,
$p_TargetUserPwd, 'Seq_Err'))

Most the the time, the records which come for this routine are >2000.
Will the execution of this routine result in slowing down the process?
Because when i see the log,there is a big time lag at this point

Clearly the answer to your time lag question is yes. Your monitoring shows this. But, if you're not permitted to change the design, there's nothing you can do to address this.

pratyusha · Post by **pratyusha** » Thu Jun 28, 2007 6:39 am

Hi Ray
I could not read your post
Not a premium member
Can you help me please? :roll:

chulett · Post by **chulett** » Thu Jun 28, 2007 6:52 am

Here, let me quote him for you (I don't think he'll mind). It's nothing new or anything we haven't said over and over already:

ray.wurlod wrote:<snip> But, if you're not permitted to change the design, there's nothing you can do to address this