Job consuming long time to execute..

devesh_ssingh · Post by **devesh_ssingh** » Sun Jun 26, 2011 1:02 am

Hi mentors,

One of my job which have 2 joins subsequently followed by sort on key for both reference and Source as shown below..i m performing left outer join on both occassion.........

db2_1 db2_2 rej1
|4 billion |2.5 millli |
sort_2 sort_4 |
| | |
src-->trx1-->sort_1--->join1----trx2--sort_3-->join2--->trx3-->Load_OP
|
|
rej2

source is approx.. 2 miilion

every sort is hash partition and sort on key column...
join have same partition....

this job is taking approx...4 hours to execute...whats wrong in designing? please suggest some changes

also in trx 3 i am capturing unmatched data coming after both the join and passing only matched to o/p through constraint like

rej1---> mer_id(int 10)=0( this is achieved)
rej2--->IP_ID(int 10)=0( this is achieved too)
Load_OP---> mer_id<>0 or IP_ID<>0( it doesn't, still throw records which have value 0, please suggest how do i achieve this?)
as far i know for integer unmatched column 0 is populated..

devesh_ssingh · Post by **devesh_ssingh** » Sun Jun 26, 2011 1:14 am

the design of job doesn't appered properly...

i have two joins in a job..both referance and source followed by sort stage(hash partition and sort on key) ...Join have been given same partition on both referance and source...

similar operation on join 2.

in the last transformer there is 3 o/p dataset

rej1---> mer_id(int 10)=0( this is achieved)
rej2--->IP_ID(int 10)=0( this is achieved too)
Load_OP---> mer_id<>0 or IP_ID<>0( it doesn't, still throw records which have value 0, please suggest how do i achieve this?)
as far i know for integer unmatched column 0 is populated..

the job is taking lot of time to execute, please suggest?

ref1 and ref2 have 4 billion and 2.5 million data approx...

source have around 2 million..

Thanks in advance
Devesh

ray.wurlod · Post by **ray.wurlod** » Sun Jun 26, 2011 1:43 am

Use Code tags to get the layout right. Then it might be clearer what you're trying to accomplish. Please also advise what the join keys are for each of the Join stages.

devesh_ssingh · Post by **devesh_ssingh** » Sun Jun 26, 2011 1:54 am

Code: Select all

                     db2_1                  db2_2       rej1                      
                     |4 bill                    |2.5 mill  |
                  sort_2                  sort_4           |
                     |                           |         |
src-->trx1-->sort_1--->join1----trx2--sort_3-->join2--->trx3-->Load_OP
                                                           |
                                                           |	
                                                           rej2

Join1 key is Mer_no( char 10)
join 2 key is MKT_code( char3)

jwiles · Post by **jwiles** » Sun Jun 26, 2011 5:22 pm

Do you have an understanding of the actual volume of data (number of bytes) represented by both your main dataset and your reference data? Volume = record_width * number_of_records.

You have 4 billion records going into the first join, being sourced by DB2. How long does it take to read and sort just that data? (DB2->Sort->Copy) Do those records contain only the data required to join to and apply to the main records?

How many degrees of parallelism is your job running with?

Regards,

devesh_ssingh · Post by **devesh_ssingh** » Sun Jun 26, 2011 11:37 pm

whatever i have explained is the actual volumne of data...

also the transfomer used is bot just to copy but carry some transformation.

its running of 4-node configuration..

first join reads 4567 rows/sec from referance seems like after 1st join perfromance degrades drastiaclly and after trx2 reading is like 54 rows/sec..

ray.wurlod · Post by **ray.wurlod** » Mon Jun 27, 2011 12:07 am

It doesn't degrade at all, actually. But the statistics are misleading. The downstream link from the Join stage started its clock (the "seconds" part of rows/sec) running as soon as the job started, but didn't actually receive any rows until the sort was complete, at least on one partition. That's why rows/sec appears low. It's one of the many reasons I counsel not to use rows/sec as a measure of ETL performance.

jwiles · Post by **jwiles** » Mon Jun 27, 2011 5:26 am

Are the counts you provided number of records or number of bytes? In your first join, which is your main data and which is your reference data? 4,000,000,000 rows at 4567/sec ~ 10 days.

Are the extracts from DB2 running in parallel?

Is there a way you can limit the number of rows returned by the first DB2 extract (currently 4 billion rows)? Can you use a where clause in SQL, perhaps?

Have you done any tuning on the sort for the 4 billion rows? Adjusting the memory usage above the default 20MB/partition to 256MB as a starting point.

Can you run in more than four partitiions?

Regards,

Mike · Post by **Mike** » Mon Jun 27, 2011 6:17 am

Are your join keys unique? Are you removing duplicates in your sorts? 4 billion in a char(10) = maybe, 2.5 million in a char(3) = seems unlikely

How many rows arrive at Load_OP? Replace Load_OP with a copy stage having no outputs and see what happens.

Use the DB2 connector stage set to do partitioned reads. There are a couple of options available even for a table that is not partitioned in the database.

Increase the parallellism. Four nodes may be a bit light for 4 billion.

Mike

devesh_ssingh · Post by **devesh_ssingh** » Tue Jun 28, 2011 2:41 am

@jwiles

the count is number of records
as mentioned in figure Db2_1 is my ref data.

i shouldn't be limiting since i need to see all data in table...

i am not sure how do i adjust memory usage because we are restricate to any enviormental setting..
also our Staging server is confifured with 4 nod only.

what i did till now spilt the 2 job and now i can see some inprovement but not so radical changes happen...
i think sort need to given more memory to perform sort operation,,

@Mike...
Join keys are not unique
i am using Db2 connector stage only
do we think i should be increasing array size, by deafult 2000 .. i can try increasing by 10000
also please let us know is there any way to improve parellism while reading through db2 connector it might help me to improve perfoamance in other job as well .....

@Ray
so how do we suppose to check performance if statisitic is misleading....
till now i have been monitring job through and many times it helped me to figure out prolem

Thanks!!!!!

jwiles · Post by **jwiles** » Tue Jun 28, 2011 5:03 am

Thank you for verifying the value is the record counts. I'll just forget about number of bytes as it looks like I'll never get that answer.

To adjust sort memory usage, set the "Restrict Memory Usage" option on the sort stage. The default value is 20 (20MB), set it to 256 (256MB) on the sort stage for the 4 billion records. This will reduce the number of temp files created by that stage by a factor of 12-13, which can have a significant effect on sorting large volumes of data. This will also increase the memory usage on your staging server, so if that server is limited in memory you may want to try a smaller value to avoid forcing the O/S into paging.

The rows/sec count can be misleading if you don't understand how the numbers shown are calculated. They are NOT instantaneous values, although most users seem to think so initially. They are the AVERAGE rows/sec that has passed through the link since the job started. Therefore, the displayed value is affected by how quickly the upstream stages can complete. You can manually calculate a more accurate average by noting when a stage starts writing data to the output link.

What is the DB2 version and O/S for the database server?

Regards,