Hashed File loading getting slow
Moderators: chulett, rschirm, roy
Hashed File loading getting slow
Hi All,
I have problem to load data to Hashed file from Oracle database.
Its working fine before, but since last week its taking more time to load.
Job Design:
Oracle database(ORAOCI9)--->Hashed File stage (hashed file for Look-up for further design)
I did
1) Read from Oracle ,and in transformer i set @NULL ,as constrain,its reading as normal speed(like 3000/secs)
2) read from oracle,-->transformer-->sequential file (like 2000/secs)
3) Delete hash file and create new one.
4) Verify all buffering option at Project level,its same.
notes:- There is no change in job design.
But its same.loading as speed of 300 rows/secs.
So is there any way to tune hashed file.or how to debug this issue.
Any help would be appreciated.
Thanks.
I have problem to load data to Hashed file from Oracle database.
Its working fine before, but since last week its taking more time to load.
Job Design:
Oracle database(ORAOCI9)--->Hashed File stage (hashed file for Look-up for further design)
I did
1) Read from Oracle ,and in transformer i set @NULL ,as constrain,its reading as normal speed(like 3000/secs)
2) read from oracle,-->transformer-->sequential file (like 2000/secs)
3) Delete hash file and create new one.
4) Verify all buffering option at Project level,its same.
notes:- There is no change in job design.
But its same.loading as speed of 300 rows/secs.
So is there any way to tune hashed file.or how to debug this issue.
Any help would be appreciated.
Thanks.
-
- Premium Member
- Posts: 210
- Joined: Wed Feb 16, 2005 7:17 am
Please provide more information
1. Is it a static / dynamic hashed file???
2. How many records are to be loaded in total??
3. After loading how many records approx, the rows per sec value dips down (even if it dips down gradually)???
Moreover, if it is juat an @NULL constraint on a column, then what difference will it make to put the constarint in sql and remove the transformer??
1. Is it a static / dynamic hashed file???
2. How many records are to be loaded in total??
3. After loading how many records approx, the rows per sec value dips down (even if it dips down gradually)???
Moreover, if it is juat an @NULL constraint on a column, then what difference will it make to put the constarint in sql and remove the transformer??
L'arrêt essayant d'être parfait… évoluons.
-
- Premium Member
- Posts: 210
- Joined: Wed Feb 16, 2005 7:17 am
-
- Premium Member
- Posts: 210
- Joined: Wed Feb 16, 2005 7:17 am
Right... could be a tuning issue....
First of all, what exactly are you using your transformer for?? (I am sorry but it is not clear from your post)
Then,
What are teh record sizes you are loading into your hashed file???
Did you create the hashed file with all the default parametrs???
From your initial post, I am understanding that when you write to a seq file, you get about 2000 -3000 rps. When you do the same thing to a hash file, rps dips.... is my understanding correct about the order of execution which you have described???
Thanks
First of all, what exactly are you using your transformer for?? (I am sorry but it is not clear from your post)
Then,
What are teh record sizes you are loading into your hashed file???
Did you create the hashed file with all the default parametrs???
From your initial post, I am understanding that when you write to a seq file, you get about 2000 -3000 rps. When you do the same thing to a hash file, rps dips.... is my understanding correct about the order of execution which you have described???
Thanks
L'arrêt essayant d'être parfait… évoluons.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
This is expected behaviour if write cache is used.
Running data into the cache is very fast. Once all the rows are processed, the cache must be flushed to disk. However, the clock keeps running during this process, so the rows/sec figure (which I have already decried as meaningless) appears to fall.
The same condition obtains if the volume of data exceeds the size of the cache, and DataStage begins to write directly to disk.
Finally, you would expect the rate to fall even if the write cache is disabled. As the dynamic hashed file grows, writes are random to a larger population of groups, so there is more (and larger) seek activity in the file system as adjacent records are written to non-adjacent groups.
Running data into the cache is very fast. Once all the rows are processed, the cache must be flushed to disk. However, the clock keeps running during this process, so the rows/sec figure (which I have already decried as meaningless) appears to fall.
The same condition obtains if the volume of data exceeds the size of the cache, and DataStage begins to write directly to disk.
Finally, you would expect the rate to fall even if the write cache is disabled. As the dynamic hashed file grows, writes are random to a larger population of groups, so there is more (and larger) seek activity in the file system as adjacent records are written to non-adjacent groups.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 210
- Joined: Wed Feb 16, 2005 7:17 am
I am using transformer only for debug purpose.I really don't need it in my job design.And my hashed file created with default paramaters.so can you describe me what would be the solution.(Allow stage write catch-not enable)aakashahuja wrote:Right... could be a tuning issue....
First of all, what exactly are you using your transformer for?? (I am sorry but it is not clear from your post)
Then,
What are teh record sizes you are loading into your hashed file???
Did you create the hashed file with all the default parametrs???
From your initial post, I am understanding that when you write to a seq file, you get about 2000 -3000 rps. When you do the same thing to a hash file, rps dips.... is my understanding correct about the order of execution which you have described???
Thanks
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The only "solution" involves managing your expectations (in terms of elapsed time) better and realizing that rows/sec is an almost entirely meaningless metric in this context.
Rows/sec doesn't matter if you're not watching the job. So don't watch it.
Rows/sec doesn't matter if you're not watching the job. So don't watch it.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 210
- Joined: Wed Feb 16, 2005 7:17 am
First, you need to find out how Minimum Modulus and Group size you gonna need....
I will tell you my way....
I take the record size, divide 2048 by record size and then I divide 4096 by record size. Where ever I get a lower value of remainder, I choose that as my Group Size.
P.S:- Record size will not be the size as per your data types. I assume that you know the details to calculate.
Then u simply need to calculate the modulus, using this formula
Modulus = ( Num of rows * Avg record length ) / (Group size * 2048 * 0.
In the above calculated value, you can add another 10 % or 20 % as per your data if it is expected to grow.
Hope it helps.
Cheers
Aakas
I will tell you my way....
I take the record size, divide 2048 by record size and then I divide 4096 by record size. Where ever I get a lower value of remainder, I choose that as my Group Size.
P.S:- Record size will not be the size as per your data types. I assume that you know the details to calculate.
Then u simply need to calculate the modulus, using this formula
Modulus = ( Num of rows * Avg record length ) / (Group size * 2048 * 0.
In the above calculated value, you can add another 10 % or 20 % as per your data if it is expected to grow.
Hope it helps.
Cheers
Aakas
L'arrêt essayant d'être parfait… évoluons.
-
- Premium Member
- Posts: 210
- Joined: Wed Feb 16, 2005 7:17 am
-
- Premium Member
- Posts: 210
- Joined: Wed Feb 16, 2005 7:17 am