Poor performance from hashed file following RedHat upgrade

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
PaulS
Premium Member
Premium Member
Posts: 45
Joined: Fri Nov 05, 2010 4:38 am

Poor performance from hashed file following RedHat upgrade

Post by PaulS »

Hi,

We have recently upgraded the o/s from redhat 5.5 to 5.9,.. one job in particular has dramatically slowed down. Lots of stages write to one hashed file, i'm getting poor performance from this hashed file..

It was being created as dynamic, so I've used the HFC to get the settings for a static file,.. which was;
CREATE.FILE HashFileName 5 589163 1 32BIT - unfortunately that was worse.

Here's the stats before testing, so when it was a dynamic file

Code: Select all

ANALYZE.FILE HashFileName STATS
name ....................... HashFileName
Pathname ................... HashFileName
File Type .................. DYNAMIC
NLS Character Set Mapping .. NONE
Hashing Algorithm .......... GENERAL
No. of groups (modulus) .... 402960 current ( minimum 1, 3 empty,
                                          110195 overflowed, 5151 badly )
Number of records .......... 7370417
Large record size .......... 1628 bytes
Number of large records .... 0
Group size ................. 2048 bytes
Load factors ............... 90% (split), 50% (merge) and 80% (actual)
Total size ................. 1061656576 bytes
Total size of record data .. 178568525 bytes
Total size of record IDs ... 489893515 bytes
Unused space ............... 393190440 bytes
Total space for records .... 1061652480 bytes
File name .................. HashFileName
                             Number per group ( total of 402960 groups )
                             Average    Minimum    Maximum     StdDev
Group buffers ..............    1.28          1          3       0.45
Records ....................   18.29          1         60       8.31
Large records ..............    0.00          0          0       0.00          
Data bytes .................  443.14         18       1469     202.04
Record ID bytes ............ 1215.74         53       3939     555.54
Unused bytes ...............  953.41         12       2116     478.20
Total bytes ................ 2612.29       2048       6144       0.00

                             Number per record ( total of 7370417 records )
                             Average    Minimum    Maximum     StdDev
Data bytes .................   24.23         21         33       2.07
Record ID bytes ............   66.47         47         87       8.38    
File name .................. HashFileName
                         Histogram of records and ID lengths

                                                                      100.0%
    Bytes ------------------------------------------------------------------

  up to 4|
  up to 8|
 up to 16|
 up to 32|  
 up to 64|
up to 128| >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
up to 256|
up to 512|
 up to 1k|
 up to 2k|
 up to 4k|
 up to 8k|
up to 16k|
     More|
          ------------------------------------------------------------------
The hash file looks like this...

Code: Select all

HashFileName
Column name  Key  SQL Type       Length   Scale  Nullable
FIELD1         Y     Varchar          6                No
FIELD2         Y     Varchar         11               Yes
FIELD3         Y     Decimal         38               Yes
FIELD4         Y     Varchar         64               Yes
FIELD5         Y     Varchar         10               Yes
FIELD6         Y     Varchar         15               Yes
FIELD7         Y     Varchar         50               Yes
FIELD8         Y     Varchar         12               Yes
FIELD9         Y     Varchar          3                No
FIELD10        N     Decimal         38       2       Yes
FIELD11        N     Decimal         38               Yes
FIELD12        N     Decimal         38       2       Yes   
Any help on this very much appreciated.

Thanks in advance

Paul
PaulS
Premium Member
Premium Member
Posts: 45
Joined: Fri Nov 05, 2010 4:38 am

Post by PaulS »

I'm sure ArndW replied to this message - Admins, any idea where it's gone?!

Paul
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I know I responded as well, and that PaulS put a response to that, so somehow 2 messages have been:

(a) redacted due to security concerns, or
(b) deleted due to bad language, or
(c) misplaced onto some other forum, or
(d) a loop and fold in the time-continuum have removed their existence.
PaulS
Premium Member
Premium Member
Posts: 45
Joined: Fri Nov 05, 2010 4:38 am

Post by PaulS »

Please let it be d).. maybe it'll loop back to when the sun always shined, Texan bars were just 10p and I'd never heard of Datastage!
PaulS
Premium Member
Premium Member
Posts: 45
Joined: Fri Nov 05, 2010 4:38 am

Post by PaulS »

Please let it be d).. maybe it'll loop back to when the sun always shined, Texan bars were just 10p and I'd never heard of Datastage!
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I can't recall exactly what I recommended - a bigger MINIMUM.MODULO if staying with a Type 30 or choosing a better hashing method on the key. Oh, that was it - a DataStage hashed files with multiple keys is stored with all the key columns concatenated together using @FM into one key. If only the last 2 bytes of this are used to create the hash you will get a bad distribution of keys and thus a lot of empty groups and a lot of overflowing ones. In this case it would be the last characters of FIELD9 that would be used.
Which part of the key to use for hashing depends upon your data contents for FIELD1 through FIELD9.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Wow... or simply e) Moderator moved it to the proper forum. I can't help it that the way Ray moves through the forums nukes the 'moved' notice. :P

Please take the discussion back over there, then I'll remove this one.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply