analyze.shm

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

attu
Participant
Posts: 225
Joined: Sat Oct 23, 2004 8:45 pm
Location: Texas

analyze.shm

Post by attu »

hi,
I want to find out how my hashed file is tuned.
Trying to run the analyze.shm command, but it is not in the VOC.
How to add it in VOC?

Thanks
attu
Participant
Posts: 225
Joined: Sat Oct 23, 2004 8:45 pm
Location: Texas

Post by attu »

sorry. it should have been analyze.file not analyze.shm.
attu
Participant
Posts: 225
Joined: Sat Oct 23, 2004 8:45 pm
Location: Texas

Post by attu »

Can someone tell me the syntax for this command?

Code: Select all

>ANALYZE.FILE
File name        =  "/dsadm/hash/myhashfile"
Must specify file name.
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

First establish a pointer in the VOC by issuing the command

Code: Select all

SETFILE /dsadm/hash/myhashfile myhashfile;
Next

Code: Select all

ANALYZE.FILE myhashfile;
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
attu
Participant
Posts: 225
Joined: Sat Oct 23, 2004 8:45 pm
Location: Texas

Post by attu »

[quote="narasimha"]First establish a pointer in the VOC by issuing the command

Code: Select all

SETFILE /dsadm/hash/myhashfile myhashfile;
I get this message

what do you want to call it in your VOC file =
attu
Participant
Posts: 225
Joined: Sat Oct 23, 2004 8:45 pm
Location: Texas

Post by attu »

thanks.

here is the output.

File type .................. DYNAMIC
Hashing Algorithm .......... GENERAL
No. of groups (modulus) .... 12003 current ( minimum 1 )
Large record size .......... 1628 bytes
Group size ................. 2048 bytes
Load factors ............... 80% (split), 50% (merge) and 80% (actual)
Total size ................. 32661504 bytes


is it badly tuned?
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

attu,

That would depend on your requirement.
There is a small application called HFC.exe available on your Datastage Installation CD. This can help you tune your hashed file.
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Lose the semi-colons.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
attu
Participant
Posts: 225
Joined: Sat Oct 23, 2004 8:45 pm
Location: Texas

Post by attu »

Thanks Narasimha.

The issue is that we are trying to do lookup from hashed files and the throughput is very low, it is like 17 rows/sec.

Is there any issue with the hashed file index or do we need to recreate the hashed files again?

Is there any other thing I can do to improve the performance?

Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Without the STATISTICS keyword, ANALYZE.FILE reports only the tuning settings (the parameters that can be set when the hashed file is created). There is no way to tell from that whether the hashed file is well tuned.

Add this keyword to have sizing information reported.
ANALYZE.FILE myhashedfile STATISTICS

Note, however, that a dynamic hashed file is a moving target; as the data volume to be stored in it changes, it will automatically alter its shape (in particular the number of groups, or modulus).

Therefore "tuned" is an ephemeral characteristic.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
attu
Participant
Posts: 225
Joined: Sat Oct 23, 2004 8:45 pm
Location: Texas

Post by attu »

Thanks Ray. I ran it as you suggested

here is the o/p

File type .................. DYNAMIC
Hashing Algorithm .......... GENERAL
No. of groups (modulus) .... 12003 current ( minimum 1, 0 empty,
3896 overflowed, 1 badly )
Number of records .......... 297932
Large record size .......... 1628 bytes
Number of large records .... 0
Group size ................. 2048 bytes
Load factors ............... 80% (split), 50% (merge) and 80% (actual)
Total size ................. 32661504 bytes
Total size of record data .. 18122595 bytes
Total size of record IDs ... 1787593 bytes
Unused space ............... 12747220 bytes
Total space for records .... 32657408 bytes

any advise?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes. Never react to a single sample. Monitor over time - maybe four weekly samples.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

attu wrote: No. of groups (modulus) .... 12003 current ( minimum 1, 0 empty,
3896 overflowed, 1 badly )
Not sure what the meaning of "badly" is in the above context?
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

More than one secondary buffer in the group. One out of 12003 is not a problem.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
attu
Participant
Posts: 225
Joined: Sat Oct 23, 2004 8:45 pm
Location: Texas

Post by attu »

Thanks for the info guys,
Our issue us still not resolved. we are doing couple of lookups using hashed file and the throughput is very slow e.g. 17 rows/sec.
I tried tuning the performance by increasing row-buffering to 1024KB but still does not helps.
what other options do I have, we are using a Dynamic 30 hashed file, can i resize it ? The memory on the box is also 100% when I run nmon, and hashed files are using pre-load file to memory option, can i disable that?

My job design is

Code: Select all


I/P --> Link Collector --> Transformer <--Hashed File
                                            |
                                            |
                                      Transformer <-- Hahsed file
                 |                          |
                 |        HF -->  Transformer <-- Hashed File      
                I/P                        |
                                            |
                          HF   --> Transformer      
                                            |
                                           Seq File
Appreciate your input.
Thanks
Post Reply