Hash File Usage Analysis

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
anupam
Participant
Posts: 172
Joined: Fri Apr 04, 2003 10:51 pm
Location: India

Hash File Usage Analysis

Post by anupam »

Hello All,

In our production Datastage Environment, we have many hash files existing. I want to know the names of the Jobs which are referring a particular hash file.

Some of them have not been modified from more than a year. But this is not sufficient for me to believe that they are not being referred by any of the existing Jobs.

Is there any way to find out the names of the Jobs referring to a particular hash file (Say "HashFile"). My requirement is to get the names of Jobs,category of Jobs and the last date (If Possible) on which that "HashFile" is referred.

Like we have a utility for Usage Analysis of Jobs, can we have some similar kind of utility.
----------------
Rgds,
Anupam
----------------
The future is not something we enter. The future is something we create.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Ray posted a query to find all jobs on this topic viewtopic.php?t=87959 . You can also use MetaStage or Reporting Assistant. Do a search. If you cannot find the other topics then we can help.
Mamu Kim
anupam
Participant
Posts: 172
Joined: Fri Apr 04, 2003 10:51 pm
Location: India

Post by anupam »

Hello Duke,

I am able to see the name of the Jobs which are referring to a particular hash file. The only thing which i want is to check whether that hash file has been referred by the Job recently (say in previous month).

In fact i want one additional information, that is, the date on which the particular hash file has been referred.

Actually, I want to identify all the hash files which are not in use right now and have not been referred by any of the jobs from say 1st of April 2004.

So that after identifying the name of the hash files which are not in use, I can delete these hash files from the project.
----------------
Rgds,
Anupam
----------------
The future is not something we enter. The future is something we create.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I get it. At the UNIX level you can look at the access time. I am not sure that is adequate. If you know all the jobs which read or write to this hash file then Ray did a post which shows the last time these jobs ran. As long as jobs are not compiled then you can also get the last run times from JobReport or dsjob -report.
Mamu Kim
anupam
Participant
Posts: 172
Joined: Fri Apr 04, 2003 10:51 pm
Location: India

Post by anupam »

I have taken the date from unix level only but not very much sure to believe on that date or not.

If some other better option is available then definately i will not believe on the date derived from unix level else no choice but to believe what ever is available.
----------------
Rgds,
Anupam
----------------
The future is not something we enter. The future is something we create.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Why would you "not believe" in the UNIX date? :? Which ones are you checking? I'd suggest (assuming dynamic hash) dropping into the hash directory and checking the dates on the DATA.30 and OVER.30 files themselves.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Craig

A lot of commands can change the UNIX dates. If they are old then believe them. If they are new then backups or some other command can change them. You could of done a LIST at TCL or within the job.
Mamu Kim
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Understood. I personally haven't seen where backups change the dates, but that could depend on the system and what is being used to perform the backup, I suppose. But in general though, I would think they would be reliable as long as one understands the type of actions that actually do update the timestamps. Certainly not so unreliable as to not even consider them.

IMHO, of course. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
chucksmith
Premium Member
Premium Member
Posts: 385
Joined: Wed Jun 16, 2004 12:43 pm
Location: Virginia, USA
Contact:

Post by chucksmith »

My tools: (1) List all files and tables used by jobs in a dsx file, and (2) Search a dsx file for a string, may help you. The first tool will let you know all hash (and other type) files used by your jobs and routines. The second tool will give you a list of all jobs and routines that reference a particular file name.

You will have to judge if the jobs are current, or out dated.

The tools are on the DataStage Tools page of www.anotheritco.com.
anupam
Participant
Posts: 172
Joined: Fri Apr 04, 2003 10:51 pm
Location: India

Post by anupam »

I have used this query for getting the names of the Jobs which are referring to the Hash files

Code: Select all

SELECT NAME FMT '40L', OBJIDNO FMT '6R' COL.HDG 'Job No' 
FROM DS_JOBOBJECTS 
WHERE OBJNAME='ROOT' AND 
           OBJIDNO IN 
           (SELECT DISTINCT OBJIDNO FROM DS_JOBOBJECTS 
           WHERE NAME ='HASHFILENAME') 
ORDER BY NAME,OBJIDNO;

HASHFILENAME -- Name of the hash files of which usage has to be derived.
1. If i am getting 0 records as the output for this query, I am assuming
that the hash file is not being referred by any of the existing Jobs.

2. I am considering Unix level date for DATA.30 asd the last modified
date.

My ultimate objective is to identify all hash files which have not been modified since 1st of April 2004 and not being currently referred by any of the Applications / Jobs.

Based on point 1 and 2, I am putting all those Hash files in unused category where the Unix level Date for DATA.30 is before 1st April 2004 and i am getting 0 records as the output from the Query given above.

Is it correct conclusion ???
----------------
Rgds,
Anupam
----------------
The future is not something we enter. The future is something we create.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Your query is wrong. The NAME field in DS_JOBOBJECTS is normally the link name. So unless you name your links and hash files the same name then this will fail.

I would use this query.

Code: Select all

select 
   DS_JOBS.NAME AS JOB_NAME FMT '35L' 
from 
   DS_JOBOBJECTS, 
   DS_JOBS
where 
   DS_JOBOBJECTS.OLETYPE = 'CHashedInput' 
   and DS_JOBOBJECTS.OBJIDNO = DS_JOBS.JOBNO
   and EVAL DS_JOBOBJECTS."@RECORD<6>" = 'MyHashFileName'
group by 
   JOB_NAME
;
and here is the other side of this query.

Code: Select all

select 
   DS_JOBS.NAME AS JOB_NAME FMT '35L' 
from 
   DS_JOBOBJECTS, 
   DS_JOBS
where 
   DS_JOBOBJECTS.OLETYPE = 'CHashedOutput' 
   and DS_JOBOBJECTS.OBJIDNO = DS_JOBS.JOBNO
   and EVAL DS_JOBOBJECTS."@RECORD<6>" = 'MyHashFileName'
group by 
   JOB_NAME
;
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

What we think as output is OLETYPE has input on the end like CHashedInput. Here are the 2 queries above together.

Code: Select all

select 
   DS_JOBS.NAME AS JOB_NAME FMT '35L' 
from 
   DS_JOBOBJECTS, 
   DS_JOBS
where 
   DS_JOBOBJECTS.OLETYPE in ('CHashedOutput','CHashedInput') 
   and DS_JOBOBJECTS.OBJIDNO = DS_JOBS.JOBNO
   and EVAL DS_JOBOBJECTS."@RECORD<6>" = 'MyHashFileName'
group by 
   JOB_NAME
;
Last edited by kduke on Mon Sep 06, 2004 6:13 am, edited 1 time in total.
Mamu Kim
anupam
Participant
Posts: 172
Joined: Fri Apr 04, 2003 10:51 pm
Location: India

Post by anupam »

Thanx Kim,

Will try this command and will not use Input and Output togather. :roll:
----------------
Rgds,
Anupam
----------------
The future is not something we enter. The future is something we create.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Anupam

I am sorry. I reworded the above answer. What I was trying to say was CHashedInput really outputs to a hash file. The way ASCL named the OLETYPEs was confusing.
Mamu Kim
Post Reply