New EtlStats soon

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

New EtlStats soon

Post by kduke »

I should be posting a new EtlStats soon. I am installing it at a new customer site. Found some missing routines. Found somethings I do not like.

If you downloaded it then you are missing routines IsJobSequence and DateConv. Also routine EmailToRtn needs rewriting. If you tried to install it and could not then email me. I will send you the changes or post the new stuff soon.

Needs an install script or notes. Nobody bugged me so I assumed they got it working.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Sorry, it has been out there for several days now. I just patched the missing routines. Craig this should compile now.

I am doing a rewrite of the table structures. It may take a while before I can finish it. It should be a star schema with a fact table when I am done. I have a couple of fact tables to tie together. Need a reporting guy to put it all in MicroStrategy for us. ETL star schema finally. Talked about it a long time ago. Who needs MetaStage anyway. Just kidding Big Blue.
Mamu Kim
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Cool, thanks. Not much time for fun stuff like this right now but I'll check it out when I can. 8)
-craig

"You can never have too many knives" -- Logan Nine Fingers
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Thanks Kim,
for all the effort you put in it :)
Alas I have yet to try it :(
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

EtlStats is free. You can download it from my tips page. If you do an exact search on this site for EtlStats then you will find intructions on how to install. If you have problems then post a question here.

Thanks Kim.
Mamu Kim
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I have a DB2 version for anyone who is interested. We needed good process metadata for administration and troubleshooting so we chose it ahead of MetaStage. We were looking for a interim solution to take us through to the end of the year when DataStage Hawk delivers better metadata reporting against a shared repository. MetaStage was a pain to install and use so we went for the simpler option.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Thanks Vincent. Try the job GenHtmlFromSqlDriver. If you fix the Email.ksh it should email to you. If not then look in Sql2Html directory below the project.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

EtlStats install.

You need to create some files at TCL. Probably DataStage Administrator
is the easiest. You can also telnet into your DataStage server and get
to TCL. In UNIX:

export DSHOME=`cat /.dshome`
cd $DSHOME
export PATH=$PATH:$DSHOME/bin
. ./dsenv
uv

# At this point you should see > the TCL prompt
LOGTO ProjectName

CREATE.FILE Scripts 19
CREATE.FILE SqlScripts 19
CREATE.FILE KimD 19
CREATE.FILE Sql2Html 19
CREATE.FILE Staging 19
CREATE.FILE Rejects 19
CREATE.FILE Source 19

Now in your zip file you have Scripts and SqlScripts folders. Copy
these to the project directory.

You need an ODBC connection to your TargetDSN. It is also called
StatDSN and SqlDSN in the EtlStats jobs. You need to import the DSX
files. You need to create the tables in this database. The DDL for
Oracle is included.

You should be able to run job SeqJobStatus. Point TargetDSN and SourceDsn
to the DSN where you created the tables above. SouceDSN parameters
are not used and need to be delete. FilePath default is okay. Some
jobs may fail. Reset them. You need to run this once.

Next you need to run SeqPrepareEtlStats. You need to run this once
everyday usually at the begining of all sequences. Both of these
sequences are examples of how to use DSJobReportDbDriver.

You need to edit Email.ksh and routine EmailToRtn. Change to meet
your needs. Do not send me these emails. There are reports sent
out with row counts. You will need to put a jog for your company's
logo in Sql2Html. Name it OurLogo.jpg or change the parameter
LogoPath to point to whatever you called it.

DSJobReportDbDriver is the main job to gather row counts. It runs
DSJobReportDb. You may need to change the parameter dsjobcmd and
projName. dsjobcmd points to the UNIX or DOS path for dsjob.
projName is the DataStage project name.

I would change all the defaults on all the parameters in these jobs.
Make them all point to the database with the ETL tables created above.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Here is an email I got and some answers which may help someone else:
Sanjay

DateConv is a routine included. If you have an old download then you need to download it again. If this routine conflicts with one you already have then rename it in the job and rename the routine in the DSX file. It needs this routine to function.

1) After execution, I could see only EtlStats job execution status. We
have executed many more jobs during day and in the past days. How do I see
the job status of all these jobs?

DSJobReportDbDriver is the controlling job. If you run this and have parameter
SeqName = All then it will get row counts for all jobs. SeqName can get row
counts for all jobs in a sequence if the value is the name of a sequence job.
If SeqName is just a job's name then it will get row counts for that one job.

How can I see the job execution status of all the jobs ran and for which job status is present in RT_LOGxxx files?

If a job is compiled then it will not get row counts because they are cleared out of RT_STATUSxxx.
DSJobReportDbDriver does not read RT_LOGxxx to get old values.

2) Which job should get triggered every day?

I run DSJobReportDbDriver at the end of each sequence as the last job activity in the sequence.
The parameter SeqName will need the name of the sequence being run.

3) What are the further steps?

Next step is to automate reports. If you look at SeqQaCompleteness. It runs several reports
to QA the whole data warehouse load. This is a lot more work to set this up but you can
run some of the reports on the last few steps.

4) How do I execute and see the reports generated using GenHtml?

GenHtml is similar setup. It is controlled by a sequence which runs KgdGenHtmlJobDocs and
KgdGenHtmlRoutineDocs. CategoryToDoc parameter will need All to document a whole project.
It can also be a sequence name, category name or job name. It will write html files into

ProjectDir/KimD/Jobs
ProjectDir/KimD/Routines

You may need to create these below the project. It will add YYYYMMDD directories below there.

You also need to get DSaveAsBmpDriver.bat and DSaveAsBmp.bat. These will get the bmp images
from each job design. Most people convert these from bmp files to gif or jpg. You need to edit
each html file display the correct image. This is very slow. The josb run very fast. Seconds to
document the whole project.

DSJobReportDbDriver is fairly slow to get row counts for a whole project maybe a hour for lots
of jobs.

5) What are the other functionality of ETLSTATS and GetHtml?

Reports and QA. It can turn any SQL statement into html. If you add ".html" on the end of a
column then it will turn that column into a link. There are all kinds of ETL metadata reports
included. Run them.

I think DwNav and ParameterNavigator both do a better job of all of this. These do cost money
and are available on my web site. They are lots easier to use.

Thanks Kim.


>From: Sanjay.Desai@relianceinfo.com
>To: "Kim Duke" <kim_g_duke@hotmail.com>
>Subject: RE: ETLSTATS and GenHtml - Request for Jobs execution order information
>Date: Thu, 15 Sep 2005 18:13:55 +0530
>
>Hi Kim,
>
>Thanks for your prompt reply.
>
>I have followed your instruction for EtlStats installation. Only one
>error which i encountered is listed below.
>
>"GetEtlQaSqlStat" gave following error while compilation.
>Error preprocessing Transform Derivation for stage GetEtlQaSqlStat.XfmStat.
>Link ETL_QA_RESULT, Column CREATE_TS: DateConv('',1,'')
>Too many arguments to function DateConv
>
>I could manage to run SeqJobStatus and SeqPrepareEtlStats.
>
>Questions:
>1) After execution, I could see only EtlStats job execution status. We
>have executed many more jobs during day and in the past days. How do I see
>the job status of all these jobs? How can I see the job execution
>status of all the jobs ran and for which job status is present in RT_LOGxxx
>files?
>2) Which job should get triggered every day?
>3) What are the further steps?
>4) How do I execute and see the reports generated using GenHtml?
>5) What are the other functionality of ETLSTATS and GetHtml?
>
>Appreciate your response.
>
>Thanks and Regards
>
>Sanjay Desai
Mamu Kim
snassimr
Premium Member
Premium Member
Posts: 281
Joined: Tue May 17, 2005 5:27 am

Post by snassimr »

I downloaded your dsx file . After import some jobs include only Annotation box. Why cant see the job design ?
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

They are all job control code or what they used to be called batch jobs. Look in job properties.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

New EtlStats is now available. It uses DSSendMail. So it works on Windows. I am not sure the UNIX version works now. This version uses ODBC. It currently works with SQL Server. So DDL is included for both Oracle and SQL Server. Maybe Vincent will send the DDL for DB2.

This is much easier to install. Do the above CREATE.FILE statements at TCL. Then copy the directories below Copy2ProjectDir in the zip to the project directory. This should setup everything pretty nicely.

If you have trouble installing then post a question in the server forum. I will be glad to answer or anyone who has it working.

Hopefully I will get mb/sec included soon. I have the routine just need to add the column and call the routine.

Next is to gather stats from all the users and post averages. Sooner if people need it. Post a reply if you would like to see other users rows per second for hash files, sequential files and whatever stage types you might use.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Download EtlStats and a bunch more jobs and routines for free on my tips page below.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Oh I forgot to mention, there is a job included which will get row counts from the job log. It posts them to a hash file. There is a second job which posts the hash file records to the same tables as the other jobs.

This gets row counts about every way but using the API. That is next. I already have the code just need to add it. I also want to get it in a utility job like UtilityRunJob included in the SDK.

Anyone else what to add code to this then go for it. Just post it on the forum.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I did know that Walter had posted this vtt/ETLStats.php This talks you through installing EtlStats. I talk way too fast and you need to crank up the volume but if you can understand what I am saying then it will teach you a little about TCL and the repository.

Thanks Walter.

The link was on this newsletter http://www.datastagexchange.com/Newslet ... t/rant.php I do hope Any does more video tips. I think his is much better than mine. Maybe I will do some more of these. If anybody thinks these help then let me know. I prefer these over writing help manuals.
Mamu Kim
Post Reply