Extracting order of execution of objects from DSX file

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
clarcombe
Premium Member
Premium Member
Posts: 515
Joined: Wed Jun 08, 2005 9:54 am
Location: Europe

Extracting order of execution of objects from DSX file

Post by clarcombe »

It has been requested that we create an excel spreadsheet with all of the job sequences and parallel jobs in order of execution

Thus
S1 - J1- OCI In - OCI Out
S1 - J2- DS In - OCI Out - DS Out
S2 - S3
S3 - J3 - FF In - OCI Out
etc

At present my colleague is doing this by hand and we have tens of job sequences to evaluate.

I have tried reading the dsx in Excel and filtering on *** Activity and jb$ but I don't always get the objects in the right order

e.g. the dsx has

Code: Select all

*** Activity "Repair_PS_S_COMBO_CF_DEFN": Initialize job
jb$V120S7 = "OCN_J_Stage_HASH_CRC_REPAIR":'.':("Repair_PS
*** Activity "Repair_PS_S_BUS_UNIT_AP": Initialize job
jb$V121S0 = "OCN_J_Stage_HASH_CRC_REPAIR":'.':("Repair_PS
However, the calling order in the DSX is actually

Repair_PS_PAYMENT_TBL
Repair_PS_S_BUS_UNIT_AP

Questions
1) If I have a compiled dsx, is there anyway from reading the generated BASIC that I can get the objects out in the right order ?
2) Are there any other ways apart from reading the dsx that I can get this information.

Thanks

Colin
Colin Larcombe
-------------------

Certified IBM Infosphere Datastage Developer
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

First ask why. You can call jobs in parallel from a sequence job and those can be difficult to represent in a spreadsheet.

Go into Designer under the File menu and generate a job report for each sequence job. A picture is worth a thousand words. It will produce documentation for you by generating the report as a web page.

One great point with DataStage is that it's a GUI. By reducing a GUI job design back to text only format, you're making it more difficult for other people to understand.
Choose a job you love, and you will never have to work a day in your life. - Confucius
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There's no such thing as a compiled DSX, so option 1 is out. The best way is to look at the sequence log - near the end there's a "summary of sequence run" event.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
clarcombe
Premium Member
Premium Member
Posts: 515
Joined: Wed Jun 08, 2005 9:54 am
Location: Europe

Post by clarcombe »

Ray, thanks for that. Thats what I needed.

As for the compiled dsx, I meant a job that has been compiled and exported as a dsx (so will have the Universe Basic in )as opposed to a job that has just been saved with no previous compilation.
Colin Larcombe
-------------------

Certified IBM Infosphere Datastage Developer
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Post by eph »

Hi!

Following qt_ky's idea, after a generation of all job's report, it's quite easy to parse the results into a csv

First step : Generate the documentation using DsJobReport (cf Kim Duke's website)
Then execute the following script on unix shell, using the job sequence naming convention (like Jq_, Seq_):

Code: Select all

#!/bin/ksh
# Arg 1 : PATH of the documentation
# Arg 2 : Job Sequence pattern in naming convention



find $1 -name "*$2*.htm*" > listDocs

cat listDocs | while read file;
        do
                echo "File processed : ${file}";
                outputfile=$(echo "${file}" | sed -e 's/htm.*//');
                echo "Type;Stage;Link info;Stage I/O" >  "${outputfile}csv";
                grep -E "</A></B></TR>|Input from|Outputs to"  ${file} > temp1;
                cat temp1 | while read line;
                do
                        case "${line}" in
                                *A\ name*)
                                                stage=$(echo ${line} | sed -e 's/.*">//;s/<.*//')
                                                echo "Stage;${stage}" >> "${outputfile}csv"
                                                ;;
                                *A\ href*)
                                                inOut=$(echo ${line} | sed -e 's/.*<dd>//;s/<.*//')
                                                stageInOut=$(echo ${line} | sed -e 's/.*">//;s/<.*//')
                                                echo "Link;${stage};${inOut};${stageInOut}" >> "${outputfile}csv"
                                                ;;
                        esac;
                done;
        done;
Results like :

Code: Select all

Type	      Stage	                                        Link info	                                                Stage I/O
Stage      uv_Connection		
Link	      uv_Connection	                        Outputs to UserVariables Activity stage 	uv_ConnectionPS
Stage      Jx_SRV_RDD_CONTEXTE_DUP		
Link        Jx_SRV_RDD_CONTEXTE_DUP	Input from Job Activity stage 	                Jx_RD_T_ME_OPD_CLIENT_DUP_001
Link        Jx_SRV_RDD_CONTEXTE_DUP	Outputs to Job Activity stage 	                Jx_RD_T_ME_OPD_CLIENT_DUP_002
It's just a quick&dirty script, tell me if you improve it, I'm interested :)

Eric
Post Reply