Scanning the Dataset

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Gopinath
Participant
Posts: 52
Joined: Wed Apr 25, 2007 2:18 am
Location: Chennai

Scanning the Dataset

Post by Gopinath »

Hi,

I have 100+ Jobs in my project. If i want to debug for a particular record then iam scanning each and individual job to see where the record got dropped exactly. If iam the developer of those 100 Jobs then i can identify the exact job with some assumption since i know the business requirement.

But my requirement is, i want to identify the list of dataset which holds the searching string.

Example:

Dataset1: MEM1 DATE1
Dataset2: MEM1 DATE1 GENDR1
********MEM2 DATE1 GENDR2
Dataset3: MEM1 DATE1 GENDR1 ST1
********MEM2 DATE1 GENDR2 ST1
Dataset4: MEM2 DATE1 GENDR2 ST1
Dataset5: MEM2 DATE1 GENDR2 ST1
********MEM3 DATE1 GENDR2 ST1
Dataset6: MEM1 DATE1 GENDR2 ST1
********MEM3 DATE1 GENDR2 ST1
Dataset7: MEM1 DATE1 GENDR2 ST1
********MEM4 DATE1 GENDR2 ST1

Search String: MEM1
My Output Should be below:

Dataset1
Dataset2
Dataset3
Dataset6
Dataset7

Thanks.
Last edited by Gopinath on Mon Apr 05, 2010 8:36 am, edited 2 times in total.
Gopinath
nikhilanshuman
Participant
Posts: 58
Joined: Tue Nov 17, 2009 3:38 am

Post by nikhilanshuman »

Datasets are stored in internal format.These can not be searched directly.

You will have to write a shell script to accomplish this.

Datasets can be viewed using ORCHADMIN DUMP command.You will have to implement this logic in you shell script.

Following are the steps :


a) In the shell script,accept the string to be searched as parameter.
b) Prepare an array of datasets name.
c) Now,iterate through that array one by one.
d) For each element of the array,apply ORCHADMIN DUMP datasetname >filename.This will put the output into a sequential file.
e) use standard UNIX functions to search for the string passed as parameter to the script.(e.g. grep..In grep there is a option which displays the file name if the searched string is found...do r&d on this)
f) repeat the above steps for each element of the array.

The output of the above shell script will come as required by you.

Now,if the script name is searchds.sh and string to be searched is "MEM1"

Then command would be : bash searchds.sh MEM1

The output will be the names of the datasets which contain the search string.
e.g.
Dataset1
Dataset2
Dataset3
Dataset6
Dataset7
Nikhil
Post Reply