getting metadata using a shell script

pdevi1234 · Post by **pdevi1234** » Thu Apr 21, 2005 1:46 pm

I have the following information:
1) filename
2) name of the datastage job that creates the file.
3) name of the input link to the file.

I want to know whether there exists a way to query the datastage engine (database) from a shell script to get the names of columns in the file and and their dispaly lengths . I know I can get the information using a datastage job . But I need to do this from a script.

Any help is appreciated.

Thank you,
Parvathi

ray.wurlod · Post by **ray.wurlod** » Thu Apr 21, 2005 5:44 pm

There's no easy way.

There's also no way that doesn't rely on undocumented information about the structure of repository tables, which is subject to change (and likely to change in the next release). And even then your shell script would need to invoke the DataStage (~ UniVerse) query engine.

kduke · Post by **kduke** » Thu Apr 21, 2005 8:04 pm

How about this:

Code: Select all

#!/bin/ksh
FILENAME=`echo "$1" | tr [a-z] [A-Z]`
uvsh <<!
SELECT 
   DS_JOBS.NAME AS JOB_NAME, 
   DS_JOBS.CATEGORY, 
   DS_JOBOBJECTS.NAME AS OBJECT_NAME, 
   DS_JOBOBJECTS.OLETYPE, 
   EVAL DS_JOBOBJECTS."if index(upcase(@RECORD),':1',1) > 0 then 'FOUND' else ''" AS FOUND FMT '5L'
FROM 
   DS_JOBS, 
   DS_JOBOBJECTS 
WHERE 
   DS_JOBS.JOBNO = DS_JOBOBJECTS.OBJIDNO 
   and FOUND = 'FOUND'
GROUP BY
   JOB_NAME, 
   DS_JOBS.CATEGORY, 
   OBJECT_NAME, 
   DS_JOBOBJECTS.OLETYPE, 
   FOUND
;
!

ray.wurlod · Post by **ray.wurlod** » Thu Apr 21, 2005 8:29 pm

Only if you're attached to project directory and uvsh can be found via your PATH.

And no guarantees into the future - Hawk release promises an open, common repository.

And what about the requirement for display lengths?

leomauer · Post by **leomauer** » Fri Apr 22, 2005 6:40 am

I played with tis task before and came up with the following code:

Code: Select all

Call DSGetLinkMetaData(Result, JobName, StageName, LinkName)
print "JobName = ":JobName
print "LinkName = ":LinkName
print "Result =":Result
print "StageName = ":StageName

print Result<1,1>
print Result<6,1>
print Result<1,2>
print Result<6,2>
print Result<1,3>
print Result<6,3>
print Result<1,4>
print Result<6,4>
....and so forth........

Ans = Result

This is a test routine that uses DSGetLinkMetaData function.
The print statement are just to show the elements of the returned dynamic array.
Instead of prints the results may be captured in variables, parsed and written to the file in the required format. I did not do this part of the routine.

I wrote it as a transform routine, but I think it can be written in job control and used as a stand alone code (DS job with Job control only) to create any metadata text. file.

pdevi1234 · Post by **pdevi1234** » Fri Apr 22, 2005 7:13 am

Thank you everyone. I will try the different alternatives and let you know.
Thanks again.

Parvathi

pdevi1234 · Post by **pdevi1234** » Fri Apr 22, 2005 1:05 pm

hi,

I tried Kim's code.It directly queries the repository tables like I wanted. As Ray said I first attached to the project directory and then ran the script. It works. But the results are not what I wanted.

I need to pass the name of the datastage job and name of the link to the script and the script should be able to obtain the names of all the columns used in the link and their display lengths as shown below:

Code: Select all

column_1                   5
column_2                  10
column_3                   8

Kim, I also didn't understand what the EVAl part in your code is doing. Could you please explain?

Leo's suggestion works. But then, I have to run a datastage job from the script.

Again "Thank you " for all the help. I couldn't have found an answer without you. Also where can I find information about the repository tables. Are they "undocumented" as Ray said?

Parvathi

kduke · Post by **kduke** » Fri Apr 22, 2005 3:24 pm

The EVAL is a column on the fly without creating the dictionary item.

I posted dictionary items recently to get the column names and lengths. Do a search if you cannot find them then let me know. Do an exact search for ColNames or MvCols. You already have the job name and link name in the previous SQL.

ray.wurlod · Post by **ray.wurlod** » Fri Apr 22, 2005 6:03 pm

They're not officially documented - though quite a few have been revealed over the years on this Forum.

Beware that substantial changes are afoot in the repository structure in the next major release.