How to parse a datastage job and fetch all col derivations
Moderators: chulett, rschirm, roy
How to parse a datastage job and fetch all col derivations
I need to come up with a way to export all the columns used in datstage job into an text file. The details have to be stage wise.
For Example : Suppose i have a job which has 3 stages (Oracle , transformer and sequential file)
My output file shud have
Oracle Tansformer Seq
Emp (varchar) Emp Emp
Empid(Integer) Empid Empid
Salary(Integer) If Salary > 100 then Salary else salary +100 TotSalary
So as given in the above example i need the derivation of each column in each stage of a given
Please let me know if this is possible
For Example : Suppose i have a job which has 3 stages (Oracle , transformer and sequential file)
My output file shud have
Oracle Tansformer Seq
Emp (varchar) Emp Emp
Empid(Integer) Empid Empid
Salary(Integer) If Salary > 100 then Salary else salary +100 TotSalary
So as given in the above example i need the derivation of each column in each stage of a given
Please let me know if this is possible
I'd suspect you'd be into the land of back-end querying... that's to say, reading out of the Datastage repository itself. Failing that, some amalgamation of exporting jobs over command line and then scripting something to strip out the necessary from each entry in the export dump.
Mark Winter
<i>Nothing appeases a troubled mind more than <b>good</b> music</i>
<i>Nothing appeases a troubled mind more than <b>good</b> music</i>
Perhaps with an advanced uderstanding of the underlying repository you could engineer something directly from there. The API will get you most of that, but not the derivations. Otherwise, you may want to look into generating a job report and then processing that. There's an icon in the Designer that looks like an IE icon and that will get you one manually, there is also a way to generate the same from the command line from what I recall. The output is HTML but the intermediate XML can be preserved (from what I recall) and then perhaps reprocessed by you into your desired format.
You could also look into exporting the jobs to .dsx or .xml format and then processing that... perhaps via another job.
I would think other people will chime in with ideas as well.
You could also look into exporting the jobs to .dsx or .xml format and then processing that... perhaps via another job.
I would think other people will chime in with ideas as well.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
The API call to DSGetLinkMetaData() will return a complete list of column names and metadata for each link in a stage. You would need to use DSOpenJob(), DSGetJobInfo(), DSGetStageInfo() and DSCloseJob() in order to get this to work.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Craig - you are correct, I didn't read the complete post and thought that the original poster just wanted column names. As you correctly noted, the derivation is not available as part of the link information; and thet DSGetStageInfo() routine call does not have an option to return derivations.
So one would need to delve into the (undocumented) depths of the hashed files or to export the job and then parse the XML or DSX format.
So one would need to delve into the (undocumented) depths of the hashed files or to export the job and then parse the XML or DSX format.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Here's why we need such a requirment. Our system gets its data from various source like mainframes, db2 and MQs. And we perform a lot of calulations on the incoming feed and then send it to our warehouse, which serves as a reporting platform to other applications. The plan is to have an intermediate system in place that does all these calculations and sends us a feed that we just load into the warehouse.
Now dont ask me why they are doing this!! cos i have no clue.
So we need to come up with all the fileds that are being calculated in the jobs and send them to another team.
That is the reason i requested a solution for such a STUPID requirment!!
Now dont ask me why they are doing this!! cos i have no clue.
So we need to come up with all the fileds that are being calculated in the jobs and send them to another team.
That is the reason i requested a solution for such a STUPID requirment!!
What about using the lineage and impact analysis functionality built into the product?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: