how to generate control files in a datastage job?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
pongal
Participant
Posts: 77
Joined: Thu Mar 04, 2004 4:46 am

how to generate control files in a datastage job?

Post by pongal »

Hi,

I have a requirement in such a way that i need to generate control files, one should contain no of rows extracted and another should contain no of rows loaded and both the files should put at remote server location through ftp.

Thanks in advance.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Hello Pongal,

Do you wish to write DataStage results to these files and ftp them? The tterm "control file" is a bit confusing in that case.

But to answer your question - this is quite easy to do. You can write a very short after-job routine to write this information out to the 2 files; then you can call the FTP command line to copy the files, or you can use the FTP stage from the DataStage job to do so.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

To get row counts in XML format then

../../DSEngine/bin/dsjob -report #projName# #jobName# XML >#jobName#.xml

There is another form of this report that is not XML if you leave off "XML" then you get that version.
Mamu Kim
pongal
Participant
Posts: 77
Joined: Thu Mar 04, 2004 4:46 am

Post by pongal »

Hi ArndW

Can i write code like this in after job routine?
Result= DSGetLinkInfo(DSJ.ME,<<stageName>>,<<LinkName>>,DSJ.LINKROWCOUNT)
and my doubt is how to create control files with rowcount information and transfer it to particular location in DataStage server.
i can use ftp,but how to generate control files.
can u explain more elaborately..

Thanks,
Pongal
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Provided that you are running with MetaStage proxy mode enabled and the listener is running, process metadata can automatically be captured by MetaStage. You can arrange for the captured data to be distributed from MetaStage.

In an upcoming release, DataStage and MetaStage will share the same repository, so that this ought to be your preferred approach looking forward.

Otherwise, it's pretty much a code-it-yourself solution (or hire a good consultant to code to your exact specifications).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
If you want to create cotrol files containing number of rows and such in ds server your most simple solution would be:
1. add a hash file with dummy key column having 1 as it's value.
2. split the output in a transformer to add the hash file with fixed 1 in the key column and the @OUTROWNUM or @INROWNUM, or whatever your loginc for outputing a row is, in the count column; at the end of the input rows you'll have the count in the hash file.
3. read that hash file to a sequential file building your control file.

IMHO,this is the most simple and pure GUI way to build a control file in server for you data files

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
pongal
Participant
Posts: 77
Joined: Thu Mar 04, 2004 4:46 am

Post by pongal »

Thanks roy...
I got the rowcount.. but performance is dam slow as we have 7.2 million records at source file. can we take any performance tuning?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Take an extra output link. It has one integer column. The value sent along this link is @INROWNUM. Run it through an Aggregator stage in which the Last function is specified.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
Do you only need to build the control file/s from existing files or ones you build in an ETL proces?

How do you define slow?
it should be relative to task and resources at hand.

Tricks for getting best performance include IPC stage to make sure a process gets max CPU doing a single task like only read or only write(though at times it can be only a small aproovment)

you might try a filter command for cutting only the first char of the line if you only need row count (just an idea I really never tested if it will make adifference)

also you might use filter command to wc -l the file and handle the result.

but theese are only options you might want to check out.

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
Post Reply