Page 1 of 1

how to generate control files in a datastage job?

Posted: Fri Mar 25, 2005 4:27 am
by pongal
Hi,

I have a requirement in such a way that i need to generate control files, one should contain no of rows extracted and another should contain no of rows loaded and both the files should put at remote server location through ftp.

Thanks in advance.

Posted: Fri Mar 25, 2005 5:04 am
by ArndW
Hello Pongal,

Do you wish to write DataStage results to these files and ftp them? The tterm "control file" is a bit confusing in that case.

But to answer your question - this is quite easy to do. You can write a very short after-job routine to write this information out to the 2 files; then you can call the FTP command line to copy the files, or you can use the FTP stage from the DataStage job to do so.

Posted: Fri Mar 25, 2005 9:47 am
by kduke
To get row counts in XML format then

../../DSEngine/bin/dsjob -report #projName# #jobName# XML >#jobName#.xml

There is another form of this report that is not XML if you leave off "XML" then you get that version.

Posted: Mon Mar 28, 2005 3:46 am
by pongal
Hi ArndW

Can i write code like this in after job routine?
Result= DSGetLinkInfo(DSJ.ME,<<stageName>>,<<LinkName>>,DSJ.LINKROWCOUNT)
and my doubt is how to create control files with rowcount information and transfer it to particular location in DataStage server.
i can use ftp,but how to generate control files.
can u explain more elaborately..

Thanks,
Pongal

Posted: Mon Mar 28, 2005 4:38 am
by ray.wurlod
Provided that you are running with MetaStage proxy mode enabled and the listener is running, process metadata can automatically be captured by MetaStage. You can arrange for the captured data to be distributed from MetaStage.

In an upcoming release, DataStage and MetaStage will share the same repository, so that this ought to be your preferred approach looking forward.

Otherwise, it's pretty much a code-it-yourself solution (or hire a good consultant to code to your exact specifications).

Posted: Mon Mar 28, 2005 5:26 am
by roy
Hi,
If you want to create cotrol files containing number of rows and such in ds server your most simple solution would be:
1. add a hash file with dummy key column having 1 as it's value.
2. split the output in a transformer to add the hash file with fixed 1 in the key column and the @OUTROWNUM or @INROWNUM, or whatever your loginc for outputing a row is, in the count column; at the end of the input rows you'll have the count in the hash file.
3. read that hash file to a sequential file building your control file.

IMHO,this is the most simple and pure GUI way to build a control file in server for you data files

IHTH,

Posted: Mon Mar 28, 2005 12:45 pm
by pongal
Thanks roy...
I got the rowcount.. but performance is dam slow as we have 7.2 million records at source file. can we take any performance tuning?

Posted: Mon Mar 28, 2005 1:24 pm
by ray.wurlod
Take an extra output link. It has one integer column. The value sent along this link is @INROWNUM. Run it through an Aggregator stage in which the Last function is specified.

Posted: Tue Mar 29, 2005 4:51 am
by roy
Hi,
Do you only need to build the control file/s from existing files or ones you build in an ETL proces?

How do you define slow?
it should be relative to task and resources at hand.

Tricks for getting best performance include IPC stage to make sure a process gets max CPU doing a single task like only read or only write(though at times it can be only a small aproovment)

you might try a filter command for cutting only the first char of the line if you only need row count (just an idea I really never tested if it will make adifference)

also you might use filter command to wc -l the file and handle the result.

but theese are only options you might want to check out.

IHTH,