Page 1 of 1

server routine for eliminating headers

Posted: Fri Nov 02, 2007 6:38 am
by kausmone
Hello everyone,

I am writing an application that should read data from csv files for further processing. The problem is that these csv files may or may not contain a header record, depending on the source.

I know how to eliminate the header (if present) using UNIX (sed, for example does it quite elegantly). But I would like to do it using a DataStage server routine. The routine needs to check the existence of a certain string (e.g. "Country_Code") whose presence would indicate that its a header record, and then delete that record from the input file, and leave the file untouched if the string doesn't exist in it.

Any help much appreciated!

Thanks,
kaus

Posted: Fri Nov 02, 2007 7:02 am
by chulett
Could you simply not check the 'First record is header' option in the Sequential file stage and add a constraint that skipped the first record if it found your 'certain string' there?

Posted: Fri Nov 02, 2007 7:09 am
by kausmone
Unfortunately, the project has some auditing set-up which forces me to update the file before it gets read by the jobs. Else, I would be 'silently' dropping a record in an audited job

Posted: Fri Nov 02, 2007 7:19 am
by chulett
While you certainly could write a routine to do this, why not stick with 'sed'? It would certainly be... simpler. Heck, you could even leverage that in the Filter option of the stage. Otherwise, you'll need to investigate all of the BASIC functions for handling sequential files:

OPENSEQ
READSEQ
WRITESEQ
CLOSESEQ
WEOFSEQ
SEEK

Etc. The BASIC pdf manual will have examples while more useful ones would be found by searching the forums here for those functions.

Posted: Fri Nov 02, 2007 7:26 am
by kausmone
Its more a case of trying to use a new thing as much as possible just to get to know it better :) , hence trying the routine instead of 'sed'

Anyways, thanks for your help!

Posted: Fri Nov 02, 2007 7:33 am
by ray.wurlod
How about just encapsulating this command as the filter command in your Sequential File stage?

Code: Select all

grep -v Country_Code #SourceFile#

Posted: Mon Nov 05, 2007 2:24 am
by kausmone
Hi Ray,

Thanks for the input; but I need to do this before the job that reads the file. Hence the routine/UNIX script.

Rgds,
Kaus

Posted: Mon Nov 05, 2007 3:14 am
by ray.wurlod
Then do it in a Routine.

Code: Select all

Shell = "UNIX"
Command = "grep -v Country_Code " : argSourceFilePath : " > " : argSourceFilePath : "_noheader"
Call DSExecute(Shell, Command, Result, ExitStatus)
If ExitStatus <> 0
Then
   Call DSLogWarn("Error (" : ExitStatus : ") removing header.", "MyRoutine")
End
Process the new file (filename_noheader) in your Sequential File stage.