server routine for eliminating headers

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

server routine for eliminating headers

Post by kausmone »

Hello everyone,

I am writing an application that should read data from csv files for further processing. The problem is that these csv files may or may not contain a header record, depending on the source.

I know how to eliminate the header (if present) using UNIX (sed, for example does it quite elegantly). But I would like to do it using a DataStage server routine. The routine needs to check the existence of a certain string (e.g. "Country_Code") whose presence would indicate that its a header record, and then delete that record from the input file, and leave the file untouched if the string doesn't exist in it.

Any help much appreciated!

Thanks,
kaus
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Could you simply not check the 'First record is header' option in the Sequential file stage and add a constraint that skipped the first record if it found your 'certain string' there?
-craig

"You can never have too many knives" -- Logan Nine Fingers
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

Unfortunately, the project has some auditing set-up which forces me to update the file before it gets read by the jobs. Else, I would be 'silently' dropping a record in an audited job
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

While you certainly could write a routine to do this, why not stick with 'sed'? It would certainly be... simpler. Heck, you could even leverage that in the Filter option of the stage. Otherwise, you'll need to investigate all of the BASIC functions for handling sequential files:

OPENSEQ
READSEQ
WRITESEQ
CLOSESEQ
WEOFSEQ
SEEK

Etc. The BASIC pdf manual will have examples while more useful ones would be found by searching the forums here for those functions.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

Its more a case of trying to use a new thing as much as possible just to get to know it better :) , hence trying the routine instead of 'sed'

Anyways, thanks for your help!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How about just encapsulating this command as the filter command in your Sequential File stage?

Code: Select all

grep -v Country_Code #SourceFile#
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

Hi Ray,

Thanks for the input; but I need to do this before the job that reads the file. Hence the routine/UNIX script.

Rgds,
Kaus
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Then do it in a Routine.

Code: Select all

Shell = "UNIX"
Command = "grep -v Country_Code " : argSourceFilePath : " > " : argSourceFilePath : "_noheader"
Call DSExecute(Shell, Command, Result, ExitStatus)
If ExitStatus <> 0
Then
   Call DSLogWarn("Error (" : ExitStatus : ") removing header.", "MyRoutine")
End
Process the new file (filename_noheader) in your Sequential File stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply