Parse Header, Trailer and Detail using CFF

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
rhettbutlernc
Charter Member
Charter Member
Posts: 14
Joined: Thu Dec 04, 2003 11:34 am

Parse Header, Trailer and Detail using CFF

Post by rhettbutlernc »

How can I parse header, detail and trailer of the incoming file using the CFF stage if no 'ID field'( a unique number to indicate Hdr, detail and Trlr) is provided in the input file.

I can get the header by specifying the 'Start Rec #' and 'End Rec #'
in the Selection criteria as 0 and 1 but how can I get the body and trailer with out calculating the total no of records.

Thanks in advance!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That one's easy: you can't. With no way of determining the record type from within the data, and no a priori knowledge of the number of lines in the file, you're stuck.
An easy mechanism, however, is to have a little routine that captures the number of lines (be exeuting wc -l command, for example) and returns this value. The return value can be used (in a job sequence) as the value of a job parameter for your ETL job.
For the header line, @INROWNUM = 1. For the trailer line, @INROWNUM = #MaxLine#. For a detail line, @INROWNUM <> 1 And @INROWNUM <> #MaxLine#.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
talk2shaanc
Charter Member
Charter Member
Posts: 199
Joined: Tue Jan 18, 2005 2:50 am
Location: India

Post by talk2shaanc »

If i dont have any use of details in my header and trailer in my design, then i would avoid counting the no of lines in the file, if your file size is huge it would take time.

Another way could be, remove the Tail of the file by using TAIL command of unix. In CFF stage give start rec=1 and End rec=max no(999999999999). Then read the file, which would extract only detail record.
Shantanu Choudhary
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

... and how do you think the tail command works out which are the final lines in the file? :roll:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
talk2shaanc
Charter Member
Charter Member
Posts: 199
Joined: Tue Jan 18, 2005 2:50 am
Location: India

Post by talk2shaanc »

ray.wurlod wrote:... and how do you think the tail command works out which are the final lines in the file? :roll:
Definately it does not reach tail or end of file, by reading through all the lines :shock: and keeping the count. if I remember exactly, It looks for End-Of-File Mark in the last few bytes and reads according to the option given with the tail command. I dont remember the exact detail now, will post it if i get one.
Shantanu Choudhary
dsxdev
Participant
Posts: 92
Joined: Mon Sep 20, 2004 8:37 am

Post by dsxdev »

Hi rhettbutlernc,
If you are not really interested in Header and trailer record then just give schema of Detail record.
All teh records which do not match this schema would be just rejected with out bothering you.
I you want to capture these Header and Trailer records add a reject link to the Sequential file Stage these Header and Trailer records are written to this file.
which you can process when you need.

If you want to read all three type of records then I guess rays option is better.

Regards
dsxdev
Happy DataStaging
rhettbutlernc
Charter Member
Charter Member
Posts: 14
Joined: Thu Dec 04, 2003 11:34 am

Post by rhettbutlernc »

Thanks for all your suggestions...

This is how I have implemented this...(I forgot to mention that I get the total no of records in the header)

I have initially parsed the header based on the Start rec# and End record # (0& 1) and stored the 'NumberOfRecs' into a hash file. The hash file is then used to constrain the incoming records using

@INROWNUM >1 or @INROWNUM < (lnkRecCnt.NumberOfRecs + 1)

Initially I am verifying the no of rows in the header with the total no of records in the incoming file using the function 'DSGetLinkInfo'.

Your comments most welcome...
Post Reply