Page 1 of 1

Reading Sequential File with Header and Footer

Posted: Thu Sep 02, 2004 4:28 am
by alexnirmal
how to read a seq file like this which contains header and footer
and how to omit this and just read the necessary columns
the header is name and no of ppl
and footer is ------
i just want the details col1 col2 col3 and col4

Name :uselss
no of ppl :89


col1 col2 col3 col4
1 2 3 4
5 6 7 8


--------------------------------------------

Name :us
no of ppl :8df9


col1 col2 col3 col4
11 21 31 14
51 61 71 18

--------------------------------------------

Posted: Thu Sep 02, 2004 4:50 am
by denzilsyb
In your transform, from the sequential file add this constraint:

@INROWNUM > 2 (if header is first two lines)

this will omit the header. The footer you can also ignore by adding something to the constraint:

@INROWNUM > 1 and col1 <> '------'

seems a dirty way of doing it though.

you could put this in BASIC code if you wanted.

Posted: Thu Sep 02, 2004 4:57 am
by anupam
If the format is going to remain same, that is in each and every file you will be having header and footer like this

Code: Select all

Name :uselss 
no of ppl :89 


col1 col2 col3 col4 
1 2 3 4 
5 6 7 8 


-------------------------------------------- 
Then you can use a simple unix script which will delete the header and footer and blank lines.

You can call that unix script in your Job by using Before-job subroutine : Exec Sh. This script will remove the header, footer and empty records. Then your file would be like this

Code: Select all

col1 col2 col3 col4
1 2 3 4 
5 6 7 8
And you know how to read this file which has a space as a delimiter.

This is one way of reading this file, there may be some other ways also. Only drawback here is that you have to call the script which manipulates the file so that it may be readable from your Job.

Posted: Thu Sep 02, 2004 5:01 am
by anupam
I don't think what dnzl has suggested will work. As the problem is reading the file and not with discarding the first 2 records....

First problem is related with the file format only. If datastage is able to read the file without any kind of warning then the problem is solved.

So
@INROWNUM > 2 (if header is first two lines)
These kind of tricks will not work.

Correct me if i am wrong.

Posted: Thu Sep 02, 2004 5:09 am
by denzilsyb
I don't think what dnzl has suggested will work
:shock: anupam; you pointed out something that i overlooked - file format. But, if there is a CR/LF at the end of the row, the header constraint should work; we were using this process before we started to validate the header and trailers.

Posted: Thu Sep 02, 2004 5:41 am
by alexnirmal
but since this file is in this format

even i cannot read it becoz it is giving errors like this and job gets failed

controlDfile..Sequential_File_0.DSLink40: nls_read_delimited() - row 1, column Field003

Posted: Thu Sep 02, 2004 5:55 am
by anupam
That is what i pointed out.

unless and untill the file is readable by datastage, no other trick can be used. It will always give this kind of warnings while reading the file.

The best alternative is to delete the header, footer and the blank lines before reading the file and making the file readable.

Then space delimiter can be used while reading the file.

Try this, it will work.

If you want to test then don't write any kind of script at first phase. Just delete the blank line, header and footer using vi editor and test that sample file. If it works then you can write the script for deleting the header/footer/blank lines....

Try it out

Posted: Thu Sep 02, 2004 6:55 am
by denzilsyb
very simply, in BASIC you would code this algorithm

open file
start loop
read record
if detailRecord then
write to output file
close loop
close file

ok.. what happens here is that you loop within the file testing whether the detail record exists. This works if you have CR/LF because you're just testing the first values (col1). this happens in code though, and not visually as one would like it, but sometimes code is the way to go.

once you have run this code, you are left with a SEQ file that you can run through datastage transforms etc.

Posted: Thu Sep 02, 2004 4:18 pm
by ray.wurlod
alexnirmal wrote:but since this file is in this format

even i cannot read it becoz it is giving errors like this and job gets failed

controlDfile..Sequential_File_0.DSLink40: nls_read_delimited() - row 1, column Field003
Within the Sequential File stage on the Columns grid there is scope for setting "missing columns" rules. Scroll the Columns grid to the right to find them. Ignore the missing Field003 and Field004 in the header lines.
(Anupam, this has been used at least once at your site, in Feb 2003.)

Posted: Sat Sep 04, 2004 3:38 am
by anupam
As Ray suggested, you can use "missing columns" rules for reading the file. Infact in one of our Applications, we have used this method but some how i am not very much comfortable with this approach.

There is no technical reason for this but i feel more comfortable with unix level manipulation of file.

I tested this method for reading the sample data provided by alexnirmal. And i am able to read the file without any kind of error, but then obviously some kind of transformation logic needs to be implemented.