Reading Sequential File with Header and Footer

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
alexnirmal
Participant
Posts: 2
Joined: Thu Sep 02, 2004 4:10 am

Reading Sequential File with Header and Footer

Post by alexnirmal »

how to read a seq file like this which contains header and footer
and how to omit this and just read the necessary columns
the header is name and no of ppl
and footer is ------
i just want the details col1 col2 col3 and col4

Name :uselss
no of ppl :89


col1 col2 col3 col4
1 2 3 4
5 6 7 8


--------------------------------------------

Name :us
no of ppl :8df9


col1 col2 col3 col4
11 21 31 14
51 61 71 18

--------------------------------------------
ETL developer,
Satyam Computer Services Limited, India.
denzilsyb
Participant
Posts: 186
Joined: Mon Sep 22, 2003 7:38 am
Location: South Africa
Contact:

Post by denzilsyb »

In your transform, from the sequential file add this constraint:

@INROWNUM > 2 (if header is first two lines)

this will omit the header. The footer you can also ignore by adding something to the constraint:

@INROWNUM > 1 and col1 <> '------'

seems a dirty way of doing it though.

you could put this in BASIC code if you wanted.
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
anupam
Participant
Posts: 172
Joined: Fri Apr 04, 2003 10:51 pm
Location: India

Post by anupam »

If the format is going to remain same, that is in each and every file you will be having header and footer like this

Code: Select all

Name :uselss 
no of ppl :89 


col1 col2 col3 col4 
1 2 3 4 
5 6 7 8 


-------------------------------------------- 
Then you can use a simple unix script which will delete the header and footer and blank lines.

You can call that unix script in your Job by using Before-job subroutine : Exec Sh. This script will remove the header, footer and empty records. Then your file would be like this

Code: Select all

col1 col2 col3 col4
1 2 3 4 
5 6 7 8
And you know how to read this file which has a space as a delimiter.

This is one way of reading this file, there may be some other ways also. Only drawback here is that you have to call the script which manipulates the file so that it may be readable from your Job.
----------------
Rgds,
Anupam
----------------
The future is not something we enter. The future is something we create.
anupam
Participant
Posts: 172
Joined: Fri Apr 04, 2003 10:51 pm
Location: India

Post by anupam »

I don't think what dnzl has suggested will work. As the problem is reading the file and not with discarding the first 2 records....

First problem is related with the file format only. If datastage is able to read the file without any kind of warning then the problem is solved.

So
@INROWNUM > 2 (if header is first two lines)
These kind of tricks will not work.

Correct me if i am wrong.
----------------
Rgds,
Anupam
----------------
The future is not something we enter. The future is something we create.
denzilsyb
Participant
Posts: 186
Joined: Mon Sep 22, 2003 7:38 am
Location: South Africa
Contact:

Post by denzilsyb »

I don't think what dnzl has suggested will work
:shock: anupam; you pointed out something that i overlooked - file format. But, if there is a CR/LF at the end of the row, the header constraint should work; we were using this process before we started to validate the header and trailers.
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
alexnirmal
Participant
Posts: 2
Joined: Thu Sep 02, 2004 4:10 am

Post by alexnirmal »

but since this file is in this format

even i cannot read it becoz it is giving errors like this and job gets failed

controlDfile..Sequential_File_0.DSLink40: nls_read_delimited() - row 1, column Field003
ETL developer,
Satyam Computer Services Limited, India.
anupam
Participant
Posts: 172
Joined: Fri Apr 04, 2003 10:51 pm
Location: India

Post by anupam »

That is what i pointed out.

unless and untill the file is readable by datastage, no other trick can be used. It will always give this kind of warnings while reading the file.

The best alternative is to delete the header, footer and the blank lines before reading the file and making the file readable.

Then space delimiter can be used while reading the file.

Try this, it will work.

If you want to test then don't write any kind of script at first phase. Just delete the blank line, header and footer using vi editor and test that sample file. If it works then you can write the script for deleting the header/footer/blank lines....

Try it out
----------------
Rgds,
Anupam
----------------
The future is not something we enter. The future is something we create.
denzilsyb
Participant
Posts: 186
Joined: Mon Sep 22, 2003 7:38 am
Location: South Africa
Contact:

Post by denzilsyb »

very simply, in BASIC you would code this algorithm

open file
start loop
read record
if detailRecord then
write to output file
close loop
close file

ok.. what happens here is that you loop within the file testing whether the detail record exists. This works if you have CR/LF because you're just testing the first values (col1). this happens in code though, and not visually as one would like it, but sometimes code is the way to go.

once you have run this code, you are left with a SEQ file that you can run through datastage transforms etc.
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

alexnirmal wrote:but since this file is in this format

even i cannot read it becoz it is giving errors like this and job gets failed

controlDfile..Sequential_File_0.DSLink40: nls_read_delimited() - row 1, column Field003
Within the Sequential File stage on the Columns grid there is scope for setting "missing columns" rules. Scroll the Columns grid to the right to find them. Ignore the missing Field003 and Field004 in the header lines.
(Anupam, this has been used at least once at your site, in Feb 2003.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
anupam
Participant
Posts: 172
Joined: Fri Apr 04, 2003 10:51 pm
Location: India

Post by anupam »

As Ray suggested, you can use "missing columns" rules for reading the file. Infact in one of our Applications, we have used this method but some how i am not very much comfortable with this approach.

There is no technical reason for this but i feel more comfortable with unix level manipulation of file.

I tested this method for reading the sample data provided by alexnirmal. And i am able to read the file without any kind of error, but then obviously some kind of transformation logic needs to be implemented.
----------------
Rgds,
Anupam
----------------
The future is not something we enter. The future is something we create.
Post Reply