Page 1 of 1

File Reading

Posted: Tue Mar 04, 2008 11:58 am
by wong_ricky
I have trouble reading a simple text file that the first line contain blanks but the subsequent rows (line) have records. How do I tell Datastage to ignore the first blank line? I tried to check the box for first line is column names in the format tab of the output which I thought by checking the box it would ignore the first line but I got the error saying read_delimited() row 1 column X , required column missing..Any suggestions or help would be much appreciated.

Posted: Tue Mar 04, 2008 12:58 pm
by throbinson
Activate filtering and put this filter in;
awk '$NF != 1'

This will skip the first line of the file.

Posted: Wed Mar 05, 2008 4:16 am
by sachin1
Hello throbinson, please may i know how is it for windows system.

Posted: Wed Mar 05, 2008 4:38 am
by ArndW
Usually you can install UNIX commands with MKS Toolkit or a similar product. In your case just fool DataStage and, in the sequential stage, state "File has column headers" so the first line is skipped.

Posted: Wed Mar 05, 2008 5:40 am
by sachin1
yes thats very fine if my first line is blank in any case if i have subsequent blank lines, need to suppress those lines for processing in windows.

Posted: Wed Mar 05, 2008 6:15 am
by ray.wurlod
That was not a part of your original specification!

Why not write a little routine to pre-process the file, to remove all the totally blank lines? This can be executed, for example, as a before-job subroutine, and the job reads its result (a different, but associated, file name).

Posted: Wed Mar 05, 2008 7:44 am
by throbinson
I think the OP wong_ricky said that checking the checkbox saying the first line is column headers requires that the first line still be formatted correctly. That is, contain the right number of delimiters. I assume that is correct. Is it? A blank line isn't formatted according to the file schema. Sachin I think highjacked the post and so don't need no stinkin' specifications. You're on your own for Windows functionality.

Posted: Wed Mar 05, 2008 7:59 am
by chulett
throbinson wrote:I think the OP wong_ricky said that checking the checkbox saying the first line is column headers requires that the first line still be formatted correctly. That is, contain the right number of delimiters. I assume that is correct. Is it? A blank line isn't formatted according to the file schema.
Yes, that is correct.

Posted: Wed Mar 05, 2008 8:21 am
by ArndW
Checking "first line is column names" makes DS skip that line - regardless of contents or format.
If you have multiple empty lines then pre-process as Ray suggested. You can even do a simple UNIX "sed '/^$/d' InFile.txt >OutFile.txt"

Posted: Wed Mar 05, 2008 8:37 am
by chulett
Hmmm... I really don't think that's true Arnd. It still has to be read. :?

I believe a small test is in order.