remove blank lines from end of file

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Jay
Participant
Posts: 105
Joined: Tue Nov 11, 2003 8:28 pm

remove blank lines from end of file

Post by Jay »

Hi All,

We get flat files which have blank lines at the end of the file. This creates the "Mismatch in number of columns" error.

How to get rid of the lines? Before-job subroutine calling a shell script?
Can i get some ideas as to how to write the script?

Thanks in advance
J
martin
Participant
Posts: 67
Joined: Fri Jul 30, 2004 7:19 am
Location: NewJersy

Post by martin »

Invoke Tranform Stage And map All The Columns And Define Below Constraint To Eliminate Last New Line Or Blank Line.
But Make Sure You Do this On Key Column or Key Columns Which Doesnt Have Spaces.

DSLink.ColumnName > ' ' (Here Take One Key Column)
Or
(Col1:Col2:Col3:Col4) > ' ' (Here U Can Take Multiple Key Columns)

Good Luck
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

The sed command is very fast and can delete blank lines. I can't remember exactly the syntax but it something like

Code: Select all

sed '1,%g/^$/d'
This says delete all lines which have the begining of a line ^ next to the end of a line $ and nothing in between.
Mamu Kim
martin
Participant
Posts: 67
Joined: Fri Jul 30, 2004 7:19 am
Location: NewJersy

Post by martin »

Would You Please Educate Us On this Command ..How To use And Where To Use
Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

It's the "stream editor", available on any UNIX box. Try typing "man sed" at the command line and check it out! :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Craig is correct. It is a simple UNIX command. It is very fast. It has very ugly syntax. You suggested that you had many blank lines and this will solve that problem. I think this has been covered so I would do a search on here and maybe Google.

You can ignore these errors in the sequential stage. You would also need a constraint in the output link. A couple of options available to you.
Mamu Kim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Expanding slightly on what Kim said.

You can set the "missing columns" rules in the Sequential File stage (Columns grid - scroll right to find them) to have DataStage ignore the fact that there are missing columns, pad with null, etc.

Then you constrain the output so that these rows are not output, using an output constraint expression in the Transformer stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Jay
Participant
Posts: 105
Joined: Tue Nov 11, 2003 8:28 pm

Post by Jay »

Hi All,

Thanks for all the replies.

I have been playing around with the 'sed' command. i'll let you know, what i end up with.

Thanks
j
saprebv
Participant
Posts: 11
Joined: Wed Aug 25, 2004 1:37 am

Post by saprebv »

put constraint in first transformer
trim(field1)<>'' and trim(field2)<>'' and trim(field3)<>'' and .....

:)
Jay
Participant
Posts: 105
Joined: Tue Nov 11, 2003 8:28 pm

Post by Jay »

Hey All

i ended up with this one-liner...

awk 'length>1' <file_name> | sed -e '/^^/d'

is this ok?

thanks all
j
Post Reply