Remove first four rows from file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

dubuku_01
Participant
Posts: 79
Joined: Fri Nov 18, 2011 2:18 pm
Location: chennai

Remove first four rows from file

Post by dubuku_01 »

Hi all,

I am trying a read a text file using sequential file stage. Column headers start from fifth line. And it contains 10 columns. I gave 10 column names in the column section. First four lines contain some information about the file. I want to remove that. Please help me in removing those four lines before proceeding with the load.

Thanks in advance
srds2
Premium Member
Premium Member
Posts: 66
Joined: Tue Nov 29, 2011 6:56 pm

Post by srds2 »

Use Before job subroutine to call a unix script where you can delete the first 5 lines.
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Post by kwwilliams »

Just put a reject link off of your sequential stage. The lines that don't meet the format of the file i.e. the 4 descriptive lines, will flow to the reject link and the rest of the data will flow through the job normally. Pretty simple.
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Unless they just happen to meet the defined format of course...

There is a filter option on sequential stage, just define a suitable filter to knock out the lines you don't want
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Post by kwwilliams »

Of course ... It is probable that the delimeter (pipe, tab, comma) being used within the file is one that would preclude a header description from matching the content of the file in format. The only way a filter works is if the format matches to define a field by which to filter the first four line. Which is why I didn't suggest it.

If you don't want to do it this way, have the sequential stage read the entire file, use a transformer to constraint to eliminate the first four rows, and then use the column import stage to break the single line into multiple columns.
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Was thinking more along the lines of a filter that drops the first 4/5 lines rather than depending on anything within the data
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Post by kwwilliams »

According to the documentation, the filter only works after reading the data.
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Hmmm, without testing, I would have hoped it passed the data through the filter before it attempted to parse it (like it does in Server jobs). As far as I remember (which admittedly isn't much), it should be a command line call to process the file prior to DataStage processing... still, will give the OP something to confirm if they so desire...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Some UNIXes allow a positive argument to the head command, for example head +5 filename (this would be your filter if you have that capability).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kandyshandy
Participant
Posts: 597
Joined: Fri Apr 29, 2005 6:19 am
Location: Singapore

Re: Remove first four rows from file

Post by kandyshandy »

If it is a one time load, simply edit the file and remove first 4 lines manually ;)

If not, there are many ways to remove first 4 lines using UNIX commands. for e.g. sed command. Execute this command in the respective place and then call the load job!!
Kandy
_________________
Try and Try again…You will succeed atlast!!
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

This is what you require:

Code: Select all

sed '1,4d'
pandeeswaran
kandyshandy
Participant
Posts: 597
Joined: Fri Apr 29, 2005 6:19 am
Location: Singapore

Post by kandyshandy »

Pandeesh, This command will stream the output? Don't you need to write to a temp file & replace the original file with the temp file?
Last edited by kandyshandy on Tue Feb 07, 2012 9:51 pm, edited 1 time in total.
Kandy
_________________
Try and Try again…You will succeed atlast!!
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

I don't know what do you mean by streaming.
There is nothing harmful if you specify that command in filter option .
pandeeswaran
dubuku_01
Participant
Posts: 79
Joined: Fri Nov 18, 2011 2:18 pm
Location: chennai

Post by dubuku_01 »

Operating system : Windows
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

Then you should have installed sed or mks toolkit
pandeeswaran
Post Reply