Sequential File - Logic required

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
DSDexter
Participant
Posts: 94
Joined: Wed Jul 11, 2007 9:36 pm
Location: Pune,India

Sequential File - Logic required

Post by DSDexter »

Hi Gurus,

I have a flat file with some records and trailer records which will contain the records count. Below this count column I am getting a blank line and a EOF char at the next line as shown below

Code: Select all


col1,col2,col3\n
100\n
................................................................................\n
EOF

(I am including \n for better understanding only).

How can i process lines only above the record count. Just to add to it. The no. of blank lines are not consistent :(

Can I get rid of all the unwanted lines using filter property of Seq. Stage?

Any help will be appreciated.
Thanks
DSDexter
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

If you have multiple columns in real data records and only one column (no commas) in unwanted records you can use the reject row facility in the flat file stage to discard those.
DSDexter
Participant
Posts: 94
Joined: Wed Jul 11, 2007 9:36 pm
Location: Pune,India

Post by DSDexter »

Andrw,

But in that case it will throw a warning in the log, saying import unsucessfull. And I dont want that to happen.
Also on the reject link I have a transformer which will abort the job if a single reject is encountered (Its not evil, It's the requirement) 8) .

So I have to avoid above two scenarios.
Thanks
DSDexter
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You can use an external filter and implement your choice of awk/sed or other program call that will only copy the part of the file you want or declare the file with just one big string record, run it through a transform to use substring commands to filter out the unwanted records, then use a column export stage to create your columns.
DSDexter
Participant
Posts: 94
Joined: Wed Jul 11, 2007 9:36 pm
Location: Pune,India

Post by DSDexter »

Andrw,

I am using the following approach

1. Remove all the blank lines using

Code: Select all

sed -e "/^[ ]*$/d"
2. Get the line count now using wc -l
3. head (above result -1) records.

Can I use multiple filters, pipe seperated in external filter stage?
Thanks
DSDexter
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Yes, you can use multiple filters. You might encounter runtime errors (I can't recall what they were) when using multiple pipes, but that can be solved by making the command "sh" and the arguments "'{your commands}'"
Post Reply