Multiple file reading using file pattern

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Roopanwita
Participant
Posts: 125
Joined: Mon Sep 11, 2006 4:22 am
Location: India

Multiple file reading using file pattern

Post by Roopanwita »

Hi,

I am trying to read multiple file of same layout and similar file name using File Pattern of Sequential file stage.

All the files has file header , I defined First Line has header in Seq file Stage, but it is considering header of only 1 file , for all other files , it is still reading header as first record.

Is there a way , where it skips header of all the files

Thank you in Advance.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sorry, I have nothing useful to say other than - dang, I would have thought they would have fixed / handled that by now. :?

Well... you could probably handle this via sed / awk / grep in the Filter option of the stage.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Post by eph »

Hi,

I guess you can do it that way :
-activate both ouput filename in one column & row number in one column output
-hash on filename&sort on filename&row number in a transformer, then use variable logic to get rid of the first line, like below

Code: Select all

svIN=filename
svTestFirstLine=if svIN<>svOLDIN then 1 else 0
svOLDIN=filename
Let us know if it works

Eric
Roopanwita
Participant
Posts: 125
Joined: Mon Sep 11, 2006 4:22 am
Location: India

Post by Roopanwita »

Hi,

Thank you for reply.

I have filtered the header in Transformer stage , job is working fine , but just wanted to check if there is any option /parameter where it will treat header of each file as header while reading using file pattern.

Thank you.
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Post by eph »

Hi,

As I understand it, the sequential file stage is applying a "cat" (abbreviation of concatenate) command line to the file pattern, which means it concatenate all files matching the pattern into one stream. This is why it can't handle the header in each file.

Eric
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

Version 11.3 processes a file pattern in parallel by default.
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply