Reading multiple files using sequential file or other stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
highpoint
Premium Member
Premium Member
Posts: 123
Joined: Sat Jun 19, 2010 12:01 am
Location: Chicago

Reading multiple files using sequential file or other stage

Post by highpoint »

Hi,

I have a file which will have list of files that i need to read. All this files will be in same directory and but non similar names.

How could i make this read these files using sequential file stage or any other stage.


Say My main file is: Filelist.txt. It contains following data

input1.txt
hours2.txt
customer3.txt
The number of files will vary each time.

I like to read content of all these files and perform transformations and load it into target table.

I could do this using external source stage.


I am looking all other options i have and the easiest and optimum on performance.


Note: These files are huge around 10+ million each file.

Help is appreciated.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Read the file of file names using an Execute Command activity using the cat command.
In a User Variables activity convert the line terminator characters into - for example - commas so that you have a comma-delimited list.
Use this comma-delimited list in a Start Loop activity.
A job within the loop is passed the next file name off the list as StartLoop.$Counter activity variable.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Why not just set the Sequential File stage to File Pattern and then supply the name of your filelist as the file to process. It will read the list of files and load each one, much the same as an "indirect" load in Informatica.
-craig

"You can never have too many knives" -- Logan Nine Fingers
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

So the file pattern in that case would need to be something like *.txt or *.* ?
Choose a job you love, and you will never have to work a day in your life. - Confucius
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No, from what I understand it would be the name of the file with the list of files in it.
-craig

"You can never have too many knives" -- Logan Nine Fingers
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

Cool. Will have to try that one... !
Choose a job you love, and you will never have to work a day in your life. - Confucius
kandyshandy
Participant
Posts: 597
Joined: Fri Apr 29, 2005 6:19 am
Location: Singapore

Post by kandyshandy »

qt_ky wrote:So the file pattern in that case would need to be something like *.txt or *.* ?
This works for sure as i have used in the past.

IBM documentation says like Craig said but no luck when i tried few years ago.
Kandy
_________________
Try and Try again…You will succeed atlast!!
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Post by eph »

Hi,

You can use file pattern like said before, or use a bash command in the file pattern expression to read your configuration file and pass through files names.

Eric
Post Reply