filepattern - reading list of files

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

mekrreddy
Participant
Posts: 88
Joined: Wed Oct 08, 2008 11:12 am

filepattern - reading list of files

Post by mekrreddy »

Hi I have a file list which contains list of files to be read using sequential file with same metadata. Using file pattern, Can I just use the file name which contains the list of files to be read. and extract the data?

file_list.txt --> datafile1.dat
datafile2.dat

In other words, I have to use only the file_list.txt to read the data files. Please advise.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No, file pattern means you'd need to supply a wildcard pattern that would match all of the files you'd want to process, not a specific list of filenames. Typical solution for a 'list of files' would be a looping Sequence job that runs a single job once per filename.
-craig

"You can never have too many knives" -- Logan Nine Fingers
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I prefer the transformer looping to the wildcard approach as you can then log to a control table the names of the files you have processed and the success of that processing. If you use the wildcard approach and the job aborts half way through you have no idea which of the files have been processed.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I usually prefer to do this looping in a controlling sequence.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Sorry - when I said Transformer looping I meant Sequence looping! Wrong looping.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

The file pattern option allows you to also specify a file containing a list of files, or a shell command which will return a list of files.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

In Sequential file stage?
pandeeswaran
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Yup. Crap, how did I miss that? It's right there in black and white pixels in the documentation of all places:

File Pattern: Specifies a group of files to import. Specify file containing a list of files or a job parameter representing the file. The file could also contain be any valid shell expression, in Bourne shell syntax, that generates a list of file names.

To make matters worse, there's no "they musta just added that in the 8.x release" - I just found the same text in the 7.x docs as well. D'oh.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Worth logging in today. Learned something.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Careful, you'll make me feel like I accomplished something...
- james wiles


All generalizations are false, including this one - Mark Twain.
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

I knew I had read about it somewhere but when I looked at the sequential file stage I was expecting a "Filename file" kind of option but could find nothing so gave up. Certainly didn't know about it available in 7.x... thought it was something new in 8.5!
Satwika
Participant
Posts: 45
Joined: Mon Jan 02, 2012 11:29 pm

Post by Satwika »

Hi,

Can you please ellobarate on this. I tried to read from a file which contains the list of file names but it's not considering the columns in the listed files.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Please explain for us what "considering the columns" means. If by that you mean that it doesn't respect the "First record is column names" setting for any but the first file, that's a known limitation when reading multiple files using any of the available mechanisms. Unless they've managed to fix that rather silly issue and you mean something else entirely...
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Does the metadata in the link properties mention these columns that it's "not considering"?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Satwika
Participant
Posts: 45
Joined: Mon Jan 02, 2012 11:29 pm

Post by Satwika »

[quote="chulett"]Yup. Crap, how did I miss that? It's right there in black and white pixels in the documentation of all places:

File Pattern: Specifies a group of files to import. Specify file containing a list of files or a job parameter representing the file. The file could also contain be any valid shell expression, in Bourne shell syntax, that generates a list of file names.

Hi chulett \ray

Good morning:

As per chulett, I have created the file which contains the list of file names and those files have the same metadata. And I created job which having :

Sequential file (File pattern and the file name which have file names) --> Tfr --> Sequential file.

Now I'm tring to load the data from the listed files. The job is not performaing as expected (Means it's not loading the data properly). It just taking the file names as values and outputing the same file names.
Now, i think you guys can understand the issue. Please clarrify me in this . Thank you....
Post Reply