Page 1 of 1

How many files can File Pattern Handle

Posted: Mon Oct 22, 2007 5:35 pm
by Pavan_Yelugula
Hi All
We are using the File pattern property for reading a set of FIles in a directory. The Job hangs up when the Number of Files matching the Pattern are more than 500...It doesn't log any Messages and just sits there forever....
But when the Number of Files are less than 50 in the Directory...it seems to be going stable but if we increase it any further than that say 70 or 80 the job looks a bit unstable...it runs for a couple of times and then aborts after a certain time...

I was wondering if there is a limit for the number of files the file pattern can handle...

Any help will be greatly appreciated

Thanks
Pavan

Posted: Mon Oct 22, 2007 6:50 pm
by ray.wurlod
Only the same limit that the operating system imposes on the results of regular expressions. There is no limit built into DataStage.

May I suggest that you may be running out of some other kind of resource, trying to process so many files in parallel? Perhaps a different approach is indicated?

Posted: Mon Oct 22, 2007 6:57 pm
by ArndW
Each operating system has some limit on the number of objects returned from a wildcard operation such as "ls *.txt" - sometimes the limits are hard and other times the limit can be modified with user settings. Can you do a ls with wildcard command on that directory without an error message (just a simple ls won't ever error, you need to use a wildcard)? There error will be something like ...parameter list too long...
If you get the error from your shell then most likely this is what is causing your problems in the PX job. You will need to use your docs or Google to find out what you can do on your OS.

Posted: Wed Oct 24, 2007 5:31 pm
by Pavan_Yelugula
Arnd and Ray
Thanks a lot for the replies...We actually figured out the problem...there is a small parallel routine in the job which is being called from a tranformer as a before stage routine...The moment we remove this routine the file pattern works like a charm and takes any number of files...

We really don't have a clue why the c routine makes the job hang with 40 files we are trying to figure that out...

In Between can we call a Parallel routine from a server routine??

Thanks
Pavan

Posted: Wed Oct 24, 2007 6:35 pm
by ArndW
Parallel routines are c++ based, Server routines are DS/BASIC based. You can, with a lot of work, make external routines callable directly from DS/BASIC (that includes calling them directly in server jobs) but it involves reloading the DS core and is not undertaken lightly. So even though techinically the answer to your question is "Yes", the actual answer in 99% of all cases is "No".

Re: How many files can File Pattern Handle

Posted: Thu Oct 25, 2007 8:06 pm
by vijayrc
Pavan_Yelugula wrote:Hi All
We are using the File pattern property for reading a set of FIles in a directory. The Job hangs up when the Number of Files matching the Pattern are more than 500...It doesn't log any Messages and just sits there forever....
But when the Number of Files are less than 50 in the Directory...it seems to be going stable but if we increase it any further than that say 70 or 80 the job looks a bit unstable...it runs for a couple of times and then aborts after a certain time...

I was wondering if there is a limit for the number of files the file pattern can handle...

Any help will be greatly appreciated

Thanks
Pavan
Pavan,
We have been using 100+ files in file pattern here and we dont have any issues :)