Page 1 of 1

Reading no of files

Posted: Fri Jun 21, 2013 9:52 am
by kennyapril
Hello,

I have a job designed for reading data using sequential stage where I do some transformations and load to new file.

I have around 2000 files to follow the same process, they all have the same layout and they are dated depending on their arrival.

Is there any processto automate this in a parallel job? Please suggest

Thank you

Posted: Fri Jun 21, 2013 10:12 am
by priyadarshikunal
do you have to read them sequentially, like the one with earliest date should be processed first or the order doesn't matter?

Posted: Fri Jun 21, 2013 10:51 am
by chulett
The stage supports a wildcard pattern so you could do them all at once. If they need to be loaded individually (one at a time) build a looping Sequence job.

Posted: Fri Jun 21, 2013 11:07 am
by kennyapril
@Priyadarshikunal
Order does not matter

@Chulett
ok, I will use wild card like filename*.txt so that it pulls all the files.
When loading it does not need to be loaded one at a time, new files can be created all at a time.
I will use looping if they need to be loaded one at a time in sequence, if that is the case shall the use the same parallel job in the sequence job and execute it at a time?

Thank you

Posted: Fri Jun 21, 2013 11:18 am
by chulett
Yes, for a loop you can use 'the same job' as long as it doesn't use a wildcard but instead passes the filename in as a job parameter each time.

Posted: Fri Jun 21, 2013 3:09 pm
by vinnz
Are you expecting your input files to come in ad-hoc while your job is executing? If not, you could concatenate your files using operating system commands prior to processing using a command stage in the sequence.