Page 1 of 1

Processing multiple files

Posted: Thu Jan 25, 2007 10:12 am
by arun_im4u
Hello,

Which would be the best approach to process multiple files of a same pattern in a folder one after the other so that the run time logs for each file can be captured separately.

I tried to use the execute command activity to capture the head file and send it as a parameter to the Job activity but did not work. The other option would be to write scripts outside DS.

Any suggestions would helpful.

Thanks.

Posted: Thu Jan 25, 2007 10:20 am
by DSguru2B
Did you try looking at Read Method "File Pattern" in the sequential file stage properties?

Posted: Thu Jan 25, 2007 10:31 am
by velagapudi_k
Select 'File Pattern' for the read method in Sequential file stage properties. That functionality is pretty good. Logs will be seperate for each file. It allows wild characters. eg: *filename*.

Posted: Thu Jan 25, 2007 11:07 am
by arun_im4u
Yeah. I did look at file pattern option, but it concatenates all the files of the pattern into one and tries to process it. I would like to process one file after the other.

Posted: Thu Jan 25, 2007 11:40 am
by DSguru2B
Use a Start Loop and End Loop in a sequence job. pass the file names as a list to the start loop. Specify #StartLoopName.$Counter# as the derivation for the job parameter. It will take all the file names as a list and pass it to your job. This way your job will process each file individually.
If the filenames are dynamic in the folder. You can have a Basic routine that used DSExecute() to get all the files present in that folder and pass it as a comma delimited list to your StartLoop Activity.

Posted: Thu Jan 25, 2007 11:40 am
by umamahes
Make a FileList with set of files You want to Process.In the job sequence user StartLoop Activity and EndLoop Activity to process all the File In the file list.To do this first count the number files in the file list and set this value as upper limit to the StartLoop activity and tehn write a routine to get the file name from the file list.

Posted: Thu Jan 25, 2007 3:50 pm
by ray.wurlod
Does not the items list in a StartLoop activity support limited regular expressions?

Posted: Thu Jan 25, 2007 6:31 pm
by velagapudi_k
No ray. I had a similar problem where I have to process one file after other. So I wrote a routine which executes an operating system command and returns the files in a comma delimited list. I am passing this as the input to loop activity and it works fine. Till now I had maximum of 15 files and my sequence iterated thru 15 times. So I have no problem.

Posted: Tue Jan 30, 2007 4:40 pm
by arun_im4u
I wrote a routine to make it a comma delimited file and pass it as a parameter to the start loop stage. It worked fine. But if there are many files then the routine generates a new line in the output and the job fails.

Code: Select all

*FilePath(Arg1) directory where the file exists and pFilePattern(Arg2) is the pattern to look for

InputArg = 'cd ' : pFilePath : ' ; ls -m ' : trim(pFilePattern)

Call DSExecute("UNIX", InputArg, Output, SystemReturnCode)

If SystemReturnCode<>0 Then
	Call DSLogFatal('GetFilesList routine failed to excute command ' : InputArg : ' with return code ' : SystemReturnCode : ' and with msg ' : Output, 'GetFilesList')
End
Else
      out1 = convert(" ", "",Output)
      out2 = Left(out1,len(out1)-1)
End

Call DSLogInfo("Command Output is " : out2,"GetFilesList")

Ans=out2
Any help would be great,
Thanks.

Posted: Tue Jan 30, 2007 6:26 pm
by ray.wurlod
Your difficulty is with your operating system's output line length limit. You may be able to modify that. You could certainly adapt your Convert() function to remove the line terminators as well (on UNIX - on Windows you'd need Ereplace()).

Or you could use ls -1 to get a single COLUMN of output, and convert the field mark characters in that to commas. Something like:

Code: Select all

InputArg = ls -1 ' : pFilePath : "/" : trim(pFilePattern) 

Call DSExecute("UNIX", InputArg, Output, SystemReturnCode) 

If SystemReturnCode<>0 Then 
   Call DSLogFatal('GetFilesList routine failed to excute command ' : InputArg : ' with return code ' : SystemReturnCode : ' and with msg ' : Output, 'GetFilesList') 
End 
Else 
      * Build list of non-empty lines
      out1 = Output
      out2 = ""
      Loop
         Remove Element From out1 Setting MoreElements
         If Len(Element) Then out2<1> = Element
      While MoreElements
      Repeat
End 

Ans=Convert(@FM, ",", out2 )
Call DSLogInfo("Command Output is (sort of) " : Ans, "GetFilesList") 

Posted: Wed Jan 31, 2007 4:41 pm
by arun_im4u
Thanks Ray. It worked fine. I modified it a little bit to serve my purpose. Didn't know what "Element" in the code means.

Code: Select all


      InputArg = "cd "  : pFilePath :"; ls -1 " : trim(pFilePattern)
      Call DSExecute("UNIX", InputArg, Output, SystemReturnCode)
      If SystemReturnCode<>0 Then
         Call DSLogFatal('GetFilesList routine failed to excute command ' : InputArg : ' with return code ' : SystemReturnCode : ' and with msg ' : Output, 'GetFilesList')
      End
      Else
      Print Output
      End

      Out1=Convert(@FM, ",", Output )
      Ans =Left(Out1,len(Out1)-1)
      Call DSLogInfo("Command Output is " : Ans, "GetFilesList")
I also wrote one that worked, but your suggestion is better.

Code: Select all

InputArg = 'cd ' : pFilePath : ' ; ls -m ' : trim(pFilePattern)

Call DSExecute("UNIX", InputArg, Output, SystemReturnCode)

If SystemReturnCode<>0 Then
	Call DSLogFatal('GetFilesList routine failed to excute command ' : InputArg : ' with return code ' : SystemReturnCode : ' and with msg ' : Output, 'GetFilesList')
End
Else

	*out1 = convert(char(10), ",", Output)
	*out2 = Left(out1,len(out1)-1)

	
Out1 = OConv(Output, "MCP")
Out2 = EReplace(Out1, ".csv,.", ".csv,")
Out3 = Left(Out2,len(Out2)-1)
Out4 = Convert(" ","",Out3)

Ans=Out4

End
Call DSLogInfo("Command Output is " : Output,"GetFilesList")
Thanks.

Posted: Wed Jan 31, 2007 4:47 pm
by ray.wurlod
"Element" is an element in a dynamic array. The output is returned to DataStage as a dynamic array (a field-mark-delimited string). The loop served to remove any empty lines (at the beginning and end, typically, from an ls output). Your solution removes only the one at the end.