Filtering folder content according to file size,date....etc?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

Filtering folder content according to file size,date....etc?

Post by alraaayeq »

Hi all;


I am trying to find a simple way to filter a list of files from one directory,usualy filtering criteria is the file last modification date, file size ...etc

can any one please help me out!

PS: I do not prefer using c++/CLI and similar stuff.
Amos.Rosmarin
Premium Member
Premium Member
Posts: 385
Joined: Tue Oct 07, 2003 4:55 am

Post by Amos.Rosmarin »

Enjoy this routine:

Code: Select all

Function getFileList(SearchDir,SearchPattern,SearchCriteria)

Ans = 

Case SearchCriteria
 ;** Put here the commend that will bring you what you want for example
Case 'L'   ;* get the newest file
   ListCmd = 'ls -rt | tail 1 ' 
Case @TRUE
   ListCmd = 'ls' 
End Case

If SearchDir = '' Then
   SearchFiles = ' ':SearchPattern
End Else
   SearchFiles = Convert(@FM,'',Splice(Reuse(' ':SearchDir),"/",Convert(',',@FM,SearchPattern)))
End

Call DSExecute("UNIX",ListCmd:SearchFiles,Ans,OsStatus)

If OsStatus<>0 Then
   Call DSLogWarn ( Ans, 'GetDirFileList')
End

Return(Ans)

alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

Post by alraaayeq »

Amos.Rosmarin wrote:Enjoy this routine:

Code: Select all

Function getFileList(SearchDir,SearchPattern,SearchCriteria)
Ans = 
.....................


Return(Ans)

Thank you Amos.

I am not familiar with BASIC , what I did is just use some lines form your routine and put it in new job under "Job Properties>Job Control".

"Ans" returns a list of files which is good, the questions now are:

1- I only got list of files once at a time, I need to pick one file name at a time, should I use any string or array manipulation or do you suggest something else?

2- can I forward "Ans" to a transformer? either as a variable or whatever!


Many thanks
alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

Post by alraaayeq »

Again, I got more troubles and I got stuck.


the reason is this , I went back to use UNIX specific commands which I hate.

I used to use C++ where there is kind of objects that I do <objectname>.<value> and this is it!!!


again, can we have similar capabilities <SOMEHOW> rather using CLI/C++/Shell!!
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Post by ogmios »

again, can we have similar capabilities <SOMEHOW> rather using CLI/C++/Shell!!
Simply put NO.

DataStage is similar as other 3rd party tools. The stuff which is "in" the tool is easy, the stuff which is not core functionality is "harder" to implement.

Can I suggest some UNIX training. You have a DataStage server running on UNIX, some of your people will need be pretty good with UNIX anyway. Also for the harder stuff in DataStage you will always need BASIC, ...

Ogmios

P.S. For your previous question: you can put the list in an array. And it would be hard to give them to a transformer.
In theory there's no difference between theory and practice. In practice there is.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

alraaayeq wrote:I only got list of files once at a time, I need to pick one file name at a time
Have you looked into the Folder stage? It was built to help with the processing of XML data from what I've read, but it does allow one to pull a list of filenames from a folder based on user supplied criteria. It would then present those filesnames (and optionally the data in the file) one at a time to your job.

Because of its heritage and the fact that it technically wants to send you the contents of the file as well, it can't be used for 'large' files - even if all you want is the filename part. :? Now, what 'large' means seems to vary from system to system, seemingly anywhere from 50MB to 300MB, but you may want to check it out.

And just to add what Ogmios said, don't be leary of taking advantage of your operating system via whatever means necessary - shell scripts, DSExecute, whatever. You'll find it comes in very handy for all kinds of things, especially in the before/after job arena. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

Post by alraaayeq »

[Quote="chulett
And just to add what Ogmios said, don't be leary of taking advantage of your operating system via whatever means necessary - shell scripts, DSExecute, whatever. You'll find it comes in very handy for all kinds of things, especially in the before/after job arena. :wink:[/Quote]

let me tell you the story behind why I am not willing to use CLI/C++/Shell. I made an application to a "lovely" company and I found that their Datacenter group have very restricted rules on applications that are "home-built" and they believe DS can/should do such thing. moreover, they do not "easily" hand-it over or accept it because they wanna to review every bit and byte. I spent a lot of my time in a dark tunnel not by my desire. :x

I am already using C++ API but they are not happy with it and they usually address the application by " the spaghetti application" since I used DS , C++ and Shell scripts as well. :?





lets go back to the technical details, here the latest thing I do have :
I am going to use "Ans" as an array and do loop on it and for each file name I will open the file and get it STATUS , I found that STATUS gives many useful details about any files.

Code: Select all

 OpenSeq PathName To FileVar Then
         STATUS stat FROM FileVar THEN PRINT stat
         Call DSLogInfo("File status : ":stat,"hmm")
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You're correct in asserting that the STATUS statement gives you lots of information about a currently open file. To process all the files in the directory, you need to open each file separately to apply this statement. (This is not really a problem; under the covers any utility you use would have to do this.)
Use a Select List to gather all the file names in the directory. For example:

Code: Select all

$INCLUDE UNIVERSE.INCLUDE FILEINFO.H

* Obtain directory pathname from this job's parameters.
DirPath = DSGetParamInfo(DSJ.ME, "DirPath", DSJ.PARAMVALUE)

* Open the directory as if it were a table
OpenPath DirPath To Dir.fvar
On Error

   Call DSLogWarn("Error opening " : Quote(DirPath) : ", status " : Status(), "Job Control")

End
Then

   * Form Select List of entries in directory
   ClearSelect 9
   Select Dir.fvar To 9

   * Process each file name in turn.  Open the file, apply status statement.
   Loop
   While ReadNext FileName From 9

      FilePath = DirPath : "/" : FileName 

      OpenSeq FilePath To File.fvar
      On Error

         Call DSLogWarn("Error opening " : Quote(FilePath) : ", status " : Status(), "JobControl")

      End
      Locked

         Call DSLogWarn(Quote(FilePath) : " locked by another process.", "JobControl")

      End
      Then

         Status FileStuff From File.fvar
         Then

            FileMode = FileStuff<5>  ; * permissions
            FileSize = FileStuff<6>
            Inode = FileStuff<10>
            ModTime = FileStuff<15>
            ModDate = FileStuff<16>
            Owner = FileStuff<30>
            * Use the above values how you will.  Use others if needed.
            * For example, if your conditions are met, DSAttachJob, 
            * DSSetParam, DSRunJob, DSWaitForJob, DSGetJobInfo, 
            * DSDetachJob.

         End
         Else

            Call DSLogWarn("Cannot obtain information from " : Quote(FilePath), "Job Control")

         End

      End
      Else

         Call DSLogWarn("Unable to open " : Quote(FilePath), "Job Control")

      End

      * This is important; it releases lock set by OpenSeq.
      If FileInfo(File.fvar, FINFO$IS.FILEVAR) Then CloseSeq File.fvar

   Repeat

End 
Else

   Call DSLogWarn("Unable to open " : Quote(DirPath), "Job Control")

End

If FileInfo(Dir.fvar, FINFO$IS.FILEVAR) Then Close Dir.fvar
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

Post by alraaayeq »

Thanks ray.wurlod for your code.


I have added DSLoginfo function inorder to view file information

Code: Select all

                  FileMode = FileStuff<5>          ; * permissions
                  FileSize = FileStuff<6>
                  Inode = FileStuff<10>
                  AccessTime = FileStuff<13>
                  AccessDate = FileStuff<14>
                  ModTime = FileStuff<15>
                  ModDate = FileStuff<16>
                  Owner = FileStuff<30>
*strat of MYCODE here
                  Call DSLogInfo("FileName: ":FileName, "MYCODE")
                  Call DSLogInfo("AccessTime : ":AccessTime , "MYCODE")
                  Call DSLogInfo("AccessDate : ":AccessDate , "MYCODE")
                  Call DSLogInfo("OWner: ":Owner , "MYCODE")
                  Call DSLogInfo("ModDate : ":ModDate , "MYCODE")
                  
Call DSLogInfo("ModTime : ":ModTime , "MYCODE")
*End of MYCODE here


I got the the follwoin g similar results

AccessTime : 32730
AccessDate : 13517
ModDate : 13517
...etc


Now, looking at above values , it does not make any sense! I do not know how to read , calculate and compare values!!!

I think if I know the format this will help a lot.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

alraaayeq wrote:I got the the follwoin g similar results

AccessTime : 32730
AccessDate : 13517
ModDate : 13517
...etc


Now, looking at above values , it does not make any sense! I do not know how to read , calculate and compare values!!!
If you read the documentation on the STATUS command (which is in the Basic.pdf file on your client pc, btw... I don't believe the online help goes into this detail) you'll see that these values are in 'Internal Format'. So, depending on exactly what you need to do, you can use them directly or convert them to an 'external' or output format that works better for you.

For example, dates and times can be compared directly in internal format. To see what they equate to, use the Oconv function with the appropriate conversion code - one of the 'D' codes for dates or one of the 'MT' codes for the times. Several examples for both are in the online help.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Here's one example, following ISO 8601 standard for presenting date and time.

Code: Select all

*strat of MYCODE here 
                  Call DSLogInfo("FileName: ":FileName, "MYCODE") 
                  Call DSLogInfo("AccessTime : ":Oconv(AccessTime,"MTS:") , "MYCODE") 
                  Call DSLogInfo("AccessDate : ":Oconv(AccessDate,"D-YMD[4,2,2]") , "MYCODE") 
                  Call DSLogInfo("OWner: ":Owner , "MYCODE") 
                  Call DSLogInfo("ModDate : ":Oconv(ModDate, "D-YMD[4,2,2]") , "MYCODE") 
                  Call DSLogInfo("ModTime : ":Oconv(ModTime, "MTS:") , "MYCODE") 
*End of MYCODE here
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply