Extracting files from unix

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

times29
Premium Member
Premium Member
Posts: 202
Joined: Mon Sep 22, 2008 3:47 pm

Extracting files from unix

Post by times29 »

Hi,
I have few files on unix box want to pick up the latest file based upon the latest date what is best approach to do it

want to extract latest file which will be (111AAA_201102251428300731_test_code.csv)

Sample of iles on unix box

111AAA_201102251428300731_test_code.csv
1222AAA_201101041428300731_test_code.csv
1444AAA_2011010314283007311_test_code.csv

Thanks
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

I'd usually do it from a job sequence. Do an execute command prior to job execution using "ls" with the appropriate wild card and options to return the correct filename first in its list.

Then use the command output status variable for that execute command to pass the correct filename to the job in a parameter.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
times29
Premium Member
Premium Member
Posts: 202
Joined: Mon Sep 22, 2008 3:47 pm

Post by times29 »

Hi,
Can draw a flow diagram if possible

Thanks
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

Use aggregator stage and max() function
Karthik
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

asorrell wrote:I'd usually do it from a job sequence. Do an execute command prior to job execution using "ls" with the appropriate wild card and options to return the correct filename first in its list.

Then use the command output status variable for that execute command to pass the correct filename to the job in a parameter.
The same can be done using before job subroutine if a sequence job is not preferred.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No, it cannot if the goal is to pass the output of the routine into the job as a job parameter value.
-craig

"You can never have too many knives" -- Logan Nine Fingers
times29
Premium Member
Premium Member
Posts: 202
Joined: Mon Sep 22, 2008 3:47 pm

Post by times29 »

That is right we need to pass the routine as job parameter value i can run script below to see that file on aix box but how pass the below script to a file.
ls -1 *.csv | sort -n -t"_" -k 2.1,2.6 | tail -1
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Andy has already explained how to do that.
-craig

"You can never have too many knives" -- Logan Nine Fingers
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

zulfi123786 wrote: The same can be done using before job subroutine if a sequence job is not preferred.
Not exactly the same thing but you could have a small script that copies/moves the most latest file to a file say xxx.dat and use the same xxx.dat in the seq file stage.

Run the script using before job subroutine.
times29
Premium Member
Premium Member
Posts: 202
Joined: Mon Sep 22, 2008 3:47 pm

Post by times29 »

i am passing file name in sequential file and its not liking it i just want to test it before passing parameters

/aa/test/latestfile=$(ls *.csv | sort -nt_ -k2.1,2.6 | tail -1)
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Well, what the heck does "not liking it" mean? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
times29
Premium Member
Premium Member
Posts: 202
Joined: Mon Sep 22, 2008 3:47 pm

Post by times29 »

mean i can't view data as no file is found but i can see data if i give
same path name with specific file name
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

times29 wrote:i am passing file name in sequential file and its not liking it i just want to test it before passing parameters

/aa/test/latestfile=$(ls *.csv | sort -nt_ -k2.1,2.6 | tail -1)
What does this mean? If this is what you put in the Sequential File stage as the filename, you can't do any such thing. What you can do is put those commands into an Execute Command stage in a Sequence job and then pass $CommandOutput to a Job Activity stage as a parameter value... as Andy noted.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are you giving this expression as the file name or as the filter command?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
times29
Premium Member
Premium Member
Posts: 202
Joined: Mon Sep 22, 2008 3:47 pm

Post by times29 »

Hi,
Don't understand what Andy is saying i put the command in execute command stage and it did return reply 0 which is desired output and in the next sequence i have a job in which this ouput need to be passed my question is where should i put this ouput as input as Chulett said ($CommandOutput to a Job Activity stage as a parameter value)

So what i understand is INPUT_FILE_NAME value expression will be $CommandOutput in job activity for job 2 as job 1 will create ouput command as below


Output from command ====>
/aa/test/aa_201005051223400386_test1.csv (from job 1)

so job 2
INPUT_FILE_NAME value expression will be $CommandOutput

Is that right

Thanks
Post Reply