File Pattern -- Reading sequentially

Shruthi · Post by **Shruthi** » Thu Dec 30, 2010 6:40 am

Hi,

I have 3 files which match a pattern.

JobName_Date1.txt
JobName_Date2.txt
JobName_Date3.txt

I want the file with JobName_Date1.txt to be read first and then JobName_Date2.txt and then JobName_Date3.txt

Can we specify this order anywhere that it has to read in a sorted order?

Thanks,
Anitha

Sreenivasulu · Post by **Sreenivasulu** » Thu Dec 30, 2010 7:24 am

Use the 'ls options' in execute command activity .
Check in unix if it comes sorted the way you want the same can be plugged in datastage

Regards
Sreeni

Shruthi · Post by **Shruthi** » Thu Dec 30, 2010 7:39 am

Our OS is Windows. How can it be done here?

samyamkrishna · Post by **samyamkrishna** » Thu Dec 30, 2010 8:04 am

the DIR command

chulett · Post by **chulett** » Thu Dec 30, 2010 9:37 am

DataStage installs the MKS Toolkit on Windows so you should in fact have UNIX-like capabilities and thus commands like "ls" can be used. Or just stick with "dir /b", either will display files alphabetically by default.

That "/b" option gives you a "bare" listing, one with just the filenames which is what would be appropriate in the Sequential File stage's File Pattern option.

Abhijeet1980 · Post by **Abhijeet1980** » Thu Dec 30, 2010 9:40 am

Shruthi,

Kindly explain us the logic for sorting the files.

Files may be sorted on following attributes:
Modified Date
File Name
.
.
.
Many othet atrributes

I hope, that helps.

Post by **daignault** » Thu Dec 30, 2010 10:36 am

Setup your Seq stage to read the 3 files using a wildcard. Create a new column as part of the take-on that is the name of the source file (use a parameter maybe?)

Use a hash partition method on the source file key you created above and the data will be grouped in the same way as the take-on.

Cheers,

Ray D

DSguru2B · Post by **DSguru2B** » Thu Dec 30, 2010 10:45 am

Read the files regardless of the order. Sort it downstream on the file name. This way you will ensure the proper sort order regardless of the order.

Shruthi · Post by **Shruthi** » Mon Jan 03, 2011 3:10 am

Thanks for all your reply.
If I want to specify "dir \b" or "ls -a", where should I specify it?

dir /b D:/Employee*.txt

But its failing with the error message as follows

Sequential_File_1,0: Couldn't find any files on host nibc1521 with pattern dir\ /b\ D:/Employee*.txt.

samyamkrishna · Post by **samyamkrishna** » Mon Jan 03, 2011 4:01 am

its in the file property something like command

chulett · Post by **chulett** » Mon Jan 03, 2011 7:11 am

You've got things in the right place as it is trying to do what you asked. And I wasn't specific enough in my previous reply - that "dir /b" output is an example of what the stage needs, not the actual command itself to be used there. Sorry.

Try simply putting "D:/Employee*.txt" there and letting us know how it goes.

Shruthi · Post by **Shruthi** » Sat Jan 08, 2011 9:40 pm

If I simply put "D:/Employee*.txt", its getting read by alphebetical order. Now, I want it to be read according to the system time and date.

Is there anything called as indirect file reading in DataStage? Meaning, all the files to be read will be put in one file and DataStage should read this file to get the file names and in turn read the files.

ray.wurlod · Post by **ray.wurlod** » Sat Jan 08, 2011 10:12 pm

Capture the output of ls -1rt Employee*.txt command, convert the line terminators to, say, commas, and use that string as the "list of things" processed by a Start Loop activity.

jwiles · Post by **jwiles** » Sun Jan 09, 2011 9:35 am

Shruthi wrote: Is there anything called as indirect file reading in DataStage? Meaning, all the files to be read will be put in one file and DataStage should read this file to get the file names and in turn read the files.

You could use a FileSet stage. Create a fileset file (name.fs) containing rows which look like this:

nodename:full_filename

and pass the fileset name to the FileSet stage. You'll probably need to set the "Use Schema defined in File Set" to False.

An example would be:

compute1:/home/dsadm/my_input_file_1.txt
compute1:/home/dsadm/my_input_file_2.txt

However, building on Craig and Ray's suggestions you should be able to do the following with the SeqFile stage

ls -1t D:/Employee*.txt

Either method will work and I expect the second one will be easier for you.

Regards,

[/b]