Need to read the FIlenames coming to the server

dsedi · Post by **dsedi** » Tue Mar 21, 2006 7:34 am

Hi All

Please could you suggest me, how to read the FIle(s)name and the first row of that file and then store it into the Oracle table.

In Server job, we have source stage which gives us the filename and the entire data of that file.

In case I only want the first row of the file instead of the entire data in server job, then how to go about?

How to go about in Parallel job.

Thanks

kumar_s · Post by **kumar_s** » Tue Mar 21, 2006 7:39 am

If you need to read only first line of the file, you make use of 'Read first n rows' option available in the Sequential stage and set it to 1.

If you need the name of the file in the same stage as another column, then you can make use of 'File Name Column' option avialable in the sequential stage.

If you need to read the file name present in the server,
assign /dev/null inthe file name option and give ls -1 in the filter option in the stage.
Or you can make use of FilePattern option. You can give '/path/' so that the input data for this stage would be the name of the files present in the directory.

chulett · Post by **chulett** » Tue Mar 21, 2006 7:48 am

kumar_s wrote:If you need to read only first line of the file, you make use of 'Read first n rows' option available in the Sequential stage and set it to 1.

Interesting. In this thread people claim there is no such option.

kumar_s · Post by **kumar_s** » Tue Mar 21, 2006 7:58 am

chulett wrote:
kumar_s wrote:If you need to read only first line of the file, you make use of 'Read first n rows' option available in the Sequential stage and set it to 1.
Interesting. In this thread people claim there is no such option.

The issue with the provided link is to skip first n rows, where as here, its to read first row.
Still i have suggested one option in the other post.

chulett · Post by **chulett** » Tue Mar 21, 2006 8:06 am

Man, I really should stop posting first thing in the morning.

kumar_s · Post by **kumar_s** » Tue Mar 21, 2006 8:10 am

chulett wrote:Man, I really should stop posting first thing in the morning.

Just the one after 5000 sucessfull posts should stop you.

dsedi · Post by **dsedi** » Wed Mar 22, 2006 3:40 am

Hi Kumar

I am working on PX 7.0. I didnt find neither 'Read first n rows' option, nor i found the option File Name Column.

Filter command will work on the entire file, but here in my case, I need to convert something like this

Column1 Column2

FIlename1 022222222222222222222
222222222222222222211

Filename2 022222222222222222222
222222222222222222211

into

Column1 Column2

FIlename1 022222222222222222222
Filename2 022222222222222222222

and so on.

Thanks & Regards
Dsedi

kumar_s · Post by **kumar_s** » Wed Mar 22, 2006 4:49 am

But I guess you can make use of before job subroutine.
'head -1 filenames/filepatters' > NewFileName.txt

If list of files are used,

Code: Select all

head -1 filename1 filename2 .. | grep -v = > NewFileName.txt

to avoid the name of the file in the content.
Read the NewFileName.txt with proper metadata.
Skip the intermediate blank lines.