Need to read the FIlenames coming to the server

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsedi
Participant
Posts: 220
Joined: Wed Jun 02, 2004 12:38 am

Need to read the FIlenames coming to the server

Post by dsedi »

Hi All

Please could you suggest me, how to read the FIle(s)name and the first row of that file and then store it into the Oracle table.

In Server job, we have source stage which gives us the filename and the entire data of that file.

In case I only want the first row of the file instead of the entire data in server job, then how to go about?

How to go about in Parallel job.

Thanks
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

If you need to read only first line of the file, you make use of 'Read first n rows' option available in the Sequential stage and set it to 1.

If you need the name of the file in the same stage as another column, then you can make use of 'File Name Column' option avialable in the sequential stage.

If you need to read the file name present in the server,
assign /dev/null inthe file name option and give ls -1 in the filter option in the stage.
Or you can make use of FilePattern option. You can give '/path/' so that the input data for this stage would be the name of the files present in the directory.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

kumar_s wrote:If you need to read only first line of the file, you make use of 'Read first n rows' option available in the Sequential stage and set it to 1.
Interesting. In this thread people claim there is no such option. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

chulett wrote:
kumar_s wrote:If you need to read only first line of the file, you make use of 'Read first n rows' option available in the Sequential stage and set it to 1.
Interesting. In this thread people claim there is no such option. :?
The issue with the provided link is to skip first n rows, where as here, its to read first row.
Still i have suggested one option in the other post.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Man, I really should stop posting first thing in the morning. :roll:
-craig

"You can never have too many knives" -- Logan Nine Fingers
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

chulett wrote:Man, I really should stop posting first thing in the morning. :roll:
Just the one after 5000 sucessfull posts should stop you. :wink:
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
dsedi
Participant
Posts: 220
Joined: Wed Jun 02, 2004 12:38 am

Post by dsedi »

Hi Kumar

I am working on PX 7.0. I didnt find neither 'Read first n rows' option, nor i found the option File Name Column.

Filter command will work on the entire file, but here in my case, I need to convert something like this

Column1 Column2

FIlename1 022222222222222222222
222222222222222222211

Filename2 022222222222222222222
222222222222222222211


into

Column1 Column2

FIlename1 022222222222222222222
Filename2 022222222222222222222

and so on.


Thanks & Regards
Dsedi
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

But I guess you can make use of before job subroutine.
'head -1 filenames/filepatters' > NewFileName.txt

If list of files are used,

Code: Select all

head -1 filename1 filename2 .. | grep -v = > NewFileName.txt
to avoid the name of the file in the content.
Read the NewFileName.txt with proper metadata.
Skip the intermediate blank lines.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply