Wild Card - Get File Name

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

synsog
Premium Member
Premium Member
Posts: 232
Joined: Sun Aug 01, 2010 11:01 pm
Location: Pune

Wild Card - Get File Name

Post by synsog »

Hi,

DS 8.7 on Windows

We are using a File Pattern in a Sequential Stage as we get dated files abc_20120322.txt (We expect only 1 per day)..

So reading and processing this file is not a problem but I want to know the exact name of the file so an email notification can be sent.

How can I get this full file name (abc_20120322.txt) as currently my SrcFileName param = 'abc_' and we append *.txt in the seq stage itself

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

There's an option in the stage to add an ouput link for the filename, one of the properties from what I recall. Don't have any documentation here so can't quote chapter and verse I'm afraid. You may also need to add a specific $APT environment variable to help it but try it first without it and see what happens.
-craig

"You can never have too many knives" -- Logan Nine Fingers
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

That would be the $APT_IMPORT_PATTERN_USES_FILESET environment variable.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Couldn't pull it out of my hat earlier today, thanks for that. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
synsog
Premium Member
Premium Member
Posts: 232
Joined: Sun Aug 01, 2010 11:01 pm
Location: Pune

Post by synsog »

I tried without using the Env Variable (which seems to be more forparallelising the read in case of mulitple files by treating it as a FileSet)

I added the option in the Seq File stage "File Name Column" and this is returning me the file name w/o the date part as \Path\abc*.Txt in a new output column but I need it to return me \Path\abc_20130322.Txt
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Add the environment variable.
-craig

"You can never have too many knives" -- Logan Nine Fingers
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

which seems to be more forparallelising the read in case of mulitple files by treating it as a FileSet
That is its primary purpose, but a side effect of that is that you are returned the actual filenames. The pattern alone (without the env variable set) causes the operator to concatenate the matching files together prior to reading them (similar along the lines of `cat \Path\abc*.Txt | program` from the command line). Individual filenames are not available and so the pattern is returned as the filename. Adding the variable requests that the files be read individually (that's how the fileset works) and thus provides the individual filenames.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
synsog
Premium Member
Premium Member
Posts: 232
Joined: Sun Aug 01, 2010 11:01 pm
Location: Pune

Post by synsog »

After adding the ENV Variable and setting it to True, the job runs successfully but does not read from the file, shows 0 records (I verified the file is present and has 45 records)

It gives this warning message

SEF_WNOACCS3_D34001: Fileset /tmp/import_tmp_729269132fb8.fs contains no files. If a file pattern was specified this indicates the pattern returned no files.

This is my File Pattern in the Seq Stage: #FileConnections.pSrcFilePath##FileConnections.pSrcFileName_Campus#*.TXT

This job w/o this ENV runs successfully and reads all the data
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Odd... I don't recall anyone posting that issue before. Usually it's what we've been discussing - it 'works' without the variable but returns the pattern as the filename. Add the variable and you get the full filename, easy peasy.

With that behaviour I'd contact your official support provider, ask them what the heck is going on. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Have you selected "File" or 'File Pattern" in source/read method options on the output link tab? Try both to see if there is any difference in results. I don't have access to a DS server right now (visiting a client) or I would look up what I've done in the past.

Looks like they have at least rewritten the error message since 8.0...the original message was very misleading, but can't recall what it said.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
synsog
Premium Member
Premium Member
Posts: 232
Joined: Sun Aug 01, 2010 11:01 pm
Location: Pune

Post by synsog »

jwiles,
I had selected the File Pattern.
It does not work with selecting Specific File option as well, which I guess is expected, as this is Not a File Name but a pattern

I see similar post here for v8.5 and it says it is a bug and that IBM is to fix it, doesn't look like it has been :(

viewtopic.php?p=416953
prasannakumarkk
Participant
Posts: 117
Joined: Wed Feb 06, 2013 9:24 am
Location: Chennai,TN, India

Post by prasannakumarkk »

here is a workaround based on your statement.
We expect only 1 per day
Get the name of the latest available file in your folder with a Unix command whichever you are comfortable of, or if you are cleaning the files after processing then take the file name(s) that are not processed. This should be good instead of generating file name for each record (space usage) and taking first record
Thanks,
Prasanna
prasannakumarkk
Participant
Posts: 117
Joined: Wed Feb 06, 2013 9:24 am
Location: Chennai,TN, India

Post by prasannakumarkk »

Basically what i think is file name in the output link can be used only when the data needs to be processed based on the file name, you dont have such requirement, Go with commands
Thanks,
Prasanna
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

workarounds are anyways there, one is to read the file name before calling the job, in a sequence and pass it as parameter.

There may be requirement where the rejected records should have file name to know from which file it was fetched or the file name in one column while loading in to target for audit purposes.

I never had requirement like that but in our case we needed the file name not in data but for rejections and audit like the file name should be stored for controlling the run as which file we processed, count in file etc, we used scripts to get the file name we were processing.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
nvalia
Premium Member
Premium Member
Posts: 180
Joined: Thu May 26, 2005 6:44 am

Post by nvalia »

I have similar issue..
The $APT_IMPORT_PATTERN_USES_FILESET = TRUE has been set in the job

When I pass parameters like this in the File Pattern - #FileConnections.pSrcFilePath##FileConnections.pSrcFileName#*.TXT
(Read Method = File Pattern) I get an error as below, even though the file is present

Fileset /tmp/import_tmp_13008d3a6b2e5.fs contains no files. If a file pattern was specified this indicates the pattern returned no files.

It seems the Source Path is on a network drive and not on the same server as the Datastage Engine and hence this issue...e.g. \\DASD\asxcd$\TEST\filename*.TXT

I verified this by manually copying the file over to the DS Server and the job works fine and also returns the exact file name as expected.

How can I resolve this Issue, as for me the files will always be on Network drive..and not on the DS Server
Post Reply