Find a file name coming out of a folder stage

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

BIuser
Premium Member
Premium Member
Posts: 238
Joined: Thu Feb 02, 2006 4:03 am
Location: South Africa

Find a file name coming out of a folder stage

Post by BIuser »

Hi there

My job contains a Folder stage that uses a wildcard character to select any document within a folder with the extension of .csv.

Problem is that I need to do an Audit Control check within my job (custom built) but the stored proc I call to implement the Audit Control check uses a SPECIFIC file name (so I can't use a wild card). What can I do to identify the file name that has been selected from the folder stage so that I can reference that in my stored Proc ?

Thanks.
Barbara
-------------------------
https://www.ssa.co.za
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

Define an output Column of FileName. The FileName being read by the folder stage will be passed to the output link. Make it the key, Varchar 255. Set Fully qualifed to yes if you need it.
BIuser
Premium Member
Premium Member
Posts: 238
Joined: Thu Feb 02, 2006 4:03 am
Location: South Africa

Post by BIuser »

I have defined an output column within the folder stage of FILENAME as you suggested. This stage then joins onto a filesplitter stage.

I get this error:
CopyOfCopyOfLoad_Reserve_Summary_02..reserve_summary_this_month.: DSD_DIROpen Folder Stage output link cannot have more than 2 columns.
-------------------------
https://www.ssa.co.za
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

Ah. You were hoping that your other column definitions would work on the folder stage as they do in the Sequential File Stage. They will not. You can only have two columns FileName and a column for the entire row of a given file. Nothing else. It's documented.
Page 201 of the Server's Developer Guide;
The Columns tab defines a maximum of two columns. The first
column must be marked as the Key and receives the file name. The
second column, if present, receives the contents of the file.
Believe me, I feel your pain.
BIuser
Premium Member
Premium Member
Posts: 238
Joined: Thu Feb 02, 2006 4:03 am
Location: South Africa

Post by BIuser »

*&$#@$#^%@#$%$#
:x
Thanks ... glad to see that you've read your documentation ! :-)

Ok, so could I then use TWO folder stages, one just to get the filename? But how would that link to anything else in the job ? I can't go from a folder to a sequential file to a folder stage again ... can I ?

Is there any other way to do this ?
-------------------------
https://www.ssa.co.za
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

Well said. no, you can't use two folder stages nor can you pass the filename to a Sequential stage. You will have to parse the row into columns yourself.
I only use the Folder stage for reading XML chunks, one per file. For that it works great. Others can advise you but I think you'd have to read the lines of the files and then parse them manually in a Transformer. Either the files are fixed width or they are delimited. Simple substrings would cut out the fields if fixed width, if delimited then you can use the FIELD function, I believe.
BIuser
Premium Member
Premium Member
Posts: 238
Joined: Thu Feb 02, 2006 4:03 am
Location: South Africa

Post by BIuser »

*SIGH* :roll:

What about using an Execute Command stage and then using DOS commands to bring back the filename ?

Can you output that to a sequential file stage ?
-------------------------
https://www.ssa.co.za
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Right, as noted the Folder stage was designed for processing XML files but (technically) can be used with other file types. You can only create one or two columns, with two columns:

* Column 1 contains the filename of the current file being processed.
* Column 2 contains the complete contents of the file in one chunk.

So you will get one row per file out of the stage and you'd need to parse it into individual records and fields from there.
-craig

"You can never have too many knives" -- Logan Nine Fingers
BIuser
Premium Member
Premium Member
Posts: 238
Joined: Thu Feb 02, 2006 4:03 am
Location: South Africa

Post by BIuser »

Ok, but here's the problem ... I need a filename to send to a stored procedure. If I cannot output the filename from the folder stage to a sequential file, how am I going to use that filename to send as an input parameter to a stored procedure ?

Will I be able to use the Execute Command stage and use DOS commands to read the filename ?
-------------------------
https://www.ssa.co.za
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Typically I would concatenate all of the files together than process the concatenated result, but you need the individual filenames, yes?

Then you may need to consider a looping Sequence job, one that gets a list of files with the UserVariables Activity stage and then the Start/End Loop stages to run your processing job once per incoming file.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Dang, you need to slow down on the replies. :wink:
BIuser wrote:Ok, but here's the problem ... I need a filename to send to a stored procedure. If I cannot output the filename from the folder stage to a sequential file, how am I going to use that filename to send as an input parameter to a stored procedure ?
Of course you can do that. From a Transformer after the Folder stage, send the contents of the first Folder field to a sequential file. Your problem remains the need to parse the file itself, however.
BIuser also wrote:Will I be able to use the Execute Command stage and use DOS commands to read the filename ?
More than likely, yes. There certainly are OS commands to get lists of files (dir, ls -l, etc), that part should be pretty easy. Can you process the filenames and the files themselves separately or does that need to happen simultaneously?
-craig

"You can never have too many knives" -- Logan Nine Fingers
throbinson
Charter Member
Charter Member
Posts: 299
Joined: Wed Nov 13, 2002 5:38 pm
Location: USA

Post by throbinson »

Why can't you pass the FileName column directly from the Folder Stage to the stage that contains the stored procedure?
BIuser
Premium Member
Premium Member
Posts: 238
Joined: Thu Feb 02, 2006 4:03 am
Location: South Africa

Post by BIuser »

Slow down on the replies ... are you struggling to keep up with the lady ??? :wink:

Yes, sadly, I have to do all this in one step. So the stages have to run after each other ... :-(

But at least I know now that I can send that FILENAME field to a Transformer ...

Thanks for that one ...
-------------------------
https://www.ssa.co.za
BIuser
Premium Member
Premium Member
Posts: 238
Joined: Thu Feb 02, 2006 4:03 am
Location: South Africa

Post by BIuser »

throbinson wrote:Why can't you pass the FileName column directly from the Folder Stage to the stage that contains the stored procedure?
If I can do that then that will solve most of my problems ... I think ...

The problem is figuring out what happens when the stored procedure sends back either a "fail" or a "good to go" response. If the response is good to go then I need to parse the rows from the file. Will I be able to go to the stored proc and then process results in a Transformer and then revert back to looking at the folder stage again and take the processing from there ?
-------------------------
https://www.ssa.co.za
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:lol:

In your shoes (even though now I know they probably wouldn't be my style) I'd build a Looping Sequence to do this rather than fight with the Folder stage, especially if the files are of any size. The Folder stage has a well known problem with 'large' files and will tend to fall over dead if the contents of the file can't be stored in memory.

What 'large' means seems to vary from system to system and O/S to O/S. Some can handle hundreds of megabytes while we've seen others that can only handle tens. FYI.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply