Load from sequential files

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
hondaccord94
Participant
Posts: 46
Joined: Tue Aug 10, 2004 11:07 am
Location: Mclean VA

Load from sequential files

Post by hondaccord94 »

Hai,

I have a situation here and I dont know how to tackle this in windows.
I have a DS server job that loads from a .csv file into a oracle table.
There are some days, that the data comes in one file and some days they come in two or more files. So my requirement is there a way I can automate this process, like, to find all the files in that particular folder and load from one file after the other.
Please advise.

Thanks
Sandeep
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

There are different ways to approach this and it depends on what you are comfortable with.

DataStage has a Folder stage that can be used to bring a list of filenames in from a directory. Those filenames can then be passed to another job or routine that, one by one, processes them. One problem, though, is the stage will also want to pass the entire contents of the file through as well. It was really built to handle XML data, hence the behaviour, which is meant to feed the XML Reader stage from what I recall.

Or you can go the Job Control / scripted route. You could consider taking any and all files found in the folder that match your criteria and concatenating them into one file - and then processing that file. Or manually looping through a directory listing and launching the processing job over and over, passing in the next filename as a job parameter. That would depend on how comfortable you were writing either job control code in DataStage or batch files in Windows.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard! :D

If the muliple files are all in the one directory, all are governed by the same metadata ("table definition") and are moved out of there after processing, you can still use a Sequential File stage. Simply specify a filter like type *.* - output from this command will become the "stdin" of the Sequential File stage. On UNIX you'd use cat *
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
hondaccord94
Participant
Posts: 46
Joined: Tue Aug 10, 2004 11:07 am
Location: Mclean VA

Post by hondaccord94 »

Ray/Chulett

Thanks for your help. I will try all the three options and see which suits my reqs. I will get back to you guys if I find any probs.
Thanks once again
hondaccord94
Participant
Posts: 46
Joined: Tue Aug 10, 2004 11:07 am
Location: Mclean VA

Post by hondaccord94 »

Sorry Craig, I got ur name wrong.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No problem, happens all the time. :wink:

Ray's suggestion of leveraging the Filter command of the Sequential file stage would be best, I would think. Only issue after that would be how you keep it from reprocessing the same files later. And the answer to that will depend on how 'dynamic' the list of files in that folder are. :D

You'd have that issue with any of the suggested methods, however.
-craig

"You can never have too many knives" -- Logan Nine Fingers
hondaccord94
Participant
Posts: 46
Joined: Tue Aug 10, 2004 11:07 am
Location: Mclean VA

Post by hondaccord94 »

Ray/Craig,

I am trying to use the filter command option now ,that Ray quoted. I have a question, this might sound silly. But I hope you can help me with this.

Say for, I am trying to search for the files with the extension 'txt', in this folder E:\dev\srcdata. I guess I dont have to give anything in the filename option. And in the filter command window. I gave this command
E:\dev\srcdata\type *.txt. This doesnt seem to work. It is throwing me an error. Please advise.

Thanks
Sandeep
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Try this instead:

Code: Select all

type E:\dev\srcdata\*.txt
-craig

"You can never have too many knives" -- Logan Nine Fingers
hondaccord94
Participant
Posts: 46
Joined: Tue Aug 10, 2004 11:07 am
Location: Mclean VA

Post by hondaccord94 »

Ray,

Tried that, it throws an error "no file name to open". Do I have to give anything in the file name option. And for your info my test job looks like this

seq_file_stg -----------> transformer ----------------> seq_file_stg
(straight moves)

Please advise.

Thanks
Sandeep
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You can call me Ray if you like. :wink:

Yes - you do have to put something in the filename box. As far as I know it doesn't use it, but you need something there.
-craig

"You can never have too many knives" -- Logan Nine Fingers
hondaccord94
Participant
Posts: 46
Joined: Tue Aug 10, 2004 11:07 am
Location: Mclean VA

Post by hondaccord94 »

Craig,

Am sorry again... :( .

I tried doing that now I am getting this error.

ds_seqopen() - Win32 error in CreateProcess() - The system cannot find the file specified.

Please advise

Thanks
Sandeep
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That should mean that there are no files in your directory that match your wildcard pattern.

What happens if you take your exact same filter command and execute it from the command line in a 'Cmd window'? I'm guessing you'll get the exact same error.
-craig

"You can never have too many knives" -- Logan Nine Fingers
hondaccord94
Participant
Posts: 46
Joined: Tue Aug 10, 2004 11:07 am
Location: Mclean VA

Post by hondaccord94 »

Craig,

I have given the same command in the cmd window. It works fine. I am not sure why it is not working in DS.

I gave the complete file name in the file name option window and the wildcard filter type <dir>*.txt. I tried giving jus the filter option too.
No use.

I guess, I am making some sense here.

Thanks for ur time
Sandeep
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Unfortunately, I don't have a Windows based server to try this out on. It "should" work fine once you figure out whatever the issue is, as I've done things like this from the UNIX side more than once in the past.

Sorry. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
hondaccord94
Participant
Posts: 46
Joined: Tue Aug 10, 2004 11:07 am
Location: Mclean VA

Post by hondaccord94 »

Craig,

I am going with the Folder Stage. It seems to work. 'Coz my boss says I have spent too much time on that. Thanks for ur help.

Sandeep
Post Reply