Load from sequential files
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 46
- Joined: Tue Aug 10, 2004 11:07 am
- Location: Mclean VA
Load from sequential files
Hai,
I have a situation here and I dont know how to tackle this in windows.
I have a DS server job that loads from a .csv file into a oracle table.
There are some days, that the data comes in one file and some days they come in two or more files. So my requirement is there a way I can automate this process, like, to find all the files in that particular folder and load from one file after the other.
Please advise.
Thanks
Sandeep
I have a situation here and I dont know how to tackle this in windows.
I have a DS server job that loads from a .csv file into a oracle table.
There are some days, that the data comes in one file and some days they come in two or more files. So my requirement is there a way I can automate this process, like, to find all the files in that particular folder and load from one file after the other.
Please advise.
Thanks
Sandeep
There are different ways to approach this and it depends on what you are comfortable with.
DataStage has a Folder stage that can be used to bring a list of filenames in from a directory. Those filenames can then be passed to another job or routine that, one by one, processes them. One problem, though, is the stage will also want to pass the entire contents of the file through as well. It was really built to handle XML data, hence the behaviour, which is meant to feed the XML Reader stage from what I recall.
Or you can go the Job Control / scripted route. You could consider taking any and all files found in the folder that match your criteria and concatenating them into one file - and then processing that file. Or manually looping through a directory listing and launching the processing job over and over, passing in the next filename as a job parameter. That would depend on how comfortable you were writing either job control code in DataStage or batch files in Windows.
DataStage has a Folder stage that can be used to bring a list of filenames in from a directory. Those filenames can then be passed to another job or routine that, one by one, processes them. One problem, though, is the stage will also want to pass the entire contents of the file through as well. It was really built to handle XML data, hence the behaviour, which is meant to feed the XML Reader stage from what I recall.
Or you can go the Job Control / scripted route. You could consider taking any and all files found in the folder that match your criteria and concatenating them into one file - and then processing that file. Or manually looping through a directory listing and launching the processing job over and over, passing in the next filename as a job parameter. That would depend on how comfortable you were writing either job control code in DataStage or batch files in Windows.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Welcome aboard! :D
If the muliple files are all in the one directory, all are governed by the same metadata ("table definition") and are moved out of there after processing, you can still use a Sequential File stage. Simply specify a filter like type *.* - output from this command will become the "stdin" of the Sequential File stage. On UNIX you'd use cat *
If the muliple files are all in the one directory, all are governed by the same metadata ("table definition") and are moved out of there after processing, you can still use a Sequential File stage. Simply specify a filter like type *.* - output from this command will become the "stdin" of the Sequential File stage. On UNIX you'd use cat *
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 46
- Joined: Tue Aug 10, 2004 11:07 am
- Location: Mclean VA
-
- Participant
- Posts: 46
- Joined: Tue Aug 10, 2004 11:07 am
- Location: Mclean VA
No problem, happens all the time.
Ray's suggestion of leveraging the Filter command of the Sequential file stage would be best, I would think. Only issue after that would be how you keep it from reprocessing the same files later. And the answer to that will depend on how 'dynamic' the list of files in that folder are. :D
You'd have that issue with any of the suggested methods, however.
Ray's suggestion of leveraging the Filter command of the Sequential file stage would be best, I would think. Only issue after that would be how you keep it from reprocessing the same files later. And the answer to that will depend on how 'dynamic' the list of files in that folder are. :D
You'd have that issue with any of the suggested methods, however.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 46
- Joined: Tue Aug 10, 2004 11:07 am
- Location: Mclean VA
Ray/Craig,
I am trying to use the filter command option now ,that Ray quoted. I have a question, this might sound silly. But I hope you can help me with this.
Say for, I am trying to search for the files with the extension 'txt', in this folder E:\dev\srcdata. I guess I dont have to give anything in the filename option. And in the filter command window. I gave this command
E:\dev\srcdata\type *.txt. This doesnt seem to work. It is throwing me an error. Please advise.
Thanks
Sandeep
I am trying to use the filter command option now ,that Ray quoted. I have a question, this might sound silly. But I hope you can help me with this.
Say for, I am trying to search for the files with the extension 'txt', in this folder E:\dev\srcdata. I guess I dont have to give anything in the filename option. And in the filter command window. I gave this command
E:\dev\srcdata\type *.txt. This doesnt seem to work. It is throwing me an error. Please advise.
Thanks
Sandeep
Try this instead:
Code: Select all
type E:\dev\srcdata\*.txt
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 46
- Joined: Tue Aug 10, 2004 11:07 am
- Location: Mclean VA
-
- Participant
- Posts: 46
- Joined: Tue Aug 10, 2004 11:07 am
- Location: Mclean VA
That should mean that there are no files in your directory that match your wildcard pattern.
What happens if you take your exact same filter command and execute it from the command line in a 'Cmd window'? I'm guessing you'll get the exact same error.
What happens if you take your exact same filter command and execute it from the command line in a 'Cmd window'? I'm guessing you'll get the exact same error.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 46
- Joined: Tue Aug 10, 2004 11:07 am
- Location: Mclean VA
Craig,
I have given the same command in the cmd window. It works fine. I am not sure why it is not working in DS.
I gave the complete file name in the file name option window and the wildcard filter type <dir>*.txt. I tried giving jus the filter option too.
No use.
I guess, I am making some sense here.
Thanks for ur time
Sandeep
I have given the same command in the cmd window. It works fine. I am not sure why it is not working in DS.
I gave the complete file name in the file name option window and the wildcard filter type <dir>*.txt. I tried giving jus the filter option too.
No use.
I guess, I am making some sense here.
Thanks for ur time
Sandeep
-
- Participant
- Posts: 46
- Joined: Tue Aug 10, 2004 11:07 am
- Location: Mclean VA