I have table blobs, consisting of blob_id (char16) and blob_cont (char 4MB)
I need to create sequential files for each row in blobs, where filename = blob_id and file content = blob_cont
Is there a way of creating multiple differently named sequential files simultaneously in DataStage?
multiple sequential file creation?
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 33
- Joined: Tue Nov 06, 2007 1:09 pm
Hmmm... not really, unless you know how many max you'll have ahead of time and build that many output stages into your job. Typical answer would be to create a single file and then script something after-job to split the file into multiple files with the names you require.
Another answer might be to write the output to a type 19 (?) hashed file, which is basically a directory and every 'record' becomes a file inside that directory. Pretty sure it's a type 19 but again you may need to rename the files post-job if you have a particular naming scheme in mind as I don't believe you can control the filenames.
Yet another answer may be (oddly enough) an XML Output stage with a 'trigger' column, just letting your data pass thru the stage with no 'xmling' going on. It creates new files whenever the value in the trigger column changes and the trigger column doesn't need to be output.
Another answer might be to write the output to a type 19 (?) hashed file, which is basically a directory and every 'record' becomes a file inside that directory. Pretty sure it's a type 19 but again you may need to rename the files post-job if you have a particular naming scheme in mind as I don't believe you can control the filenames.
Yet another answer may be (oddly enough) an XML Output stage with a 'trigger' column, just letting your data pass thru the stage with no 'xmling' going on. It creates new files whenever the value in the trigger column changes and the trigger column doesn't need to be output.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Well... not sure I would have ponied up some of those thoughts if I'd known we were talking about two million files daily. Except perhaps academically.
I think your "Option 3" is a perfectly valid solution, kudos for coming up with that.
![Wink :wink:](./images/smilies/icon_wink.gif)
I think your "Option 3" is a perfectly valid solution, kudos for coming up with that.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Charter Member
- Posts: 299
- Joined: Wed Nov 13, 2002 5:38 pm
- Location: USA
-
- Premium Member
- Posts: 33
- Joined: Tue Nov 06, 2007 1:09 pm
Option 4 does also work - slightly slower than the routine version - and in my testing I was not successful in passing the path. i.e. the path must be set by a job parameter, and col 1 to the folder stage becomes filename, col 2, etc file content.
Despite the theoretical likelihood that Parallel jobs will not be particularly helpful for this requirement, we are still interested in comparing the performance of parallel against server.
However, no folder stage available for parallel jobs. Is there an equivalent/alternative? Can file sets be utilised like this for output?
A Parallel routine can not be basic. Anybody ever come across a basic to C++ converter? :D
Otherwise, I guess I need to write a C++ routine that sits on the server file system to be called from the parallel job? Is that the right theory?
Any other comments or remarks?
Despite the theoretical likelihood that Parallel jobs will not be particularly helpful for this requirement, we are still interested in comparing the performance of parallel against server.
However, no folder stage available for parallel jobs. Is there an equivalent/alternative? Can file sets be utilised like this for output?
A Parallel routine can not be basic. Anybody ever come across a basic to C++ converter? :D
Otherwise, I guess I need to write a C++ routine that sits on the server file system to be called from the parallel job? Is that the right theory?
Any other comments or remarks?
-
- Charter Member
- Posts: 299
- Joined: Wed Nov 13, 2002 5:38 pm
- Location: USA