I have a need for reading multiple files with the similar structure in one dataset stage? Since I can't specifiy a pattern in the DataSet stage, how would I achieve this?
For example, I have 4 file like Item_A.DS, Item_B.DS, Item_C.DS and Item_D.ds (all with similar column layouts). I have to read-in all the four files. If they were text files, I would have used the SequentialFile stage and would read them with the pattern Item_*.Txt.
How would I do the same with DataSets?
Thanks for your responses in advance.
Kishore Nagururu
Pattern read in the DataSet Stage.
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54
- Joined: Mon Dec 24, 2007 9:27 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
This can not be done with a Data Set stage. You might have some luck with a File Set. "Similar" structure is not good enough - "identical" structure is required.
If you are trying to read four Data Sets, however, this is an entirely different ball game. There is no multiple reader. Use a Funnel stage. However, the Data Sets must have identical parallelism.
If you are trying to read four Data Sets, however, this is an entirely different ball game. There is no multiple reader. Use a Funnel stage. However, the Data Sets must have identical parallelism.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54
- Joined: Mon Dec 24, 2007 9:27 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Are they Data Sets or files?
You can not do the "File Set" thing if they are Data Set descriptor files, at least not sensibly, because you lack proper metadata for Data Set descriptor files.
For files there is an option in the Sequential File stage to read multiple files as a File Set.
You can not do the "File Set" thing if they are Data Set descriptor files, at least not sensibly, because you lack proper metadata for Data Set descriptor files.
For files there is an option in the Sequential File stage to read multiple files as a File Set.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54
- Joined: Mon Dec 24, 2007 9:27 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Four Data Set stages, one Funnel stage. The Data Set stage can only read one Data Set (which is, itself, a parallel structure, so has at least as many data files as there are processing nodes).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54
- Joined: Mon Dec 24, 2007 9:27 am
Thanks Ray.
The situation we are in is... the number of dataset files are not fixed. We are in the pilot and as we mass... we may get more dataset files as input. We cannot afford to open up the code to add the new dataset stage, everytime.
That was the reason, why I was exploring the pattern read for datasets. Since I can't do pattern reads for datasets, what are my other options?
Thanks again.
The situation we are in is... the number of dataset files are not fixed. We are in the pilot and as we mass... we may get more dataset files as input. We cannot afford to open up the code to add the new dataset stage, everytime.
That was the reason, why I was exploring the pattern read for datasets. Since I can't do pattern reads for datasets, what are my other options?
Thanks again.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
As they say in the classics, "tough bikkies". You don't have any alternative. You could always create ten jobs, that handle one through ten Data Sets, and use a job sequence to decide which one to run.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.