Pattern read in the DataSet Stage.

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Pattern read in the DataSet Stage.

Post by kishorenvkb »

I have a need for reading multiple files with the similar structure in one dataset stage? Since I can't specifiy a pattern in the DataSet stage, how would I achieve this?

For example, I have 4 file like Item_A.DS, Item_B.DS, Item_C.DS and Item_D.ds (all with similar column layouts). I have to read-in all the four files. If they were text files, I would have used the SequentialFile stage and would read them with the pattern Item_*.Txt.

How would I do the same with DataSets?

Thanks for your responses in advance.
Kishore Nagururu
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

This can not be done with a Data Set stage. You might have some luck with a File Set. "Similar" structure is not good enough - "identical" structure is required.

If you are trying to read four Data Sets, however, this is an entirely different ball game. There is no multiple reader. Use a Funnel stage. However, the Data Sets must have identical parallelism.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Post by kishorenvkb »

:-) Yes they are exactly identical in layout. How do you use the file set?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are they Data Sets or files?

You can not do the "File Set" thing if they are Data Set descriptor files, at least not sensibly, because you lack proper metadata for Data Set descriptor files.

For files there is an option in the Sequential File stage to read multiple files as a File Set.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Post by kishorenvkb »

They are datasets. We are planning to move from sequential files to Datasets for obvious performance reasons. Any help is greatly appreciated.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Four Data Set stages, one Funnel stage. The Data Set stage can only read one Data Set (which is, itself, a parallel structure, so has at least as many data files as there are processing nodes).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Post by kishorenvkb »

Thanks Ray.

The situation we are in is... the number of dataset files are not fixed. We are in the pilot and as we mass... we may get more dataset files as input. We cannot afford to open up the code to add the new dataset stage, everytime.

That was the reason, why I was exploring the pattern read for datasets. Since I can't do pattern reads for datasets, what are my other options?

Thanks again.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

As they say in the classics, "tough bikkies". You don't have any alternative. You could always create ten jobs, that handle one through ten Data Sets, and use a job sequence to decide which one to run.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply