Page 1 of 1

Reading Multiple DataSets using File Pattern

Posted: Thu Jan 27, 2011 10:55 pm
by Nagin
Hi,
Is there a way to read multiple DataSets with similar pattern in a single stage?

For example, I have multiple datasets with same metadata and same partitioning.

TestData_1.ds
TestData_2.ds
TestData_3.ds

I would want to pick up all these datasets TestData_*.ds like we can do for flat files.

I dont see a pattern option in DataSet stage. Is there any way to achieve this?

Thanks.

Posted: Thu Jan 27, 2011 11:39 pm
by ray.wurlod
No. Use separate Data Set stages and run them into a Funnel stage.

Posted: Fri Jan 28, 2011 1:04 am
by Nagin
ray.wurlod wrote:No. Use separate Data Set stages and run them into a Funnel stage.
The problem I have is I wouldn't know how many DataSets will be there. They will be generated dynamically. Today I may have 10 DataSets tomorrow it could be 20.

Posted: Fri Jan 28, 2011 1:26 am
by ray.wurlod
The answer is still no.

Posted: Fri Jan 28, 2011 2:26 am
by gssr
Nagin wrote:
ray.wurlod wrote:No. Use separate Data Set stages and run them into a Funnel stage.
The problem I have is I wouldn't know how many DataSets will be there. They will be generated dynamically. Today I may have 10 DataSets tomorrow it could be 20.
Replace the dataset with sequential file as the target stage in the job that creates the Dataset(dynamic)

Posted: Fri Jan 28, 2011 2:51 am
by meet_deb85
Well, I faced the same challenge but i could do it in the following way.

You will need to have one common sequence job and a parallel for this

Parallel job
Datastet1 -------------------->>>Dataset2
Parameterize the Dataset name in the Dataset 1 and put any name of your choice in Dataset2

Sequence Job
I am mentioning only the first two stages, the rest I guess you will be able to fogure out
Stage 1 - Execute command, put this command in the stage -
orchadmin truncate #Name of the Dataset used in Dataset2 of the parallel job#

Stage 2- Execute command, put this command in the stage:-
ls #The pattern of your Datasets#

Stage 3- Start Loop
and run the loop as many times as the number of datasets you have figured out in Stage 2

Don't forget to keep append mode in the Dataset-2 of the parallel job.