Hi,
Is there a way to read multiple DataSets with similar pattern in a single stage?
For example, I have multiple datasets with same metadata and same partitioning.
TestData_1.ds
TestData_2.ds
TestData_3.ds
I would want to pick up all these datasets TestData_*.ds like we can do for flat files.
I dont see a pattern option in DataSet stage. Is there any way to achieve this?
Thanks.
Reading Multiple DataSets using File Pattern
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Replace the dataset with sequential file as the target stage in the job that creates the Dataset(dynamic)Nagin wrote:The problem I have is I wouldn't know how many DataSets will be there. They will be generated dynamically. Today I may have 10 DataSets tomorrow it could be 20.ray.wurlod wrote:No. Use separate Data Set stages and run them into a Funnel stage.
RAJ
-
- Premium Member
- Posts: 132
- Joined: Tue Sep 04, 2007 11:38 am
- Location: NOIDA
Well, I faced the same challenge but i could do it in the following way.
You will need to have one common sequence job and a parallel for this
Parallel job
Datastet1 -------------------->>>Dataset2
Parameterize the Dataset name in the Dataset 1 and put any name of your choice in Dataset2
Sequence Job
I am mentioning only the first two stages, the rest I guess you will be able to fogure out
Stage 1 - Execute command, put this command in the stage -
orchadmin truncate #Name of the Dataset used in Dataset2 of the parallel job#
Stage 2- Execute command, put this command in the stage:-
ls #The pattern of your Datasets#
Stage 3- Start Loop
and run the loop as many times as the number of datasets you have figured out in Stage 2
Don't forget to keep append mode in the Dataset-2 of the parallel job.
You will need to have one common sequence job and a parallel for this
Parallel job
Datastet1 -------------------->>>Dataset2
Parameterize the Dataset name in the Dataset 1 and put any name of your choice in Dataset2
Sequence Job
I am mentioning only the first two stages, the rest I guess you will be able to fogure out
Stage 1 - Execute command, put this command in the stage -
orchadmin truncate #Name of the Dataset used in Dataset2 of the parallel job#
Stage 2- Execute command, put this command in the stage:-
ls #The pattern of your Datasets#
Stage 3- Start Loop
and run the loop as many times as the number of datasets you have figured out in Stage 2
Don't forget to keep append mode in the Dataset-2 of the parallel job.