Page 1 of 1

Split the Input Source into Multiple Datasets

Posted: Mon Sep 28, 2015 1:54 am
by samratisking
Hi All,

My requirement is to split the data in a table (80 - 100 fields) into multiple datasets at the target.

The target datasets should be created dynamically in the following way:

Dataset1 - 1 to 30 columns
Dataset2 - 31 to 60 columns
Dataset3 - 61 to 90 columns
Dataset4 - 91 to 100 columns.

We are not sure of the structure(metadata) of the input, and we are using RCP to read all the fields in datastage.

Any ideas are greatly appreciated.

Best Regards,
Sam

Posted: Mon Sep 28, 2015 4:30 am
by Thomas.B
For me, the first step is to create a job to put all the records from your base to a Dataset.
Then, you can create a multiple instance job to modify your schema like that:

Code: Select all

Input Dataset ---> Modify ---> Output Dataset
In the modify, set the specification to "KEEP #PARAM1#" where PARAM1 is a job parameter that contains the columns you want to put in one of your Datasets
After that, you can create a sequence job to call 1 time your first job, then, 4 times your second job with different column lists.
You can also declare your column list directly on the odbc stage but you will have to call it 4 times

I dont think you can do it without naming the columns in DataStage. Only by joining with the catalog tables from your database (syscat.columns for DB2, dbc.columns for Teradata ...)

Posted: Mon Sep 28, 2015 8:43 pm
by stuartjvnorton
How do you use them together coherently afterwards?
Are you taking a copy of the PK field[s] along into each of the files?