Multiple files based on number

dodda · Post by **dodda** » Wed Mar 25, 2009 4:08 am

Hello

I have a requirement where lets say i have a flat file with 200 records and after doing some transformations i need to produce a file each for every 50 records. So i will have to produce 4 files with 50 records each. If the input file has 60 records i need to produce 2 files with 1 file 50 records and the second with 10 records. Is there a way that this can be done through datstage.

Thanks

ray.wurlod · Post by **ray.wurlod** » Wed Mar 25, 2009 4:27 am

Yes. Create sufficient output links to handle the worst case scenario, and filter on row number (ideally row number from original source). Run in sequential mode.

Pagadrai · Post by **Pagadrai** » Wed Mar 25, 2009 6:13 am

Hi,
If predicting the number of branches you might need is tough, you can try this:

Create a Wrapped stage (Custom stage)
You can call a Unix script that will partition the data and write
to multiple files.

This is just an idea.

I will also try this for learning purpose and post the result.

dodda · Post by **dodda** » Wed Mar 25, 2009 6:20 am

Hi

thanks for your response. Yes the input file might have any number of records but i need to produce a file for every 50 records.I have never created custom stages before. Is there a way other than creating custom stages.

Thanks

Pagadrai · Post by **Pagadrai** » Wed Mar 25, 2009 6:24 am

dodda wrote:Hi

thanks for your response. Yes the input file might have any number of records but i need to produce a file for every 50 records.I have never created custom stages before. Is there a way other than creating custom stages.

Thanks

Hi,
Once you have the unix script for the purpose, implementing is not difficult.
or instead of a stage, you can land the data in an intermediary sequential file and call the script once the job is complete.

dodda · Post by **dodda** » Wed Mar 25, 2009 7:12 am

OK

Thanks for your help

verify · Post by **verify** » Wed Mar 25, 2009 7:22 am

After doing the transformations for entire records, load the records into a sequential file and then call your script through After Job Routine that will split your records into 50..

chulett · Post by **chulett** » Wed Mar 25, 2009 8:03 am

If you go that route, the UNIX command "split" can be used to chunk up the full file into smaller files and then you may need to loop through the results and rename the files unless you can live with the naming convention the command uses.

wahi80 · Post by **wahi80** » Wed Mar 25, 2009 8:03 am

Thats right, just use the csplit command in unix and you will be able to achieve your objective

chulett · Post by **chulett** » Wed Mar 25, 2009 8:13 am

Actually, just the "split" command since the requirement is by row count. There's no need for the "context split" capability (based on file contents) that the csplit command brings to the table.

DSXchange

Multiple files based on number

Multiple files based on number

Re: Multiple files based on number