split data based on size in datastage

prasson_ibm · Post by **prasson_ibm** » Thu Jun 12, 2014 9:21 am

Hi,

Is there any way in datastage i can split the input file (eg 1 GB) based on size lets say 1 MB and create the multiple files.

vamsi.4a6 · Post by **vamsi.4a6** » Thu Jun 12, 2014 9:39 am

In 9.1 we can split the file based on key but i think we have only unix split command option. In one of post i read we can use Folder stage in server jobs to split a file but not sure exactly

chulett · Post by **chulett** » Thu Jun 12, 2014 9:39 am

Perhaps, but why not simply do it at the command line?

prasson_ibm · Post by **prasson_ibm** » Thu Jun 12, 2014 10:01 am

Hi,

This is what i exactly suggested,but my client is rigid,want to check the capebility of datastage

chulett · Post by **chulett** » Thu Jun 12, 2014 10:34 am

I'm not aware of any "split by size" option. Of course you could BuildOp whatever you like and still call it "in DataStage". Otherwise you would have to do something by row count after you figure out approximately how many records will generally equal 1MB. Using the Folder stage as a target is one option but is a bit of a pain in the butt. There have been other suggestions made in the past when this question was asked before that a search should turn up.

ArndW · Post by **ArndW** » Thu Jun 12, 2014 11:40 am

Since the input file is, by definition, a source it is not changed in the job. If you wish to use a UNIX utility such as "split" to split the file then you can do so as part of the before-job calls, likewise you could call a BASIC Before-job routine to perform more complex tasks on the file.

Are you sure you didn't mean to split the output file?

chulett · Post by **chulett** » Thu Jun 12, 2014 3:29 pm

Sam Ting.

qt_ky · Post by **qt_ky** » Thu Jun 12, 2014 8:11 pm

The Big Data File stage in 9.1 has an optional Max File Size property for writing target files. When the max size is reached (in MB), it generates another target file. I found this in the documentation; have not tested it myself.