Creating sequential text file on fly

rafik2k · Post by **rafik2k** » Sat Jun 30, 2007 4:55 pm

I am not sure whether is it possible or not?
I have input data set of customers information for different region like North,South,West,North_West etc.

My requirement is to create exactly one output file for each region.
I am not sure how many region would be at the time of reading input file.

Based on condition I have to create as many dynamic file as many region exist in input data set.

If input data set contains data two region, then I will need to create only two output file for those two region.

If anyone of you have any idea or hints, kindly let me know.

Any help would be greatly appreciated.

Thanks in Advance.

JoshGeorge · Post by **JoshGeorge** » Sat Jun 30, 2007 6:57 pm

Do this in 2 jobs. First job just to identify the regions in the input. Second job to create files according to the output of first job. Both the jobs can be called from a sequence.

chulett · Post by **chulett** » Sun Jul 01, 2007 7:06 am

Not a whole lotta 'how' there.

Rafik, all things are possible. Do you need to do any preprocessing to determine region? I get the impression you've already got all the information you need for that. Actually, do you need to do any processing on this 'data set' or do you simple need to split it per region?

If you do need to process the data in order to create properly formatted flat file(s) then fine, I'm sure you can handle that part. You can use a couple of stages to create multiple output files - either the Folder stage or (oddly enough) the XML Output stage if you're not picky about the output filenames or are willing to run a 'rename' post-processing step. Both stages would require you to deliver the final product as a single 'record', the difference is the Folder stage allows you to control the filename while the XML Output stage lets you name the first file and then appends an incremented number to it as the value of the 'Trigger' column changes.

Or if you just need to split the file, I'd suggest something like an awk or perl script if you have it available. This would also allow you to create 1 output file from a DataStage job and then split it afterwards, as long as the region (or whatever value controls the splitting) is actually in the output file.

Or... if you have mad skillz in BASIC or C (etc) you could always write some hand code to read your source sequential file and create multiple output files. That would give you full and explicit control over exactly what goes on.