Page 1 of 1

split one input file into multiple output files

Posted: Sun Feb 03, 2008 3:23 pm
by kishorenvkb
Hi All,

What is the most optimum way of splitting one input into 3 output files based on the filed in the input.

Input is being read from the Teradata table and the output should go into 3 different files based on the value in one column of the input data that is read from Teradata.

Thanks in advance

Kishore

Posted: Sun Feb 03, 2008 9:23 pm
by ray.wurlod
If there are only three (or few) distinct values in the column, Switch stage.

If you're reading in sequential mode from Teradata consider using a three-node configuration file and a modulus partitioning algortihm. You may need to map the three values to integers using a Modify stage. Hash partitioning can not be guaranteed to work. Write to a Sequential File stage that writes to precisely three files - one should be written from each partition.

Otherwise investigate Filter or Transformer stage, each of which has a computation overhead.