Page 1 of 1

Delimited file or fixed-width or CSV file

Posted: Fri Dec 17, 2010 8:48 am
by Shruthi
Hi,

We have DB2 as source and want to put this in some temporary file and in the next step, read from the file. We will processing more 5 million records. Which is the better approach? To store it in CSV file or Flat file or delimited file?

Thank in advance,

Re: Delimited file or fixed-width or CSV file

Posted: Fri Dec 17, 2010 8:56 am
by chulett
Shruthi wrote:Which is the better approach? To store it in CSV file or Flat file or delimited file?
Everything you've mentioned is a "flat file" and a "csv" file is one flavor of a delimited flat file, it just means a specific delimiter - a comma. So... six of one.

Your other option is a dataset and the pros/cons of that could depend on how much space you are willing to dedicate to this and what exactly this 'next step' is and needs.

Posted: Fri Dec 17, 2010 8:56 am
by vinothkumar
If your next step is also in DataStage, Dataset will be a good option.

Posted: Fri Dec 17, 2010 10:49 am
by Shruthi
This file is needed by other teams for processing in other softwares. Hence dataset option is removed.
The next step is load into data warehouse. Before loading, we have many stages as per business need.
I am quite new to Datastage. I read about the properties "Number of readers per node" and "Read from multiple nodes". These are available only for fixed width files.
When Datastage PX is used, does it read parellely from delimited files? Is there no difference in used delimited and fixed-width files as far as performance is concerned?

Posted: Fri Dec 17, 2010 10:57 am
by chulett
As you noted, fixed-width files can be read by multiple reader nodes while delimited ones cannot. Nature of the beast rather than a specific DataStage 'thing'.

In your shoes, I would ask them (your other teams) what they would prefer. Again, it will depend on the tool used to do the actual loading but fixed-width files tend to be more 'performant' there as well.

Posted: Fri Dec 17, 2010 2:42 pm
by ray.wurlod
In version 8.1 delimited files CAN be read using multiple readers per node, though it remains true that this is more efficient with fixed-width files.

Posted: Fri Dec 17, 2010 2:52 pm
by chulett
Ah... that's good to know about 8.1.

Posted: Sat Dec 18, 2010 6:05 am
by Shruthi
Thanks Ray! That was of great help.