Hi,
We have DB2 as source and want to put this in some temporary file and in the next step, read from the file. We will processing more 5 million records. Which is the better approach? To store it in CSV file or Flat file or delimited file?
Thank in advance,
Delimited file or fixed-width or CSV file
Moderators: chulett, rschirm, roy
Re: Delimited file or fixed-width or CSV file
Everything you've mentioned is a "flat file" and a "csv" file is one flavor of a delimited flat file, it just means a specific delimiter - a comma. So... six of one.Shruthi wrote:Which is the better approach? To store it in CSV file or Flat file or delimited file?
Your other option is a dataset and the pros/cons of that could depend on how much space you are willing to dedicate to this and what exactly this 'next step' is and needs.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 342
- Joined: Tue Nov 04, 2008 10:38 am
- Location: Chennai, India
This file is needed by other teams for processing in other softwares. Hence dataset option is removed.
The next step is load into data warehouse. Before loading, we have many stages as per business need.
I am quite new to Datastage. I read about the properties "Number of readers per node" and "Read from multiple nodes". These are available only for fixed width files.
When Datastage PX is used, does it read parellely from delimited files? Is there no difference in used delimited and fixed-width files as far as performance is concerned?
The next step is load into data warehouse. Before loading, we have many stages as per business need.
I am quite new to Datastage. I read about the properties "Number of readers per node" and "Read from multiple nodes". These are available only for fixed width files.
When Datastage PX is used, does it read parellely from delimited files? Is there no difference in used delimited and fixed-width files as far as performance is concerned?
As you noted, fixed-width files can be read by multiple reader nodes while delimited ones cannot. Nature of the beast rather than a specific DataStage 'thing'.
In your shoes, I would ask them (your other teams) what they would prefer. Again, it will depend on the tool used to do the actual loading but fixed-width files tend to be more 'performant' there as well.
In your shoes, I would ask them (your other teams) what they would prefer. Again, it will depend on the tool used to do the actual loading but fixed-width files tend to be more 'performant' there as well.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: