Page 1 of 1

Which is best tool for handling 20TB to 30TB Data ?

Posted: Wed Mar 30, 2005 12:56 am
by Subu
Hi ,
Which is best tool for handling 20TB to 30TB Data ? Compare between DataState or Ab Initio?
Please advice me.

Thanks
Subu

Posted: Wed Mar 30, 2005 1:40 am
by ArndW
Subu,

it really depends upon a lot of factors and there is no single answer. DataStage Px is great for large amounts of data, as is Ab Initio. Both tools will do the job. The deciding factors are then price, availability of resources, the Salesperson, etc.

Apart from that, this question has been asked in different forms several times in this forum and a search of the headers should give you a number of interesting threads with many different opinions and views. But since this is a DataStage site the view will be somewhat slanted.

Posted: Wed Mar 30, 2005 1:51 am
by roy
Hi,
Do they have a forum like this one? (must have a weight in my opinion ;))

Posted: Wed Mar 30, 2005 4:32 am
by Subu
Thanks for your help. Still i am in the confused stage :) . Because I need to give good reason to our client why i am chose for DataStage .. why not ApInitio ?

Thanks
Subu
***

Posted: Wed Mar 30, 2005 4:57 am
by ArndW
Subu,

in order to recommend one over the other you would need to know more about the data, platforms, databases, metadata, infrastructure, etc. How is the 20-30Tb of Data organized - is that 300 rows at 100Gb apiece? 10 Tables? Source from complex sources, from PL/1 or Cobol? Does the client require metadata management at a low or high level?

All ETL tools will move data from A to B with no real transformations or logic in about the same amount of time; so using size alone as a differentiator is no use.

Posted: Wed Mar 30, 2005 6:21 pm
by vmcburney
You are asking the question on a DataStage forum so people will tend to favour DataStage over Ab Initio. Both ETL tools have a robust parallel processing engine that is better then any other data integration tool on the market. They both have data quality plugins and metadata management. We cannot tell you which one is better. It comes down to price, availability of resources and best fit in your enterprise. Why don't you let each company present to your client and let the client decide.

Re: Which is best tool for handling 20TB to 30TB Data ?

Posted: Wed Mar 30, 2005 11:30 pm
by T42
Subu wrote:Hi ,
Which is best tool for handling 20TB to 30TB Data ? Compare between DataState or Ab Initio?
DataStage EE. Seriously. There is a challenge out by Ascential to anyone who can defeat the benchmarked performance (I don't have the specific URL handy at the moment), and NOONE, not even Ab Initio, could come close, performance-wise.

Ascential have plenty of very large clients who throw terabytes of data left n' right daily, so they are very experienced with your data size. I have worked on databases that are 2-3 terabytes total, and DataStage does a decent job handling them.

One interesting trivia: DataStage EE was initially developed by Torrent. Torrent and Ab Initio used to be a single company, before they spun off into separate paths due to disagreement on how to handle data. The combined company was founded by people who developed the Thinking Machines back in the late 80s/early 90s. My memory is so fuzzy right now on this. I do think that Torrent chose the right solution, and I bet Ascential would agree. :-)

Posted: Wed Apr 06, 2005 9:55 am
by diamondabhi
DataStage EE ofcourse, no questions about it even performance wise and economic factor and technical support.

Posted: Wed Apr 06, 2005 4:07 pm
by ray.wurlod
Go to Ascential's web site and do a search for "benchmark". There are some success stories (with big data volumes) that you can use to support your pro-DataStage case.

Posted: Wed Apr 06, 2005 10:16 pm
by Subu
Thanks for reply. I found some material from 'coogle' also.

Subu
***

Posted: Wed Apr 06, 2005 10:16 pm
by Subu
Thanks for reply. I found some material from 'coogle' also.

Subu