Page 1 of 1

Evaluation of Datastage as an ETL tool

Posted: Wed Mar 03, 2010 11:46 am
by ratuldey
We need to evaluate Datastage ETL tool based on the information we receive against the following queries we have. Kindly provide us the information if available with anyone. Any help would be appreciated -

(1) Version/Change Control Support: Ability to save multiple versions. Ability to port code across environments. Does Datastage support a multi-developer environment?
(2) Vendor Viability: How committed is IBM to the product Datastage or the concept of BI? How well does the IBM support the Datastage?
(3) Security: Does Datastage support LDAP? How granular is the security?
(4) Performance: Does Datastage support parallelism? Does it have a bulk data load function?
(5) Database/Platform: Does Datastage run on Intel, RISC, and mainframe? Which source and target DBMS is supported?
(6) Ease of Use: Does Datastage have an intuitive GUI? How much training is required?
(7) Functionality: Built in functions. Custom-built and reusable? Is there support for external scripts or code?
(8 ) Supportability: Can Datastage be supported through the web? Does Datastage interface with an external batch scheduler? How sophisticated is the monitoring?
(9) Compatibility: How well does Datastage interface with the existing environment? Does Datastage provide connectors to existing data sources and applications?
(10) Error Handling/Debugging: How well does Datastage handle errors?
(11) Audit ability: How well does Datastage log activity? How available are the logs?
(12) Extraction Mechanism: Does Datastage support both push and pull options?
(13) Data Cleansing: Are there any built in cleansing functions?
(14) Metadata: How open is the metadata collected. Which DB is it stored?
(15) Cost: Approximate total cost of ownership.

Posted: Wed Mar 03, 2010 12:08 pm
by chulett
You should be asking all those questions of IBM Sales rather than people here. Have you done so or are you at least working with someone? :?

Posted: Wed Mar 03, 2010 4:28 pm
by vmcburney
This is the type of thing IBM Sales do all the time.

(1) Version/Change Control Support: Ability to save multiple versions. Ability to port code across environments. Does Datastage support a multi-developer environment?
YES, YES, YES
(2) Vendor Viability: How committed is IBM to the product Datastage or the concept of BI? How well does the IBM support the Datastage?
VERY COMMITTED. VERY WELL.
(3) Security: Does Datastage support LDAP? How granular is the security?
YES. VERY.
(4) Performance: Does Datastage support parallelism? Does it have a bulk data load function?
YES. YES - FOR ORACLE, TERADATA, DB2, SQLSERVER, NETEZZA, SYBASE.
(5) Database/Platform: Does Datastage run on Intel, RISC, and mainframe? Which source and target DBMS is supported?
WINDOWS, UNIX, LINUX, Z-LINUX, MOST DATABASES NATIVELY AND ODBC AND MAINFRAME MVS AND UNIX SYSTEM SERVICES
(6) Ease of Use: Does Datastage have an intuitive GUI? How much training is required?
YES. 5 DAYS DEVELOPER 3 DAYS ADMINISTRATOR 10 DAYS ADVANCED DEVELOPER 5 YEARS GURU
(7) Functionality: Built in functions. Custom-built and reusable? Is there support for external scripts or code?
YES, YES, YES, YES, YES
(8 ) Supportability: Can Datastage be supported through the web? Does Datastage interface with an external batch scheduler? How sophisticated is the monitoring?
YES, YES, BASIC SCHEDULING BUT ADVANCED SEQUENCING AND JOB CONTROL
(9) Compatibility: How well does Datastage interface with the existing environment? Does Datastage provide connectors to existing data sources and applications?
WHAT EXISTING ENVIRONMENT? YES - MOST ARE FREE, ERP PACKS COST MORE, INFOSPHERE CDC IS AN ADDON.
(10) Error Handling/Debugging: How well does Datastage handle errors?
AS WELL AS THE DEVELOPER ALLOWS
(11) Audit ability: How well does Datastage log activity? How available are the logs?
ALL EVENTS LOGGED, LOG VIEWS IN DIRECTOR TOOL OR WEB REPORT TOOL OR RETRIEVED AND EMAILED BY JOB CONTROL CODE
(12) Extraction Mechanism: Does Datastage support both push and pull options?
YES
(13) Data Cleansing: Are there any built in cleansing functions?
YES
(14) Metadata: How open is the metadata collected. Which DB is it stored?
VIA METADATA WORKBENCH COMPLETELY OPEN, DB2 ORACLE SQLSERVER.
(15) Cost: Approximate total cost of ownership.
ANYWHERE BETWEEN $30 (ONE HOUR ON THE CLOUD) TO $1 BILLION DOLLARS (WORLDS GREATEST SUPER GRID). PLUS TAX IN SOME TERRITORIES.

Posted: Wed Mar 03, 2010 4:43 pm
by DSguru2B
Vincent, you should be working as a sales rep. Well answered in clear, declarative, unambiguous language. Very difficult to mis-interpret.

Posted: Wed Mar 03, 2010 5:48 pm
by ray.wurlod
Even so, but I thought I'd have a crack at this one purely as an intellectual exercise.

My opinions follow. They are not necessarily those of my employer nor of anyone else.

(1) Yes.
(2) IBM has a bigger picture for Information Management than just ETL, but the DataStage ETL tool is a vital part of that story. DataStage will be around for decades and will remain an IBM product. It is supported by very competent people.
(3) Yes. Users authenticate using LDAP user names. The login/security service accesses the LDAP server for user and group authentication.
(4) Yes and Yes. The parallel execution engine scales linearly and automatically.
(5) Yes, yes and yes. Any source, any target.
(6) The GUI is reasonably intuitive, but the tool can do so much that at least entry-level training (four days) is needed. For more advanced areas, such as working with SAP, further training is recommended.
(7) Yes, yes and yes.
(8) Depends what you mean by "supported". Many of the administrative functions are performed using browser-based clients. A command line interface allows any third-party scheduler to be used - monitoring is whatever you script the scheduler to interrogate. Another component, Metadata Workbench (again browser-based) can perform extremely sophisticated monitoring.
(9) Very well. Yes.
(10) This is of course a subjective judgment. I prefer to prevent errors. Error reporting is quite comprehensive.
(11) There are various strategies available. You choose the one that best suites your requirements.
(12) Yes.
(13) Yes, enhanced with collaborative profiling and data quality tools in the Information Server suite.
(14) You control who sees metadata. They are accessible through a browser-based interface. Choice of database is DB2, Oracle or Microsoft SQL Server.
(15) Depends what you buy. Expensive, but you get what you pay for.