DS connectivity with Hadoop

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mydsworld
Participant
Posts: 321
Joined: Thu Sep 07, 2006 3:55 am

DS connectivity with Hadoop

Post by mydsworld »

Please let me know If DS 11.3 can connect to Hadoop file system (Cloudera 5) and to Hive/HBase tables. If 'Yes' how to configure.

Thanks in advance.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes. Use the HDFS stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mydsworld
Participant
Posts: 321
Joined: Thu Sep 07, 2006 3:55 am

Post by mydsworld »

Hi Ray,
Thanks for your reply. I havn't used it ever, so wondering using HDFS stage I can access files on HDFS. Will I be able to access tables in Hive or HBase using 'HDFS stage' ?

Thanks.
mydsworld
Participant
Posts: 321
Joined: Thu Sep 07, 2006 3:55 am

Post by mydsworld »

Ray or Others,

Is there any stage called 'HDFS stage' in DS 11.3. I thought it is 'Big Data' File stage to access files sitting on Hadoop.

Curious to know if one can set up the following.

1. ODBC for Hive tables and calling it in Datastage
2. JDBC for HBase tables and calling it in Datastage.

Thanks.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The 'HDFS stage' is the Big Data File stage.
-craig

"You can never have too many knives" -- Logan Nine Fingers
mydsworld
Participant
Posts: 321
Joined: Thu Sep 07, 2006 3:55 am

Post by mydsworld »

Thanks for the clarification. Still curious about the following.

ODBC/JDBC for Hive/HBase tables and calling it in Datastage
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

If those access methods are supported they can be 'called' from DataStage. The JDBC side would take extra shenanigans because, well... Java. You could always ask your official support provider as well.
-craig

"You can never have too many knives" -- Logan Nine Fingers
JPalatianos
Premium Member
Premium Member
Posts: 306
Joined: Wed Jun 21, 2006 11:41 am

Post by JPalatianos »

Hi,
I was just curious if you were ever able to configure your connectivity to Cloudera using the "Big Data File Stage"?

We have just been tasked with a similar exercise connecting to Cloudera from our 11.5 DataStage installation.

Thanks - - John
atulgoel
Participant
Posts: 84
Joined: Tue Feb 03, 2009 1:09 am
Location: Bangalore, India

Post by atulgoel »

Hi .. just wanted to know if you are able to read the hive tables or hdfs files using Big data file stage...Even I have a similar requirement and doing research on the configuration part.
Atul
Post Reply