Hadoop and Datastage EE

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
daignault
Premium Member
Premium Member
Posts: 165
Joined: Tue Mar 30, 2004 2:44 pm
Contact:

Hadoop and Datastage EE

Post by daignault »

Hadoop - http://en.wikipedia.org/wiki/Hadoop


I'm looking at playing with the Hadoop API and creating a buildop for write operations. Anyone out there worked with Hadoop.

I've created buildops before so I'm fine on that side of the ledger. I suspect that with hadoop I'll need to pay attention to partitioning on the outbound connector.

Just to clarify, at the present time I'm only interested in playing with updating the HDFS file system. Not playing with Map reduce, etc.

Thanks for any insite.

Regards,

Ray D
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Stay tuned to IBM for much more coming around Hadoop and processing unstructured data generally, particularly using InfoSphere Streams. One of the two big themes for the coming year for Big Blue is handling the 80% of data that are unstructured.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
akrish1982
Participant
Posts: 7
Joined: Thu May 24, 2012 12:55 pm
Contact:

Hadoop is now part of V8.7

Post by akrish1982 »

Hadoop integration is now a part of IIS suite. Hadoop is very important in large scale computing today.

http://datastageetlexpert.blogspot.com/ ... adoop.html
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Indeed. As noted in a different thread right here on DSXchange, there is a Big Data stage available for version 8.7, which is essentially a Sequential File stage that connects to a Hadoop file system. So you get all the benefits of the STREAMS I/O module still, with straightforward access to Hadoop.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Mr Daignault's company is bogged down with Red Tape and only has 8.1 and 8.5 installed in an ETL environment.

Having an 8.7 one will be another year away if that.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Meh.

Try working with Defence and Taxation authorities. I'm doing both at the moment. MUCH waiting on bureaucratic processes and even then what they give you may not be what you asked for.

Don't get me started.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply