Infosphere 11.5 on Hadoop

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sendmkpk
Premium Member
Premium Member
Posts: 97
Joined: Mon Apr 02, 2007 2:47 am

Infosphere 11.5 on Hadoop

Post by sendmkpk »

We are about to upgrade from 8.7 on grid plus lsf to IIS 11.5.

We are really confused if we should go with IIS 11.5 on grid or IIS on Hadoop not BigIntergrate or BigQuality.

Any thoughts how do regular datastage jobs would behave on Hadoop cluster.

How would all input files/scripts would really work on a hadoop cluster.

any thoughts from Gurus is much appreciated.

Thanks
Praveen
Salegueule
Participant
Posts: 35
Joined: Fri May 21, 2004 4:22 pm

Re: Infosphere 11.5 on Hadoop

Post by Salegueule »

One thing to remember is that you wont get any gain of performance on anything that involves small files. Hadoop file system (HDFS) is not working well with small files. A small file can be defined as any file that is much smaller than the Hadoop block size. The Hadoop block size is usually set to 64,128, or 256 MB.

Thanks.
Post Reply