Greenplum

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ratna
Participant
Posts: 16
Joined: Mon Aug 13, 2007 3:33 am

Greenplum

Post by ratna »

Hi all,

i have a question, can we use Greenplum Database on Datastage 7.5.2?
And can you tell me how? is it using the ODBC?

Thanks,
Ratna
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Probably. You would need to obtain an ODBC driver (presumably from GreenPlum themselves) that is capable of managing their parallel access functionality. You could also write text files, and have their parallel bulk loader do the heavy lifting. And you could, of course, write your own custom stage (if you can get access to documentation about any client GreenPlum or PostGres API).

A recent press release indicates that GreenPlum has "certified interoperability with ... IBM DataStage", so it may be well worth it to contact them directly to see precisely what this means. And then you can post the answer here!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
tkbharani
Premium Member
Premium Member
Posts: 71
Joined: Wed Dec 27, 2006 8:12 am
Location: Sydney

Post by tkbharani »

Yes , you can use it in DS 7.5.2
We have implemented in 7.1 server job itself. But Insert/update is slow when using ODBC.
Best Way for using GreenPlum with DataStage is use "gpfdist" fast greenplum loader using unix. Very soon they are coming with native connectivity between GreenPlum and DataStage . GP is working on it.

When u have tones of data to be loaded and quried, use GP and DS for best results. :wink:
Thanks, BK
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

We don't actually know the volumes of data that U is processing. We haven't heard from U for some while and even then no details were vouchsafed as to data volumes.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ethanr
Participant
Posts: 23
Joined: Fri Apr 11, 2008 9:37 am
Location: Delhi

Post by ethanr »

If it is regarding size of datalaoding in GreenPlum,
then approximately you can load 1 TeraByte data in less than 3 hours(12 cpu,dual core)
For query'ing you can scan 1 Tera Byte of data in 16 minutes.
For more accurate bench marks you can always contact GreenPlum.
Thanks, EthanR
mike369
Participant
Posts: 7
Joined: Fri Mar 23, 2012 12:21 am

Post by mike369 »

tkbharani, can you tell config steps details? thank you
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

From almost four years ago?
:shock:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply