Slowly Changing Dimension Stage - size limitations?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Slowly Changing Dimension Stage - size limitations?

Post by ArndW »

I'm not in a position to test this directly and didn't have any luck with a search so I'll ask the question here:

Is there a 2Gb limit to the reference data to to SCD stage? Our source and reference data will be in excess of 5Gb but the documentation indicates an in-memory table is built which makes me think along the lines of the lookup stage and it's inherent memory limit.

Does anyone know for certain that this limit does, or does not, exist?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Isn't that limitation based on the "bitness" of the software? 2GB only in the 32bit version?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Yes, but if the stage was written to support larger amounts of data it could, even with a 32bit pointer, get around this limitation (writing to disk, for example). If it uses the old lookup-stage functionality then I'd be "SOL".
allavivek
Premium Member
Premium Member
Posts: 211
Joined: Sat May 01, 2010 5:07 pm

Post by allavivek »

I think the reference data is loaded in to memory as lookup stage, but based on purpose codes. The amount of data gets loaded in to memory depends on purpose code settings.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Would it make any difference if your reference data were partitioned three or more ways, so that any one process had less than 2GB of reference data with which to deal?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

If it uses the same code as the lookup stage, then the 2Gb limit wouldn't be affected by the number of processing nodes. I guess we'll have to code an example and see if it blows up on us. Once we have a result I'll post it here.
Post Reply