Page 1 of 1

Slowly Changing Dimension Stage - size limitations?

Posted: Wed Jan 23, 2013 9:05 am
by ArndW
I'm not in a position to test this directly and didn't have any luck with a search so I'll ask the question here:

Is there a 2Gb limit to the reference data to to SCD stage? Our source and reference data will be in excess of 5Gb but the documentation indicates an in-memory table is built which makes me think along the lines of the lookup stage and it's inherent memory limit.

Does anyone know for certain that this limit does, or does not, exist?

Posted: Wed Jan 23, 2013 9:31 am
by chulett
Isn't that limitation based on the "bitness" of the software? 2GB only in the 32bit version?

Posted: Wed Jan 23, 2013 9:44 am
by ArndW
Yes, but if the stage was written to support larger amounts of data it could, even with a 32bit pointer, get around this limitation (writing to disk, for example). If it uses the old lookup-stage functionality then I'd be "SOL".

Posted: Wed Jan 23, 2013 10:45 am
by allavivek
I think the reference data is loaded in to memory as lookup stage, but based on purpose codes. The amount of data gets loaded in to memory depends on purpose code settings.

Posted: Wed Jan 23, 2013 1:21 pm
by ray.wurlod
Would it make any difference if your reference data were partitioned three or more ways, so that any one process had less than 2GB of reference data with which to deal?

Posted: Thu Jan 24, 2013 5:13 am
by ArndW
If it uses the same code as the lookup stage, then the 2Gb limit wouldn't be affected by the number of processing nodes. I guess we'll have to code an example and see if it blows up on us. Once we have a result I'll post it here.