I'm not in a position to test this directly and didn't have any luck with a search so I'll ask the question here:
Is there a 2Gb limit to the reference data to to SCD stage? Our source and reference data will be in excess of 5Gb but the documentation indicates an in-memory table is built which makes me think along the lines of the lookup stage and it's inherent memory limit.
Does anyone know for certain that this limit does, or does not, exist?
Yes, but if the stage was written to support larger amounts of data it could, even with a 32bit pointer, get around this limitation (writing to disk, for example). If it uses the old lookup-stage functionality then I'd be "SOL".
I think the reference data is loaded in to memory as lookup stage, but based on purpose codes. The amount of data gets loaded in to memory depends on purpose code settings.
Would it make any difference if your reference data were partitioned three or more ways, so that any one process had less than 2GB of reference data with which to deal?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
If it uses the same code as the lookup stage, then the 2Gb limit wouldn't be affected by the number of processing nodes. I guess we'll have to code an example and see if it blows up on us. Once we have a result I'll post it here.