Surrogate generator not taking initial value from flat file.

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
partheev123
Premium Member
Premium Member
Posts: 20
Joined: Sun Dec 20, 2009 10:46 pm

Surrogate generator not taking initial value from flat file.

Post by partheev123 »

Hi,

I am generating surrogate key and also using flat file option. But the initial value for next run is not coming from the flat file. it is taking some random value. Can you guys please help me in what i should give in the block size, I think that is something i am missing.
Appreciate your help.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Re: Surrogate generator not taking initial value from flat f

Post by jwiles »

partheev123 wrote:Hi,

I am generating surrogate key and also using flat file option. But the initial value for next run is not coming from the flat file. it is taking some random value. Can you guys please help me in what i should give in the block size, I think that is something i am missing.
Appreciate your help.
My first impression is that the next available value is being assigned within a multi-partition job, which can appear to be random.

Are you wanting to start the next job with the next available value or the next highest value (highest previously assigned +1)? They are usually two different values, especially when you are running in parallel. You have the option within the SKG stage (also in Transformer in 8x) whether or not to start with highest+1.

Is the flat file being manipulated in any way between job runs?

The flat file (also called a state file) indicates to SKG what ranges of values are available for assignment within a block of xxxx values, with each instance of SKG assigning values from it's assigned block until it either exhausts those values or the job completes.

I would not change the block size to a lower value unless it is determined to be the only way to achieve what you need. Setting the block size value too low can severely impact performance in a medium-to-high volume job. Caveat: If this is typically a low-volume job, you could potentially set the blocksize to the approximate average record count to reduce the perceived randomness and assigned value gaps due to partitioning.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
Post Reply