Surrogate Key producing one extra key

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bkumar103
Participant
Posts: 214
Joined: Wed Jul 25, 2007 2:29 am
Location: Chennai

Surrogate Key producing one extra key

Post by bkumar103 »

My job looks as follow:

Job 1
Dataset --> Surrogate Key Generator -> Oracle

Input dataset has 475 Records.

When I observed log there is something i could not understand.
In the log it shows Output 0 produced 476 records although in dataset only 475 Records are there.

can we do anything about this?

Thanks,
Birendra
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

I do not think its surrogate key issue. Can you check from Dataset Management how many records are there and how many records made it to the database?
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
bkumar103
Participant
Posts: 214
Joined: Wed Jul 25, 2007 2:29 am
Location: Chennai

Post by bkumar103 »

In the dataset it is just 475 records and 475 records got loaded to the database. The message is flagged against the surrogate key stage. Also there is one extra surrogate key is getting generated. i.e for the first load say key is generated from 1 to 475 and in the next run it is generating from 477 instead of 476.
Birendra
bkumar103
Participant
Posts: 214
Joined: Wed Jul 25, 2007 2:29 am
Location: Chennai

Post by bkumar103 »

I have changed the Oracle to Sequential file.
SO my job is now

Dataset ->> Surrogate Key Stage -->> Sequential file.

I ran the Job twice. Surrogate key stage runs in the Sequential mode.
I could see that the in the First run the SK is generated from 1 to 475 and in the Next run it generated from 477 to 951. There is a gap between last generated value from Last value from the 1st Run and First value in the Second Run. Did anyone saw such case?

Thanks in Advance,
Birendra
Birendra
prasson_ibm
Premium Member
Premium Member
Posts: 536
Joined: Thu Oct 11, 2007 1:48 am
Location: Bangalore

Post by prasson_ibm »

What blocksize you have selected?
prasson_ibm
Premium Member
Premium Member
Posts: 536
Joined: Thu Oct 11, 2007 1:48 am
Location: Bangalore

Post by prasson_ibm »

If the key source is a flat file, specify how keys are generated:

Code: Select all

■To generate keys in sequence from the highest value that was last used, set the Generate Key from Last Highest Value property to Yes. Any gaps in the key range are ignored.
■To specify a value to initialize the key source, add the File Initial Value property to the Options group, and specify the start value for key generation.
■To control the block size for key ranges, add the File Block Size property to the Options group, set this property to User specified, and specify a value for the block size.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

I have to ask - why do you care? Surrogate keys work regardless of "gaps" and according to industry best practices you are never supposed to assign meaning to them (ie: record counts). As long as every record gets a unique key, it shouldn't matter.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
bkumar103
Participant
Posts: 214
Joined: Wed Jul 25, 2007 2:29 am
Location: Chennai

Post by bkumar103 »

The Block size is set to 1.
Birendra
Post Reply