Page 1 of 3

Surrogate Key Generator questions

Posted: Mon Jun 27, 2011 9:40 pm
by pandeesh
prem84 wrote:I created a state file using UNIX touch command and in the surrogate key generator stage
......And the state file is also updated
what's the purpose of creating state file?

Posted: Mon Jun 27, 2011 9:41 pm
by pandeesh
ray.wurlod wrote:Keys are allocated in blocks. If not all keys in a block are used (due to slight variations in the number of rows per partition) they are discarded.
does it mean that it's not at all possible to be contiguous?

thanks

Posted: Mon Jun 27, 2011 9:50 pm
by chulett
Since these are your questions, I split them out into your own thread rather than clutter up someone else's issue. And that way you can decide when your doubts have been resolved.

For the record, split from this post if you want to see the original context.

Posted: Tue Jun 28, 2011 1:02 am
by ray.wurlod
Of course it's possible to be contiguous. You could run just a single partition.

Beyond that you can not guarantee contiguity unless you can also guarantee precisely that there are precisely the same number of rows in every partition.

And even then, the Surrogate Key Generator stage is not really the preferred way to do it (you could set the block size to 1, but that is rather slow). You should instead use a Column Generator or Transformer stage to generate the unique values.

Posted: Tue Jun 28, 2011 5:43 am
by pandeesh
So, the state file concept is not applicable in Datastage 7.5.x?

i could not find any option for that in SKG stage.

Posted: Tue Jun 28, 2011 6:36 am
by chulett
Correct, the state file and/or support of a sequence object were added in the 8.x release from what I recall.

Posted: Tue Jun 28, 2011 7:04 am
by pandeesh
so how i can achieve the same in 7.5.x?

Posted: Tue Jun 28, 2011 7:13 am
by chulett
What "same" would that be? The stage works the way it works in your version, not much you can do about that.

Posted: Tue Jun 28, 2011 7:58 am
by pandeesh
chulett wrote:What "same" would that be? The stage works the way it works in your version, not much you can do about that.
My understanding is as given below:

in DS 7.5.x version, the SKG stage is used for key generations in one time load.

Example, in the first run 100 records are loaded. So SK will be generated from 1 to 100.

In the next day the same job runs, it will again generate from 1 to 100 and not from 101.

In DS 7.5.x it's not possible to achieve this using SKG stage.

Correct me ,if my understanding is not correct

Posted: Tue Jun 28, 2011 8:10 am
by jwiles
You are essentially correct. However, you can achieve close to what you want by specifying the starting number for SKG and pass it in as a job parameter.

Regards,

Posted: Tue Jun 28, 2011 8:18 am
by pandeesh
if we are going to run the job manually then we can follow as you said.

But if it's a scheduled job, how to pass the last loaded SK in last run to current run as a parameter?

how to achieve this communication?

Thanks

Posted: Tue Jun 28, 2011 8:26 am
by mhester
You may have to create a job sequence which retrieves the max() value, writes it to a seq file and then bind this value to a job parm in the sequence. This is one possible way to do it, I am sure there are others.

Posted: Wed Jun 29, 2011 1:24 am
by pandeesh
mhester wrote:You may have to create a job sequence which retrieves the max() value, writes it to a seq file and then bind this value to a job parm in the sequence. This is one possible way to do it, I am sure there are others.
Yes!! i have achieved like this;

1)Initiallly I have created a dummy hash file with the value of 1 in that.

2)Then i have created the parallel job with source sequential file ,SKG stage and target sequential file. In the SKG stage i have passed a parameter for start value.

3)I have created one more server job in which the target file of the previous job as source. And i have used aggregator stage for fiinding max() for SK field.Next in the Transformer stage , i have calculated Max()+1 and ovewritten the value in the hash file.

4)In the main job sequence initially i have placed a user variable activity stage and get the value from the hassh file using GetHashValueByKey("","") and pass this value to the first job as a parameter for S.K start value.

In this way i have achieved this with a parallel job and server job in sequenece.

Any other ways or ideas welcome!
Thanks

Posted: Wed Jun 29, 2011 2:16 am
by ray.wurlod
Well, if you're using hashed files, you may as well use the Key Management routines in the SDK for working with them.

Posted: Wed Jun 29, 2011 2:26 am
by pandeesh
ray.wurlod wrote:Well, if you're using hashed files, you may as well use the Key Management routines in the SDK for working with them.
Is it possible to retrieve the value from sequential file and pass in the user varibale activity stage?(i have done this in hashed file.in that design,can the hashed file be replaced by sequential file)

Thanks