Hi,
I am facing some issue when using same surrogate key state file by separate jobs in parallel.
1.Currently I am using the stage in parallel mode with 'Generate key from last highest value=YES'.
Beacuse of this as expected value generated will not be continuous. But this approach is desired because if accidently the state file is deleted we can populate the state file with maximum value retrieved from the table(which was populated using this state file).
Before the file was deleted the content of state file
8002 8002
After the state file is created and updated with the max value from table i can view the content as:
1 8001
8002 8002
Since i have used 'Generate key from last highest value=YES', the job which uses this surrogate key will have the value starting from 8001. So no issue here but there is gap in sequence generated.
2. If i use surrogate stage in parallel mode(don't want to use sequential) and 'Generate key from last highest value=NO'. There is no problem until the state file is intact. Even though the data may not be continuous but each value will be used. There will not be any gap in the sequence. The problem comes if the state file is deleted.
Before the file was deleted the content of state file
8002 8002
After the state file is created and updated with the max value from table i can view the content as:
1 8001
8002 8002
In this case when I run job which makes use of above state file to populate the table, the value starts with 1. This data is already present in the table, hence the job aborts with unique key constraint problem.
Can you suggest any way [using surrogate stage in parallel mode(don't want to use sequential) and 'Generate key from last highest value=NO'] that after the state file is created and updated with maximum value from the table it has just 8002 8002 (8001 is the maximum value generated by table). I don't want it to have value '1 8001'
Thanks,
Suma
issue when using 'Generate key from last highest value=NO'
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 7
- Joined: Mon Oct 30, 2006 12:19 am
Re: issue when using 'Generate key from last highest value=N
Let's get this out of the way first:
Suma Reddy wrote:Beacuse of this as expected value generated will not be continuous.
Why is this in any kind of concern? Having "gaps" in surrogate key sequences is normal, expected and nothing to lose any sleep over. Do you have any particular reason for actually needing them to be contiguous and "gap free"?Suma Reddy also wrote:]So no issue here but there is gap in sequence generated.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 7
- Joined: Mon Oct 30, 2006 12:19 am
No although contigous value is preferred it is not mandatory. Can you please let me know the maximum value generated by surrogate key generator using state file?
In my project I am expecting huge volume of data.And this table will grow with each passing day.
I am just wondering if I continue to lose value with this gap in sequence i shouldn't reach to limit of value generated by surrogate key generator.
In my project I am expecting huge volume of data.And this table will grow with each passing day.
I am just wondering if I continue to lose value with this gap in sequence i shouldn't reach to limit of value generated by surrogate key generator.
-
- Premium Member
- Posts: 730
- Joined: Tue Nov 04, 2008 10:14 am
- Location: Bangalore
The holes in the key values do exist if the block size is greater than one but the holes keep filling up when new keys are generated untill the block is completely utilised and then new blocks are reserved so you don't have to worry about the gaps to the extent of going beyond unsigned bigint rangeSuma Reddy wrote:I am just wondering if I continue to lose value with this gap in sequence i shouldn't reach to limit of value generated by surrogate key generator.
- Zulfi
Hi Suma,
IMO, the surrogate key state file has been a little "flaky" since the beginning.
Once you understand what is going on, you can deal with it.
When you create a new state file having a value of 8002 using the "Create and Update" action, you see the results that you have seen:
1 8001
8002 8002
When you update an existing state file with a value of 8002 using the "Update" action you see:
1 1
8002 8002
With the update action, you still have 1 key that will be "reused".
I have built myself a generic utility job sequence to sync a surrogate key state file:
1) remove an existing state file
2) create a state file
3) update the state file with the max value selected from database table
4) generate 1 surrogate key (to burn the "1 1" entry)
The final result is a state file with a single record:
8002 8002
The format of the state file is quite simple. It is simply two unsigned bigint fields stored in binary format with no field or record delimiters (16 bytes per record). You can use the od command to view it. I suspect they are stored with native endianness.
Mike
IMO, the surrogate key state file has been a little "flaky" since the beginning.
Once you understand what is going on, you can deal with it.
When you create a new state file having a value of 8002 using the "Create and Update" action, you see the results that you have seen:
1 8001
8002 8002
When you update an existing state file with a value of 8002 using the "Update" action you see:
1 1
8002 8002
With the update action, you still have 1 key that will be "reused".
I have built myself a generic utility job sequence to sync a surrogate key state file:
1) remove an existing state file
2) create a state file
3) update the state file with the max value selected from database table
4) generate 1 surrogate key (to burn the "1 1" entry)
The final result is a state file with a single record:
8002 8002
The format of the state file is quite simple. It is simply two unsigned bigint fields stored in binary format with no field or record delimiters (16 bytes per record). You can use the od command to view it. I suspect they are stored with native endianness.
Mike
-
- Premium Member
- Posts: 730
- Joined: Tue Nov 04, 2008 10:14 am
- Location: Bangalore
No such experience. I can't imagine how od "did not correctly read the values".
I'm more fluent in hex than octal, so I use od to dump it as hex.
Another quick way to view is to set up a sequential file stage. Define the record type as implicit, set delimiter and quote to none, define two fields as unsigned bigint, add the property to indicate that each field is binary, then use view data.
Mike
I'm more fluent in hex than octal, so I use od to dump it as hex.
Another quick way to view is to set up a sequential file stage. Define the record type as implicit, set delimiter and quote to none, define two fields as unsigned bigint, add the property to indicate that each field is binary, then use view data.
Mike
-
- Participant
- Posts: 7
- Joined: Mon Oct 30, 2006 12:19 am