KeyMgtKeyNextValue transform
Moderators: chulett, rschirm, roy
KeyMgtKeyNextValue transform
I am trying to find a way to duplicate the identity function in SQL. I have found the KeyMgt transforms but I have been unable to find documentation on these transforms. I do know that the transform KeyMgtKeyNextValue will generate sequential numbers but how to use this transform exactly and what literal string I should give it have been guesswork.
Re: KeyMgtKeyNextValue transform
The identify functionality is a means to generate a unique identifier for each row. What you are talking about is a surrogate key assignment functionality. Your question, while seemingly easy, can invoke a theoretical, spiritual, and philosophical debate.
If you simply want an ever increasing unique identifier, then all you need to do is create a "seed" process (can be a job, script, whatever) that simply selects the maximum value currently in use in the table and puts that output somewhere (DS hash file, sequential text file, etc). Then, during transformation process you simply use that value as a seed to a running variable and keep incrementing it every time you want the next value.
However, you have not stated how you want to use it in a complete sense. Do you need to do a natural key lookup to see if the row is already in the target table, then use currently assigned surrogate key (your identity assigned column) else create the next value. This activity is typical of a data warehouse environment. In this case, the process is almost the same as above, except now you have to interrogate the target to see if a row is already assigned a value.
At this point, I'm going to recommend that you pursue Ascential's training programs, as this is covered not only in the elementary training course, but very much in depth in their methodology course. It is available for e-learning, as well as contains many working examples that will better situate you for this type of design.
If you simply want an ever increasing unique identifier, then all you need to do is create a "seed" process (can be a job, script, whatever) that simply selects the maximum value currently in use in the table and puts that output somewhere (DS hash file, sequential text file, etc). Then, during transformation process you simply use that value as a seed to a running variable and keep incrementing it every time you want the next value.
However, you have not stated how you want to use it in a complete sense. Do you need to do a natural key lookup to see if the row is already in the target table, then use currently assigned surrogate key (your identity assigned column) else create the next value. This activity is typical of a data warehouse environment. In this case, the process is almost the same as above, except now you have to interrogate the target to see if a row is already assigned a value.
At this point, I'm going to recommend that you pursue Ascential's training programs, as this is covered not only in the elementary training course, but very much in depth in their methodology course. It is available for e-learning, as well as contains many working examples that will better situate you for this type of design.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Re: KeyMgtKeyNextValue transform
Thanks for answering my question. I have taken the Ascential training and am aware of using lookup tables to get the next value. I noticed the KeyMgtKeyNextValue transform in Vers. 7 but there is very little documentation and was wondering if this was Ascential's way of introducing key generation functionality within DataStage.justlrng wrote:I am trying to find a way to duplicate the identity function in SQL. I have found the KeyMgt transforms but I have been unable to find documentation on these transforms. I do know that the transform KeyMgtKeyNextValue will generate sequential numbers but how to use this transform exactly and what literal string I should give it have been guesswork.
as ray mentioned, the SEQNAME.NEXTVAL is a great way to replace the IDENTITY function in SQL, keeping the persistence in the DB. you probably only want to use the KeyMgtKeyNextValue if you specifically want to move the persistence out of the database for use by multiple datasources, as Ken describes. It uses Common storage to hold the Key value, and you should only use this if you are familiar with clearing, deleting, creating common variables (do a help on "common" from datastage).
The function creates a type 2 hash file called "SDKSequences" in your datastage project to store the "last Value", using the argument as the name of the Key of the file. It then post-increments the number by 1 and writes it back to the common variable. To use it, simply call the function inside a derivation in the transform: if the comman variable is not created, it will create it. To reset the value, you need to delete the SDKSequences file from your project.
There are several methods for calling NEXTVAL. Here are some suggestions in order of my personal preference:
1. Call NEXTVAL as a subselect in the Attribute derivation from the source stage column derivations
2. Call Nextval in a separate stage/page which is referenced in the Xform: The stage should not contain a key value to "lookup", resulting in a Nextval call for evey input stream record, regardless if the value is used in the output link of the Xform
3. use @NULL as the derivation for the target field, calling Nextval in a userdefined insert statement of the target stage, wraping the call in an NVL or isnull function: NVL(:1, SEQNAME.NEXTVAL)
4. write a routine to exec a SQL command to call NEXTVAL
Send me an email if you have more specific questions offline.
The function creates a type 2 hash file called "SDKSequences" in your datastage project to store the "last Value", using the argument as the name of the Key of the file. It then post-increments the number by 1 and writes it back to the common variable. To use it, simply call the function inside a derivation in the transform: if the comman variable is not created, it will create it. To reset the value, you need to delete the SDKSequences file from your project.
There are several methods for calling NEXTVAL. Here are some suggestions in order of my personal preference:
1. Call NEXTVAL as a subselect in the Attribute derivation from the source stage column derivations
2. Call Nextval in a separate stage/page which is referenced in the Xform: The stage should not contain a key value to "lookup", resulting in a Nextval call for evey input stream record, regardless if the value is used in the output link of the Xform
3. use @NULL as the derivation for the target field, calling Nextval in a userdefined insert statement of the target stage, wraping the call in an NVL or isnull function: NVL(:1, SEQNAME.NEXTVAL)
4. write a routine to exec a SQL command to call NEXTVAL
Send me an email if you have more specific questions offline.
I would avoid using the routine KeyMgtKeyNextValue if you are going to be using a instantiated jobs (think of Agent Smith from Matrix 2, more me) to tackle a transformation. Because this routine uses internal BASIC commands that read-lock the hash file maintaining the last value assigned, all of your jobs will stall as one job hogs the lock table.
What I have done is accept gaps in surrogate key sequencing (remember: "Gaps are okay" spake Kimball, "because surrogate keys should be meaningless") Anyway, what I do is use a modified version that only locks the row long enough to grab a handfull of keys, say 10000 values, and write back a value 10000 greater. The job will work off that internal list, safe in the knowledge that no one else will be assigning in that rate. This means another instantiated clone (Agent Smith) will start from that value+10000 and add 10000 to it and use that range. Once the 10000 keys are exhausted, it will go get some more.
Usage: AssignSkeyParallel("TABLENAME", "BLOCKSIZE", "INCREMENTBY")
Set the hash file name to whatever you want, and make sure you seed the hash file before each run with the current max values on the tables.
Good luck!
What I have done is accept gaps in surrogate key sequencing (remember: "Gaps are okay" spake Kimball, "because surrogate keys should be meaningless") Anyway, what I do is use a modified version that only locks the row long enough to grab a handfull of keys, say 10000 values, and write back a value 10000 greater. The job will work off that internal list, safe in the knowledge that no one else will be assigning in that rate. This means another instantiated clone (Agent Smith) will start from that value+10000 and add 10000 to it and use that range. Once the 10000 keys are exhausted, it will go get some more.
Usage: AssignSkeyParallel("TABLENAME", "BLOCKSIZE", "INCREMENTBY")
Set the hash file name to whatever you want, and make sure you seed the hash file before each run with the current max values on the tables.
Code: Select all
COMMON /SURRKEYLastSurrogateKey/ LastSurrogateKeyUsed, PresizeChunk, KeysToUse, Initialized, F.FILE
FunctionName = "AssignSkeyParallel"
LastSurrogateKeyFile = "TABLE.COUNT"
If Initialized # "INITIALIZED" OR KeysToUse = 0 Then
If Initialized # "INITIALIZED" Then
Call DSLogInfo("Initializing", FunctionName)
OpenFailed = @TRUE
OPEN LastSurrogateKeyFile TO F.FILE Then
OpenFailed = @FALSE
End Else
TCL = "CREATE.FILE ":LastSurrogateKeyFile:" DYNAMIC"
Call DSLogInfo("TCL: ":TCL, FunctionName)
Call DSExecute("TCL", TCL, ScreenOutput, SystemReturnCode)
Call DSLogInfo(ScreenOutput, FunctionName)
OPEN LastSurrogateKeyFile TO F.FILE Then
OpenFailed = @FALSE
End
End
If OpenFailed Then
Call DSLogFatal("Unable to open/create the file ":LastSurrogateKeyFile, FunctionName)
End
Initialized = "INITIALIZED"
Call DSLogInfo("Initialization finished", FunctionName)
End
TableName = UPCASE(TRIM(Arg1))
PresizeChunk = TRIM(Arg2)
If PresizeChunk = "" Then PresizeChunk = 5000
Call DSLogInfo("Retrieving next surrogate key for ":TableName, FunctionName)
RecordLocked = @FALSE
TimeOut = 60 * 5 ; * 60 seconds * 5 minutes
SleepPulse = 15
Loop
ReadU row From F.FILE,TableName LOCKED
TimeOut -= 1
Sleep 1
SleepPulse -= 1
If SleepPulse = 0 Then
Call DSLogInfo("Waiting on last surrogate key for ":TableName, FunctionName)
SleepPulse = 15
End
RecordLocked = @TRUE
End Then
RecordLocked= @FALSE
CONVERT '"' TO '' IN row
End Else
RecordLocked = @FALSE
row = ""
row<1> = 0
End
LastSurrogateKeyUsed = row<1>
Until TimeOut = 0 OR NOT(RecordLocked) Do Repeat
If TimeOut > 0 Then
Call DSLogInfo("Read last surrogate key ":LastSurrogateKeyUsed:" for ":TableName, FunctionName)
NewLastSurrogateKeyUsed = LastSurrogateKeyUsed + PresizeChunk
KeysToUse = PresizeChunk
row<1> = NewLastSurrogateKeyUsed
WRITE row ON F.FILE,TableName
LastSurrogateKeyUsed = LastSurrogateKeyUsed + Arg3
KeysToUse = PresizeChunk - Arg3
Ans = LastSurrogateKeyUsed
End Else
Call DSLogFatal("Unable to obtain lock on ":LastSurrogateKeyFile:" record ":TableName, FunctionName)
End
End Else
*
* Increment the last surrogate key used
*
LastSurrogateKeyUsed = LastSurrogateKeyUsed + Arg3
KeysToUse -= Arg3
Ans = LastSurrogateKeyUsed
End
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle