DSXchange

Posted: **Tue Apr 03, 2012 8:21 am**

Hi,

I have to generate SurrogateKey. In S2T they did not mention any thing.

1.NextSurrogateKey() function in transformer.
2. SurrogateKey stage
3.( (@INROWNUM - 1) * @NUMPARTITIONS + @PARTITIONNUM + 1)

Can you suggest me which one is best approach in above list.

Posted: **Tue Apr 03, 2012 5:09 pm**

"Best" in what sense?

Posted: **Tue Apr 03, 2012 8:14 pm**

Hi ray,

in terms of performence.

and also could you plesae explain what is difference between three approches.

Posted: **Tue Apr 03, 2012 9:34 pm**

Define "performance" in an ETL context.

Rows/sec is not a good metric, due to the huge number of variables, not least row size, buffering, and the like.

NextSurrogateKey() function and Surrogate Key Generator stage are functionally identical - they get the next key value from a set maintained in either a state file or a database sequence.

The expression makes use of some system variables that are automatically assigned values.

If I were to be asked my opinion I would opt for the expression being quickest to execute, because the other two have to perform at least some I/O. But in a large data set there wouldn't be a lot in it, as caching effects would probably kick in. But I have no evidence upon which to base that opinion.

Posted: **Wed Apr 04, 2012 8:22 am**

Keep in mind the fact that 'the expression' as show has no 'pick up where it left off last time' logic nor the ability to handle concurrency.

Posted: **Wed Apr 04, 2012 3:19 pm**

True, but those are correctness issues, not performance issues!