Surrogate Key Generator, prefixing Surrogate Key

fmou · Post by **fmou** » Mon Jun 06, 2011 8:11 pm

Hi,

How can I prefix the Surrogate Key to an existing column instead of adding another column in Surrogate Key Generator? I.e., I want the output field to be

Code: Select all

SK : "," : RecIn

but seems to me that the output field is not editable in Surrogate Key Generator.

Reason behind this request,

I want to define columns, data types, and other format options in the *parallel* Column Import stage, not the Sequential stage, which can improve performance because it separates parsing process from sequential read process.

However, I need to assign a row number in parallel using the Surrogate Key Generator to the record read sequentially.

Please help.

Thanks

ray.wurlod · Post by **ray.wurlod** » Mon Jun 06, 2011 9:09 pm

Not possible in the way you describe. But you can interpolate a Column Generator stage in the flow to generate a constant (a Cycle with only one Value).

fmou · Post by **fmou** » Tue Jun 07, 2011 7:13 am

fmou wrote:I want to define columns, data types, and other format options in the *parallel* Column Import stage, not the Sequential stage, which can improve performance because it separates parsing process from sequential read process.

However, I need to assign a row number in parallel using the Surrogate Key Generator to the record read sequentially.

Actually, the reason that I'm asking was,

from the book that I am reading,

IBM WebSphere DataStage Advanced Enterprise Edition (DX436), Draft - 5/19/2005, on page 280

It says,

Column Import Examples
...
Using a Sequential File stage (running sequentially):
If the incoming file does not have a way of uniquely identifying a
given row, it may be necessary to read the file sequentially, and
assign a row number in parallel using the Surrogate Key
Generator

So I guess that the above saying is wrong then.

jwiles · Post by **jwiles** » Tue Jun 07, 2011 7:54 am

You can always use a transformer to concatenate the surrogate key with whichever column. However, if your intent is to just parse it back out with Column Import, don't even bother with the concatenation. Just let the surrogate key column pass through to the output of Column Import....no need to waste time and CPU cycles concatenating and parsing the same generated column.

Regards,

fmou · Post by **fmou** » Tue Jun 07, 2011 8:27 am

jwiles wrote:. . . if your intent is to just parse it back out with Column Import, don't even bother with the concatenation. Just let the surrogate key column pass through to the output of Column Import....

Thanks for the reply,

Actually that's the first method that I tried, however, due to my limited knowledge to DS, I wasn't able to -- didn't know how to have Column Import handle two fields. Could you elaborate a bit please?

Thanks

jwiles · Post by **jwiles** » Tue Jun 07, 2011 12:39 pm

simply map the surrogate key column from the input link to the output link (as you would in most other stages) on the Output link Mapping tab.

Regards,

ray.wurlod · Post by **ray.wurlod** » Tue Jun 07, 2011 2:49 pm

That is a very old reference!

The way some things work has changed over the years.

fmou · Post by **fmou** » Tue Jun 07, 2011 3:56 pm

Could you elaborate a bit more, Ray, please?

I guess the principle to define columns, data types, and other format options in the *parallel* Column Import stage, not the Sequential stage still hold, just the practice how to add the surrogate key has changed, right?

thanks

fmou · Post by **fmou** » Sat Jun 18, 2011 10:03 pm

jwiles wrote:simply map the surrogate key column from the input link to the output link (as you would in most other stages) on the Output link Mapping tab.

Oh, just figured out why you think that way. No that doesn't work for my case because I'm using schema file as the Column Method of the Column Import stage. I.e., my Output link Mapping was empty. Guess the only way to make it work is to switch to Explicit Column Method then.

ray.wurlod · Post by **ray.wurlod** » Sun Jun 19, 2011 12:09 am

Why not just have the Sequential File stage generate the row number? That would serve as the key you seek.

fmou · Post by **fmou** » Tue Jun 21, 2011 9:51 am

ray.wurlod wrote:Why not just have the Sequential File stage generate the row number? That would serve as the key you seek. ...

Ah, good idea, didn't know that.

Type: Output Column
Adds an unsigned BigInt column with the specified name to the output that contains the row number.

thanks

fmou · Post by **fmou** » Wed Jun 22, 2011 1:53 pm

ray.wurlod wrote:Why not just have the Sequential File stage generate the row number? That would serve as the key you seek. ...

Ops, spoke too soon. Actually, the limitation is at the Column Import stage.

I'm using the schema file as the Column Method of the Column Import stage. I.e., my Output link Mapping was empty. Any columns other than the importing one can not be passed along with the imported ones, correct?

thanks

jwiles · Post by **jwiles** » Wed Jun 22, 2011 2:24 pm

You've asked your sequential file stage to create a record number column. Is it defined in the file stage's output metadata grid (in the stage GUI)?

Is that column defined in your input link metadata grid? If so, you should be able to drag and drop it in the output link mapping tab.

If not, it won't show up as a mappable input column in the stage GUI. Do you have runtime column propagation enabled? If so, you can manually add that row count column to the metadata grid so that you can access it through the GUI.

Regards,

DSXchange

Surrogate Key Generator, prefixing Surrogate Key

Surrogate Key Generator, prefixing Surrogate Key

Re: Surrogate Key Generator, prefixing Surrogate Key