Page 1 of 1

surrogate key generator

Posted: Tue Feb 07, 2012 9:45 am
by karthi_gana
All,

I have designed a simpe job to get familiar with surrogate key generator stage.

seq file --> surrogate_key_generator --> seq_file

columns:
name --> --> seq, name

surrogate key properties:

source_type = flat file
source name = path of my input file with the file name.
Generated Output column name = seq

seq_file content:

karthik
gana
mani
ray
chulet

my requirement is to generate the below output using surrogate key generator stage.

1 karthik
2 gana
3 mani
4 ray
5 chulet

when i run the job, the job failed with the below message.

Surrogate_Key_Generator_0,0: Unable to lock state file /bis_data/msg/mfr/surr_key_test.txt: Input/output error.

Note i am able to see the file content when i click 'view file' in first seq_file.

Re: surrogate key generator

Posted: Tue Feb 07, 2012 10:03 am
by karthi_gana
what i have learned from here is "state file " is a file which will maintain the sequence number kind of values. (again not sure) where as the extension of this file is .sk

is it correct? if yes, where can i see that .sk file?

note I have modified the partition mode to "Sequential" in the surrogate key generator stage and ran the job. i just got the above said error message.

Posted: Tue Feb 07, 2012 10:52 am
by rupeshg
#1 You have to first generate the .sk file with initial value - you can do this with

row generator ----> surrogate key generator stage


row generator:-
1 row
column- seq
data type- integer
initial vlue=0

Surrogate key stage:-
input column name- seq
action- Create and Update
Source name- /<full path>/<filename>.sk
source type- Flat file

Run this job to create surrogate key file.

#2 Generating sequences from .sk file created above -
Try this -

row generator---->surrogate key stage---->peek stage


row generator:-
10 rows
column- name
datatype- char

surrogate key stage:-
generated output column- seq
source name- /<full path>/<filename>.sk
under options:
File Initial Value=1
File block size=user specified
User-specified block size=1

Run the above job and you should see this output-

Peek_6,0: seq:1 name:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Peek_6,1: seq:2 name:bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
Peek_6,0: seq:3 name:cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Peek_6,1: seq:4 name:dddddddddddddddddddddddddddddddddddddddddddddddddddd
Peek_6,0: seq:5 name:eeeeeeeeeeeeeeeeeeeeeeee
Peek_6,1: seq:6 name:ffffffffffffffffffffffffffffffffffff
Peek_6,0: seq:7 name:ggggggggggggggggggg
Peek_6,1: seq:8 name:hhhhhhhhhhhhhhhhhh
Peek_6,0: seq:9 name:iiiiiiiiiiiiiii
Peek_6,1: seq:10 name:jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj


There you go !!

Posted: Tue Feb 07, 2012 9:42 pm
by kandyshandy
Based on the requirements, you can also use touch command to create the state file for the first time ;)

Surrogate key generator stage

Posted: Wed Feb 15, 2012 4:18 am
by sunnygupta
Hi eveyone,

I unable to run the job having surrogate_key_generator in sequential mode ....

The error comes:-Input data set on port 0 has a partition method, but the operator is not parallel.

I hav changed preserve partitioning="clear" but still encountering the same error....

Can we run it sequentially or not? if so pl. provide me the solution.The job is.....

seqfile------->surr_key_generator---------->targetseqfile

Thanks in advance.

Posted: Wed Feb 15, 2012 3:22 pm
by ray.wurlod
The extension is not required. Put another way, no extension is required for the state file name.