Page 1 of 1

Generic stage - usage of "transform" operator

Posted: Tue Apr 16, 2013 2:05 am
by tomasz.koc
Hi All,

Can anyone give a simple working example of using a "transform" operator in a "generic" stage.

I am trying to achieve a simple thing - to trim and concatenate two or more columns, but these columns' names are known only during runtime.

example inputs:
col1: 'aa '
col2: ' bb'

example outputs:
col1: 'aa '
col2: ' bb'
col3: 'aabb'

I am trying with this code (which I put into generic stage as below):

Code: Select all

transform
-expression 'col3=u_trim_leading_trailing(col1)+u_trim_leading_trailing(col2)'
-flag compileAndRun
I have 1 input link and 1 output link to generic stage and I use RCP.

This setup gives these problems:

Code: Select all

Warning:Generic_270: The number of reject datasets "0" is less than the number of input datasets "1". [transform/transform.C:1950]
Fatal:Generic_270: Expected semi-colon; [line 1, character 89]. [transform/parse.C:1433]
Usually, what I do is
  • 1. I build a static job using a stage that l later want to change to generic (for example lookup stage)
    2. I compile that static job
    3. I look into "Job properties -> Generated OSH" and I copy the code for lookup operator
    4. In my target job I use generic stage and I paste copied code into "Stage -> Properties -> Options -> Operator" field
    5. I make that field parametric and my whole job is dynamic
The problem with "generated OSH" for a transformer stage is that the generated script just gives me this:

Code: Select all

transform
-flag run
-name 'V0S285_CopyOfst_SK_update1_Transformer_285'
Meaning the whole expression logic is in the precompiled file called V0S285_CopyOfst_SK_update1_Transformer_285 and that does not help me to build a correct script.

I tried the DS documentation but all I found is this:
http://pic.dhe.ibm.com/infocenter/iisin ... sform.html

...and it does not really specify the example of using a correct expression options.

Any help is greatly appreciated.

Regards
Tomasz

Re: Generic stage - usage of "transform" operator

Posted: Tue Apr 16, 2013 2:49 am
by pkll

Code: Select all

col3---->col1:col2(concatenate col1 and col2)

Posted: Tue Apr 16, 2013 3:27 am
by ray.wurlod
Try adding the trailing semi-colon.

Take a look at the generated OSH or score for a job where a Transformer stage is used. This will show you exactly how the transform operator can be used.

Posted: Tue Apr 16, 2013 4:12 am
by tomasz.koc
Hi Ray,

I did try with semicolons.
Code below is one of examples of generated OSH of the generic stage with semicolon:

Code: Select all

#################################################################
#### STAGE: Generic_270
## Operator
transform
-expression  'NEW_ID=u_trim_leading_trailing(B40OPU)+u_trim_leading_trailing(B40ACC)'
-flag compileAndRun;
## General options
[ident('Generic_270'); jobmon_ident('Generic_270')]
## Inputs
0< [] 'V101:in.v'
## Outputs
0> [] 'Generic_270:to_sort.v'
;
# End of OSH code
But it also gives me an error. It complains about this part:

Code: Select all

[ident('Generic_270'); jobmon_ident('Generic_270')]
Which is added automatically by the OSH generator.

Code: Select all

main_program: Syntax error: Expected operator name, got: "[", line 103; text: copy
It looks to me that there should be only one semicolon at the end of operator script and exactly that one is generated by the OSH.

As I wrote before - I looked at the generated OSH for a job where a Transformer stage is used.

I got this:

Code: Select all

#################################################################
#### STAGE: Transformer_285
## Operator
transform
## Operator options
-flag run
-name 'V0S285_CopyOfst_SK_update1_Transformer_285'
## General options
[ident('Transformer_285'); jobmon_ident('Transformer_285')]
## Inputs
0< [] 'V101:Inrec.v'
## Outputs
0> [] 'Transformer_285:outRec.v'
;
The OSH generator is using "-name" option which is not useful to me because it points to a compiled file.
I am trying to use the "-expression" option which is not very well covered in the documentation...

Below is an extract from documentation...
http://pic.dhe.ibm.com/infocenter/iisin ... sform.html

Transform: syntax and options

Code: Select all

transform
-fileset fileset_description   
-table -key field [ci | cs] 
[-key field [ci | cs] ...] 
[-allow_dups] 
[-save fileset_descriptor] 
[-diskpool pool] 
[-schema schema | -schemafile schema_file] 
[-argvalue job_parameter_name= job_parameter_value  ...][-collation_sequence  locale  | 
collation_file_pathname  |    OFF]
[-expression  expression_string | -expressionfile expressionfile_path ]
[-maxrejectlogs integer]   
[-sort [-input | -output [ port ] -key  field_name  
 sort_key_suboptions  ...]
[-part [-input | -output [port] -key  field_name        part_key_suboptions  ...]
[-flag {compile | run | compileAndRun} [ flag_compilation_options ]] 
[-inputschema  schema | -inputschemafile  schema_file ] 
[-outputschema  schema | -outputschemafile  schema_file ]  
[-reject [-rejectinfo reject_info_column_name_string]]
[-oldnullhandling]
[-abortonnull]

flag_compilation_options are:
[-dir  dir_name_for_compilation ] [-name  library_path_name ]
    [-optimize | -debug] [-verbose] [-compiler  cpath ] 
[-staticobj  absolute_path_name ] [-sharedobj         absolute_path_name ]     [-t  options ] 
    [compileopt options] [-linker lpath] [-linkopt  options ]

Posted: Wed Apr 17, 2013 2:03 am
by tomasz.koc
Hi All,

After some trial and error I finally managed to achieve what I needed by putting this code into Generic stage:

Code: Select all

transform

-expression '
outputname 0 outRec;
mainloop
{
outRec.NEW_ID=u_trim_leading_trailing(col1)+u_trim_leading_trailing(col2);
writerecord 0;
}
'
-inputschema record(col1:ustring[4];col2:decimal[11,0];inRec:*;)
-outputschema record(NEW_ID:ustring[max=1000];outRec:*;)

-flag compileAndRun
-dir '/opt/SW/IBM/InformationServer/somepath/buildop'
-name 'transform_bmf40p'
Which translates to this "generated" OSH

Code: Select all

#################################################################
#### STAGE: add_fields
## Operator
transform

-expression '
outputname 0 outRec;
mainloop
{
outRec.NEW_ID=u_trim_leading_trailing(col1)+u_trim_leading_trailing(col2);
writerecord 0;
}
'
-inputschema record(col1:ustring[4];col2:decimal[11,0];inRec:*;)
-outputschema record(NEW_ID:ustring[max=1000];outRec:*;)

-flag compileAndRun
-dir '/opt/SW/IBM/InformationServer/somepath/buildop'
-name 'transform_bmf40p'

## General options
[ident('add_fields'); jobmon_ident('add_fields')]
## Inputs
0< [] 'st_tr_syntaxicControl:to_keys.v'
## Outputs
0> [] 'add_fields:outRec.v'
;
The part in these sections is generated by DS

Code: Select all

## General options
## Inputs
## Outputs
Regards
Tomasz