Generic stage - usage of "transform" operator

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
tomasz.koc
Participant
Posts: 7
Joined: Wed Dec 14, 2011 8:48 am
Location: Austria
Contact:

Generic stage - usage of "transform" operator

Post by tomasz.koc »

Hi All,

Can anyone give a simple working example of using a "transform" operator in a "generic" stage.

I am trying to achieve a simple thing - to trim and concatenate two or more columns, but these columns' names are known only during runtime.

example inputs:
col1: 'aa '
col2: ' bb'

example outputs:
col1: 'aa '
col2: ' bb'
col3: 'aabb'

I am trying with this code (which I put into generic stage as below):

Code: Select all

transform
-expression 'col3=u_trim_leading_trailing(col1)+u_trim_leading_trailing(col2)'
-flag compileAndRun
I have 1 input link and 1 output link to generic stage and I use RCP.

This setup gives these problems:

Code: Select all

Warning:Generic_270: The number of reject datasets "0" is less than the number of input datasets "1". [transform/transform.C:1950]
Fatal:Generic_270: Expected semi-colon; [line 1, character 89]. [transform/parse.C:1433]
Usually, what I do is
  • 1. I build a static job using a stage that l later want to change to generic (for example lookup stage)
    2. I compile that static job
    3. I look into "Job properties -> Generated OSH" and I copy the code for lookup operator
    4. In my target job I use generic stage and I paste copied code into "Stage -> Properties -> Options -> Operator" field
    5. I make that field parametric and my whole job is dynamic
The problem with "generated OSH" for a transformer stage is that the generated script just gives me this:

Code: Select all

transform
-flag run
-name 'V0S285_CopyOfst_SK_update1_Transformer_285'
Meaning the whole expression logic is in the precompiled file called V0S285_CopyOfst_SK_update1_Transformer_285 and that does not help me to build a correct script.

I tried the DS documentation but all I found is this:
http://pic.dhe.ibm.com/infocenter/iisin ... sform.html

...and it does not really specify the example of using a correct expression options.

Any help is greatly appreciated.

Regards
Tomasz
Tomasz Koc
pkll
Participant
Posts: 73
Joined: Thu Oct 25, 2012 9:45 pm

Re: Generic stage - usage of "transform" operator

Post by pkll »

Code: Select all

col3---->col1:col2(concatenate col1 and col2)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Try adding the trailing semi-colon.

Take a look at the generated OSH or score for a job where a Transformer stage is used. This will show you exactly how the transform operator can be used.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
tomasz.koc
Participant
Posts: 7
Joined: Wed Dec 14, 2011 8:48 am
Location: Austria
Contact:

Post by tomasz.koc »

Hi Ray,

I did try with semicolons.
Code below is one of examples of generated OSH of the generic stage with semicolon:

Code: Select all

#################################################################
#### STAGE: Generic_270
## Operator
transform
-expression  'NEW_ID=u_trim_leading_trailing(B40OPU)+u_trim_leading_trailing(B40ACC)'
-flag compileAndRun;
## General options
[ident('Generic_270'); jobmon_ident('Generic_270')]
## Inputs
0< [] 'V101:in.v'
## Outputs
0> [] 'Generic_270:to_sort.v'
;
# End of OSH code
But it also gives me an error. It complains about this part:

Code: Select all

[ident('Generic_270'); jobmon_ident('Generic_270')]
Which is added automatically by the OSH generator.

Code: Select all

main_program: Syntax error: Expected operator name, got: "[", line 103; text: copy
It looks to me that there should be only one semicolon at the end of operator script and exactly that one is generated by the OSH.

As I wrote before - I looked at the generated OSH for a job where a Transformer stage is used.

I got this:

Code: Select all

#################################################################
#### STAGE: Transformer_285
## Operator
transform
## Operator options
-flag run
-name 'V0S285_CopyOfst_SK_update1_Transformer_285'
## General options
[ident('Transformer_285'); jobmon_ident('Transformer_285')]
## Inputs
0< [] 'V101:Inrec.v'
## Outputs
0> [] 'Transformer_285:outRec.v'
;
The OSH generator is using "-name" option which is not useful to me because it points to a compiled file.
I am trying to use the "-expression" option which is not very well covered in the documentation...

Below is an extract from documentation...
http://pic.dhe.ibm.com/infocenter/iisin ... sform.html

Transform: syntax and options

Code: Select all

transform
-fileset fileset_description   
-table -key field [ci | cs] 
[-key field [ci | cs] ...] 
[-allow_dups] 
[-save fileset_descriptor] 
[-diskpool pool] 
[-schema schema | -schemafile schema_file] 
[-argvalue job_parameter_name= job_parameter_value  ...][-collation_sequence  locale  | 
collation_file_pathname  |    OFF]
[-expression  expression_string | -expressionfile expressionfile_path ]
[-maxrejectlogs integer]   
[-sort [-input | -output [ port ] -key  field_name  
 sort_key_suboptions  ...]
[-part [-input | -output [port] -key  field_name        part_key_suboptions  ...]
[-flag {compile | run | compileAndRun} [ flag_compilation_options ]] 
[-inputschema  schema | -inputschemafile  schema_file ] 
[-outputschema  schema | -outputschemafile  schema_file ]  
[-reject [-rejectinfo reject_info_column_name_string]]
[-oldnullhandling]
[-abortonnull]

flag_compilation_options are:
[-dir  dir_name_for_compilation ] [-name  library_path_name ]
    [-optimize | -debug] [-verbose] [-compiler  cpath ] 
[-staticobj  absolute_path_name ] [-sharedobj         absolute_path_name ]     [-t  options ] 
    [compileopt options] [-linker lpath] [-linkopt  options ]
Tomasz Koc
tomasz.koc
Participant
Posts: 7
Joined: Wed Dec 14, 2011 8:48 am
Location: Austria
Contact:

Post by tomasz.koc »

Hi All,

After some trial and error I finally managed to achieve what I needed by putting this code into Generic stage:

Code: Select all

transform

-expression '
outputname 0 outRec;
mainloop
{
outRec.NEW_ID=u_trim_leading_trailing(col1)+u_trim_leading_trailing(col2);
writerecord 0;
}
'
-inputschema record(col1:ustring[4];col2:decimal[11,0];inRec:*;)
-outputschema record(NEW_ID:ustring[max=1000];outRec:*;)

-flag compileAndRun
-dir '/opt/SW/IBM/InformationServer/somepath/buildop'
-name 'transform_bmf40p'
Which translates to this "generated" OSH

Code: Select all

#################################################################
#### STAGE: add_fields
## Operator
transform

-expression '
outputname 0 outRec;
mainloop
{
outRec.NEW_ID=u_trim_leading_trailing(col1)+u_trim_leading_trailing(col2);
writerecord 0;
}
'
-inputschema record(col1:ustring[4];col2:decimal[11,0];inRec:*;)
-outputschema record(NEW_ID:ustring[max=1000];outRec:*;)

-flag compileAndRun
-dir '/opt/SW/IBM/InformationServer/somepath/buildop'
-name 'transform_bmf40p'

## General options
[ident('add_fields'); jobmon_ident('add_fields')]
## Inputs
0< [] 'st_tr_syntaxicControl:to_keys.v'
## Outputs
0> [] 'add_fields:outRec.v'
;
The part in these sections is generated by DS

Code: Select all

## General options
## Inputs
## Outputs
Regards
Tomasz
Tomasz Koc
Post Reply