Passing transformations dynamically
Moderators: chulett, rschirm, roy
Passing transformations dynamically
Hi Everyone ,
I have the following requirement :
There are some 100 columns and i need to do some transformations on these columns .
col1 , col2 , col3 , col4 ...col100 are columns .
The type of transformations are stored in a table called rule_table .This table will be populated daily . Suppose record1 is coming then it has rule id on the record .Based on this rule id , we will querying the table rule_table based on rule id .
Suppose let us say the record in rule table is :
ruleid ( 100 ) , transformation ( col1 ='A' and col2 ='B' )
ruleid ( 101 ) , transformation ( col10 = 'C' or col59 = 'G' ) and so on .
I need to join those tables based on ruleid and do the transformations based on the transformation column in rule_table .
If incoming record has ruleid =100 then i need to verify
( col = 'A' and col2 = 'B' ) is true or not .Please note that here transformations , number of columns involved in the transformation , name of the columns in the transformations are dynamic .
Could you please tell if this can be acheived through datastage or not ? Please let me know if i have not clearly explained the scenario ?
I have the following requirement :
There are some 100 columns and i need to do some transformations on these columns .
col1 , col2 , col3 , col4 ...col100 are columns .
The type of transformations are stored in a table called rule_table .This table will be populated daily . Suppose record1 is coming then it has rule id on the record .Based on this rule id , we will querying the table rule_table based on rule id .
Suppose let us say the record in rule table is :
ruleid ( 100 ) , transformation ( col1 ='A' and col2 ='B' )
ruleid ( 101 ) , transformation ( col10 = 'C' or col59 = 'G' ) and so on .
I need to join those tables based on ruleid and do the transformations based on the transformation column in rule_table .
If incoming record has ruleid =100 then i need to verify
( col = 'A' and col2 = 'B' ) is true or not .Please note that here transformations , number of columns involved in the transformation , name of the columns in the transformations are dynamic .
Could you please tell if this can be acheived through datastage or not ? Please let me know if i have not clearly explained the scenario ?
Why not ask your friend?
OSH means the Orchestrate Scripts that the PX job 'compiler' creates and which you can find in the directories associated with the job. The scripts run the various 'operators' that each stage represents. My first thought is that a problem with that approach is the 'dynamic transformations' that you want to do would be performed by a transformer and that stage actually compiles down to a C++ executable rather than any script. Second issue would be that you could not generate / manipulate the OSH dynamically, i.e. while the job runs, before it starts perhaps but not while it runs.
You can certainly take a look at some examples from your existing jobs, but I seriously doubt any answer lies down that path.
![Wink :wink:](./images/smilies/icon_wink.gif)
OSH means the Orchestrate Scripts that the PX job 'compiler' creates and which you can find in the directories associated with the job. The scripts run the various 'operators' that each stage represents. My first thought is that a problem with that approach is the 'dynamic transformations' that you want to do would be performed by a transformer and that stage actually compiles down to a C++ executable rather than any script. Second issue would be that you could not generate / manipulate the OSH dynamically, i.e. while the job runs, before it starts perhaps but not while it runs.
You can certainly take a look at some examples from your existing jobs, but I seriously doubt any answer lies down that path.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Thanks craig . Seeing all the posts related to " rule engine " , It looks like its not that simple to implement .I am still unable to figure it out which technology will be a good option to implement the same . I am sure this type of requirement must have been done before . Any suggestions or directions are truly appreciated .Thanks .
From reading those same posts and looking into this myself some years ago, the conclusion that I and others came to was the need for an external dedicated third-party 'Rules Engine' product. We also decided it wasn't worth the time, effort and money it would take to implement. I don't recall hearing about them for quite some time now, so no clue what the current state of that 'market' is but I would guess you could google around or ask your support provider for suggestions.
After that, it would be all about how difficult it would be to interface with whatever you managed to find from inside a job. Make sure whatever mechanism that turns out to be is something your company has the resources / expertise to support and maintain.
Let us know what you find out, curious now.![Wink :wink:](./images/smilies/icon_wink.gif)
After that, it would be all about how difficult it would be to interface with whatever you managed to find from inside a job. Make sure whatever mechanism that turns out to be is something your company has the resources / expertise to support and maintain.
Let us know what you find out, curious now.
![Wink :wink:](./images/smilies/icon_wink.gif)
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 892
- Joined: Thu Oct 16, 2003 5:18 am
DataStage will allow you to dynamically load and execute such rules, but you will be restricted to functions/rules that can be executed within the scope of a modify stage; unfortunately the transform stage is not part of the OSH mechanism directly but gets compiled from c source and is not dynamically interpreted at runtime.
This could be done in a generic stage with calls up the modify operator and needs to be done inside of a DataStage job using runtime column propagation.
This could be done in a generic stage with calls up the modify operator and needs to be done inside of a DataStage job using runtime column propagation.