6 Transformers Or 60 Stage Variables
Moderators: chulett, rschirm, roy
6 Transformers Or 60 Stage Variables
Hi,
I am at a cross road between choosing one of the 2 types of designs.
Problem :
From a database i will receive 6 different types of levels, Based on the level(eg: 010,050,030,060)
i need to perform
a. Split the String into multiple columns
b. Perform validation of each of these columns and Add up all the error messages as a single string at each level
I will be able to acheive the solution using the following designs:
Solution:
1. Use 60 variables to perform validation of data
or
2. Use 6 transformers to perform validation of each of these levels and create the ErrorMessage String
My question is which is a better approach:
1. Have 60 stage variables
2. Have 6 transformers
Please suggest
I am at a cross road between choosing one of the 2 types of designs.
Problem :
From a database i will receive 6 different types of levels, Based on the level(eg: 010,050,030,060)
i need to perform
a. Split the String into multiple columns
b. Perform validation of each of these columns and Add up all the error messages as a single string at each level
I will be able to acheive the solution using the following designs:
Solution:
1. Use 60 variables to perform validation of data
or
2. Use 6 transformers to perform validation of each of these levels and create the ErrorMessage String
My question is which is a better approach:
1. Have 60 stage variables
2. Have 6 transformers
Please suggest
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I'd advocate 60 meaningfully-named stage variables, then monitor the stage to determine its resource consumption. If that's less than 100%, leave it alone. Otherwise break it into two and monitor again. Repeat until no process is demanding more than 100% of one CPU.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
I would go so far as to say 60 well-named stage variables with well-thought-out and efficiently-written derivations. Sure wouldn't want to see anything like this:
if input_link.thedate = StringToDate('1299-01-01') then 'NULL' else if input_link.thedate >= StringToDate('2000-01-01') and input_link.thedate <= StringToDate('2010-12-31') then 'VALID' else 'INVALID'
On the other hand, this type of code does generate billable work for me
if input_link.thedate = StringToDate('1299-01-01') then 'NULL' else if input_link.thedate >= StringToDate('2000-01-01') and input_link.thedate <= StringToDate('2010-12-31') then 'VALID' else 'INVALID'
On the other hand, this type of code does generate billable work for me
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
At one time maybe so, but not since at least 7.5 and maybe 7.0 IIRC. Transformer-generated code is framework-native and while it can't match hand-rubbed C++ custom operators, it can be pretty darn good when well written. Most performance issues I see with transformers are due to poor derivation logic and/or job design practices by developers.daignault wrote:Every time you use a transformer stage, the data is exported, processed by the C++ code in the transformer and re-imported with data validation into the Datastage run machine.
Ray D
Although, writing 6 separate transformers wouldn't be recommended even now. That's just extra stages for the data to be transported between and to pass through.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
1: Maybe not the best term, but: The executable created by the transformer compilation is not isolated from the Orchestrate framework by means of the export/import process as an external source would be, or a wrappered operator somewhat is. It can also be combined with other operators by Orchestrate at runtime.abc123 wrote:jwiles, just 2 questions about your comment.
1) What do you mean by "framework-native"?
2) Are you saying that after DS 7.5, a job with a transformer is not interpreted into C++ under the hood?
2: No, I am not saying that. The compilation process converts the logic in transformer derivations to the Orchestrate transform language (a subset of C++), then plugs that into a C++ operator framework and compiles the resulting operator. You still are required to have a supported C++ compiler in order to use transformers in a parallel job.
And yes, you still see the transform operator named in your OSH. Many improvements have been made to the the operator and the generated code to eliminate their notorious slow performance in the earlier releases. A well written transformer can have performance that approaches that of some native stages. I just really hate to hear that old paradigm still taught as the absolute truth.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.