Hi
We are struggling to make decission whether to go for Audit stage or Normal Transformer approach.
Our case is like Business rule going to be keep on added and also going to be huge... The option we r thinking is:
1. Use audit stage and define rules as filters and generate exception table and use tht exception table with source(may be stage table) to generate table.
2. Use tranformer and filter(where clause) at source
The option 1 we thing maintanence wise very easy to go about, but we feel that is very performance intensive!!
In option 2 maintanability is hectic!!
Looking forward for your suggestions.
Thanks
Madhav
Pros & Cons of Audit stage
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
We went with business rules and constraint filters written in transformers and directing messages to sequential files. These files were then collected and loaded into a couple custom message tables. From this we can peruse messages for failed or dirty rows and get metrics. We audit the data as it moves through the ETL jobs. We use stage variables for most of these checks and have a standard output format to produce the same set of message columns from all jobs.
Stage variables helps you organise your rules within a single transformer at the end of a DataStage job. You can check and output multiple rule messages. Good naming conventions and standard error codes and checks helps organise it.
We have the ability to decide whether a rule is KEEP or DROP for each row. Sometimes we want to report a rule failure but still want to deliver the row to the target.
Stage variables helps you organise your rules within a single transformer at the end of a DataStage job. You can check and output multiple rule messages. Good naming conventions and standard error codes and checks helps organise it.
We have the ability to decide whether a rule is KEEP or DROP for each row. Sometimes we want to report a rule failure but still want to deliver the row to the target.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
At one site where I worked late last year the requirement was similar to yours. Indeed, the business rules depended on what data were available in addition to changing over time. We implemented a "late binding" approach, where a table-driven approach was used to select the appropriate business rule, which was then executed via a "rule dispatcher" routine using "indirect CALL" (which you can read about in the DataStage BASIC manual - it's the CALL @subrname(args) syntax).
Needless to say, this was in a server job environment, not a parallel job environment. Still handling millions of rows, however.
I have never used AuditStage in-line (wasn't aware that it was easily possible), but I have used QualityStage in-line to perform data cleansing and resultant validation.
Needless to say, this was in a server job environment, not a parallel job environment. Still handling millions of rows, however.
I have never used AuditStage in-line (wasn't aware that it was easily possible), but I have used QualityStage in-line to perform data cleansing and resultant validation.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.