How to Handle DataQuality?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kishorebhulokam
Participant
Posts: 9
Joined: Thu Jul 10, 2008 6:18 pm

How to Handle DataQuality?

Post by kishorebhulokam »

Hi,

can any one please tell me how to handle the dataquality in ETL jobs.

Thanks in Advance...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Initial skepticism is what I've always found to be best. Disbelieve everything you are assured about data quality (if it's good) and CHECK. Profile the data to learn what's really there (ProfileStage is good here, and AuditStage is good for verifying compliance with business rules*). Based on the profile, design any requirements that may exist for cleaning and standardizing the data, and implement those requirements with DataStage/QualityStage.

* Your organization does have its business rules documented, doesn't it? The process of profiling may well uncover other, undocumented characteristics of data that turn out also to be business rules.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
whenry6000
Premium Member
Premium Member
Posts: 129
Joined: Thu Mar 02, 2006 8:28 am

Post by whenry6000 »

ray.wurlod wrote:Initial skepticism is what I've always found to be best. Disbelieve everything you are assured about data quality (if it's good) and CHECK. Profile the data to learn what's really there (ProfileStage is good here, and AuditStage is good for verifying compliance with business rules*). Based on the profile, design any requirements that may exist for cleaning and standardizing the data, and implement those requirements with DataStage/QualityStage.

* Your organization does have its business rules documented, doesn't it? The process of profiling may well uncover other, undocumented characteristics of data that turn out also to be business rules.
Do ProfileStage and AuditStage still exist in 8.0?? I thought ProfileStage was replaced by Information Analyzer, and I'm not sure about what has replaced AuditStage, if anything.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Original post was for 7.x. You are correct that ProfileStage no longer exists in 8.0 having been morphed into Information Analyzer. Elements of QualityStage and AuditStage will also migrate into Information Analyzer over the next couple of years. In the meantime, AuditStage continues to exist as a separate product.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply