Data cleansing

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
sri75
Premium Member
Premium Member
Posts: 132
Joined: Thu Sep 09, 2004 12:42 pm

Data cleansing

Post by sri75 »

Hi,


I have some questions regarding Ascential Integrity Tool.

I think it is used for cleansing data.Could anybody please tell exactly how it works ?How to clean data in the text files using ths toll.

Thanks
Sri
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Ascential bought the company called Vality that wrote the tool called INTEGRITY. That tool has now been incorporated into the Ascential product suite and is called QualityStage.

There are essentially four phases to data cleansing:
  • investigation (find out what's there, which can be character-based, word-based or can even look at overlapping/redefined fields)

    standardization (correctly "bucket" data according to rule sets)

    matching (various kinds, to determine what blocks (groups) of potential duplicates there are)

    survivorship (to yield the "best of breed" data, again you can specify how, e.g. longest string, most recent, highest information content weight)
For more information look on the Ascential web site and/or enrol in the QualityStage training class.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply