DSXchange

Posted: **Fri Jul 15, 2005 9:04 am**

Hi,

Can any explain me the difference between profile stage and quality stage and what are the scenarios where they are used.....

I would be very thankful for the help.

Thanks
chowdary.

Posted: **Fri Jul 15, 2005 11:23 am**

Hi,

Look into the link for detail information.
http://www.ascential.com/litlib/

Profile stage is the acquired tool from MetaRecon.
It's a data mining and analysis tool.

ProfileStage is more related to the discovery of the data,
while AuditStage gives information on the patterns present in the data

Audit stage is the older Quality Manager

These tools gives the means to do data profiling

Thanks
Ketfos

Posted: **Fri Jul 15, 2005 1:12 pm**

There is a lot of information available in this forum as well as ascential site. Bascically QualityStage would be used as a cleansing tool to investigate and standardise the data quality. Among many other features widely used it may also be used for deduplication. And as ketfos has mentined Profile stage is basically a data profiling tool.
Ash.

Posted: **Fri Jul 15, 2005 7:35 pm**

There is substantial overlap between what ProfileStage (formerly MetaRecon) and AuditStage (formerly Quality Manager) do, but enough differences to warrant separate products - though "they" may merge the functionality one day. Both look at the actual data (rather than the metadata) to determine what's really out there, and to look for typical patterns out there (nulls, cardinality, skewed distributions, and so on).

QualityStage (formerly Vality INTEGRITY) is totally different. It performs some or all (your choice) of four separate tasks:

investigation (at both character and word level, and free format, which means you can find data that overlap fields or are in the wrong fields)

standardization, essentially moving data into the correct fields and generating standard forms (for example AV, AVE and AVENUE are all output as AVE (your business rules), also Soundex, NYSIIS and reverse Soundex forms which are better for fuzzy matching)

matching, which involves identifying potentially duplicate records using probabilistic (rather than deterministic) methods and "blocking" them into groups, assigning match weights and allowing statistical cutoffs to be used to identify true matches, true non-matches and the grey area in between

survivorship, in which "best of breed" data survive from each block of potential duplicates, for example the most frequently occurring value, the longest string, and so on

DSXchange

Profile Stage Vs Quality Stage

Profile Stage Vs Quality Stage