Page 1 of 1

Running a mtach job using QS plug in

Posted: Tue Jan 25, 2005 4:37 pm
by DSkkk
Hi all,

i am able to run my standardization jobs using QS plug in for DS.
now i have to run my Match job using the plug-in.
the problem is i canoot use both the files as input to the QS plug in as it does not support reference links.
and the report and extract files do not contain any metadata definitions. so how do i proceed with using the plug in. if anyone has used the plug in
in a similar case please guide me thru the process.
thanks.

Posted: Tue Jan 25, 2005 5:15 pm
by vmcburney
The QualityStage plugin does support reference inputs however you have to load one of your files into a hash file first. That way DataStage feeds the matching lookup rows to the QualityStage job via hash file lookups.

You can also feed in just the primary data and let QualityStage open the match data file directly from a flat file and not from a DataStage link.

DataStage will receive the output of the matching, you need to define the fields from the input data that will end up in the match report and manually add columns for things like match weight and match result. I don't have access to the exact columns to be added but you can see them in the QualityStage application when you go through the matching steps.

Posted: Tue Jan 25, 2005 6:46 pm
by PilotBaha
Another way to do this is to combine the data and reference files together with some kind of flag and let QS split the input file into two. (D for data and R for reference for instance). I find this approach much quicker and less of a maintenance nightmare.

The match result extract file will not be that much of a problem though, as it is most likely a single file which gets fed into the output link from QS plugin.

Hi Vmcburney

Posted: Wed Jan 26, 2005 10:06 am
by DSkkk
Hi Vmcburney,

you siad that i can feed in just the primary data and let QualityStage open the match data file directly from a flat file and not from a DataStage link.

i see in the documentation for the plug-in that When you run a match, the data file should be bound to a QualityStage stage link, whereas the reference file should be taken from its original location, usually <Master_Project_dir>/Data

Can u kindly explain me how do i do this. Like according to my understanding what u said is i just provide the file to be matched as the input to the plug in and the reference file has to be opened by the quality stage.
how can i do this.

thanks.

Posted: Wed Jan 26, 2005 2:42 pm
by ray.wurlod
Does anyone have a preferred strategy for keeping the reference data up to date?

When there's no match, the source data must be pushed through a STAN and then added to the reference data. The problem is when? What if the caller (DataStage job) elects not to commit?

We've implemented a temporary table of new reference data, and the QS process matches against a view that is a UNION of the original reference data and the new. But it seems unwieldy.

This has to work in a RTI environment too.

The other thing we've done is to create a number of generic columns in the database table (and in the temporary table) in which the standadised values (NYSIIS of last name, first letter of first name, etc.) are stored; these are used to restrict the list of reference candidates, since there are otherwise millions of rows in the reference set.