Page 1 of 1

Compare two tables with same structures in different servers

Posted: Mon May 16, 2011 12:30 am
by paranoid
Hi,

As part of our testing phase, we would like to compare two tables with identical structures in two different servers.

Please suggest any optimal way of performing this using DS server edition.

Thanks in advance
Sue

Posted: Mon May 16, 2011 6:47 am
by chulett
What kind of "comparison" are we talking about here?

Posted: Mon May 16, 2011 6:54 am
by paranoid
Hi,

Here we are loading the same data in two different tables in two different servers and we would like to compare the data in those tables and see if everything is fine and load the mismatched data into another table.

Thanks
Sue

Posted: Mon May 16, 2011 7:08 am
by chulett
Which part do you need help with? Just the comparison? I'd suggest using the CRC32 function to build a checksum over the data fields in each record and then see if the same key in the other file has the same checksum. If not, something is different between the two records.

If you are unfamiliar with checksums, an exact search here for CRC32 should help.

Posted: Tue May 17, 2011 12:53 am
by paranoid
Thanks Chulett for your response. I have already implemented CRC32 in my previous projects and have an idea on it. But we have got around 600 tables to compare and if i implement CRC32, i would be constructing 600 jobs to accomplish that. Is there any other easy way to do so?

Thanks in advance
Sue

Posted: Tue May 17, 2011 2:30 am
by BI-RMA
Hi Sue,

you could use a job with a generic change data capture and runtime column propagation. The minimum information you need are the key columns for the comparison.

You will need generic stages with a sort and a hash operator preceding the input links of the generic cdc and you will need a solution to pass the tablename and key-columns to the job (Server-job with a function setting the UserStatus-variable?).

You can then run the comparison of all your tables within a Loop or use a more sophisticated control-structure to allow running some of these (multiple instance) jobs in parallel.

Consult the Advanced Parallel Job Developer Guide for reference on generic operators.

Sorry: this is not suitable for a Server only environment, of course.

Posted: Tue May 17, 2011 6:36 am
by chulett
BI-RMA wrote:Sorry: this is not suitable for a Server only environment, of course.
Which means not suitable for Sue.
paranoid wrote:Is there any other easy way to do so?
In Server? No, not that I can think of.

Posted: Tue May 17, 2011 3:07 pm
by ray.wurlod
Use a lookup from one table to the other. Perform whatever comparisons you need in the Transformer stage.

Posted: Tue May 17, 2011 5:51 pm
by chulett
By "easier" I assumed they meant without having to create 600 jobs.

Posted: Tue May 17, 2011 5:57 pm
by ray.wurlod
Is it possible for you to use parallel job? If so there may be easier methods involving Runtime Column Propagation and change detection stage types. You may be able to create just one job.

Posted: Tue May 17, 2011 6:01 pm
by chulett
From what I recall they don't have PX licensed, hence my earlier comments.

Posted: Wed May 18, 2011 3:28 am
by paranoid
Yes. We dont have PX licensed as well.

But creating 600 jobs takes a lot of effort and time. I guess it can be done using SQL scripts using minus operator. But it will surely eat up server resources.

Thanks
Sue

Posted: Wed May 18, 2011 4:03 am
by ray.wurlod
Whatever you do - even if it's running a generic job 600 times - will eat up resources. There's no way around that.