Compare two tables with same structures in different servers

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
paranoid
Premium Member
Premium Member
Posts: 185
Joined: Tue May 29, 2007 5:50 am

Compare two tables with same structures in different servers

Post by paranoid »

Hi,

As part of our testing phase, we would like to compare two tables with identical structures in two different servers.

Please suggest any optimal way of performing this using DS server edition.

Thanks in advance
Sue
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What kind of "comparison" are we talking about here?
-craig

"You can never have too many knives" -- Logan Nine Fingers
paranoid
Premium Member
Premium Member
Posts: 185
Joined: Tue May 29, 2007 5:50 am

Post by paranoid »

Hi,

Here we are loading the same data in two different tables in two different servers and we would like to compare the data in those tables and see if everything is fine and load the mismatched data into another table.

Thanks
Sue
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Which part do you need help with? Just the comparison? I'd suggest using the CRC32 function to build a checksum over the data fields in each record and then see if the same key in the other file has the same checksum. If not, something is different between the two records.

If you are unfamiliar with checksums, an exact search here for CRC32 should help.
-craig

"You can never have too many knives" -- Logan Nine Fingers
paranoid
Premium Member
Premium Member
Posts: 185
Joined: Tue May 29, 2007 5:50 am

Post by paranoid »

Thanks Chulett for your response. I have already implemented CRC32 in my previous projects and have an idea on it. But we have got around 600 tables to compare and if i implement CRC32, i would be constructing 600 jobs to accomplish that. Is there any other easy way to do so?

Thanks in advance
Sue
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Post by BI-RMA »

Hi Sue,

you could use a job with a generic change data capture and runtime column propagation. The minimum information you need are the key columns for the comparison.

You will need generic stages with a sort and a hash operator preceding the input links of the generic cdc and you will need a solution to pass the tablename and key-columns to the job (Server-job with a function setting the UserStatus-variable?).

You can then run the comparison of all your tables within a Loop or use a more sophisticated control-structure to allow running some of these (multiple instance) jobs in parallel.

Consult the Advanced Parallel Job Developer Guide for reference on generic operators.

Sorry: this is not suitable for a Server only environment, of course.
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

BI-RMA wrote:Sorry: this is not suitable for a Server only environment, of course.
Which means not suitable for Sue.
paranoid wrote:Is there any other easy way to do so?
In Server? No, not that I can think of.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use a lookup from one table to the other. Perform whatever comparisons you need in the Transformer stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

By "easier" I assumed they meant without having to create 600 jobs.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Is it possible for you to use parallel job? If so there may be easier methods involving Runtime Column Propagation and change detection stage types. You may be able to create just one job.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

From what I recall they don't have PX licensed, hence my earlier comments.
-craig

"You can never have too many knives" -- Logan Nine Fingers
paranoid
Premium Member
Premium Member
Posts: 185
Joined: Tue May 29, 2007 5:50 am

Post by paranoid »

Yes. We dont have PX licensed as well.

But creating 600 jobs takes a lot of effort and time. I guess it can be done using SQL scripts using minus operator. But it will surely eat up server resources.

Thanks
Sue
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Whatever you do - even if it's running a generic job 600 times - will eat up resources. There's no way around that.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply