Schema comparison

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

dodda
Premium Member
Premium Member
Posts: 244
Joined: Tue May 29, 2007 11:31 am

Schema comparison

Post by dodda »

Hello,

I have a requirement where i have to compare two databases one is backup database and other one is prod database. We have around 120 tables in each database schema and we have to produce delta by comparing those tables. Is there a optimal to do this. I am thinking of change capture stage. but do i need to design a single for each pair of database comparison that would be 120 jobs? is there a way i can compare the whole schema?

Apprecaite your help.

Regards
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What database? Most come with tools, or third-pary tools exist for tasks like this. It would be cumbersome and time-consuming to accomplish this via ETL, I would think. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
vasubabu
Participant
Posts: 153
Joined: Wed Jan 25, 2006 2:38 am

Post by vasubabu »

chulett wrote:What database? Most come with tools, or third-pary tools exist for tasks like this. It would be cumbersome and time-consuming to accomplish this via ETL, I would think. :? ...

if every table metadata is same you can do via multiple instances
please guide me if i am wrong

phani
VASU..
dodda
Premium Member
Premium Member
Posts: 244
Joined: Tue May 29, 2007 11:31 am

Post by dodda »

Hello Vasu

Thanks for your reply.since one is production database and another is backup database they have the same metadata while comparing the same table but we have 122 tables with each pair having different metadata

Thanks
dodda
Premium Member
Premium Member
Posts: 244
Joined: Tue May 29, 2007 11:31 am

Post by dodda »

chulett wrote:What database? Most come with tools, or third-pary tools exist for tasks like this. It would be cumbersome and time-consuming to accomplish this via ETL, I would think. :?
Hello chulett

Yes it is cumbersome i certainly agree with that. But is there a optimal option by using ETL datastage?. we cant do multiple instances as 122 tables have different metadata.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I don't think there will be anything "optimal" about an ETL solution to this problem. This is really a DBA task to validate "backup" databases. And as far as I know, if you really want to go down this road you will need 122 jobs for this... but at least they will all be very very (very) similar. :wink:

Set it up once and then clone, clone, clone.

ps. You still haven't mentioned the database - Oracle, DB2, ??
-craig

"You can never have too many knives" -- Logan Nine Fingers
dodda
Premium Member
Premium Member
Posts: 244
Joined: Tue May 29, 2007 11:31 am

Post by dodda »

chulett wrote:I don't think there will be anything "optimal" about an ETL solution to this problem. This is really a DBA task to validate "backup" databases. And as far as I know, if you really want to go down this road you will need 122 jobs for this... but at least they will all be very very (very) similar. :wink:

Set it up once and then clone, clone, clone.

ps. You still haven't mentioned the database - Oracle, DB2, ??
Hello chulett
Thanks for your reply.The delta will be done on oracle databases.

Thanks
iDomz
Participant
Posts: 81
Joined: Wed Jul 25, 2007 5:25 am
Location: London

Post by iDomz »

How about using a generic job and schema files for metadata?
dodda
Premium Member
Premium Member
Posts: 244
Joined: Tue May 29, 2007 11:31 am

Post by dodda »

iDomz wrote:How about using a generic job and schema files for metadata?
Hi iDomz

can you please elaborate this. How to use schema files for metadata.
Appreciate your help

thanks
iDomz
Participant
Posts: 81
Joined: Wed Jul 25, 2007 5:25 am
Location: London

Post by iDomz »

Create schema files for all tables
Create a job that does nothing but compare two sets of data (Change capture or Difference stage)
Turn RCP on
Use a set of common columns as keys - audit columns can be a candidate key
Mark all non key columns are values to true
Pass the schema file names as parameters to your job

I have not tried this but theoretically should work if you have common audit keys across both databases. The experts can tell you why it will not :wink:
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

yes this can be done in a single CDC job.
if you are comapring record to record then read all column as as column( by concanating them). the CDC will give output of the record difference
dodda
Premium Member
Premium Member
Posts: 244
Joined: Tue May 29, 2007 11:31 am

Post by dodda »

keshav0307 wrote:yes this can be done in a single CDC job.
if you are comapring record to record then read all column as as column( by concanating them). the CDC will give output of the record difference
Hello Keshav

Thanks for your reply.Can you please explain me more elaborately since i am using two oracle stages to compare (delta) how can i pass the schema file as a parameter and also how to read all the columns from oracle as a single column.

Thanks
dodda
Premium Member
Premium Member
Posts: 244
Joined: Tue May 29, 2007 11:31 am

Post by dodda »

keshav0307 wrote:yes this can be done in a single CDC job.
if you are comapring record to record then read all column as as column( by concanating them). the CDC will give output of the record difference
Hello Keshav,

Since we are comparing 122 tables we have different metadata for each pair of tables. how can we do that

Thanks
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

you didn't read it carefully.
if you are comapring record to record then read all column as as column( by concanating them). the CDC will give output of the record difference
that means irrespective of the table, there will always be only one column.
dodda
Premium Member
Premium Member
Posts: 244
Joined: Tue May 29, 2007 11:31 am

Post by dodda »

keshav0307 wrote:you didn't read it carefully.
if you are comapring record to record then read all column as as column( by concanating them). the CDC will give output of the record difference
that means irrespective of the table, there will always be only one column.
Hello Keshav,

how to do that for reading multiple tables and how to pass the schema file as parameter for oracle stage.

Thanks
Post Reply