Page 2 of 2

Posted: Tue Jan 23, 2007 4:45 am
by kumar_s
In that case Log might not help.
But you would love to use WebSphere Replication Server for you very purpose.

Posted: Thu Jan 25, 2007 1:50 pm
by nvalia
has 150 columns with each on an average 25 characters per column...would also include some numeric cols as part of this 150 cols
so around 3500 characters per row..

Posted: Thu Jan 25, 2007 1:51 pm
by nvalia
well we are actually to trying to replace the current Sybase Replication process by ETL..that is the chalenge

Posted: Fri Jan 26, 2007 3:52 pm
by nvalia
The record size could be like 500-600 chracters..(15 cols)
We have 293 million rows of this size and we need to find the changed data using Change capture?
maybe around 450 GB of data

So there will be 2 datasets with these many rows...We have max 8 nodes at our disposal..

So what kind of approx time will we need for this exercise?

Posted: Fri Jan 26, 2007 4:00 pm
by DSguru2B
Take a million rows, perform your bench mark. Increase it with factors of 10 untill you read maybe 1/3 of your data. Keep noting the performance change. This will give you a fairly accurate approximation of how much time it will take you for the full run.Its hard to guess the time frame without knowing details about your environment.

Posted: Fri Jan 26, 2007 6:23 pm
by ray.wurlod
Forty Two

:lol:

Posted: Fri Jan 26, 2007 7:33 pm
by DSguru2B
There you go, you wanted a number, you have a number now by Ray. :lol:

Posted: Sun Jan 28, 2007 5:52 pm
by vmcburney
I would be surprised if anything in an ETL tool could match the performance and efficiency of a native Sybase replication tool for the straight replication of data. ETL really comes into its own when you are using the "T" in ETL. If you are doing straight CDC with no transformation/consolidation/cleansing then DataStage and the parallel CDC is a kind of cludgy way to do it. There are a lot of overheads in the ETL development that you wouldn't get in a straight replication tool.

So if you are going from Sybase to a data warehouse I would say Sybase replication combined with DataStage is a good option for keeping data volumes down. DataStage with the CDC stage is a good option for keeping it all in one tool. If you are going straight table copies than Sybase Replication is hard to beat.