Page 1 of 1

performance variance

Posted: Tue May 24, 2011 9:00 pm
by pandeesh
Hi,

I want to know the difference between the below designs:

The job design is simple, as extracting from a table and loading into datset

The source table contains 4 million records and we are using 2 node configuration.

1) Oracle stage---->Transformer------>Dataset

2)Oraclestage------->Copy-------->Dataset

3)Oracle stage------->Dataset.


Among those three designs which one will be effective one?

How will be the performance?

Thanks

Re: performance variance

Posted: Tue May 24, 2011 10:17 pm
by SURA
As far as i know, you wont find much difference in result, depend the data volume.

Hence you are writing it in a file, i guess you wont find much.

In the coming days, if you want to do something in the data, that time TFM will help.

DS User

Posted: Wed May 25, 2011 12:17 am
by ray.wurlod
(2) and (3) are identical, assuming the Force option is not used in the Copy stage. Adding a Transformer stage, even one that transfers data only, will add a small demand for resources.

Posted: Wed May 25, 2011 12:50 am
by pandeesh
What will be the differenec between 2 and 3 , if force is enabled in copy stage?

Will there be any difference in runtime>

thanks

Posted: Wed May 25, 2011 2:25 am
by ray.wurlod
Maybe, maybe not. For 0 rows, definitely not. Times are only reported in whole seconds, so there may be no measurable difference for a moderate number of rows either. How many will depend upon how wide the rows are; you did not offer that information.

Posted: Wed May 25, 2011 11:40 pm
by chandra.shekhar@tcs.com
Tfr is a heavy processing stage when speaking of 4 million records.
Tfr will take more time(differenct can be in seconds also) than (2) and (3).
And as everybody (2) and (3) are equal, I think use (3) option.

Posted: Wed May 25, 2011 11:59 pm
by pandeesh
So, what's the importance of copy stage?
where it plays a vital role?

Thanks

Posted: Thu May 26, 2011 12:03 am
by ray.wurlod
It's the cheapest stage for renaming columns, dropping columns, re-ordering columns on the link and executing implicit data type conversions.

It's particularly useful for making copies of its input when you need more than one copy.

Posted: Thu May 26, 2011 12:09 am
by SURA
You can
take more than one copy of the input data.
shuffle the metadata order
Rename the column
Drop metadata etc.

All depends what you need to do? where you need to use!

Example Scenario: Input date will pass into AGGR stage, as well as to JOIN stage from a COPY stage and then do inner join to combine data....

DS User