Transformer Stage vs Copy Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ds_is_fun
Premium Member
Premium Member
Posts: 194
Joined: Fri Jan 07, 2005 12:00 pm

Transformer Stage vs Copy Stage

Post by ds_is_fun »

I am creating multiple lookupfilesets in a job from DB2(source). I currently have transformers between source(Db2) and target(lkupfilesets).
Since I am not performing any transformations, rather making a copy. I am thinking a copy stage in between would be much preferred to make the job run faster. Since the goal of our project is to reduce extraction time.
Any inputs appreciated in advance.
Thanks.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

ds_is_fun,

in your case the limiting factor is most likely going to be the database read access, and not the methodology you use to split the data stream and write it to flat files on the server.

The only way to really answer you for your specific implementation is to test timings - and fortunately you have the row-generator stage to help you create data.

I personally don't know which one will be faster, the copy stage fits the design paradigm better and looks "cooler" but at the same time the transform stage is quite efficient; I'd try it.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
In short transfprmer = less performance so use speringly, but do use if you need it.

well I think I have a better answer:
look for the advanced guide in your client docs directory

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Roy & ds_is_fun,

instead of bouncing questions & opinions, I decided to try the empirical approach and tested a job on a 4-cpu (yes, I've got a BIG toy to play with here) AIX PX installation and ran simultaneous COPY and TRANSFORM jobs on simple data using generated ROW-GENERATOR data and writing to datasets.

After several runs the average speeds were:

COPY stage 425,000 rows per second
TRANSFORM stage 375,000 rows per second

88% of the speed... I guess that answers the questions...
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I've seen almost the same results (about 20-33% improvement) when replacing a transformer with a filter stage or a modify stage.
Post Reply