I am creating multiple lookupfilesets in a job from DB2(source). I currently have transformers between source(Db2) and target(lkupfilesets).
Since I am not performing any transformations, rather making a copy. I am thinking a copy stage in between would be much preferred to make the job run faster. Since the goal of our project is to reduce extraction time.
Any inputs appreciated in advance.
Thanks.
Transformer Stage vs Copy Stage
Moderators: chulett, rschirm, roy
ds_is_fun,
in your case the limiting factor is most likely going to be the database read access, and not the methodology you use to split the data stream and write it to flat files on the server.
The only way to really answer you for your specific implementation is to test timings - and fortunately you have the row-generator stage to help you create data.
I personally don't know which one will be faster, the copy stage fits the design paradigm better and looks "cooler" but at the same time the transform stage is quite efficient; I'd try it.
in your case the limiting factor is most likely going to be the database read access, and not the methodology you use to split the data stream and write it to flat files on the server.
The only way to really answer you for your specific implementation is to test timings - and fortunately you have the row-generator stage to help you create data.
I personally don't know which one will be faster, the copy stage fits the design paradigm better and looks "cooler" but at the same time the transform stage is quite efficient; I'd try it.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Hi,
In short transfprmer = less performance so use speringly, but do use if you need it.
well I think I have a better answer:
look for the advanced guide in your client docs directory
IHTH,
In short transfprmer = less performance so use speringly, but do use if you need it.
well I think I have a better answer:
look for the advanced guide in your client docs directory
IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Roy & ds_is_fun,
instead of bouncing questions & opinions, I decided to try the empirical approach and tested a job on a 4-cpu (yes, I've got a BIG toy to play with here) AIX PX installation and ran simultaneous COPY and TRANSFORM jobs on simple data using generated ROW-GENERATOR data and writing to datasets.
After several runs the average speeds were:
COPY stage 425,000 rows per second
TRANSFORM stage 375,000 rows per second
88% of the speed... I guess that answers the questions...
instead of bouncing questions & opinions, I decided to try the empirical approach and tested a job on a 4-cpu (yes, I've got a BIG toy to play with here) AIX PX installation and ran simultaneous COPY and TRANSFORM jobs on simple data using generated ROW-GENERATOR data and writing to datasets.
After several runs the average speeds were:
COPY stage 425,000 rows per second
TRANSFORM stage 375,000 rows per second
88% of the speed... I guess that answers the questions...
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
I've seen almost the same results (about 20-33% improvement) when replacing a transformer with a filter stage or a modify stage.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn