Hi All,
In my job I am spliting the data based on some criterion and by using filter stage.
I received the comment from the reviewer :
"It is actually more efficient to use a transformer because the transformer is compiled and the filter is interpreted and adds more overhead than the transformer".
I don't know this before.
Waiting for your comments regarding this statement.
Thanks
Girija S
Spliting data using transformer vs filter stage
Moderators: chulett, rschirm, roy
Create a job with a row generator and the column you are using to filter, run it through the filter stage and into a copy stage that has no output. Use enough rows of data so your job runs at least 10 minutes.
Change the job to a transform stage and re-run with the same amount of data.
What results do you get?
Change the job to a transform stage and re-run with the same amount of data.
What results do you get?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
girija - I just spent 5 minutes testing this at Version 8
Simple RowGen -> Trans/Filter -> Copy Stage.
The Row Generator creates one seeded Random integer column with values of 1 to 100. The Filter and Transform only pass on rows with a value of > 50, i.e. about 50% filtration/constraint rates for 10 Million rows on a 1-node configuration.
Both tests repeated several times on an otherwise unused system
~7-8 Seconds Filter Stage.
~10-11 Seconds Transform Stage.
Simple RowGen -> Trans/Filter -> Copy Stage.
The Row Generator creates one seeded Random integer column with values of 1 to 100. The Filter and Transform only pass on rows with a value of > 50, i.e. about 50% filtration/constraint rates for 10 Million rows on a 1-node configuration.
Both tests repeated several times on an otherwise unused system
~7-8 Seconds Filter Stage.
~10-11 Seconds Transform Stage.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
We had an IBM trainer tell us the same thing, that a buildop or transformer would do the filter more efficiently than the filter stage. I am surprised at the results you are seeing, ArndW. That is very interesting...
Did you put a sleep command in your transformer to skew your results? j/k
Brad
Did you put a sleep command in your transformer to skew your results? j/k
Brad
It is not that I am addicted to coffee, it's just that I need it to survive.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: