Hi All,
My job can be done by
either Merge or Join.
I was just wondering which one is more efficient, considering the DataSet may have around 300 mills rows.
Thanks,
Munish
Merge or Join, Which is more efficient
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Define "efficient".
They use very similar algorithms for memory management.
Do you need just a join, or do you need to be able to capture, separately, rows for which there was no "update" available? The decision is largely driven by functionality.
Why not perform some benchmarks and document your results here?
They use very similar algorithms for memory management.
Do you need just a join, or do you need to be able to capture, separately, rows for which there was no "update" available? The decision is largely driven by functionality.
Why not perform some benchmarks and document your results here?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sure,
Would definitely do that.
It is one of things to do in our SVT.
In current job,
Source 1: Key values + sum
Source 2 : Key valued + count
join
output: Keyvalues + Sum + Count.
Thus, it is very simple inner join.
We have started the development with JOIN stage, however we are going to compare it with Merge stage once we have real time 300 mill data.
Thanks Ray
Munish
Would definitely do that.
It is one of things to do in our SVT.
In current job,
Source 1: Key values + sum
Source 2 : Key valued + count
join
output: Keyvalues + Sum + Count.
Thus, it is very simple inner join.
We have started the development with JOIN stage, however we are going to compare it with Merge stage once we have real time 300 mill data.
Thanks Ray
Munish
MK
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Once I made similar benchmarking, Join gave be better result. But I dont have any of those details right now. But I will still wait for your results to be published, Munish.
Make sure your CPU usage idle for all the cases. Do also measure the memory and CPU usage during the operation.
Make sure your CPU usage idle for all the cases. Do also measure the memory and CPU usage during the operation.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Any Updates on this Munish. I'm curious too, as I am also debating on the usage of MERGE over JOIN in certain scenarios in our development. Your results would help me. Thankskumar_s wrote:Once I made similar benchmarking, Join gave be better result. But I dont have any of those details right now. But I will still wait for your results to be published, Munish.
Make sure your CPU usage idle for all the cases. Do also measure the memory and CPU usage during the operation.