hi All
Can you tell me which one of two is faster the hash file look up of server or the join stage of parallel extender.
Consider that i do the hash file look up on my server canvas put it in a shared container and execute on parallel extender will this make my job faster than just using a join stage on my parallel extender
it would be great if u can also tell me which one is better or why one is faster than the other
Thanks and regards
Pavan
which is faster hash file join or parallel join stage?
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 133
- Joined: Tue Nov 23, 2004 11:24 pm
- Location: India
Re: which is faster hash file join or parallel join stage?
Not so good on PX, but I doubt you can use hash file lookup in PX jobs. Stay within same groups of functionality, so use a join stage. Or do the join directly in the database
Ogmios
Ogmios
In theory there's no difference between theory and practice. In practice there is.
If you could use Lookup (i.e. if your data is smaller than the amount of memory you have), use it! It is VERY fast, especially on multiple node processing.
Join stage force a sort on your data whether you ask for it or not (it's done within the framework, see your help guide on Join Stage). However, it is plenty fast, especially on multiple nodes (hash files are "1 node" -- a 16 node join would definitely be faster on a 16 cpu box).
Join stage force a sort on your data whether you ask for it or not (it's done within the framework, see your help guide on Join Stage). However, it is plenty fast, especially on multiple nodes (hash files are "1 node" -- a 16 node join would definitely be faster on a 16 cpu box).
-
- Participant
- Posts: 18
- Joined: Mon Jan 12, 2004 7:20 am
- Location: USA
Re: which is faster hash file join or parallel join stage?
[quote="Pavan_Yelugula"]hi All
Can you tell me which one of two is faster the hash file look up of server or the join stage of parallel extender.
Consider that i do the hash file look up on my server canvas put it in a shared container and execute on parallel extender will this make my job faster than just using a join stage on my parallel extender
it would be great if u can also tell me which one is better or why one is faster than the other
Thanks and regards
Pavan[/quote]
I think you're looking for problematic issues with your job design a join stage forces a sort issue, typically its a a hog on resources both for PX and Server jobs. The algorithm sort process for DataStage are not so great.
Can you tell me which one of two is faster the hash file look up of server or the join stage of parallel extender.
Consider that i do the hash file look up on my server canvas put it in a shared container and execute on parallel extender will this make my job faster than just using a join stage on my parallel extender
it would be great if u can also tell me which one is better or why one is faster than the other
Thanks and regards
Pavan[/quote]
I think you're looking for problematic issues with your job design a join stage forces a sort issue, typically its a a hog on resources both for PX and Server jobs. The algorithm sort process for DataStage are not so great.
Hi,
By definition and availability of resources the Enterprise edition (PX) should give better performance, again as long as resources for paralel execution are available.
If you have a situation where this is not true, you probably don't have your system configured correctly;
or a job that is so small in data volume terms it needs not the PX engine.
IHTH,
By definition and availability of resources the Enterprise edition (PX) should give better performance, again as long as resources for paralel execution are available.
If you have a situation where this is not true, you probably don't have your system configured correctly;
or a job that is so small in data volume terms it needs not the PX engine.
IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org