difference between normal and sparse lookup
Moderators: chulett, rschirm, roy
difference between normal and sparse lookup
Hi All,
I am new to datastage. I want to know the diffrence between the normal lookup and sparse lookup. In which scenarios we will use the normal lookup and sparse lookup.How internally the datastage execution will be done for these two types of lookups.
Thanks in advance,
Ramesh
I am new to datastage. I want to know the diffrence between the normal lookup and sparse lookup. In which scenarios we will use the normal lookup and sparse lookup.How internally the datastage execution will be done for these two types of lookups.
Thanks in advance,
Ramesh
-
- Charter Member
- Posts: 822
- Joined: Sat Sep 17, 2005 5:25 pm
- Location: USA
-
- Participant
- Posts: 54
- Joined: Mon May 28, 2007 12:41 am
- Location: Chennai
hi
Hi
bydefault Database takes normal looup. but for sparsh lookup if u input data is less than ur lookup data then u have to do sparslookup.
normal lookup u know....
bydefault Database takes normal looup. but for sparsh lookup if u input data is less than ur lookup data then u have to do sparslookup.
normal lookup u know....
-
- Participant
- Posts: 106
- Joined: Thu Jul 27, 2006 10:05 pm
- Location: Mumbai
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
A normal lookup is performed against a virtual Data Set associated with the reference input link. A hash table (index) is built "on the fly" so that keys can be found quickly in the virtual Data Set. Therefore a normal lookup shows as a composite operator in the score; one to create the lookup table (LUT), the other to use it. If there is a Lookup File Set on the other end of the reference input link, the hash table (LUT) has been pre-build and is stored in the Lookup File Set.
A sparse lookup executes queries directly against the database table specified in the stage producing into the reference input link. This tends to be substantially slower than a normal lookup, but is useful where the reference data set woud not fit into memory. However, depending on the circumstances, a Join stage may be preferable to a sparse Lookup.
A sparse lookup executes queries directly against the database table specified in the stage producing into the reference input link. This tends to be substantially slower than a normal lookup, but is useful where the reference data set woud not fit into memory. However, depending on the circumstances, a Join stage may be preferable to a sparse Lookup.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
Normal will provide data for an in-memory look up whereas Sparse will access the database directly. Normal might provide poor performance when the reference data is huge as it has to load large data into memory. In such scenarios a join stage may work better(It will put the data onto an internal dataset file for each link, and then do the join based on the data provided). If your stream data is small then a sparse lookup is suggested as a row is retrieved from the database for each input row instead of the full reference data being brought into memory. Sparse lookup sends individual SQL statement for every incoming row (If stream data is huge you can imagine the number of times it has to hit DB and hence the down side on performance). It can be used when you want to get the next sequence number from your database (Again expensive overhead on your job as noted before). Also note that sparse lookup is only available for DB2 and Oracle. Normal Lookup stage can have multiple reference link but Sparse can only have one reference link. When normal is used it is a good practice to choose less volume data as reference data (Try reversing stream and reference for this if you can).
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
JoshGeorge wrote:Normal will provide data for an in-memory look up whereas Sparse will access the database directly. Normal might provide poor performance when the reference data is huge as it has to load large data into memory. In such scenarios a join stage may work better(It will put the data onto an internal dataset file for each link, and then do the join based on the data provided). If your stream data is small then a sparse lookup is suggested as a row is retrieved from the database for each input row instead of the full reference data being brought into memory. Sparse lookup sends individual SQL statement for every incoming row (If stream data is huge you can imagine the number of times it has to hit DB and hence the down side on performance). It can be used when you want to get the next sequence number from your database (Again expensive overhead on your job as noted before). Also note that sparse lookup is only available for DB2 and Oracle. Normal Lookup stage can have multiple reference link but Sparse can only have one reference link. When normal is used it is a good practice to choose less volume data as reference data (Try reversing stream and reference for this if you can).
just for clarification
use normal lookup when reference database is small to fit in the memory;
use sparse lookup when reference database is huge i.e ratio of input data and reference data is 1:100 or more
if all conditions above doesn't satisfy, its better to go for join stage
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.
Genius may have its limitations, but stupidity is not thus handicapped.
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia