While i am using the look up stage, I have observed that files are being created.....
As per my knowledge (Please correct if wrong), the look up stage operated in Primary memory (RAM) then why are the files created in the scratch disk, is it because the data is too large to be placed in the RAM ?
Thanks..................
Need help in understanding Look-up Functionality
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 730
- Joined: Tue Nov 04, 2008 10:14 am
- Location: Bangalore
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
Re: Need help in understanding Look-up Functionality
Do you want to keep entire reference records in RAM for all the times and slow all other processes. it works same as any operating system to keep the RAM clear as much as possible without degrading the performance.zulfi123786 wrote:While i am using the look up stage, I have observed that files are being created.....
As per my knowledge (Please correct if wrong), the look up stage operated in Primary memory (RAM) then why are the files created in the scratch disk, is it because the data is too large to be placed in the RAM ?
Thanks..................
It does uses the RAM but managing memory is also necessary. At the time actual lookup happens data is in the memory.
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.![Wink :wink:](./images/smilies/icon_wink.gif)
Genius may have its limitations, but stupidity is not thus handicapped.
![Wink :wink:](./images/smilies/icon_wink.gif)
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There are tunable limits on how much memory is used for virtual Data Sets and for buffers. (It will come as no surprise that they are tuned by setting environment variable values.) When this memory is reached, then DataStage uses scratchdisk.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
I ran a test on this a few years ago. http://it.toolbox.com/blogs/infosphere/datastage-tip-for-beginners-parallel-lookup-types-7183. For under 1000 rows lookups were faster than joins and it didn't matter what your lookup source was as it fitted into RAM memory with ease. I ran some tests on 3 million lookup rows and found a Lookup stage with the reference data in a Lookup Fileset was fastest at 42 seconds, a join stage was just over a minute and a Lookup with source data in a database, sequential file or dataset was over 2 minutes.
So if you want performance improvements on large lookups think about lookup filesets or join stages.
So if you want performance improvements on large lookups think about lookup filesets or join stages.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn