Job failing in lookup Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Job failing in lookup Stage

Post by just4geeks »

Hello all,

I am getting this error in one of our job. This error is occurring in lookup stage. This job runs fine in low volume Data.
Error:
Could not map table file "/opt/IBM/InformationServer/Server/Datasets/lookuptable.20111207.hezfuxb (size 3938972368 bytes)": Cannot allocate memory

Our environment as as below:
Version: 8.1 FP1
OS: Linux 64 bit

ulimit -a command returns following:

address space limit (kbytes) (-M) unlimited
core file size (blocks) (-c) unlimited
cpu time (seconds) (-t) unlimited
data size (kbytes) (-d) unlimited
file size (blocks) (-f) unlimited
locks (-L) unlimited
locked address space (kbytes) (-l) 32
nofile (-n) 1024
nproc (-u) unlimited
pipe buffer size (bytes) (-p) 4096
resident set size (kbytes) (-m) unlimited
socket buffer size (bytes) (-b) 4096
stack size (kbytes) (-s) 8192
threads (-T) not supported
process size (kbytes) (-v) unlimited

Job appears to be aborting around 4 GB size of in memory table.

I searched around in this forum and got various options but I am still not clear on few things:

1. Is this 4 GB limit is set by application or Operating System?

2. Is this limitation on Primary Input Data size or Lookup data? Our primary source is DB2 table which has around 32 million rows and lookup is also a DB2 table but with very few records. Primary input is growing everyday and today seems to hit 4 GB limit because since yesterday it was working fine with low volume. My understanding is that Lookup Data ( which has very few records) will be in memory so I am confused as to why it is throwing an error based on size of primary input?

3. Can we modify the limit either at application or OS level? if yes then how and what varibale values needs to be changed?

4. I read that sometime it can happen if TMPDIR is left blank or set to /TMP. I set it to a directory with lot of space available but it still threw same error.

Any help would be highly appreciated.

Thanks,
Attitude is everything....
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How much real (physical) memory is in the machine? How much scratch space do you have? How big is your temporary directory?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

Physical memory: 32 GB
Scratch Space: 35 GB available
Temp Directory: 159 GB available

Thanks,
Attitude is everything....
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

I wrote:
2. Is this limitation on Primary Input Data size or Lookup data? Our primary source is DB2 table which has around 32 million rows and lookup is also a DB2 table but with very few records. Primary input is growing everyday and today seems to hit 4 GB limit because since yesterday it was working fine with low volume. My understanding is that Lookup Data ( which has very few records) will be in memory so I am confused as to why it is throwing an error based on size of primary input?

Please ignore this point. I did not realize that developer changed orientation of lookup and primary input links so I thought lookup link is primary and vice versa! So no confusion here. But still looking to get answer to other points.

Thanks!!
Attitude is everything....
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

The stack size looks too low. The nofiles setting also looks too low, but stack size is more related to memory. The nofiles minimum is usually 10240. This is not apples to apples, but here is ulimit -a output from AIX:

Code: Select all

time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        4194304
memory(kbytes)       unlimited
coredump(blocks)     unlimited
nofiles(descriptors) 131072
threads(per process) unlimited
processes(per user)  unlimited
Compare your settings with the minimum kernel settings listed in the Install Guide for your release.
Choose a job you love, and you will never have to work a day in your life. - Confucius
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

Thanks Eric!
I will look into your suggestions and will post my findings.

Take care!
Attitude is everything....
ajay.vaidyanathan
Participant
Posts: 53
Joined: Fri Apr 18, 2008 8:13 am
Location: United States

Job failing in lookup Stage

Post by ajay.vaidyanathan »

What is the vloume of your lookup data ??? If it is really huge then using lookup is not advisable. Go for Joiner with a Left Outer Join option. Then lets see whether you still get this error.
Regards
Ajay
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

I'm fairly sure your OS is limiting your ability to allocate memory via the ulmiit settings.

I tried out a normal lookup (supposed to be in memory) using an 11.5 GB dataset as the reference on 2 nodes. The job finished with no errors but took a very long time. Server is 8.5 on AIX with 38 GB RAM. It created a 26 GB lookuptable.* file during the run then removed it when finished. I'm not sure why it creates a temp file when the dataset is already on local disk to begin with. I have seen these files accumulate in cases when jobs with Lookup stages abort, and then it requires manual clean-up.
Choose a job you love, and you will never have to work a day in your life. - Confucius
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

Better you can try with Join and compare the performance vs Lookup.
pandeeswaran
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

Yes, Join can perform better. My point is to highlight the cause of the error and also to find the upper limit of what Lookup can handle. I can run it without error, even with data larger than 3.66 GB in this examle (and larger than 4 GB). When the performance drops, it's no longer the best solution.
Choose a job you love, and you will never have to work a day in your life. - Confucius
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

Thanks everyone for valuable inputs.
We tried Join Stage even before I posted this and it worked.
But my objective to post this was to obtain more clarification about behaviour of Lookup stage.
1. I am still not clear if it OS level issue or application level. We have Linux OS with 32 GB RAM.
2. As you can see we have ulimit set up as "unlimited"
3. If Eric can run much higher volume that 4 GB with lookup stage, I am not sure which setting we need to change at our level.

Thanks
Attitude is everything....
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

4GB is a limitation of 32bit architecture. You've said your O/S is 64bit, what about your version of DataStage?
-craig

"You can never have too many knives" -- Logan Nine Fingers
just4geeks
Premium Member
Premium Member
Posts: 644
Joined: Sat Aug 26, 2006 3:59 pm
Location: Mclean, VA

Post by just4geeks »

Thanks Craig!
We are running DataStage 8.1 FP1.
How can we find out whether we are running 32 or 64 bit version of DataStage?

Appreciate your help!
Attitude is everything....
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I don't know how to do that just from what is installed... can you check your 'manifest' or with your vendor?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If you're running 8.1 you are running 32-bit version.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply