Page 1 of 2

Job failing in lookup Stage

Posted: Wed Dec 07, 2011 2:44 pm
by just4geeks
Hello all,

I am getting this error in one of our job. This error is occurring in lookup stage. This job runs fine in low volume Data.
Error:
Could not map table file "/opt/IBM/InformationServer/Server/Datasets/lookuptable.20111207.hezfuxb (size 3938972368 bytes)": Cannot allocate memory

Our environment as as below:
Version: 8.1 FP1
OS: Linux 64 bit

ulimit -a command returns following:

address space limit (kbytes) (-M) unlimited
core file size (blocks) (-c) unlimited
cpu time (seconds) (-t) unlimited
data size (kbytes) (-d) unlimited
file size (blocks) (-f) unlimited
locks (-L) unlimited
locked address space (kbytes) (-l) 32
nofile (-n) 1024
nproc (-u) unlimited
pipe buffer size (bytes) (-p) 4096
resident set size (kbytes) (-m) unlimited
socket buffer size (bytes) (-b) 4096
stack size (kbytes) (-s) 8192
threads (-T) not supported
process size (kbytes) (-v) unlimited

Job appears to be aborting around 4 GB size of in memory table.

I searched around in this forum and got various options but I am still not clear on few things:

1. Is this 4 GB limit is set by application or Operating System?

2. Is this limitation on Primary Input Data size or Lookup data? Our primary source is DB2 table which has around 32 million rows and lookup is also a DB2 table but with very few records. Primary input is growing everyday and today seems to hit 4 GB limit because since yesterday it was working fine with low volume. My understanding is that Lookup Data ( which has very few records) will be in memory so I am confused as to why it is throwing an error based on size of primary input?

3. Can we modify the limit either at application or OS level? if yes then how and what varibale values needs to be changed?

4. I read that sometime it can happen if TMPDIR is left blank or set to /TMP. I set it to a directory with lot of space available but it still threw same error.

Any help would be highly appreciated.

Thanks,

Posted: Wed Dec 07, 2011 3:10 pm
by ray.wurlod
How much real (physical) memory is in the machine? How much scratch space do you have? How big is your temporary directory?

Posted: Wed Dec 07, 2011 3:38 pm
by just4geeks
Physical memory: 32 GB
Scratch Space: 35 GB available
Temp Directory: 159 GB available

Thanks,

Posted: Wed Dec 07, 2011 4:35 pm
by just4geeks
I wrote:
2. Is this limitation on Primary Input Data size or Lookup data? Our primary source is DB2 table which has around 32 million rows and lookup is also a DB2 table but with very few records. Primary input is growing everyday and today seems to hit 4 GB limit because since yesterday it was working fine with low volume. My understanding is that Lookup Data ( which has very few records) will be in memory so I am confused as to why it is throwing an error based on size of primary input?

Please ignore this point. I did not realize that developer changed orientation of lookup and primary input links so I thought lookup link is primary and vice versa! So no confusion here. But still looking to get answer to other points.

Thanks!!

Posted: Wed Dec 07, 2011 10:24 pm
by qt_ky
The stack size looks too low. The nofiles setting also looks too low, but stack size is more related to memory. The nofiles minimum is usually 10240. This is not apples to apples, but here is ulimit -a output from AIX:

Code: Select all

time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        4194304
memory(kbytes)       unlimited
coredump(blocks)     unlimited
nofiles(descriptors) 131072
threads(per process) unlimited
processes(per user)  unlimited
Compare your settings with the minimum kernel settings listed in the Install Guide for your release.

Posted: Thu Dec 08, 2011 9:56 am
by just4geeks
Thanks Eric!
I will look into your suggestions and will post my findings.

Take care!

Job failing in lookup Stage

Posted: Thu Dec 08, 2011 4:38 pm
by ajay.vaidyanathan
What is the vloume of your lookup data ??? If it is really huge then using lookup is not advisable. Go for Joiner with a Left Outer Join option. Then lets see whether you still get this error.

Posted: Sat Dec 17, 2011 4:04 pm
by qt_ky
I'm fairly sure your OS is limiting your ability to allocate memory via the ulmiit settings.

I tried out a normal lookup (supposed to be in memory) using an 11.5 GB dataset as the reference on 2 nodes. The job finished with no errors but took a very long time. Server is 8.5 on AIX with 38 GB RAM. It created a 26 GB lookuptable.* file during the run then removed it when finished. I'm not sure why it creates a temp file when the dataset is already on local disk to begin with. I have seen these files accumulate in cases when jobs with Lookup stages abort, and then it requires manual clean-up.

Posted: Sat Dec 17, 2011 9:41 pm
by pandeesh
Better you can try with Join and compare the performance vs Lookup.

Posted: Sun Dec 18, 2011 8:34 am
by qt_ky
Yes, Join can perform better. My point is to highlight the cause of the error and also to find the upper limit of what Lookup can handle. I can run it without error, even with data larger than 3.66 GB in this examle (and larger than 4 GB). When the performance drops, it's no longer the best solution.

Posted: Mon Dec 19, 2011 8:07 am
by just4geeks
Thanks everyone for valuable inputs.
We tried Join Stage even before I posted this and it worked.
But my objective to post this was to obtain more clarification about behaviour of Lookup stage.
1. I am still not clear if it OS level issue or application level. We have Linux OS with 32 GB RAM.
2. As you can see we have ulimit set up as "unlimited"
3. If Eric can run much higher volume that 4 GB with lookup stage, I am not sure which setting we need to change at our level.

Thanks

Posted: Mon Dec 19, 2011 8:15 am
by chulett
4GB is a limitation of 32bit architecture. You've said your O/S is 64bit, what about your version of DataStage?

Posted: Mon Dec 19, 2011 8:40 am
by just4geeks
Thanks Craig!
We are running DataStage 8.1 FP1.
How can we find out whether we are running 32 or 64 bit version of DataStage?

Appreciate your help!

Posted: Mon Dec 19, 2011 10:59 am
by chulett
I don't know how to do that just from what is installed... can you check your 'manifest' or with your vendor?

Posted: Mon Dec 19, 2011 3:01 pm
by ray.wurlod
If you're running 8.1 you are running 32-bit version.