Job failing in lookup Stage

just4geeks · Post by **just4geeks** » Wed Dec 07, 2011 2:44 pm

Hello all,

I am getting this error in one of our job. This error is occurring in lookup stage. This job runs fine in low volume Data.
Error:
Could not map table file "/opt/IBM/InformationServer/Server/Datasets/lookuptable.20111207.hezfuxb (size 3938972368 bytes)": Cannot allocate memory

Our environment as as below:
Version: 8.1 FP1
OS: Linux 64 bit

ulimit -a command returns following:

address space limit (kbytes) (-M) unlimited
core file size (blocks) (-c) unlimited
cpu time (seconds) (-t) unlimited
data size (kbytes) (-d) unlimited
file size (blocks) (-f) unlimited
locks (-L) unlimited
locked address space (kbytes) (-l) 32
nofile (-n) 1024
nproc (-u) unlimited
pipe buffer size (bytes) (-p) 4096
resident set size (kbytes) (-m) unlimited
socket buffer size (bytes) (-b) 4096
stack size (kbytes) (-s) 8192
threads (-T) not supported
process size (kbytes) (-v) unlimited

Job appears to be aborting around 4 GB size of in memory table.

I searched around in this forum and got various options but I am still not clear on few things:

1. Is this 4 GB limit is set by application or Operating System?

2. Is this limitation on Primary Input Data size or Lookup data? Our primary source is DB2 table which has around 32 million rows and lookup is also a DB2 table but with very few records. Primary input is growing everyday and today seems to hit 4 GB limit because since yesterday it was working fine with low volume. My understanding is that Lookup Data ( which has very few records) will be in memory so I am confused as to why it is throwing an error based on size of primary input?

3. Can we modify the limit either at application or OS level? if yes then how and what varibale values needs to be changed?

4. I read that sometime it can happen if TMPDIR is left blank or set to /TMP. I set it to a directory with lot of space available but it still threw same error.

Any help would be highly appreciated.

Thanks,

ray.wurlod · Post by **ray.wurlod** » Wed Dec 07, 2011 3:10 pm

How much real (physical) memory is in the machine? How much scratch space do you have? How big is your temporary directory?

just4geeks · Post by **just4geeks** » Wed Dec 07, 2011 3:38 pm

Physical memory: 32 GB
Scratch Space: 35 GB available
Temp Directory: 159 GB available

Thanks,

just4geeks · Post by **just4geeks** » Wed Dec 07, 2011 4:35 pm

I wrote:

2. Is this limitation on Primary Input Data size or Lookup data? Our primary source is DB2 table which has around 32 million rows and lookup is also a DB2 table but with very few records. Primary input is growing everyday and today seems to hit 4 GB limit because since yesterday it was working fine with low volume. My understanding is that Lookup Data ( which has very few records) will be in memory so I am confused as to why it is throwing an error based on size of primary input?

Please ignore this point. I did not realize that developer changed orientation of lookup and primary input links so I thought lookup link is primary and vice versa! So no confusion here. But still looking to get answer to other points.

Thanks!!

qt_ky · Post by **qt_ky** » Wed Dec 07, 2011 10:24 pm

The stack size looks too low. The nofiles setting also looks too low, but stack size is more related to memory. The nofiles minimum is usually 10240. This is not apples to apples, but here is ulimit -a output from AIX:

Code: Select all

time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        4194304
memory(kbytes)       unlimited
coredump(blocks)     unlimited
nofiles(descriptors) 131072
threads(per process) unlimited
processes(per user)  unlimited

Compare your settings with the minimum kernel settings listed in the Install Guide for your release.

just4geeks · Post by **just4geeks** » Thu Dec 08, 2011 9:56 am

Thanks Eric!
I will look into your suggestions and will post my findings.

Take care!

ajay.vaidyanathan · Post by **ajay.vaidyanathan** » Thu Dec 08, 2011 4:38 pm

What is the vloume of your lookup data ??? If it is really huge then using lookup is not advisable. Go for Joiner with a Left Outer Join option. Then lets see whether you still get this error.

qt_ky · Post by **qt_ky** » Sat Dec 17, 2011 4:04 pm

I'm fairly sure your OS is limiting your ability to allocate memory via the ulmiit settings.

I tried out a normal lookup (supposed to be in memory) using an 11.5 GB dataset as the reference on 2 nodes. The job finished with no errors but took a very long time. Server is 8.5 on AIX with 38 GB RAM. It created a 26 GB lookuptable.* file during the run then removed it when finished. I'm not sure why it creates a temp file when the dataset is already on local disk to begin with. I have seen these files accumulate in cases when jobs with Lookup stages abort, and then it requires manual clean-up.

pandeesh · Post by **pandeesh** » Sat Dec 17, 2011 9:41 pm

Better you can try with Join and compare the performance vs Lookup.

qt_ky · Post by **qt_ky** » Sun Dec 18, 2011 8:34 am

Yes, Join can perform better. My point is to highlight the cause of the error and also to find the upper limit of what Lookup can handle. I can run it without error, even with data larger than 3.66 GB in this examle (and larger than 4 GB). When the performance drops, it's no longer the best solution.

just4geeks · Post by **just4geeks** » Mon Dec 19, 2011 8:07 am

Thanks everyone for valuable inputs.
We tried Join Stage even before I posted this and it worked.
But my objective to post this was to obtain more clarification about behaviour of Lookup stage.
1. I am still not clear if it OS level issue or application level. We have Linux OS with 32 GB RAM.
2. As you can see we have ulimit set up as "unlimited"
3. If Eric can run much higher volume that 4 GB with lookup stage, I am not sure which setting we need to change at our level.

Thanks

chulett · Post by **chulett** » Mon Dec 19, 2011 8:15 am

4GB is a limitation of 32bit architecture. You've said your O/S is 64bit, what about your version of DataStage?

just4geeks · Post by **just4geeks** » Mon Dec 19, 2011 8:40 am

Thanks Craig!
We are running DataStage 8.1 FP1.
How can we find out whether we are running 32 or 64 bit version of DataStage?

Appreciate your help!

chulett · Post by **chulett** » Mon Dec 19, 2011 10:59 am

I don't know how to do that just from what is installed... can you check your 'manifest' or with your vendor?

ray.wurlod · Post by **ray.wurlod** » Mon Dec 19, 2011 3:01 pm

If you're running 8.1 you are running 32-bit version.