Compile Job Hangs - consuming all memory

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jonathanhale
Premium Member
Premium Member
Posts: 33
Joined: Tue Nov 06, 2007 1:09 pm

Compile Job Hangs - consuming all memory

Post by jonathanhale »

A fairly simple 3 stage job, read from sequential file, transform, insert to Oracle with Oracle connector stage will not compile.

Its a 800 column row.

But, still...?

The compilation attempt is consuming 3GB of memory, and running for 4 hours, and never successfully compiling.

Has anyone encountered anything similiar?
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Re: Compile Job Hangs - consuming all memory

Post by jwiles »

Disable C++ compiler optimization...it's likely affecting the compilation of the transformer because of the large number of columns defined in theinput and output links. For gcc, use -O0 in the compile options, either in the stage itself or at the project level in DS Administrator. I believe it's the same for VisualAge (AIX).

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Re: Compile Job Hangs - consuming all memory

Post by jwiles »

Jonathan,

Any results? Has this been resolved?

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
jonathanhale
Premium Member
Premium Member
Posts: 33
Joined: Tue Nov 06, 2007 1:09 pm

Post by jonathanhale »

Hi James,

Sorry for delayed response. I was off-site yesterday.

Yes, it worked exactly as you suggested.

Worryingly well, in fact! ;-)

instead of failing to compile after 4 hours and consuming 4 GB of RAM, the errant job now compiles instantly with a max of 1 MB RAM consumption.

Many many thanks for your solution.

Do you know of an IBM URL/White paper that describes the implications to compiler behaviour of the various compiler optimsation options?
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Glad it worked! I hit the same wall early in my DS career and I'm sure there's many similar stories out there :)

I haven't found any whitepapers surrounding the optimization options. There's no real optimization standard across different compilers so it's difficult to say that -02 does this and this and that for gcc, VisualC, VisualAge and others. In general, tho, the more aggressive the option chosen, the more memory and cpu intensive the compilation is going to be. As the compiler it looking for items to optimize, it's building tables and traversing them repeatedly and those tables grow as the number of objects, lines of code, function calls, etc., all grow. In a transformer, every defined input or output column will have at least two objects defined in the generated c++ code (I don't know if unaccessed input columns are optimized out or not, tho...maybe that's been improved over the years). So, 800*2 = 1600, 1600*2 = 3200, not counting stage variables, objects for input/output records, constants, intermediate variables, etc., plus the actual logic.

There are design practices that can help to avoid this situation by making use of Runtime Column Propagation--letting the engine do the copying of unmodified columns instead of forcing transformer to do it.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
Post Reply