ODBC enterprise stage fatal error

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Rohit_ben
Participant
Posts: 19
Joined: Sat Sep 22, 2007 4:55 am

ODBC enterprise stage fatal error

Post by Rohit_ben »

One of our ETL jobs on DataStage 8.1 failed in Live environment with the following error -

APT_CombinedOperatorController,0: [IBM(DataDirect OEM)][ODBC SQL Server Driver]20157
odbcwrt_EX_POST_RESULTS,0: Failure during execution of operator logic.
...
main_program: APT_PMsectionLeader(1, node1), player 2 - Unexpected exit status 1.
...
orared_EX_POST_RESULTS,0: Fatal Error: Unable to allocate communication resources
...
APT_CombinedOperatorController,1: Fatal Error: waitForWriteSignal(): Premature EOF on node hkeosp08 Cannot allocate memory


The ETL job fetches data from Oracle database and populates a table in SQL Server database. We use the original DataDirect ODBC drivers which came with Information Server 8.1, to connect to SQL server.

The job failed after running for 30 minutes and processing 4 million records. It was re-run once and the job failed again at the same point. We had monitored the CPU utilization during the second run and found out that after processing about 3.7 million records approximately, the used swap memory size suddenly started increasing till the point that almost 100% swap was used. The job has failed at this point. The CPU utilization was between 30% to 60%. Also, what is the significance of the error - [IBM(DataDirect OEM)][ODBC SQL Server Driver]20157 ?

We have a total free scratch space of 35 GB and have a configuration of two nodes. No significant increase in the scratch space was observed during the job runs.

The /tmp space is 12 GB and around 9 GB was available before job execution. This is the same amount of space allocated and available on the UAT server. But the job has run successfully a number of times on the UAT server but has failed on Live server.

Also we have found out that the DSN entry in the .odbc.ini file has an entry as QEWSD=39262.

DSN entry for SQL Server:

[DPServer]
Driver=/opt/IBM/InformationServer/Server/branded_odbc/lib/VMmsss23.so
Description=DataDirect SQL Server Wire Protocol
Database=
Address=
QuotedId=No
AnsiNPW=No
QEWSD=39262
DescribeAtPrepare=0

This job was earlier a server job with insert array size of 1 in ODBC Enterprise stage. This server job was converted to parallel which has failed. We had changed the insert array size to 50 in the new parallel job to boost performance. Also the execution of the ODBC enterprise stage is parallel (which is default).

As a fallback, we have ran the server job again and it has finished successfully.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

You can safely remove the QEWSD=39262 line. Those get added when testing DSNs from the command line. It will generate a warning about the license.

For the parallel job you may want to check the ulimit settings on your OS as those can limit available memory.
Choose a job you love, and you will never have to work a day in your life. - Confucius
Rohit_ben
Participant
Posts: 19
Joined: Sat Sep 22, 2007 4:55 am

Post by Rohit_ben »

Raised the issue to IBM. They have asked us to lower the insert array size. We have increased the swap memory from 2 GB to 24 GB. We will be testing the job once again and monitor.
Post Reply