Error writing to pipe

Camaj · Post by **Camaj** » Mon Oct 24, 2005 9:12 am

Bonjour,

I used the DB2 AIX V8 version with the DataStage 7.5.

When I tried to LOAD a large volume of rows (more than 5 millions) in my DB2 table and I got this error "Error writing to pipe".

I suspect a problem with the partition LOCK.

I tried to used a .DS and .SEQ file but I got the same error.

Anyone can help me !!!

Thank !

ArndW · Post by **ArndW** » Mon Oct 24, 2005 9:29 am

I am curious why you suspect a DB/2 partition lock when you get the same error message writing to a dataset or sequential file?

Does the error message refer to a stage or a player number - pipes are used extensively for interprocess communication so you would need to narrow it down. If you got an error writing to a pipe that means that the other process that is reading from the pipe has either died (usually the error message then is different) or is too slow. Your error looks like a pipe timeout issue; but this shouldn't happen when writing to a sequential file. Also, does the timeout and abort happen at the same row each time or at approximately the same runtime?

Camaj · Post by **Camaj** » Mon Oct 24, 2005 12:04 pm

I suspect a DB2 Deadlock because I can't found anything else.

My job fail with writing to pipe error or stock in DataStage and never ending.

This is a strange problem.

In test, where the volume was around 500 K I never got this problem.

Are you an idea, what should cause this problem ?????

ArndW · Post by **ArndW** » Mon Oct 24, 2005 12:19 pm

Camaj,

change the job to write to a dataset or sequential file. Exactly what happens, i.e. the full error message. Does it happen at the same row number each time? Which stage is giving the error? All this can be done without going into the details of the orchestrate mechanism.

Usually in a case like this I start removing components of the job bit by bit until the error goes away, then concentrate on what I did for the last change. Do you have lookups or transformations? Does the error persist when you remove these steps?

Camaj · Post by **Camaj** » Mon Oct 24, 2005 12:31 pm

I do that,

First, I remove all stage and keep only a DS dataset and DB2 table.

I got the same problem.

Whem I write on to DS instead of DB2, everything is working fine.

This is relared when I tried to used the LOAD in the DB2 Stage!!!!.

Thank for your help !

ArndW · Post by **ArndW** » Mon Oct 24, 2005 12:55 pm

OK, so then it goes away when you don't load to DB/2. What happens if you change the stage to upsert? Is your scratch filling up? Could you look at your log file and post the actual error message, plus look a couple of entries before and after for warnings or other text that might assist in narrowing down the cause. Run just a couple of thousand rows through (put a constraint @INROWNUM < 5000 in a transform stage) and see if the data actually gets written to DB/2. Sort your incoming data stream differently (could it be related to what you are trying to write - i.e. bad data of some type).

The answer to your problem isn't obvious from your error description, so you will need to do some more diagnosis and reporting in order to narrow it down.

Camaj · Post by **Camaj** » Mon Oct 24, 2005 1:27 pm

When I used WRITE instead of LOAD. Is working fine.

I can't not check before or after the error, because the problem occur at different row on each run.

I tried to run with the @INROWNUM < 5000 and I got any problem !

Thank !

ArndW · Post by **ArndW** » Mon Oct 24, 2005 1:57 pm

The Load functionality will buffer data - can you watch your temporary and scratch areas to see if they fill up during the big run?

Camaj · Post by **Camaj** » Wed Oct 26, 2005 7:47 am

Bonjour,

I was checked the TEMPORARY and SCRACH during the run with Unix command df -k.

Both directory are used only to 20%.

Thank !

ArndW · Post by **ArndW** » Wed Oct 26, 2005 8:19 am

I don't know what else it might be - if nothing shows up in your DB/2 logs then I think it is time to contact Ascential/IBM support.

kumar_s · Post by **kumar_s** » Wed Oct 26, 2005 9:01 am

Hi,
Also give a shot to this.
If you project is configured to monitore time base, change to size base monitoring.

regards
kumar

Camaj · Post by **Camaj** » Wed Oct 26, 2005 9:28 am

Thank !

How I can change the project configuration from monitore time base to size base monitoring.

ArndW · Post by **ArndW** » Wed Oct 26, 2005 10:46 am

You can change this monitoring information in the $APT settings; but I am not quite sure what extra information would come out of this. Perhaps Kumar could explain -

kumar_s · Post by **kumar_s** » Wed Oct 26, 2005 9:19 pm

Hi,
Always Size base monitoring has preference over the Time base monitoring.
The errors due to broken pipe was also due to the jobmonn process This is due to the monitoring frequency.
It recomended to have some value say 5 in APT_MONITOR_TIME.
Hence to override thisAPT_MONITOR_SIZE can be assighedn to a have a huge value say a million.
Also check for swap space.

regards
kumar

track_star · Post by **track_star** » Thu Oct 27, 2005 10:06 am

Let me clarify a little on what kumar_s said.

Always Size base monitoring has preference over the Time base monitoring.
-->This is incorrect. As is comes striaght out of the box, it is just the opposite. Time-based monitoring is the default, and unless you change it, is the preferred method.

The errors due to broken pipe was also due to the jobmonn process This is due to the monitoring frequency.

This is probably correct, but without more information and troubleshooting is hard to really tell. There could be an issue with DB2, but it's more likely that modifying the JobMon settings will cause the issue to disappear.

With that said, here's the skinny on JobMon.

The default value for APT_MONITOR_TIME is 5, which if no value is present for APT_MONITOR_SIZE, the engine uses time-based monitoring. There have been issues with time-based monitoring, so a recommended approach is to set APT_MONITOR_SIZE to some large value (like 50000 or 100000). This forces the engine to use row-based monitoring, and decreases the frequency of JobMon checks by the engine. This only works, however, if you have APT_MONITOR_TIME set to the default value (of 5). If any other value is set in APT_MONITOR_TIME, time-based monitoring is used. The only other alternative here, is to turn monitoring off, by using APT_NO_JOBMON=1. There are some patches that exist to fix issues with monitoring (check with Ascential Support to see if one applies in your particular case).