Fatal Error: APT_SYSselect returned error status -1

bcarlson · Post by **bcarlson** » Wed Jun 06, 2007 6:15 pm

I am getting an error that I have not seen before:

<funnel,0> Fatal Error: APT_SYSselect returned error status -1 and no inputs reached EOF.

The exact same code runs on our production environment and does not have any issues. The job is very simple:

Code: Select all

dataset1 -> hash on acctnum
                            > funnel -> db2write
dataset2 -> hash on acctnum

Can anyone give any insight as to what is going on?

Thanks!

Brad.

bcarlson · Post by **bcarlson** » Wed Jun 06, 2007 6:20 pm

A little more log info:

##F TFIO 000153 18:58:22(003) <input repartition(1),0> Fatal Error: Unable to allocate communication resources
##E TFPM 000000 18:58:22(001) <node_edwdev2> operator [{natural="/u001/DataStage_work/at/at_dmnd_dep_tran_final.ds", synthetic="inpu
t repartition(1)"}], partition 0 of 8, processID 8,212,494 on edwdev2, player 2 terminated unexpectedly.
##E TFPM 000338 18:58:22(001) <main_program> Unexpected exit status 1
##E TOFN 000001 18:58:22(000) <funnel,2> Failure during execution of operator logic.
##I TOFN 000163 18:58:22(001) <funnel,2> Input 0 consumed 63525 records.
##I TOFN 000163 18:58:22(002) <funnel,2> Input 1 consumed 86123 records.
##I TOFN 000094 18:58:22(003) <funnel,2> Output 0 produced 149648 records.
##F TFOR 000151 18:58:22(004) <funnel,2> Fatal Error: APT_SYSselect returned error status -1 and no inputs reached EOF.

Also, one of the input datasets are pretty large. DS1 is 48 million records, DS2 is about 500,000. However, I ran against a much smaller set and got the same error (about 3.9 million and 0 recs). This is a generic process and deals with any size input dataset and writes to a parameterized DB2 table via db2write.

Brad.

ray.wurlod · Post by **ray.wurlod** » Wed Jun 06, 2007 9:22 pm

This may not be related, but why hash on acctnum only to repartition using DB2 after the Funnel stage?

bcarlson · Post by **bcarlson** » Thu Jun 07, 2007 9:21 am

Hmmm, good point. The datasets that it is using are already partitioned, so we really don't need the hash in front of each. Let me try it without the hash and see what happens.

On one hand, hopefully it will work and I can move on. On the other hand, then we still don't really know why it failed in the first place.

Still pondering...

Brad.

bcarlson · Post by **bcarlson** » Fri Jun 08, 2007 10:04 am

Okay, I updated the program to not re-hash the datasets going into the funnel stage and that eliminated the issue. However, the fact that it works does not give me warm fuzzy feelings when I don't know or understand why it failed to begin with.

The exact same code works in our production environment just fine. Why does it fail in dev? I am guesing that it is something environmental (not necessarily code related). I have tried the process with varying input sizes and come to the conclusion that input record count does not matter - it fails with large and small volumes.

Does anyone have some suggestions about what might be causing the error? This is a generic process used by dozens of production jobs. I am loathe to update a production process without a true understanding of why this failure is occuring - especially when the production process is running just fine.

Any help would be greatly appreciated!

Brad.

ps. I am NOT flagging this with a workaround. I don't know about the rest of you, but I tend to ignore entries that are resolved or marked as workarounds.

uegodawa · Post by **uegodawa** » Fri Jun 08, 2007 1:18 pm

just for curiosity
1. Are you running same server version on both Production and
Development Environments ?
2. Are you running same job with same node configuration on both systems ?
3. Environment Variables under Parallel branch are exactly same on both systems ?

bcarlson · Post by **bcarlson** » Fri Jun 08, 2007 2:20 pm

DataStage version is the same on dev and prod, as are environment variables/settings. The node configuration is different, but only in terms of the number of nodes. The way the nodes are configured is the same.

ray.wurlod · Post by **ray.wurlod** » Fri Jun 08, 2007 4:50 pm

Continuous, sort/merge or sequential Funnel? I suspect the second, and that the process that watches all inputs simultaneously to figure out which is next to preserve sorted order has taken some kind of abort. It's not totally clear why - but that's the line of investigation I'd be following, even unto re-instating the original partitioning to see whether it's reproducible. Maybe production has the "preserve partitioning" flag set to Clear but the development has "Set" or "Propagate"?

bcarlson · Post by **bcarlson** » Tue Jun 26, 2007 3:34 pm

Just a quick update. Turns out it was probably an IBM DataStage patch that caused our problems.

On development, this patch was installed in May. However, the job we are working with is run very rarely in dev so we never ran into the error until very recently. We therefore did not put together the connection between the patch and our error.

On the other hand, this patch was just installed into our production environment and lo and behold the same job failed instantly with the same error message.

The patch has been backed out and we are now awaiting a fix from IBM to resolve the issue.

When I get the patch identification information and the related fix, I will post as much info as I can.

Until then...

Brad.

bcarlson · Post by **bcarlson** » Tue Jun 26, 2007 4:34 pm

There were 2 patches applied at the same time: 102853 & 117136. We are not sure which one causes the error. Again, we are awaiting a fix from IBM.

Brad.

shin0066 · Post by **shin0066** » Wed Apr 16, 2008 9:28 am

Hi bcarlson,

we are also having the same issue and our environment also installed with same patches you mentioned. As you mentioned that in topic that you have a workaround... could you please specify what is that workaround until we get a fix to the patches.

Appreciate

ray.wurlod · Post by **ray.wurlod** » Wed Apr 16, 2008 5:07 pm

The workaround was not to partition the inputs to the Funnel stage (but, perhaps, to do it upstream of there).