Page 1 of 2

Bus error (Coredump)

Posted: Thu Mar 04, 2004 4:41 pm
by chulett
Any help out there for tracking one of these down? What I've got is a Command stage in a Sequence that (at the end of the process) runs a Korn shell script. It and others like it have been running fine for months in a number of different jobs. Today I added it to a new job and I get the following error when the job runs:

Code: Select all

BatchETimeMaster..JobControl (@Archive_Files): Executed: /home/dsuser/scripts/archive_files.ksh /staging/ETIME/archive ETime /home/payftp/ETIMEINPUT.TXT /staging/ETIME/load/ETime* /staging/ETIME/load/Missing*
Reply=138
Output from command ====>
SH: 15502 Bus error(coredump)
I can run this same command manually and it will work fine. I can copy it out of the log entry and run it. But actually run the job and I've gotten a coredump - three times in a row now. :?

I'll open a Ticket with Ascential (and have sent it to my SA) but was curious if anyone out there had ever seen behaviour like this or had any thoughts they were willing to share.

Posted: Thu Mar 04, 2004 5:05 pm
by ariear
Sorry about the silly question but before running it from your command line you did . dsenv and other stuff from the ..RCxx so you're in the same environment.

Posted: Thu Mar 04, 2004 5:16 pm
by chulett
Yep! As close as possible to actually running under a job and it works fine. It actually runs just fine in many other jobs, across multiple servers and projects... there seems to be something magically wrong with this instance of it. :evil:

Maybe I'll try the old delete and re-add the Stage trick...

Posted: Fri Mar 05, 2004 10:33 am
by chulett
More information. I can't execute any scripts in any jobs on this particular server right now. Even if I create a job that does nothing other than call a script - any script - it coredumps. :evil:

Too much going on to allow me to restart DataStage right at the moment and I can certainly move on to other things. The system is coming down for a hardware upgrade tonight so perhaps things will be 'back to normal' after then. Perhaps. We shall see.

Posted: Mon Mar 08, 2004 11:37 am
by chulett
Always nice when four of the five entries so far are from the OP. :wink:

More info. Rebooting made no difference. Turns out what is happening is this - I can run any kind of normal O/S command (like "ls -l", for example) and anything using default shell (sh) syntax. What coredumps for me is when I need to 'pop' a Korn shell and run something using K shell syntax. Doesn't seem to matter if I do:

Code: Select all

/usr/bin/ksh scriptname
or if I put the following line at the top of the script, which is the usual way I handle this:

Code: Select all

#! /usr/bin/ksh
Either method causes a coredump on this particular server. I've got 'tickets' open with both Ascential and my local support forces, we'll see what comes of it. :?

Posted: Thu Sep 23, 2004 2:34 pm
by hgalusha
Craig,

Did you get a response from Ascential on this? I just upgraded an HPUX machine to 7.5 and I'm getting this exact behaviour. I suspect it has something to do with the new dsenv, but thus far, Ascential support hasn't come through with an answer.

Thanks in Advance,

Howard

Posted: Thu Sep 23, 2004 2:47 pm
by chulett
Not really. No fixes other than the need (for whatever reason) to change all references of:

Code: Select all

#! /usr/bin/ksh
to:

Code: Select all

#! /usr/bin/sh
in our scripts. Some of my servers can run either shell, but others (like Production!) core dump when specifying the korn shell. No clue what in the heck the difference is. Did not use to be a problem, and it didn't surface for us at a DataStage version change...

Posted: Thu Sep 23, 2004 5:34 pm
by kduke
Craig

How many files match *. There is a limit in the kernel of how big argv and argc. These correspond to $# in a shell script. Therefore you can blow out these arrays. If you patched your UNIX lately then these can get reset. I would wrap the shell script into one without arguments and try executing it. Like archive_all.ksh. Just a guess.

Posted: Thu Sep 23, 2004 5:53 pm
by chucksmith
Have you verified the setting of the SHDISPATCH variable in the uvconfig file? Mine is set to:

Code: Select all

SHDISPATCH /bin/sh
Is there a global .profile on your system?

Do ksh scripts still work from Unix?

Posted: Thu Sep 23, 2004 7:25 pm
by chulett
Hmm... I'll have to check SHDISPATCH and check for a global .profile, I have no idea off the top of my head. As to the question re korn shell scripts from UNIX, yah, they all work great. It's only when they get executed by a DataStage job that they core dump - and then only on certain servers. :?

Kim, I'll try your suggestions when I get a chance. I'd pretty much forgotten about this as things are working fine for us after dropping back to the 'default' shell.

Posted: Thu Sep 23, 2004 10:22 pm
by jwhyman
LD_PRELOAD set? Not sure of your DS and HP rev so canot advise, sheck with support f you can safely unset.

Posted: Fri Sep 24, 2004 9:38 am
by tonystark622
Craig,

LD_PRELOAD set? Not sure of your DS and HP rev so canot advise, sheck with support f you can safely unset.
I ran into this problem too and was told that I could comment out the LD_PRELOAD stuff in the dsenv if I wasn't using PX, I think.

I just converted our scripts to use SH rather than KSH. We weren't using any KSH features, so it was a simple change for us.

Good Luck,
Tony

Posted: Fri Sep 24, 2004 9:46 am
by chulett
tonystark622 wrote:I ran into this problem too and was told that I could comment out the LD_PRELOAD stuff in the dsenv if I wasn't using PX, I think.
Interesting. I was working on the comment in dsenv that says it must be unset on HP/UX 11.00 - and we run 11.11, which I understand is 11i.

However, I've just double-checked and dev has it commented out while production doesn't. Dev runs korn scripts, production doesn't. Hmm.... I'll comment it out in production and see what happens after the next bounce.
I just converted our scripts to use SH rather than KSH. We weren't using any KSH features, so it was a simple change for us.
Yah, worked well for us, too. :wink:

Posted: Fri Sep 24, 2004 9:54 am
by tonystark622
Interesting. I was working on the comment in dsenv that says it must be unset on HP/UX 11.00 - and we run 11.11, which I understand is 11i.
Ya know. I just looked at our dsenv file and that comment _is_ in there. All this time and issues with Ascential and I didn't even see that comment. <sigh> I'm just busy... yeah... that's my excuse :)

Tony

Posted: Mon Sep 27, 2004 7:47 am
by hgalusha
BTW, we are HPUX 11.11 as well. Ascential recommended we comment out the LD_PRELOAD statement in the dsenv. We did it and voila, kornshell scripts work again.

Thanks again for your responses and collaboration here.