Debian Bug report logs - #582745
upstart: Upstart jobs get stuck when "expect daemon" or "expect fork" is wrong.

version graph

Package: upstart; Maintainer for upstart is Steve Langasek <vorlon@debian.org>; Source for upstart is src:upstart.

Reported by: David Caldwell <david@porkrind.org>

Date: Sun, 23 May 2010 10:33:01 UTC

Severity: normal

Tags: confirmed

Found in version upstart/0.6.3-1

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Michael Biebl <biebl@debian.org>:
Bug#582745; Package upstart. (Sun, 23 May 2010 10:33:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Caldwell <david@porkrind.org>:
New Bug report received and forwarded. Copy sent to Michael Biebl <biebl@debian.org>. (Sun, 23 May 2010 10:33:04 GMT) Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: David Caldwell <david@porkrind.org>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: upstart: Upstart jobs get stuck when "expect daemon" or "expect fork" is wrong.
Date: Sun, 23 May 2010 03:22:11 -0700
Package: upstart
Version: 0.6.3-1
Severity: normal


While trying to write some upstart /etc/init .conf files, I noticed that if
I got the "expect daemon/fork" line incorrect then init would get in a
messed up state, causing any subsequent "initctl start" or "initctl stop"
commands to hang on the .conf file in question, even after the bug was
fixed. The only solution I could find was to rename the script. I suspect
rebooting would work to but I'm in a position to reboot right now.

Clearly it was my fault that I got the "expect" line wrong, but even so I
don't think upstart should hang when talking about that job for then
on. There needs to be some way to reset it or not let it get in that state
at all.

Here's a small test case I made. I was unable to get it to hang with just an
"exec" line. I had to put it in a "script" and then call an external
program.

cat > fail.conf <<EOF
expect fork
script
        /bin/true
        sleep 10&
end script
EOF

If I try to start the job now it prints a PID that doesn't match up with the
correct PID. On my system I got:

# initctl start fail
fail start/running, process 20121
# ps aux | grep sleep
root     20122  0.0  0.0   3804   480 ?        S    03:07   0:00 sleep 10

At this point any attempt to "initctl stop fail" will hang. If I Control-C a
hung "stop" and try again it will say it's already stopped, but the state
doesn't show up correctly:

# initctl status fail
fail stop/killed, process 20121

Attempting to start it again also hangs. I couldn't find any way out of this
state except to reboot, or rename the fail.conf file to fail1.conf and
continue testing under that name.

-- System Information:
Debian Release: squeeze/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.33-2-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages upstart depends on:
ii  initscripts                   2.87dsf-10 scripts for initializing and shutt
ii  libc6                         2.10.2-6   Embedded GNU C Library: Shared lib
ii  libdbus-1-3                   1.2.24-1   simple interprocess messaging syst
ii  sysv-rc                       2.87dsf-10 System-V-like runlevel change mech
ii  sysvinit-utils                2.87dsf-10 System-V-like utilities

upstart recommends no packages.

upstart suggests no packages.

-- no debconf information




Information forwarded to debian-bugs-dist@lists.debian.org, Scott James Remnant <scott@netsplit.com>:
Bug#582745; Package upstart. (Mon, 06 Jun 2011 17:00:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Alan Givati <alangiv@gmail.com>:
Extra info received and forwarded to list. Copy sent to Scott James Remnant <scott@netsplit.com>. (Mon, 06 Jun 2011 17:00:03 GMT) Full text and rfc822 format available.

Message #10 received at 582745@bugs.debian.org (full text, mbox):

From: Alan Givati <alangiv@gmail.com>
To: 582745@bugs.debian.org
Subject: I have been experiencing a similar issue
Date: Mon, 06 Jun 2011 19:57:36 +0300
[Message part 1 (text/plain, inline)]
If there is something bad in a *script* stanza then that job will become 
broken (freezes on start or stop commands) till reboot.  Is there a way 
to "restart" upstart?
[Message part 2 (text/html, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Scott James Remnant <scott@netsplit.com>:
Bug#582745; Package upstart. (Fri, 13 Apr 2012 20:33:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Paul Makepeace <paulm@paulm.com>:
Extra info received and forwarded to list. Copy sent to Scott James Remnant <scott@netsplit.com>. (Fri, 13 Apr 2012 20:33:04 GMT) Full text and rfc822 format available.

Message #15 received at 582745@bugs.debian.org (full text, mbox):

From: Paul Makepeace <paulm@paulm.com>
To: 582745@bugs.debian.org
Subject: Any fix in sight?
Date: Fri, 13 Apr 2012 13:31:30 -0700
I've also just run into this issue.

I see this bug has been open for nearly two years, is it going to get
looked at? Is there anything we can do to help?




Information forwarded to debian-bugs-dist@lists.debian.org, Scott James Remnant <scott@netsplit.com>:
Bug#582745; Package upstart. (Mon, 07 May 2012 06:33:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to James Vautin <james.vautin@iplayup.com>:
Extra info received and forwarded to list. Copy sent to Scott James Remnant <scott@netsplit.com>. (Mon, 07 May 2012 06:33:03 GMT) Full text and rfc822 format available.

Message #20 received at 582745@bugs.debian.org (full text, mbox):

From: James Vautin <james.vautin@iplayup.com>
To: 582745@bugs.debian.org
Subject: Me too...
Date: Mon, 7 May 2012 16:32:15 +1000
[Message part 1 (text/plain, inline)]
Throwing up my hand as well!

Upstart, with a particular job, is in limbo (hung start/stop) until I reboot the relevant hosts, which at this time is not an option to maintain our 99.99%!

This seems like a pretty serious bug .. surprised it's been here for 2 years and more people aren't screaming?  Or I'm not sure - perhaps we're all doing seething wrong and there's a way to unstick these jobs without rebooting? :)
[Message part 2 (text/html, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Scott James Remnant <scott@netsplit.com>:
Bug#582745; Package upstart. (Mon, 07 May 2012 06:39:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to James Vautin <james@vautin.com>:
Extra info received and forwarded to list. Copy sent to Scott James Remnant <scott@netsplit.com>. (Mon, 07 May 2012 06:39:03 GMT) Full text and rfc822 format available.

Message #25 received at 582745@bugs.debian.org (full text, mbox):

From: James Vautin <james@vautin.com>
To: 582745@bugs.debian.org
Subject: Workaround
Date: Mon, 7 May 2012 16:36:18 +1000
[Message part 1 (text/plain, inline)]
I should also have mentioned - my workaround at the moment was to just rename the job.. I can't use the old job name until we reboot I guess :) 

[Message part 2 (text/html, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Scott James Remnant <scott@netsplit.com>:
Bug#582745; Package upstart. (Wed, 09 May 2012 09:51:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Marc Cooper <marc@auxbuss.com>:
Extra info received and forwarded to list. Copy sent to Scott James Remnant <scott@netsplit.com>. (Wed, 09 May 2012 09:51:04 GMT) Full text and rfc822 format available.

Message #30 received at 582745@bugs.debian.org (full text, mbox):

From: Marc Cooper <marc@auxbuss.com>
To: 582745@bugs.debian.org
Subject: Same problem
Date: Wed, 09 May 2012 10:37:04 +0100
Confirmed in version: 0.6.6-4

I just ran into this issue. I'm very grateful you reported it as I could
have spent hours trying to debug it.

I have exactly the symptoms you describe and triaged your debug steps
which I can replicate.

-- 
Best,
Marc




Added tag(s) confirmed. Request was from Steve Langasek <vorlon@debian.org> to control@bugs.debian.org. (Fri, 31 Aug 2012 20:54:12 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Steve Langasek <vorlon@debian.org>:
Bug#582745; Package upstart. (Tue, 19 Feb 2013 19:12:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Vladimir Rutsky <rutsky.vladimir@gmail.com>:
Extra info received and forwarded to list. Copy sent to Steve Langasek <vorlon@debian.org>. (Tue, 19 Feb 2013 19:12:03 GMT) Full text and rfc822 format available.

Message #37 received at 582745@bugs.debian.org (full text, mbox):

From: Vladimir Rutsky <rutsky.vladimir@gmail.com>
To: 582745@bugs.debian.org
Subject: Workaround: create process with expected PID
Date: Tue, 19 Feb 2013 23:07:40 +0400
[Message part 1 (text/plain, inline)]
Hello,

I encountered same issue that described in this bug report. I found
workaround by Johan Kiviniemi for this issue:

https://raw.github.com/ion1/workaround-upstart-snafu/master/workaround-upstart-snafu

Usage:

$ wget
https://raw.github.com/ion1/workaround-upstart-snafu/master/workaround-upstart-snafu
$ chmod +x workaround-upstart-snafu
$ ./workaround-upstart-snafu PID

where PID is process ID expected by upstart:

$ initctl status testdaemon
testdaemon stop/killed, process 16413

This is Ruby script and as I see it creates lots of childs using fork,
until process with requested PID created. After this script upstart thinks,
that daemon stopped:

$ initctl status testdaemon
testdaemon stop/waiting

On my machine this script works for a few minutes, which is long, but more
acceptable than reboot in certain conditions.


Best wishes,

Vladimir Rutsky
[Message part 2 (text/html, inline)]

Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Wed Apr 16 19:30:51 2014; Machine Name: buxtehude.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.