Debian Bug report logs -
#357140
coreutils: ln -sf is not atomic
Reported by: Roger Leigh <rleigh@debian.org>
Date: Wed, 15 Mar 2006 21:33:07 UTC
Severity: minor
Tags: wontfix
Found in version coreutils/5.94-1
Done: Michael Stone <mstone@debian.org>
Bug is archived. No further changes may be made.
Toggle useless messages
Report forwarded to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#357140; Package coreutils.
(full text, mbox, link).
Acknowledgement sent to Roger Leigh <rleigh@debian.org>:
New Bug report received and forwarded. Copy sent to Michael Stone <mstone@debian.org>.
(full text, mbox, link).
Message #5 received at submit@bugs.debian.org (full text, mbox, reply):
Package: coreutils
Version: 5.94-1
Severity: normal
ln -sf can fail due to two unhandled race conditions:
To reproduce:
while true; do /bin/ln -sf foo bar; done &
(repeat a few times)
while true; do rm -f bar ; sleep 1; done &
You will see intermittent errors:
/bin/ln: cannot remove `bar': No such file or directory
/bin/ln: creating symbolic link `bar' to `foo': File exists
The latter error is much more frequent than the former.
This is due to the sequence of events:
stat file
unlink file
symlink file
Normally, this is fine, but it does not cope with these factors:
1) The file can be unlinked by another process in between the stat and
the unlink, causing the unlink to fail. ln should be aware this
can happen, and not treat this as a fatal error.
2) Another process can create file between unlink and symlink, leading
to symlink creation failing. Again, ln should be aware creation
may fail. If the error is EEXIST, it should repeat the unlink and
try again.
Regards,
Roger
-- System Information:
Debian Release: testing/unstable
APT prefers unstable
APT policy: (990, 'unstable')
Architecture: powerpc (ppc)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.15.4
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Versions of packages coreutils depends on:
ii libacl1 2.2.35-1 Access control list shared library
ii libc6 2.3.6-3 GNU C Library: Shared libraries an
ii libselinux1 1.28-4 SELinux shared libraries
coreutils recommends no packages.
-- no debconf information
Information forwarded to debian-bugs-dist@lists.debian.org:
Bug#357140; Package coreutils.
(full text, mbox, link).
Acknowledgement sent to Michael Stone <mstone@debian.org>:
Extra info received and forwarded to list.
(full text, mbox, link).
Message #10 received at 357140@bugs.debian.org (full text, mbox, reply):
On Wed, Mar 15, 2006 at 09:23:08PM +0000, Roger Leigh wrote:
>1) The file can be unlinked by another process in between the stat and
> the unlink, causing the unlink to fail. ln should be aware this
> can happen, and not treat this as a fatal error.
Why?
>2) Another process can create file between unlink and symlink, leading
> to symlink creation failing. Again, ln should be aware creation
> may fail. If the error is EEXIST, it should repeat the unlink and
> try again.
Again, why?
Is there any practical purpose to changing any of this? I'm not sure I
care too much about what happens if you're running a tight endless loop
like that, because the outcome is going to be random (dependent on
execute order) anyway.
Mike Stone
Information forwarded to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#357140; Package coreutils.
(full text, mbox, link).
Acknowledgement sent to Roger Leigh <rleigh@whinlatter.ukfsn.org>:
Extra info received and forwarded to list. Copy sent to Michael Stone <mstone@debian.org>.
(full text, mbox, link).
Message #15 received at 357140@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Michael Stone <mstone@debian.org> writes:
> On Wed, Mar 15, 2006 at 09:23:08PM +0000, Roger Leigh wrote:
>>1) The file can be unlinked by another process in between the stat and
>> the unlink, causing the unlink to fail. ln should be aware this
>> can happen, and not treat this as a fatal error.
>
> Why?
Because otherwise the command fails when it should not:
process1 process2
stat stat
unlink
unlink
symlink
FAIL
>>2) Another process can create file between unlink and symlink, leading
>> to symlink creation failing. Again, ln should be aware creation
>> may fail. If the error is EEXIST, it should repeat the unlink and
>> try again.
>
> Again, why?
Because otherwise the command fails when it should not:
process1 process2
stat
unlink
stat
symlink
symlink
FAIL
In both these cases, the -f switch tells ln to unlink and replace any
preexisting file, if it exists. If the file appears or disappears
during the execution of ln, this should not affect its behaviour. As
it is, ln does not behave deterministically.
> Is there any practical purpose to changing any of this? I'm not sure I
> care too much about what happens if you're running a tight endless
> loop like that, because the outcome is going to be random (dependent
> on execute order) anyway.
This example was an artificial test case. Its only purpose is to
demonstrate the problem in an efficient manner.
This *is* a real problem, which we are seeing with the sbuild part of
buildd. It has (perl):
system "/bin/ln -sf $main::pkg_logfile $conf::build_dir/current-$main::distribution";
system "/bin/ln -sf $main::pkg_logfile $conf::build_dir/current";
When buildd schedules builds concurrently, sbuild is showing the
problem intermittently.
Regards,
Roger
--
Roger Leigh
Printing on GNU/Linux? http://gutenprint.sourceforge.net/
Debian GNU/Linux http://www.debian.org/
GPG Public Key: 0x25BFB848. Please sign and encrypt your mail.
[Message part 2 (application/pgp-signature, inline)]
Information forwarded to debian-bugs-dist@lists.debian.org:
Bug#357140; Package coreutils.
(full text, mbox, link).
Acknowledgement sent to Michael Stone <mstone@debian.org>:
Extra info received and forwarded to list.
(full text, mbox, link).
Message #20 received at 357140@bugs.debian.org (full text, mbox, reply):
On Wed, Mar 15, 2006 at 11:53:54PM +0000, Roger Leigh wrote:
>In both these cases, the -f switch tells ln to unlink and replace any
>preexisting file, if it exists. If the file appears or disappears
>during the execution of ln, this should not affect its behaviour.
I'm not convinced that's true; I don't think there's anything in the
documentation that suggests that the command is atomic.
>This *is* a real problem, which we are seeing with the sbuild part of
>buildd. It has (perl):
>
> system "/bin/ln -sf $main::pkg_logfile $conf::build_dir/current-$main::distribution";
> system "/bin/ln -sf $main::pkg_logfile $conf::build_dir/current";
>
>When buildd schedules builds concurrently, sbuild is showing the
>problem intermittently.
Well, calling ln from perl via system is fugly anyway. I'd suggest using
something like:
unlink($conf::build_dir/current-$main::distribution);
if (!symlink($main::pkg_logfile, $conf::build_dir/current-$main::distribution))
{
handle_error() if (not $! =~ m/File exists/);
}
I'm not sure what you expect the end result to be if two processes try
and operate at the same time. It may be that you need a locking
mechanism for this to make sense.
Mike Stone
Information forwarded to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#357140; Package coreutils.
(full text, mbox, link).
Acknowledgement sent to bob@proulx.com (Bob Proulx):
Extra info received and forwarded to list. Copy sent to Michael Stone <mstone@debian.org>.
(full text, mbox, link).
Message #25 received at 357140@bugs.debian.org (full text, mbox, reply):
Roger Leigh wrote:
> Michael Stone <mstone@debian.org> writes:
> > Roger Leigh wrote:
> >>1) The file can be unlinked by another process in between the stat and
> >> the unlink, causing the unlink to fail. ln should be aware this
> >> can happen, and not treat this as a fatal error.
> >
> > Why?
>
> Because otherwise the command fails when it should not:
I don't think there ever has been a promise that these commands have
atomic behavior.
ln -sf SOURCE DESTINATION
Is basically the same as the following, only more conveniently
packaged for use as one shell command.
rm -f DESTINATION
ln -s SOURCE DESTINATION
Here is a reference to a similar discussion from upstream:
http://lists.gnu.org/archive/html/bug-coreutils/2005-05/msg00033.html
If you have suggestions as to how to improve the behavior I am sure
that would be appreciated. But I don't see how this can be made
atomic. And this does not seem like something to be fixed simply.
> This *is* a real problem, which we are seeing with the sbuild part of
> buildd. It has (perl):
>
> system "/bin/ln -sf $main::pkg_logfile $conf::build_dir/current-$main::distribution";
> system "/bin/ln -sf $main::pkg_logfile $conf::build_dir/current";
Ew, that is ugly. And why the hard coded paths?
> When buildd schedules builds concurrently, sbuild is showing the
> problem intermittently.
If you need an atomic operation then you will need to use a command
that is atomic such as rename(2) accessable from within mv. Here is
an example in shell off the top of my head but expecting to use a
better temporary name than the one I just produced for this example.
ln -sf $pkg_logfile $build_dir/current-$distribution.$$
mv $build_dir/current-$distribution.$$ $build_dir/current-$distribution
But of course you are in perl and can access rename from perl even
more directly.
Michael Stone wrote:
> Well, calling ln from perl via system is fugly anyway. I'd suggest using
> something like:
>
> unlink($conf::build_dir/current-$main::distribution);
> if (!symlink($main::pkg_logfile,
> $conf::build_dir/current-$main::distribution))
> {
> handle_error() if (not $! =~ m/File exists/);
> }
Let me suggest the following instead:
unlink($conf::build_dir/current-$main::distribution);
symlink($main::pkg_logfile,$conf::build_dir/current-$main::distribution.$$) or die;
rename($conf::build_dir/current-$main::distribution.$$,$conf::build_dir/current-$main::distribution) or die;
I tested the principle using this script.
#!/usr/bin/perl
unlink($ARGV[1].$$);
symlink($ARGV[0],$ARGV[1].$$) || die;
rename($ARGV[1].$$,$ARGV[1]) || die;
> I'm not sure what you expect the end result to be if two processes try
> and operate at the same time. It may be that you need a locking
> mechanism for this to make sense.
Agreed. But avoiding needing locking when possible is better.
Bob
Information forwarded to debian-bugs-dist@lists.debian.org:
Bug#357140; Package coreutils.
(full text, mbox, link).
Acknowledgement sent to Michael Stone <mstone@debian.org>:
Extra info received and forwarded to list.
(full text, mbox, link).
Message #30 received at 357140@bugs.debian.org (full text, mbox, reply):
On Thu, Mar 16, 2006 at 01:29:57AM -0700, Bob Proulx wrote:
>Let me suggest the following instead:
>
> unlink($conf::build_dir/current-$main::distribution);
> symlink($main::pkg_logfile,$conf::build_dir/current-$main::distribution.$$) or die;
> rename($conf::build_dir/current-$main::distribution.$$,$conf::build_dir/current-$main::distribution) or die;
I considered something like that...
>> I'm not sure what you expect the end result to be if two processes try
>> and operate at the same time. It may be that you need a locking
>> mechanism for this to make sense.
>
>Agreed. But avoiding needing locking when possible is better.
...but without knowing what the end result is, I didn't see it as an
improvement. That is, if two processes are writing at the same time,
one's output is going to get clobbered--if one gets clobbered atomically
I don't see an obvious advantage. If you have an actual locking scheme
you can determine the order of what happens; otherwise you're just
guessing, especially since it looks like there are *two* files involved
(the rename scheme can't atomically handle both of them). If you are
going to do the symlink/rename scheme I'd also use a better random
filename than $$. It's also not necessary to do the unlink in that
case--actually it's better not to, since the rename will clobber the
destination anyway.
Mike Stone
Information forwarded to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#357140; Package coreutils.
(full text, mbox, link).
Acknowledgement sent to bob@proulx.com (Bob Proulx):
Extra info received and forwarded to list. Copy sent to Michael Stone <mstone@debian.org>.
(full text, mbox, link).
Message #35 received at 357140@bugs.debian.org (full text, mbox, reply):
Michael Stone wrote:
> ...but without knowing what the end result is, I didn't see it as an
> improvement. That is, if two processes are writing at the same time,
> one's output is going to get clobbered--if one gets clobbered atomically
> I don't see an obvious advantage.
Sometimes it does not matter which file is in place as long as the
file in place is always syntactically correct.
But you are right, not knowing the particular case here makes it hard
to suggest an alternative. I was transferring from cases I have
typically run into in my own experience and was pursuing an
alternative which avoided spurious noise from commands. However, that
noise may have been an early warning indicator that something was
generally wrong. So making it quiet may not be a good thing after
all because it removes the early warning of other problems.
> Bob Proulx wrote:
> > unlink($conf::build_dir/current-$main::distribution);
> > symlink($main::pkg_logfile,$conf::build_dir/current-$main::distribution.$$) or die;
> > rename($conf::build_dir/current-$main::distribution.$$,$conf::build_dir/current-$main::distribution) or die;
> It's also not necessary to do the unlink in that case--actually it's
> better not to, since the rename will clobber the destination anyway.
Drat! You are correct. My test script had it right, unlinking the
symlink target. But I goofed translating that to the *real* case and
unlinked the wrong target. That should have been:
unlink($conf::build_dir/current-$main::distribution.$$);
symlink($main::pkg_logfile,$conf::build_dir/current-$main::distribution.$$) or die;
rename($conf::build_dir/current-$main::distribution.$$,$conf::build_dir/current-$main::distribution) or die;
With an unlink of the symlink target. Otherwise the symlink call may
fail if it tries to symlink to an existing target.
> I'd also use a better random filename than $$.
That would be fine. But this is a temporary file being used by
cooperating processes in a local tmp directory. It does not need to
be random. It just needs to be unique among all of the processes at
that moment in time. I don't think extreme measures are needed for
this type of simple case. But obviously it won't hurt and if the case
becomes more complicated later then the infrastructure is in place for
it. But I was trying to keep the example simple. (And as you saw I
still made a typo. :-)
Bob
Information forwarded to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#357140; Package coreutils.
(full text, mbox, link).
Acknowledgement sent to Roger Leigh <rleigh@whinlatter.ukfsn.org>:
Extra info received and forwarded to list. Copy sent to Michael Stone <mstone@debian.org>.
(full text, mbox, link).
Message #40 received at 357140@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
bob@proulx.com (Bob Proulx) writes:
> Roger Leigh wrote:
>> Michael Stone <mstone@debian.org> writes:
>> > Roger Leigh wrote:
>> >>1) The file can be unlinked by another process in between the stat and
>> >> the unlink, causing the unlink to fail. ln should be aware this
>> >> can happen, and not treat this as a fatal error.
>> >
>> > Why?
>>
>> Because otherwise the command fails when it should not:
>
> I don't think there ever has been a promise that these commands have
> atomic behavior.
OK.
> If you have suggestions as to how to improve the behavior I am sure
> that would be appreciated. But I don't see how this can be made
> atomic. And this does not seem like something to be fixed simply.
I'm not sure if it should or not be fixed, and if fixed, whether it
should or should not be atomic. I just wanted to report the existence
of the problem.
I'm not sure that (filesystem) atomicity is necessarily required. The
important thing (for me) is that it succeeds without error unless
there's a real failure (such as permissions issues). If that means
repeated unlinking until symlink succeeds, that's OK for me: that's
why I used the "-f" switch, because it's OK for the file to be
clobbered.
If this is undesirable, feel free to close or downgrade the severity.
I'm not sure how this would fit in with POSIX/SUS; if it doesn't fit,
it's obviously a broken idea.
>> This *is* a real problem, which we are seeing with the sbuild part of
>> buildd. It has (perl):
>>
>> system "/bin/ln -sf $main::pkg_logfile $conf::build_dir/current-$main::distribution";
>> system "/bin/ln -sf $main::pkg_logfile $conf::build_dir/current";
>
> Ew, that is ugly. And why the hard coded paths?
The hard-coded paths are just for testing (in case ln was a sh
builtin). I know it's really ugly: I didn't write it myself, and I
will be replacing it shortly with proper perl.
The reason this is done is that the "current" symlink points to the
currently running build log, so you could e.g. tail it to monitor the
current build. When more than one build is running, it points to the
most recently started one, but when they start at almost exactly the
same time, this is when the race is seen.
> If you need an atomic operation then you will need to use a command
> that is atomic such as rename(2) accessable from within mv. Here is
> an example in shell off the top of my head but expecting to use a
> better temporary name than the one I just produced for this example.
>
> ln -sf $pkg_logfile $build_dir/current-$distribution.$$
> mv $build_dir/current-$distribution.$$ $build_dir/current-$distribution
>
> But of course you are in perl and can access rename from perl even
> more directly.
Absolutely. I'll be doing just that. Or even just unlink and symlink
and continuing on error: in this case, it's not so important that it
succeeds, just that the current symlink exists and points to the
latest build. If it fails, it's because it got preempted by another
build, so that's OK.
> Let me suggest the following instead:
>
> unlink($conf::build_dir/current-$main::distribution);
> symlink($main::pkg_logfile,$conf::build_dir/current-$main::distribution.$$) or die;
> rename($conf::build_dir/current-$main::distribution.$$,$conf::build_dir/current-$main::distribution) or die;
>
> I tested the principle using this script.
>
> #!/usr/bin/perl
> unlink($ARGV[1].$$);
> symlink($ARGV[0],$ARGV[1].$$) || die;
> rename($ARGV[1].$$,$ARGV[1]) || die;
I'll probably just use || return.
Thanks for the suggestions, guys.
Regards,
Roger
--
Roger Leigh
Printing on GNU/Linux? http://gutenprint.sourceforge.net/
Debian GNU/Linux http://www.debian.org/
GPG Public Key: 0x25BFB848. Please sign and encrypt your mail.
[Message part 2 (application/pgp-signature, inline)]
Changed Bug title to `coreutils: ln -sf is not atomic' from `coreutils: ln -sf has two races which cause breakage'.
Request was from Lucas Nussbaum <lucas@lucas-nussbaum.net>
to control@bugs.debian.org.
(Wed, 23 Jan 2008 13:09:05 GMT) (full text, mbox, link).
Tags added: wontfix
Request was from Lucas Nussbaum <lucas@lucas-nussbaum.net>
to control@bugs.debian.org.
(Wed, 23 Jan 2008 13:09:05 GMT) (full text, mbox, link).
Severity set to `minor' from `normal'
Request was from Lucas Nussbaum <lucas@lucas-nussbaum.net>
to control@bugs.debian.org.
(Wed, 23 Jan 2008 13:09:06 GMT) (full text, mbox, link).
Reply sent to Michael Stone <mstone@debian.org>:
You have taken responsibility.
(full text, mbox, link).
Notification sent to Roger Leigh <rleigh@debian.org>:
Bug acknowledged by developer.
(full text, mbox, link).
Message #51 received at 357140-done@bugs.debian.org (full text, mbox, reply):
I'm going to close this since I still question the general utility of
attempting a sylink after an unlink fails, and since some alternate
approaches were suggested and seem to have been effective.
Mike Stone
Bug archived.
Request was from Debbugs Internal Request <owner@bugs.debian.org>
to internal_control@bugs.debian.org.
(Thu, 01 May 2008 07:33:06 GMT) (full text, mbox, link).
Send a report that this bug log contains spam.
Debian bug tracking system administrator <owner@bugs.debian.org>.
Last modified:
Sat Aug 26 13:27:07 2023;
Machine Name:
buxtehude
Debian Bug tracking system
Debbugs is free software and licensed under the terms of the GNU
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson,
2005-2017 Don Armstrong, and many other contributors.