Debian Bug report logs - #469564
FreeBSD kernel doesn't follow x86/x86-64 ABI wrt direction flag

version graph

Package: kfreebsd-6; Maintainer for kfreebsd-6 is (unknown);

Reported by: aurel32@debian.org

Date: Sun, 2 Mar 2008 20:06:02 UTC

Severity: critical

Fixed in version kfreebsd-6/6.3-4

Done: Aurelien Jarno <aurel32@debian.org>

Bug is archived. No further changes may be made.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#469058; Package libc6. Full text and rfc822 format available.

Acknowledgement sent to Rupert Swarbrick <rswarbrick@googlemail.com>:
New Bug report received and forwarded. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>. Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: Rupert Swarbrick <rswarbrick@googlemail.com>
To: submit@bugs.debian.org
Subject: libc6: New version of libc6 hangs SBCL
Date: Sun, 02 Mar 2008 20:04:46 +0000
Subject: libc6: New version of libc6 hangs SBCL
Package: libc6
Version: 2.7-9
Severity: critical
Justification: breaks unrelated software

*** Please type your report below this line ***

After upgrading to 2.7-9 of libc6 in unstable, SBCL became extremely
prone to crashing 
randomly (i.e. 5-10 source files compiled of the SBCL CVS code before a
100% CPU hang which 
was only killable with -s 9.

The issue was originally raised on the SBCL devel list here:

http://thread.gmane.org/gmane.lisp.steel-bank.devel/10902/focus=10905

I upgraded the system from libc6 2.7-8, which seems to have caused the
problem. Downgrading 
libc6 to the package in testing (libc 2.7-6) fixes it again: I've since
compiled sbcl from 
CVS twice to make sure.


Rupert Swarbrick


P.S. The System Information below was generated with reportbug and I've
currently got 2.7-6 
installed. libgcc1 doesn't seem to have been up/down-graded, though, so
I think the below is 
mostly true except for the "testing" bits, since all other libraries are
at the current 
versions in unstable.

-- System Information:
Debian Release: lenny/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)

Kernel: Linux 2.6.24-1-686 (SMP w/2 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages libc6 depends on:
ii  libgcc1                 1:4.3-20080227-1 GCC support library

libc6 recommends no packages.

-- debconf information excluded






Information forwarded to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#469058; Package libc6. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>. Full text and rfc822 format available.

Message #10 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: Rupert Swarbrick <rswarbrick@googlemail.com>, 469058@bugs.debian.org
Subject: Re: Bug#469058: libc6: New version of libc6 hangs SBCL
Date: Sun, 02 Mar 2008 22:58:23 +0100
Rupert Swarbrick a écrit :
> Subject: libc6: New version of libc6 hangs SBCL
> Package: libc6
> Version: 2.7-9
> Severity: critical
> Justification: breaks unrelated software
> 
> *** Please type your report below this line ***
> 
> After upgrading to 2.7-9 of libc6 in unstable, SBCL became extremely
> prone to crashing 
> randomly (i.e. 5-10 source files compiled of the SBCL CVS code before a
> 100% CPU hang which 
> was only killable with -s 9.

Could you please give a way to reproduce the bug? I know nothing about 
SBCL, so a list of commands to execute or a shell script would be nice.

> The issue was originally raised on the SBCL devel list here:
> 
> http://thread.gmane.org/gmane.lisp.steel-bank.devel/10902/focus=10905
> 
> I upgraded the system from libc6 2.7-8, which seems to have caused the
> problem. Downgrading 
> libc6 to the package in testing (libc 2.7-6) fixes it again: I've since
> compiled sbcl from 
> CVS twice to make sure.

Could you please try to narrow the problem to a single libc6 version? 
Older versions are available on snapshot.debian.net.

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Information forwarded to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#469058; Package libc6. Full text and rfc822 format available.

Acknowledgement sent to Rupert Swarbrick <rswarbrick@googlemail.com>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>. Full text and rfc822 format available.

Message #15 received at 469058@bugs.debian.org (full text, mbox):

From: Rupert Swarbrick <rswarbrick@googlemail.com>
To: Aurelien Jarno <aurelien@aurel32.net>
Cc: Rupert Swarbrick <rswarbrick@googlemail.com>, 469058@bugs.debian.org
Subject: Re: Bug#469058: libc6: New version of libc6 hangs SBCL
Date: Sun, 02 Mar 2008 22:13:30 +0000
[Message part 1 (text/plain, inline)]
> > After upgrading to 2.7-9 of libc6 in unstable, SBCL became extremely
> > prone to crashing 
> > randomly (i.e. 5-10 source files compiled of the SBCL CVS code before a
> > 100% CPU hang which 
> > was only killable with -s 9.
> 
> Could you please give a way to reproduce the bug? I know nothing about 
> SBCL, so a list of commands to execute or a shell script would be nice.

Of course, sorry. Probably the easiest option is to get sbcl and darcs
(the version control system). It doesn't appear to matter which version
of sbcl - I've had the problem with 1.0.12 up to 1.0.14.

Then get clbuild (which is a little shell script which can among other
things get and build a new copy of sbcl, which is a project large enough
to guarantee the hang on my computer at least) using

darcs get http://common-lisp.net/project/clbuild/clbuild

Chmod +x the clbuild script inside the folder and cd inside and run

./clbuild buildsbcl

Hopefully (!) your system should hang at 100% CPU with the sbcl process
ignoring SIGTERM. Sorry about the involved duplication process - it's
just that you need to compile quite a bit of code before the seemingly
"random" bug strikes.

> Could you please try to narrow the problem to a single libc6 version? 
> Older versions are available on snapshot.debian.net.
> 

Could you let me know a way to convince dpkg to let me do that? Since
you have to up/downgrade a dependent package at the same time, and I'm a
bit scared of using --force-all on libc6 on my only computer! (Or am I
being needlessly cautious?)

Incidentally, as someone pointed out in the sbcl thread I linked to,
it's quite possible that your change in libc6 has thrown up a bug in
sbcl rather than the other way round - we're trying (and failing) to get
strace,gdb to tell us something useful about where sbcl died. However,
I'd be extremely grateful if you could suggest what change you think is
likely to be able to cause this sort of symptom - and I stand by my
categorization: the change definitely breaks an unrelated package :P

Thanks for the really prompt response!

Rupert
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#469058; Package libc6. Full text and rfc822 format available.

Acknowledgement sent to "Liam Healy" <lnp@healy.washington.dc.us>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>. Full text and rfc822 format available.

Message #20 received at 469058@bugs.debian.org (full text, mbox):

From: "Liam Healy" <lnp@healy.washington.dc.us>
To: 469058@bugs.debian.org
Subject: libc6 version
Date: Sun, 2 Mar 2008 17:15:06 -0500
The problem started for me when this upgrade happened:
[UPGRADE] libc6 2.7-8 -> 2.7-9
[UPGRADE] libc6-dev 2.7-8 -> 2.7-9
and was not happening before.  Thus it is the changes in going to
2.7-9 from 2.7-8 that reveal this problem.

Liam




Information forwarded to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#469058; Package libc6. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>. Full text and rfc822 format available.

Message #25 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: Rupert Swarbrick <rswarbrick@googlemail.com>, 469058@bugs.debian.org
Subject: Re: Bug#469058: libc6: New version of libc6 hangs SBCL
Date: Mon, 03 Mar 2008 00:01:47 +0100
Rupert Swarbrick a écrit :
>>> After upgrading to 2.7-9 of libc6 in unstable, SBCL became extremely
>>> prone to crashing 
>>> randomly (i.e. 5-10 source files compiled of the SBCL CVS code before a
>>> 100% CPU hang which 
>>> was only killable with -s 9.
>> Could you please give a way to reproduce the bug? I know nothing about 
>> SBCL, so a list of commands to execute or a shell script would be nice.
> 
> Of course, sorry. Probably the easiest option is to get sbcl and darcs
> (the version control system). It doesn't appear to matter which version
> of sbcl - I've had the problem with 1.0.12 up to 1.0.14.
> 
> Then get clbuild (which is a little shell script which can among other
> things get and build a new copy of sbcl, which is a project large enough
> to guarantee the hang on my computer at least) using
> 
> darcs get http://common-lisp.net/project/clbuild/clbuild
> 
> Chmod +x the clbuild script inside the folder and cd inside and run
> 
> ./clbuild buildsbcl
> 
> Hopefully (!) your system should hang at 100% CPU with the sbcl process
> ignoring SIGTERM. Sorry about the involved duplication process - it's
> just that you need to compile quite a bit of code before the seemingly
> "random" bug strikes.

I am able to reproduce the bug, except that here the sbcl process hangs 
with 0% CPU.

>> Could you please try to narrow the problem to a single libc6 version? 
>> Older versions are available on snapshot.debian.net.
>>
> 
> Could you let me know a way to convince dpkg to let me do that? Since
> you have to up/downgrade a dependent package at the same time, and I'm a
> bit scared of using --force-all on libc6 on my only computer! (Or am I
> being needlessly cautious?)
> 

Theoretically if you upgrade/downgrade all libc6 related packages 
(libc6-dev, libc6-i686, ...) at the same time it should work.

Anyway now that I am able to reproduce the problem, I can do that myself.

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Information forwarded to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#469058; Package libc6. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>. Full text and rfc822 format available.

Message #30 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: Liam Healy <lnp@healy.washington.dc.us>, 469058@bugs.debian.org, Rupert Swarbrick <rswarbrick@googlemail.com>
Subject: Re: Bug#469058: libc6 version
Date: Mon, 03 Mar 2008 00:03:23 +0100
Liam Healy a écrit :
> The problem started for me when this upgrade happened:
> [UPGRADE] libc6 2.7-8 -> 2.7-9
> [UPGRADE] libc6-dev 2.7-8 -> 2.7-9
> and was not happening before.  Thus it is the changes in going to
> 2.7-9 from 2.7-8 that reveal this problem.

I get to the same conclusion. The only relevant change I could see 
between 2.7-8 and 2.7-9 is the switch to gcc-4.3 from gcc-4.2 to build 
the glibc. I will try to confirm this is the culprit.

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Information forwarded to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#469058; Package libc6. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>. Full text and rfc822 format available.

Message #35 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: 469058@bugs.debian.org
Cc: Liam Healy <lnp@healy.washington.dc.us>, Rupert Swarbrick <rswarbrick@googlemail.com>
Subject: Re: Bug#469058: libc6 version
Date: Mon, 03 Mar 2008 07:26:18 +0100
Aurelien Jarno a écrit :
> Liam Healy a écrit :
>> The problem started for me when this upgrade happened:
>> [UPGRADE] libc6 2.7-8 -> 2.7-9
>> [UPGRADE] libc6-dev 2.7-8 -> 2.7-9
>> and was not happening before.  Thus it is the changes in going to
>> 2.7-9 from 2.7-8 that reveal this problem.
> 
> I get to the same conclusion. The only relevant change I could see 
> between 2.7-8 and 2.7-9 is the switch to gcc-4.3 from gcc-4.2 to build 
> the glibc. I will try to confirm this is the culprit.
> 

I confirm that switch to gcc-4.3 from gcc-4.2 causes the problem. The 
next step, understandy why, won't be that easy.

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Information forwarded to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#469058; Package libc6. Full text and rfc822 format available.

Acknowledgement sent to "Nikodemus Siivola" <nikodemus@random-state.net>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>. Full text and rfc822 format available.

Message #40 received at 469058@bugs.debian.org (full text, mbox):

From: "Nikodemus Siivola" <nikodemus@random-state.net>
To: 469058@bugs.debian.org, sbcl-devel <sbcl-devel@lists.sourceforge.net>
Subject: looking over GCC 4.3 release notes
Date: Tue, 4 Mar 2008 20:28:55 +0200
Looking at http://gcc.gnu.org/gcc-4.3/changes.html, this is the only
thing that really jumps out:

"GCC no longer places the cld instruction before string operations.
Both i386 and x86-64 ABI documents mandate the direction flag to be
clear at the entry of a function. It is now invalid to set the flag in
asm statement without reseting it afterward."

...but (1) SBCL _should_ be resetting the direction flag before any
calls to libc code, and (2) I would expect problems caused by this to
be more deterministic.

Cheers,

 -- Nikodemus




Information forwarded to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#469058; Package libc6. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>. Full text and rfc822 format available.

Message #45 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: Nikodemus Siivola <nikodemus@random-state.net>, 469058@bugs.debian.org
Cc: sbcl-devel <sbcl-devel@lists.sourceforge.net>
Subject: Re: Bug#469058: looking over GCC 4.3 release notes
Date: Tue, 04 Mar 2008 20:21:42 +0100
Nikodemus Siivola a écrit :
> Looking at http://gcc.gnu.org/gcc-4.3/changes.html, this is the only
> thing that really jumps out:
> 
> "GCC no longer places the cld instruction before string operations.
> Both i386 and x86-64 ABI documents mandate the direction flag to be
> clear at the entry of a function. It is now invalid to set the flag in
> asm statement without reseting it afterward."
> 
> ...but (1) SBCL _should_ be resetting the direction flag before any
> calls to libc code, and (2) I would expect problems caused by this to
> be more deterministic.
> 

On my side, I have made some progress. I have rebuilt a glibc with 
gcc-4.3 for all files except the signal/ directory, built with gcc-4.2 
instead. SBCL seems to work correctly in this case.


-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Information forwarded to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#469058; Package libc6. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>. Full text and rfc822 format available.

Message #50 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: Nikodemus Siivola <nikodemus@random-state.net>, 469058@bugs.debian.org
Cc: sbcl-devel <sbcl-devel@lists.sourceforge.net>, control@bugs.debian.org
Subject: Re: Bug#469058: looking over GCC 4.3 release notes
Date: Wed, 05 Mar 2008 03:12:52 +0100
reassign 469058 sbcl
retitle 469058 sbcl don't reset direction flag upon exit
thanks

Nikodemus Siivola a écrit :
> Looking at http://gcc.gnu.org/gcc-4.3/changes.html, this is the only
> thing that really jumps out:
> 
> "GCC no longer places the cld instruction before string operations.
> Both i386 and x86-64 ABI documents mandate the direction flag to be
> clear at the entry of a function. It is now invalid to set the flag in
> asm statement without reseting it afterward."
> 
> ...but (1) SBCL _should_ be resetting the direction flag before any
> calls to libc code, and (2) I would expect problems caused by this to
> be more deterministic.

It actually doesn't reset it. The problem causes the sigemptyset()
function from the glibc to not work correctly.

I have identified the potential part from SBCL causing the problem, I am
currently testing a fix.

Cheers,
Aurelien


-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Bug reassigned from package `libc6' to `sbcl'. Request was from Aurelien Jarno <aurelien@aurel32.net> to control@bugs.debian.org. (Wed, 05 Mar 2008 02:15:10 GMT) Full text and rfc822 format available.

Changed Bug title to `sbcl don't reset direction flag upon exit' from `libc6: New version of libc6 hangs SBCL'. Request was from Aurelien Jarno <aurelien@aurel32.net> to control@bugs.debian.org. (Wed, 05 Mar 2008 02:15:12 GMT) Full text and rfc822 format available.

Changed Bug title to `sbcl doesn't reset direction flag upon exit' from `sbcl don't reset direction flag upon exit'. Request was from Aurelien Jarno <aurel32@debian.org> to control@bugs.debian.org. (Wed, 05 Mar 2008 02:21:03 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>:
Bug#469058; Package sbcl. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>. Full text and rfc822 format available.

Message #61 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: Nikodemus Siivola <nikodemus@random-state.net>, 469058@bugs.debian.org
Cc: sbcl-devel <sbcl-devel@lists.sourceforge.net>, control@bugs.debian.org
Subject: Re: Bug#469058: looking over GCC 4.3 release notes
Date: Wed, 5 Mar 2008 10:22:18 +0100
tag 469058 + patch
thanks

On Wed, Mar 05, 2008 at 03:12:52AM +0100, Aurelien Jarno wrote:
> reassign 469058 sbcl
> retitle 469058 sbcl don't reset direction flag upon exit
> thanks
> 
> Nikodemus Siivola a écrit :
> > Looking at http://gcc.gnu.org/gcc-4.3/changes.html, this is the only
> > thing that really jumps out:
> > 
> > "GCC no longer places the cld instruction before string operations.
> > Both i386 and x86-64 ABI documents mandate the direction flag to be
> > clear at the entry of a function. It is now invalid to set the flag in
> > asm statement without reseting it afterward."
> > 
> > ...but (1) SBCL _should_ be resetting the direction flag before any
> > calls to libc code, and (2) I would expect problems caused by this to
> > be more deterministic.
> 
> It actually doesn't reset it. The problem causes the sigemptyset()
> function from the glibc to not work correctly.
> 
> I have identified the potential part from SBCL causing the problem, I am
> currently testing a fix.
> 

Please find below the patch to fix the problem. I only tested on amd64,
but it should work the same way on i386.

--- sbcl-1.0.14.0.orig/src/compiler/x86/call.lisp
+++ sbcl-1.0.14.0/src/compiler/x86/call.lisp
@@ -364,7 +364,8 @@
       ;; Restore EDI, and reset the stack.
       (emit-label restore-edi)
       (loadw edi-tn ebx-tn (frame-word-offset 1))
-      (inst mov esp-tn ebx-tn))))
+      (inst mov esp-tn ebx-tn)
+      (inst cld))))
   (values))
 
 ;;;; unknown values receiving
@@ -1376,7 +1377,8 @@
        (inst sub ecx 1)
        (inst jmp :nz loop)
        ;; NIL out the last cons.
-       (storew nil-value dst 1 list-pointer-lowtag))
+       (storew nil-value dst 1 list-pointer-lowtag)
+       (inst cld))
       (emit-label done))))
 
 ;;; Return the location and size of the &MORE arg glob created by
--- sbcl-1.0.14.0.orig/src/compiler/x86/values.lisp
+++ sbcl-1.0.14.0/src/compiler/x86/values.lisp
@@ -38,6 +38,7 @@
     (inst movs :dword)
     (inst cmp esp-tn esi)
     (inst jmp :be loop)
+    (inst cld)
     DONE
     (inst lea esp-tn (make-ea :dword :base edi :disp n-word-bytes))
     (inst sub edi esi)
--- sbcl-1.0.14.0.orig/src/compiler/x86/nlx.lisp
+++ sbcl-1.0.14.0/src/compiler/x86/nlx.lisp
@@ -237,6 +237,7 @@
     (inst std)
     (inst rep)
     (inst movs :dword)
+    (inst cld)
 
     DONE
     ;; Reset the CSP at last moved arg.
--- sbcl-1.0.14.0.orig/src/compiler/x86-64/call.lisp
+++ sbcl-1.0.14.0/src/compiler/x86-64/call.lisp
@@ -356,7 +356,8 @@
       ;; Restore EDI, and reset the stack.
       (emit-label restore-edi)
       (loadw rdi-tn rbx-tn (- (1+ 1)))
-      (inst mov rsp-tn rbx-tn))))
+      (inst mov rsp-tn rbx-tn)
+      (inst cld))))
   (values))
 
 ;;;; unknown values receiving
@@ -1320,7 +1321,8 @@
        (inst sub rcx 1)
        (inst jmp :nz loop)
        ;; NIL out the last cons.
-       (storew nil-value dst 1 list-pointer-lowtag))
+       (storew nil-value dst 1 list-pointer-lowtag)
+       (inst cld))
       (emit-label done))))
 
 ;;; Return the location and size of the &MORE arg glob created by
--- sbcl-1.0.14.0.orig/src/compiler/x86-64/values.lisp
+++ sbcl-1.0.14.0/src/compiler/x86-64/values.lisp
@@ -38,6 +38,7 @@
     (inst movs :qword)
     (inst cmp rsp-tn rsi)
     (inst jmp :be LOOP)
+    (inst cld)
     DONE
     (inst lea rsp-tn (make-ea :qword :base rdi :disp n-word-bytes))
     (inst sub rdi rsi)
--- sbcl-1.0.14.0.orig/src/compiler/x86-64/nlx.lisp
+++ sbcl-1.0.14.0/src/compiler/x86-64/nlx.lisp
@@ -212,6 +212,7 @@
     (inst std)
     (inst rep)
     (inst movs :qword)
+    (inst cld)
 
     DONE
     ;; Reset the CSP at last moved arg.
--- sbcl-1.0.14.0.orig/src/assembly/x86/assem-rtns.lisp
+++ sbcl-1.0.14.0/src/assembly/x86/assem-rtns.lisp
@@ -54,6 +54,7 @@
   (inst lea edi (make-ea :dword :base ebx :disp (- n-word-bytes)))
   (inst rep)
   (inst movs :dword)
+  (inst cld)                            ; restore direction bit
 
   ;; solaris requires DF being zero.
   #!+sunos (inst cld)
@@ -153,6 +154,7 @@
   (inst sub esi (fixnumize 1))
   (inst rep)
   (inst movs :dword)
+  (inst cld)                            ; restore direction bit
 
   ;; solaris requires DF being zero.
   #!+sunos (inst cld)
--- sbcl-1.0.14.0.orig/src/assembly/x86-64/assem-rtns.lisp
+++ sbcl-1.0.14.0/src/assembly/x86-64/assem-rtns.lisp
@@ -54,6 +54,7 @@
   (inst lea edi (make-ea :qword :base ebx :disp (- n-word-bytes)))
   (inst rep)
   (inst movs :qword)
+  (inst cld)                            ; restore direction bit
 
   ;; Restore the count.
   (inst mov ecx edx)
@@ -150,6 +151,7 @@
   (inst sub esi (fixnumize 1))
   (inst rep)
   (inst movs :qword)
+  (inst cld)                            ; restore direction bit
 
   ;; Load the register arguments carefully.
   (loadw edx rbp-tn -1)

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Tags added: patch Request was from Aurelien Jarno <aurelien@aurel32.net> to control@bugs.debian.org. (Wed, 05 Mar 2008 09:26:37 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>:
Bug#469058; Package sbcl. Full text and rfc822 format available.

Acknowledgement sent to "Nikodemus Siivola" <nikodemus@random-state.net>:
Extra info received and forwarded to list. Copy sent to Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>. Full text and rfc822 format available.

Message #68 received at 469058@bugs.debian.org (full text, mbox):

From: "Nikodemus Siivola" <nikodemus@random-state.net>
To: 469058@bugs.debian.org, sbcl-devel <sbcl-devel@lists.sourceforge.net>
Cc: "Aurelien Jarno" <aurelien@aurel32.net>, "Debian Common Lisp Team" <pkg-common-lisp-devel@lists.alioth.debian.org>
Subject: DF and signal handlers
Date: Wed, 5 Mar 2008 13:39:08 +0200
On 3/5/08, Debian Bug Tracking System <owner@bugs.debian.org> wrote:

> tag 469058 + patch
>  Bug#469058: sbcl doesn't reset direction flag upon exit
>  There were no tags set.
>  Tags added: patch

Thanks for the patch, but... while I agree that it is good to change
SBCL to reset the direction flag every time it is diddled, instead of
just before calling C, I don't think SBCL is actually at fault here.

 1. SBCL does actually reset DF before any call to foreign (GCC generated) code.
    See line 236 in src/compiler/x86/c-call.lisp, and line 125 in
    src/runtime/x86-assem.S.

    (It is possible I'm missing out a call-path here, but even so, read on and
    see if my fears are unfounded or not.)

 2. If the problem was due to a foreign call, it should be deterministic.

 3. If the problem was due to _returning_ to main(), it should be deterministic.

What I suspect is actually going on (especially considering your
statement that compiling signals/ with 4.2 avoided the issue) is that
a signal handler is entered while DF is set.

If this is the case, then clearing it right after each REP loop where
SBCL uses it just makes seeing the bug much more unlikely -- but not
impossible in the presence of async signals.

If so, this may also explain some _very_ hard to reproduce faults we
have seen over the years: using a pre 4.3-GCC compiled libc, a signal
at an in opportune moment in the middle of a REP loop could clear DF!
Yikes!

I'm not sure what is The Right Thing here, though. Should SBCL (and
_any_ program that ever sets DF!) save, clear, and restore DF in its
signal handlers? Should libc/kernel do that? Should signals be blocked
before ever setting DF? Is setting DF Just A Bad Idea?

Cheers,

 -- Nikodemus




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>:
Bug#469058; Package sbcl. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>. Full text and rfc822 format available.

Message #73 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: Nikodemus Siivola <nikodemus@random-state.net>
Cc: 469058@bugs.debian.org, sbcl-devel <sbcl-devel@lists.sourceforge.net>, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>
Subject: Re: DF and signal handlers
Date: Wed, 05 Mar 2008 14:33:08 +0100
Nikodemus Siivola a écrit :
> On 3/5/08, Debian Bug Tracking System <owner@bugs.debian.org> wrote:
> 
>> tag 469058 + patch
>>  Bug#469058: sbcl doesn't reset direction flag upon exit
>>  There were no tags set.
>>  Tags added: patch
> 
> Thanks for the patch, but... while I agree that it is good to change
> SBCL to reset the direction flag every time it is diddled, instead of
> just before calling C, I don't think SBCL is actually at fault here.
> 
>  1. SBCL does actually reset DF before any call to foreign (GCC generated) code.
>     See line 236 in src/compiler/x86/c-call.lisp, and line 125 in
>     src/runtime/x86-assem.S.
> 
>     (It is possible I'm missing out a call-path here, but even so, read on and
>     see if my fears are unfounded or not.)
> 
>  2. If the problem was due to a foreign call, it should be deterministic.
> 
>  3. If the problem was due to _returning_ to main(), it should be deterministic.

Looks correct.

> What I suspect is actually going on (especially considering your
> statement that compiling signals/ with 4.2 avoided the issue) is that
> a signal handler is entered while DF is set.

What I am sure is that sigemptyset() from the glibc is called with the
direction flag set, and that should not happen.

> If this is the case, then clearing it right after each REP loop where
> SBCL uses it just makes seeing the bug much more unlikely -- but not
> impossible in the presence of async signals.

Seems correct, though I have made half a dozen of build here, without
any problem.

> If so, this may also explain some _very_ hard to reproduce faults we
> have seen over the years: using a pre 4.3-GCC compiled libc, a signal
> at an in opportune moment in the middle of a REP loop could clear DF!
> Yikes!
> 
> I'm not sure what is The Right Thing here, though. Should SBCL (and
> _any_ program that ever sets DF!) save, clear, and restore DF in its
> signal handlers? Should libc/kernel do that? Should signals be blocked

I currently have no idea about that.

> before ever setting DF? Is setting DF Just A Bad Idea?

I don't think so. Not setting DF means less optimized code has to be used.

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>:
Bug#469058; Package sbcl. Full text and rfc822 format available.

Acknowledgement sent to "Nikodemus Siivola" <nikodemus@random-state.net>:
Extra info received and forwarded to list. Copy sent to Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>. Full text and rfc822 format available.

Message #78 received at 469058@bugs.debian.org (full text, mbox):

From: "Nikodemus Siivola" <nikodemus@random-state.net>
To: "Aurelien Jarno" <aurelien@aurel32.net>
Cc: 469058@bugs.debian.org, sbcl-devel <sbcl-devel@lists.sourceforge.net>, "Debian Common Lisp Team" <pkg-common-lisp-devel@lists.alioth.debian.org>
Subject: Re: DF and signal handlers
Date: Wed, 5 Mar 2008 15:48:05 +0200
On 3/5/08, Aurelien Jarno <aurelien@aurel32.net> wrote:
> Nikodemus Siivola a écrit :
>
> > On 3/5/08, Debian Bug Tracking System <owner@bugs.debian.org> wrote:
>  >
>  >> tag 469058 + patch
>  >>  Bug#469058: sbcl doesn't reset direction flag upon exit
>  >>  There were no tags set.
>  >>  Tags added: patch
>  >
>  > Thanks for the patch, but... while I agree that it is good to change
>  > SBCL to reset the direction flag every time it is diddled, instead of
>  > just before calling C, I don't think SBCL is actually at fault here.
>  >
>  >  1. SBCL does actually reset DF before any call to foreign (GCC generated) code.
>  >     See line 236 in src/compiler/x86/c-call.lisp, and line 125 in
>  >     src/runtime/x86-assem.S.
>  >
>  >     (It is possible I'm missing out a call-path here, but even so, read on and
>  >     see if my fears are unfounded or not.)
>  >
>  >  2. If the problem was due to a foreign call, it should be deterministic.
>  >
>  >  3. If the problem was due to _returning_ to main(), it should be deterministic.
>
>
> Looks correct.
>
>
>  > What I suspect is actually going on (especially considering your
>  > statement that compiling signals/ with 4.2 avoided the issue) is that
>  > a signal handler is entered while DF is set.
>
>
> What I am sure is that sigemptyset() from the glibc is called with the
> direction flag set, and that should not happen.

Right.

I'm about to merge a patch to SBCL based on yours, which moves all DF
resets to immediate vicinity of STDs for easier auditing, and removed
the then-unnecessary CLD instructions from foreign call sequences.
This will fix them symptoms, and be good for SBCL, but I think the
underlying problem is still there in signal handling. :/

>  > If this is the case, then clearing it right after each REP loop where
>  > SBCL uses it just makes seeing the bug much more unlikely -- but not
>  > impossible in the presence of async signals.
>
>
> Seems correct, though I have made half a dozen of build here, without
> any problem.

That is not too suprising: the are normally no asynch signals
delivered during the build, but SIGSEGV is a regular occurance (it is
used by the GC), so SIGSEGV handlers may have been seeing the DF set.

What _is_ strange is that this appears to have been random. (At least
all the reporters seemed to characterize it as semirandom behaviour.)
Multiple builds from the same source with the same host compiler
should have essentially identical GC characteristics.

>  > If so, this may also explain some _very_ hard to reproduce faults we
>  > have seen over the years: using a pre 4.3-GCC compiled libc, a signal
>  > at an in opportune moment in the middle of a REP loop could clear DF!
>  > Yikes!
>  >
>  > I'm not sure what is The Right Thing here, though. Should SBCL (and
>  > _any_ program that ever sets DF!) save, clear, and restore DF in its
>  > signal handlers? Should libc/kernel do that? Should signals be blocked
>
>
> I currently have no idea about that.

I'll see if I can cook up a small test-case using async signals. (One
that doesn't need SBCL so that it can be passed to upstream libc /
kernel people if necessary without too much friction.)

Cheers,

 -- Nikodemus

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>:
Bug#469058; Package sbcl. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>. Full text and rfc822 format available.

Message #83 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: Nikodemus Siivola <nikodemus@random-state.net>
Cc: 469058@bugs.debian.org, sbcl-devel <sbcl-devel@lists.sourceforge.net>, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>
Subject: Re: DF and signal handlers
Date: Wed, 05 Mar 2008 14:55:38 +0100
Nikodemus Siivola a écrit :
> On 3/5/08, Aurelien Jarno <aurelien@aurel32.net> wrote:
>> Nikodemus Siivola a écrit :
>>
>>> On 3/5/08, Debian Bug Tracking System <owner@bugs.debian.org> wrote:
>>  >
>>  >> tag 469058 + patch
>>  >>  Bug#469058: sbcl doesn't reset direction flag upon exit
>>  >>  There were no tags set.
>>  >>  Tags added: patch
>>  >
>>  > Thanks for the patch, but... while I agree that it is good to change
>>  > SBCL to reset the direction flag every time it is diddled, instead of
>>  > just before calling C, I don't think SBCL is actually at fault here.
>>  >
>>  >  1. SBCL does actually reset DF before any call to foreign (GCC generated) code.
>>  >     See line 236 in src/compiler/x86/c-call.lisp, and line 125 in
>>  >     src/runtime/x86-assem.S.
>>  >
>>  >     (It is possible I'm missing out a call-path here, but even so, read on and
>>  >     see if my fears are unfounded or not.)
>>  >
>>  >  2. If the problem was due to a foreign call, it should be deterministic.
>>  >
>>  >  3. If the problem was due to _returning_ to main(), it should be deterministic.
>>
>>
>> Looks correct.
>>
>>
>>  > What I suspect is actually going on (especially considering your
>>  > statement that compiling signals/ with 4.2 avoided the issue) is that
>>  > a signal handler is entered while DF is set.
>>
>>
>> What I am sure is that sigemptyset() from the glibc is called with the
>> direction flag set, and that should not happen.
> 
> Right.
> 
> I'm about to merge a patch to SBCL based on yours, which moves all DF
> resets to immediate vicinity of STDs for easier auditing, and removed
> the then-unnecessary CLD instructions from foreign call sequences.
> This will fix them symptoms, and be good for SBCL, but I think the
> underlying problem is still there in signal handling. :/
> 
>>  > If this is the case, then clearing it right after each REP loop where
>>  > SBCL uses it just makes seeing the bug much more unlikely -- but not
>>  > impossible in the presence of async signals.
>>
>>
>> Seems correct, though I have made half a dozen of build here, without
>> any problem.
> 
> That is not too suprising: the are normally no asynch signals
> delivered during the build, but SIGSEGV is a regular occurance (it is
> used by the GC), so SIGSEGV handlers may have been seeing the DF set.
> 
> What _is_ strange is that this appears to have been random. (At least
> all the reporters seemed to characterize it as semirandom behaviour.)
> Multiple builds from the same source with the same host compiler
> should have essentially identical GC characteristics.

Well it may depends on the kernel. On one machine, it was hanging
randomly. On another machine, I get an error from GC at the very
beginning of the build.

>>  > If so, this may also explain some _very_ hard to reproduce faults we
>>  > have seen over the years: using a pre 4.3-GCC compiled libc, a signal
>>  > at an in opportune moment in the middle of a REP loop could clear DF!
>>  > Yikes!
>>  >
>>  > I'm not sure what is The Right Thing here, though. Should SBCL (and
>>  > _any_ program that ever sets DF!) save, clear, and restore DF in its
>>  > signal handlers? Should libc/kernel do that? Should signals be blocked
>>
>>
>> I currently have no idea about that.
> 
> I'll see if I can cook up a small test-case using async signals. (One
> that doesn't need SBCL so that it can be passed to upstream libc /
> kernel people if necessary without too much friction.)
> 

GCC developer says it's the job of the kernel. I doubt the glibc can do
something here, that's the kernel which calls the signal handler.

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>:
Bug#469058; Package sbcl. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>. Full text and rfc822 format available.

Message #88 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: Nikodemus Siivola <nikodemus@random-state.net>
Cc: 469058@bugs.debian.org, sbcl-devel <sbcl-devel@lists.sourceforge.net>, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>
Subject: Re: DF and signal handlers
Date: Wed, 05 Mar 2008 15:07:02 +0100
Nikodemus Siivola a écrit :
> On 3/5/08, Debian Bug Tracking System <owner@bugs.debian.org> wrote:
> 
>> tag 469058 + patch
>>  Bug#469058: sbcl doesn't reset direction flag upon exit
>>  There were no tags set.
>>  Tags added: patch
> 
> Thanks for the patch, but... while I agree that it is good to change
> SBCL to reset the direction flag every time it is diddled, instead of
> just before calling C, I don't think SBCL is actually at fault here.
> 
>  1. SBCL does actually reset DF before any call to foreign (GCC generated) code.
>     See line 236 in src/compiler/x86/c-call.lisp, and line 125 in
>     src/runtime/x86-assem.S.
> 
>     (It is possible I'm missing out a call-path here, but even so, read on and
>     see if my fears are unfounded or not.)
> 
>  2. If the problem was due to a foreign call, it should be deterministic.
> 
>  3. If the problem was due to _returning_ to main(), it should be deterministic.
> 
> What I suspect is actually going on (especially considering your
> statement that compiling signals/ with 4.2 avoided the issue) is that
> a signal handler is entered while DF is set.
> 
> If this is the case, then clearing it right after each REP loop where
> SBCL uses it just makes seeing the bug much more unlikely -- but not
> impossible in the presence of async signals.
> 
> If so, this may also explain some _very_ hard to reproduce faults we
> have seen over the years: using a pre 4.3-GCC compiled libc, a signal
> at an in opportune moment in the middle of a REP loop could clear DF!
> Yikes!

I doubt this is related, as the flags register is saved by gcc upon
enter to the signal handler and restored upon exit.

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>:
Bug#469058; Package sbcl. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>. Full text and rfc822 format available.

Message #93 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: Nikodemus Siivola <nikodemus@random-state.net>
Cc: 469058@bugs.debian.org, sbcl-devel <sbcl-devel@lists.sourceforge.net>, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>
Subject: Re: DF and signal handlers
Date: Wed, 5 Mar 2008 16:10:53 +0100
On Wed, Mar 05, 2008 at 03:48:05PM +0200, Nikodemus Siivola wrote:
> On 3/5/08, Aurelien Jarno <aurelien@aurel32.net> wrote:
> > Nikodemus Siivola a écrit :
> >
> > > On 3/5/08, Debian Bug Tracking System <owner@bugs.debian.org> wrote:
> >  >
> >  >> tag 469058 + patch
> >  >>  Bug#469058: sbcl doesn't reset direction flag upon exit
> >  >>  There were no tags set.
> >  >>  Tags added: patch
> >  >
> >  > Thanks for the patch, but... while I agree that it is good to change
> >  > SBCL to reset the direction flag every time it is diddled, instead of
> >  > just before calling C, I don't think SBCL is actually at fault here.
> >  >
> >  >  1. SBCL does actually reset DF before any call to foreign (GCC generated) code.
> >  >     See line 236 in src/compiler/x86/c-call.lisp, and line 125 in
> >  >     src/runtime/x86-assem.S.
> >  >
> >  >     (It is possible I'm missing out a call-path here, but even so, read on and
> >  >     see if my fears are unfounded or not.)
> >  >
> >  >  2. If the problem was due to a foreign call, it should be deterministic.
> >  >
> >  >  3. If the problem was due to _returning_ to main(), it should be deterministic.
> >
> >
> > Looks correct.
> >
> >
> >  > What I suspect is actually going on (especially considering your
> >  > statement that compiling signals/ with 4.2 avoided the issue) is that
> >  > a signal handler is entered while DF is set.
> >
> >
> > What I am sure is that sigemptyset() from the glibc is called with the
> > direction flag set, and that should not happen.
> 
> Right.
> 
> I'm about to merge a patch to SBCL based on yours, which moves all DF
> resets to immediate vicinity of STDs for easier auditing, and removed
> the then-unnecessary CLD instructions from foreign call sequences.
> This will fix them symptoms, and be good for SBCL, but I think the
> underlying problem is still there in signal handling. :/
> 
> >  > If this is the case, then clearing it right after each REP loop where
> >  > SBCL uses it just makes seeing the bug much more unlikely -- but not
> >  > impossible in the presence of async signals.
> >
> >
> > Seems correct, though I have made half a dozen of build here, without
> > any problem.
> 
> That is not too suprising: the are normally no asynch signals
> delivered during the build, but SIGSEGV is a regular occurance (it is
> used by the GC), so SIGSEGV handlers may have been seeing the DF set.
> 
> What _is_ strange is that this appears to have been random. (At least
> all the reporters seemed to characterize it as semirandom behaviour.)
> Multiple builds from the same source with the same host compiler
> should have essentially identical GC characteristics.
> 
> >  > If so, this may also explain some _very_ hard to reproduce faults we
> >  > have seen over the years: using a pre 4.3-GCC compiled libc, a signal
> >  > at an in opportune moment in the middle of a REP loop could clear DF!
> >  > Yikes!
> >  >
> >  > I'm not sure what is The Right Thing here, though. Should SBCL (and
> >  > _any_ program that ever sets DF!) save, clear, and restore DF in its
> >  > signal handlers? Should libc/kernel do that? Should signals be blocked
> >
> >
> > I currently have no idea about that.
> 
> I'll see if I can cook up a small test-case using async signals. (One
> that doesn't need SBCL so that it can be passed to upstream libc /
> kernel people if necessary without too much friction.)
> 

The small code below exhibits the problem. It was there already with
gcc-4.2, but in that case, gcc generates a cld or std instruction 
before any instruction that uses the direction flag.


#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <signal.h>

void handler(int signal) {
	uint64_t rflags;
	
	asm volatile("pushfq ; popq %0" : "=g" (rflags));

	if (rflags & (1 << 10))
		printf("DF = 1\n");
	else
		printf("DF = 0\n");
}

int main() {
	signal(SIGUSR1, handler);

	while(1)
	{
		asm volatile("std\r\n");
	}
}


-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>:
Bug#469058; Package sbcl. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>. Full text and rfc822 format available.

Message #98 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: Nikodemus Siivola <nikodemus@random-state.net>
Cc: 469058@bugs.debian.org, sbcl-devel <sbcl-devel@lists.sourceforge.net>, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>, debian-kernel@lists.debian.org, debian-gcc@lists.debian.org
Subject: Re: DF and signal handlers
Date: Wed, 5 Mar 2008 16:49:21 +0100
reassign 469058 linux-2.6,gcc-4.3
thanks

On Wed, Mar 05, 2008 at 04:10:53PM +0100, Aurelien Jarno wrote:
> On Wed, Mar 05, 2008 at 03:48:05PM +0200, Nikodemus Siivola wrote:
> > On 3/5/08, Aurelien Jarno <aurelien@aurel32.net> wrote:
> > > Nikodemus Siivola a écrit :
> > >
> > > > On 3/5/08, Debian Bug Tracking System <owner@bugs.debian.org> wrote:
> > >  >
> > >  >> tag 469058 + patch
> > >  >>  Bug#469058: sbcl doesn't reset direction flag upon exit
> > >  >>  There were no tags set.
> > >  >>  Tags added: patch
> > >  >
> > >  > Thanks for the patch, but... while I agree that it is good to change
> > >  > SBCL to reset the direction flag every time it is diddled, instead of
> > >  > just before calling C, I don't think SBCL is actually at fault here.
> > >  >
> > >  >  1. SBCL does actually reset DF before any call to foreign (GCC generated) code.
> > >  >     See line 236 in src/compiler/x86/c-call.lisp, and line 125 in
> > >  >     src/runtime/x86-assem.S.
> > >  >
> > >  >     (It is possible I'm missing out a call-path here, but even so, read on and
> > >  >     see if my fears are unfounded or not.)
> > >  >
> > >  >  2. If the problem was due to a foreign call, it should be deterministic.
> > >  >
> > >  >  3. If the problem was due to _returning_ to main(), it should be deterministic.
> > >
> > >
> > > Looks correct.
> > >
> > >
> > >  > What I suspect is actually going on (especially considering your
> > >  > statement that compiling signals/ with 4.2 avoided the issue) is that
> > >  > a signal handler is entered while DF is set.
> > >
> > >
> > > What I am sure is that sigemptyset() from the glibc is called with the
> > > direction flag set, and that should not happen.
> > 
> > Right.
> > 
> > I'm about to merge a patch to SBCL based on yours, which moves all DF
> > resets to immediate vicinity of STDs for easier auditing, and removed
> > the then-unnecessary CLD instructions from foreign call sequences.
> > This will fix them symptoms, and be good for SBCL, but I think the
> > underlying problem is still there in signal handling. :/
> > 
> > >  > If this is the case, then clearing it right after each REP loop where
> > >  > SBCL uses it just makes seeing the bug much more unlikely -- but not
> > >  > impossible in the presence of async signals.
> > >
> > >
> > > Seems correct, though I have made half a dozen of build here, without
> > > any problem.
> > 
> > That is not too suprising: the are normally no asynch signals
> > delivered during the build, but SIGSEGV is a regular occurance (it is
> > used by the GC), so SIGSEGV handlers may have been seeing the DF set.
> > 
> > What _is_ strange is that this appears to have been random. (At least
> > all the reporters seemed to characterize it as semirandom behaviour.)
> > Multiple builds from the same source with the same host compiler
> > should have essentially identical GC characteristics.
> > 
> > >  > If so, this may also explain some _very_ hard to reproduce faults we
> > >  > have seen over the years: using a pre 4.3-GCC compiled libc, a signal
> > >  > at an in opportune moment in the middle of a REP loop could clear DF!
> > >  > Yikes!
> > >  >
> > >  > I'm not sure what is The Right Thing here, though. Should SBCL (and
> > >  > _any_ program that ever sets DF!) save, clear, and restore DF in its
> > >  > signal handlers? Should libc/kernel do that? Should signals be blocked
> > >
> > >
> > > I currently have no idea about that.
> > 
> > I'll see if I can cook up a small test-case using async signals. (One
> > that doesn't need SBCL so that it can be passed to upstream libc /
> > kernel people if necessary without too much friction.)
> > 

That's definitively a kernel/gcc-4.3 problem, I have reported it
upstream: http://lkml.org/lkml/2008/3/5/207

I am therefore reassigning the bug to those packages.

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Bug reassigned from package `sbcl' to `linux-2.6,gcc-4.3'. Request was from Aurelien Jarno <aurelien@aurel32.net> to control@bugs.debian.org. (Wed, 05 Mar 2008 15:51:13 GMT) Full text and rfc822 format available.

Tags removed: patch Request was from Aurelien Jarno <aurelien@aurel32.net> to control@bugs.debian.org. (Wed, 05 Mar 2008 16:24:03 GMT) Full text and rfc822 format available.

Tags removed: patch Request was from Aurelien Jarno <aurel32@debian.org> to control@bugs.debian.org. (Wed, 05 Mar 2008 16:24:05 GMT) Full text and rfc822 format available.

Noted your statement that Bug has been forwarded to http://lkml.org/lkml/2008/3/5/207. Request was from Aurelien Jarno <aurel32@debian.org> to control@bugs.debian.org. (Wed, 05 Mar 2008 16:33:05 GMT) Full text and rfc822 format available.

Changed Bug title to `Linux doesn't follow x86/x86-64 ABI wrt direction flag' from `sbcl doesn't reset direction flag upon exit'. Request was from Aurelien Jarno <aurel32@debian.org> to control@bugs.debian.org. (Wed, 05 Mar 2008 16:36:02 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>, Debian GCC Maintainers <debian-gcc@lists.debian.org>:
Bug#469058; Package linux-2.6,gcc-4.3. Full text and rfc822 format available.

Acknowledgement sent to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>, Debian GCC Maintainers <debian-gcc@lists.debian.org>. Full text and rfc822 format available.

Message #113 received at 469058@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurelien@aurel32.net>
To: debian-kernel@lists.debian.org
Cc: 469058@bugs.debian.org, sbcl-devel <sbcl-devel@lists.sourceforge.net>, Debian Common Lisp Team <pkg-common-lisp-devel@lists.alioth.debian.org>, Nikodemus Siivola <nikodemus@random-state.net>, debian-gcc@lists.debian.org, debian-bsd@lists.debian.org, debian-hurd@lists.debian.org, debian-glibc@lists.debian.org
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag
Date: Wed, 5 Mar 2008 22:53:28 +0100
reassign 469058 linux-2.6
submitter 469058 aurel32@debian.org
clone 469058 -1 -2 -3 -4 -5
reassign -1 kfreebsd-6
retitle -1 FreeBSD kernel doesn't follow x86/x86-64 ABI wrt direction flag
reassign -2 kfreebsd-7
retitle -2 FreeBSD kernel doesn't follow x86/x86-64 ABI wrt direction flag
reassign -3 hurd
retitle -3 Hurd crashes when a signal handler is called with DF = 1
reassign -4 gcc-4.3
retitle -4 gcc-4.3: old behavior wrt cld/std should be restored
reassign -5 glibc
retitle -5 libc6 should build-depends on a fixde gcc 4.3 wrt cld/std
submitter -5 nikodemus@random-state.net
thanks


On Wed, Mar 05, 2008 at 04:49:21PM +0100, Aurelien Jarno wrote:
> reassign 469058 linux-2.6,gcc-4.3
> thanks
> 
> On Wed, Mar 05, 2008 at 04:10:53PM +0100, Aurelien Jarno wrote:
> That's definitively a kernel/gcc-4.3 problem, I have reported it
> upstream: http://lkml.org/lkml/2008/3/5/207
> 
> I am therefore reassigning the bug to those packages.
> 

Now that the situation is more clear, let's clone/reassign the bugs to
the right packages. While the kernels have to be fixed, gcc 4.3
behavior wrt to cld/std has to be reverted to the old behavior to 
ensure an upgrade path from Etch.

So let's summarize:
- linux 2.6 has to be fixed. A patch is available on the lkml [1] for
  2.6.25-rc. It could be easily backported to 2.6.24
- kfreebsd 6 and 7 exhibit the same behavior. They have to be fixed
- hurd crashes in this case. It has to be fixed
- gcc 4.3 should not strictly follow the ABI when it comes to cld/std
  and use the old behavior, which is now a de facto ABI for some time.
- glibc 2.7-9 is broken with the current kernels. It has to be rebuilt
  with a fixed gcc.

[1] http://lkml.org/lkml/2008/3/5/306
[2] http://gcc.gnu.org/ml/gcc-patches/2006-12/msg00354.html

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net




Bug reassigned from package `linux-2.6,gcc-4.3' to `linux-2.6'. Request was from Aurelien Jarno <aurelien@aurel32.net> to control@bugs.debian.org. (Wed, 05 Mar 2008 22:03:06 GMT) Full text and rfc822 format available.

Changed Bug submitter from Rupert Swarbrick <rswarbrick@googlemail.com> to aurel32@debian.org. Request was from Aurelien Jarno <aurelien@aurel32.net> to control@bugs.debian.org. (Wed, 05 Mar 2008 22:03:07 GMT) Full text and rfc822 format available.

Bug 469058 cloned as bugs 469564, 469565, 469566, 469567, 469568. Request was from Aurelien Jarno <aurelien@aurel32.net> to control@bugs.debian.org. (Wed, 05 Mar 2008 22:03:07 GMT) Full text and rfc822 format available.

Bug reassigned from package `linux-2.6' to `kfreebsd-6'. Request was from Aurelien Jarno <aurelien@aurel32.net> to control@bugs.debian.org. (Wed, 05 Mar 2008 22:03:16 GMT) Full text and rfc822 format available.

Changed Bug title to `FreeBSD kernel doesn't follow x86/x86-64 ABI wrt direction flag' from `Linux doesn't follow x86/x86-64 ABI wrt direction flag'. Request was from Aurelien Jarno <aurelien@aurel32.net> to control@bugs.debian.org. (Wed, 05 Mar 2008 22:03:17 GMT) Full text and rfc822 format available.

Removed annotation that Bug had been forwarded to http://lkml.org/lkml/2008/3/5/207. Request was from Aurelien Jarno <aurel32@debian.org> to control@bugs.debian.org. (Wed, 05 Mar 2008 22:06:04 GMT) Full text and rfc822 format available.

Reply sent to Aurelien Jarno <aurel32@debian.org>:
You have taken responsibility. Full text and rfc822 format available.

Notification sent to aurel32@debian.org:
Bug acknowledged by developer. Full text and rfc822 format available.

Message #130 received at 469564-close@bugs.debian.org (full text, mbox):

From: Aurelien Jarno <aurel32@debian.org>
To: 469564-close@bugs.debian.org
Subject: Bug#469564: fixed in kfreebsd-6 6.3-4
Date: Thu, 06 Mar 2008 00:17:04 +0000
Source: kfreebsd-6
Source-Version: 6.3-4

We believe that the bug you reported is fixed in the latest version of
kfreebsd-6, which is due to be installed in the Debian FTP archive:

kfreebsd-6_6.3-4.diff.gz
  to pool/main/k/kfreebsd-6/kfreebsd-6_6.3-4.diff.gz
kfreebsd-6_6.3-4.dsc
  to pool/main/k/kfreebsd-6/kfreebsd-6_6.3-4.dsc
kfreebsd-source-6.3_6.3-4_all.deb
  to pool/main/k/kfreebsd-6/kfreebsd-source-6.3_6.3-4_all.deb



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 469564@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Aurelien Jarno <aurel32@debian.org> (supplier of updated kfreebsd-6 package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Thu, 06 Mar 2008 00:51:59 +0100
Source: kfreebsd-6
Binary: kfreebsd-source-6.3 kfreebsd-headers-6.3-1 kfreebsd-image-6.3-1-amd64-generic ndiswrapper-modules-6.3-1-amd64-generic kfreebsd-image-6-amd64-generic kfreebsd-headers-6.3-1-amd64-generic kfreebsd-headers-6-amd64-generic kfreebsd-image-6.3-1-amd64-k8 ndiswrapper-modules-6.3-1-amd64-k8 kfreebsd-image-6-amd64-k8 kfreebsd-headers-6.3-1-amd64-k8 kfreebsd-headers-6-amd64-k8 kfreebsd-image-6.3-1-amd64-k8-smp ndiswrapper-modules-6.3-1-amd64-k8-smp kfreebsd-image-6-amd64-k8-smp kfreebsd-headers-6.3-1-amd64-k8-smp kfreebsd-headers-6-amd64-k8-smp kfreebsd-image-6.3-1-em64t-p4 ndiswrapper-modules-6.3-1-em64t-p4 kfreebsd-image-6-em64t-p4 kfreebsd-headers-6.3-1-em64t-p4 kfreebsd-headers-6-em64t-p4 kfreebsd-image-6.3-1-em64t-p4-smp ndiswrapper-modules-6.3-1-em64t-p4-smp kfreebsd-image-6-em64t-p4-smp kfreebsd-headers-6.3-1-em64t-p4-smp kfreebsd-headers-6-em64t-p4-smp kfreebsd-image-6.3-1-486 ndiswrapper-modules-6.3-1-486 kfreebsd-image-6-486 kfreebsd-headers-6.3-1-486 kfreebsd-headers-6-486 kfreebsd-image-6.3-1-586 ndiswrapper-modules-6.3-1-586 kfreebsd-image-6-586 kfreebsd-headers-6.3-1-586 kfreebsd-headers-6-586 kfreebsd-image-6.3-1-586-smp ndiswrapper-modules-6.3-1-586-smp kfreebsd-image-6-586-smp kfreebsd-headers-6.3-1-586-smp kfreebsd-headers-6-586-smp kfreebsd-image-6.3-1-686 ndiswrapper-modules-6.3-1-686 kfreebsd-image-6-686 kfreebsd-headers-6.3-1-686 kfreebsd-headers-6-686 kfreebsd-image-6.3-1-686-smp ndiswrapper-modules-6.3-1-686-smp kfreebsd-image-6-686-smp kfreebsd-headers-6.3-1-686-smp kfreebsd-headers-6-686-smp
Architecture: source all
Version: 6.3-4
Distribution: unstable
Urgency: low
Maintainer: Aurelien Jarno <aurel32@debian.org>
Changed-By: Aurelien Jarno <aurel32@debian.org>
Description: 
 kfreebsd-source-6.3 - source code for kernel of FreeBSD 6.3 with Debian patches
Closes: 469564
Changes: 
 kfreebsd-6 (6.3-4) unstable; urgency=low
 .
   * 030_abi_cld.diff: new patch to clear the direction flag before calling
     a signal handler (Closes: bug#469564).
   * Fix debian/copyright.
Files: 
 72d729888bd7149d8bc3e476f15268b4 2419 devel optional kfreebsd-6_6.3-4.dsc
 f05f121cdec1d63ab070331df0773dfe 48444 devel optional kfreebsd-6_6.3-4.diff.gz
 730be4e78c5d182a9df0033ba5a0a5b1 15483692 devel optional kfreebsd-source-6.3_6.3-4_all.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHzzX4w3ao2vG823MRAiHhAJ0ZEbZQ0Z/bZvLyFIbLWnKjNMN4JACfY0TW
ii5SzshXO0QVvFoyod8VDkc=
=WkM5
-----END PGP SIGNATURE-----





Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Thu, 10 Apr 2008 07:33:30 GMT) Full text and rfc822 format available.

Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Thu Apr 17 07:20:37 2014; Machine Name: beach.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.