Debian Bug report logs -
#368991
banshee: crash when click play
Reported by: Nicholas Crespi <roundtrip@gmail.com>
Date: Fri, 26 May 2006 15:48:05 UTC
Severity: grave
Tags: fixed, patch
Merged with 369450,
369733,
370451
Found in version liboil/0.3.9-1.1
Fixed in versions 0.3.9-1.1, 0.3.9-1.2
Done: Steve Langasek <vorlon@debian.org>
Bug is archived. No further changes may be made.
Toggle useless messages
Report forwarded to debian-bugs-dist@lists.debian.org, roundtrip@gmail.com, Sebastian Dröge <slomo@ubuntu.com>:
Bug#368991; Package banshee.
(full text, mbox, link).
Acknowledgement sent to Nicholas Crespi <roundtrip@gmail.com>:
New Bug report received and forwarded. Copy sent to roundtrip@gmail.com, Sebastian Dröge <slomo@ubuntu.com>.
(full text, mbox, link).
Message #5 received at submit@bugs.debian.org (full text, mbox, reply):
Package: banshee
Version: 0.10.10-2
Severity: grave
Justification: renders package unusable
clicking on the play icon, banshee crashes with this error:
ERROR: Caught a segmentation fault while loading plugin file:
/usr/lib/gstreamer-0.10/libgstaudioresample.so
Please either:
- remove it and restart.
- run with --gst-disable-segtrap and debug.
-- System Information:
Debian Release: testing/unstable
Architecture: i386 (i686)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.16-1-686
Locale: LANG=it_IT@euro, LC_CTYPE=it_IT@euro (charmap=ISO-8859-15)
Versions of packages banshee depends on:
ii gconf2 2.14.0-1 GNOME configuration database syste
ii gstreamer0.10-gnomevfs 0.10.7-2 GStreamer plugin for GnomeVFS
ii gstreamer0.10-plugins-base 0.10.7-2 GStreamer plugins from the "base"
ii gstreamer0.10-plugins-good 0.10.3-2 GStreamer plugins from the "good"
ii gstreamer0.10-plugins-ugly 0.10.3-1 GStreamer plugins from the "ugly"
ii libart-2.0-2 2.3.17-1 Library of functions for 2D graphi
ii libatk1.0-0 1.11.4-2 The ATK accessibility toolkit
ii libbonobo2-0 2.14.0-1 Bonobo CORBA interfaces library
ii libbonoboui2-0 2.14.0-2 The Bonobo UI library
ii libc6 2.3.6-9 GNU C Library: Shared libraries
ii libcairo2 1.0.4-2 The Cairo 2D vector graphics libra
ii libdbus-1-2 0.61-5 simple interprocess messaging syst
ii libdbus-1-cil 0.61-5 CLI binding for D-BUS interprocess
ii libdbus-glib-1-2 0.61-5 simple interprocess messaging syst
ii libfontconfig1 2.3.2-5.1 generic font configuration library
ii libgconf2-4 2.14.0-1 GNOME configuration database syste
ii libgconf2.0-cil 2.8.2-2 CLI binding for GConf 2.12
ii libglade2.0-cil 2.8.2-2 CLI binding for the Glade librarie
ii libglib2.0-0 2.10.2-2 The GLib library of C routines
ii libglib2.0-cil 2.8.2-2 CLI binding for the GLib utility l
ii libgnome-desktop-2 2.14.1.1-1 Utility library for loading .deskt
ii libgnome-keyring0 0.4.9-1 GNOME keyring services library
ii libgnome2-0 2.14.1-2 The GNOME 2 library - runtime file
ii libgnome2.0-cil 2.8.2-2 CLI binding for GNOME 2.12
ii libgnomecanvas2-0 2.14.0-2 A powerful object-oriented display
ii libgnomeui-0 2.14.1-1 The GNOME 2 libraries (User Interf
ii libgnomevfs2-0 2.14.1-2 GNOME virtual file-system (runtime
ii libgstreamer0.10-0 0.10.6-2 Core GStreamer libraries and eleme
ii libgtk2.0-0 2.8.17-2 The GTK+ graphical user interface
ii libgtk2.0-cil 2.8.2-2 CLI binding for the GTK+ toolkit 2
ii libhal1 0.5.7-2 Hardware Abstraction Layer - share
ii libice6 1:1.0.0-3 X11 Inter-Client Exchange library
ii libipoddevice0 0.4.5-2 library for retrieving information
ii libmono-corlib1.0-cil 1.1.13.6-4 Mono core library (1.0)
ii libmono-security1.0-cil 1.1.13.6-4 Mono Security library
ii libmono-sqlite1.0-cil 1.1.13.6-4 Mono Sqlite library
ii libmono-system-data1.0-cil 1.1.13.6-4 Mono System.Data library
ii libmono-system-web1.0-cil 1.1.13.6-4 Mono System.Web library
ii libmono-system1.0-cil 1.1.13.6-4 Mono System libraries (1.0)
ii libmono1.0-cil 1.1.13.6-4 Mono libraries (1.0)
ii libmusicbrainz4c2a 2.1.2-4 Second generation incarnation of t
ii libnautilus-burn2 2.12.3-2 Nautilus Burn Library - runtime ve
ii liborbit2 1:2.14.0-1 libraries for ORBit2 - a CORBA ORB
ii libpango1.0-0 1.12.1-3 Layout and rendering of internatio
ii libpopt0 1.7-5 lib for parsing cmdline parameters
ii libsm6 1:1.0.0-4 X11 Session Management library
ii libstartup-notification0 0.8-1 library for program launch feedbac
ii libx11-6 2:1.0.0-6 X11 client-side library
ii libxcursor1 1.1.5.2-5 X cursor management library
ii libxext6 1:1.0.0-4 X11 miscellaneous extension librar
ii libxfixes3 1:3.0.1.2-4 X11 miscellaneous 'fixes' extensio
ii libxi6 1:1.0.0-5 X11 Input extension library
ii libxinerama1 1:1.0.1-4 X11 Xinerama extension library
ii libxml2 2.6.24.dfsg-1 GNOME XML library
ii libxrandr2 2:1.1.0.2-4 X11 RandR extension library
ii libxrender1 1:0.9.0.2-4 X Rendering Extension client libra
ii mono-runtime 1.1.13.6-4 Mono runtime
ii zlib1g 1:1.2.3-11 compression library - runtime
banshee recommends no packages.
-- no debconf information
Information forwarded to debian-bugs-dist@lists.debian.org, Sebastian Dröge <slomo@ubuntu.com>:
Bug#368991; Package banshee.
(full text, mbox, link).
Acknowledgement sent to Sebastian Dröge <slomo@slomosnail.de>:
Extra info received and forwarded to list. Copy sent to Sebastian Dröge <slomo@ubuntu.com>.
(full text, mbox, link).
Message #10 received at 368991@bugs.debian.org (full text, mbox, reply):
Am Freitag, den 26.05.2006, 17:46 +0200 schrieb Nicholas Crespi:
> Package: banshee
> Version: 0.10.10-2
> Severity: grave
> Justification: renders package unusable
>
> clicking on the play icon, banshee crashes with this error:
> ERROR: Caught a segmentation fault while loading plugin file:
> /usr/lib/gstreamer-0.10/libgstaudioresample.so
>
> Please either:
> - remove it and restart.
> - run with --gst-disable-segtrap and debug.
Which version of liboil do you have?
Am I right that you use a CPU with SSE instruction set?
Bye
Information forwarded to debian-bugs-dist@lists.debian.org, Sebastian Dröge <slomo@ubuntu.com>:
Bug#368991; Package banshee.
(full text, mbox, link).
Acknowledgement sent to "Nicholas Crespi" <roundtrip@gmail.com>:
Extra info received and forwarded to list. Copy sent to Sebastian Dröge <slomo@ubuntu.com>.
(full text, mbox, link).
Message #15 received at 368991@bugs.debian.org (full text, mbox, reply):
pentium M banias 1400Mhz with sse, sse2 (from /proc/cpuinfo)
liboil is at 0.3.9-1
I tried to downgrade the lib to the testing version and now it works
ps: should I post it also to the liboil buglist?
Information forwarded to debian-bugs-dist@lists.debian.org, Sebastian Dröge <slomo@ubuntu.com>:
Bug#368991; Package banshee.
(full text, mbox, link).
Acknowledgement sent to Sebastian Dröge <slomo@slomosnail.de>:
Extra info received and forwarded to list. Copy sent to Sebastian Dröge <slomo@ubuntu.com>.
(full text, mbox, link).
Message #20 received at 368991@bugs.debian.org (full text, mbox, reply):
reassign 368991 liboil
thanks
Am Freitag, den 26.05.2006, 18:57 +0200 schrieb Nicholas Crespi:
> pentium M banias 1400Mhz with sse, sse2 (from /proc/cpuinfo)
>
> liboil is at 0.3.9-1
>
> I tried to downgrade the lib to the testing version and now it works
>
> ps: should I post it also to the liboil buglist?
Not needed, I've reassigned this bug to liboil already.
Bye
Bug reassigned from package `banshee' to `liboil'.
Request was from Sebastian Dröge <slomo@slomosnail.de>
to control@bugs.debian.org.
(full text, mbox, link).
Information forwarded to debian-bugs-dist@lists.debian.org:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to David Schleef <ds@schleef.org>:
Extra info received and forwarded to list.
(full text, mbox, link).
Message #27 received at 368991@bugs.debian.org (full text, mbox, reply):
On Fri, May 26, 2006 at 10:33:29AM -0700, Debian Bug Tracking System wrote:
> Processing commands for control@bugs.debian.org:
>
> > reassign 368991 liboil
> Bug#368991: banshee: crash when click play
> Bug reassigned from package `banshee' to `liboil'.
Why is this a liboil bug?
dave...
--
David Schleef
Big Kitten LLC (http://www.bigkitten.com/) -- data acquisition on Linux
Information forwarded to debian-bugs-dist@lists.debian.org:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to David Schleef <ds@schleef.org>:
Extra info received and forwarded to list.
(full text, mbox, link).
Message #32 received at 368991@bugs.debian.org (full text, mbox, reply):
Submitter: please run oil-bugreport on the affected machine and
attach the result. Please also provide a banshee backtrace if
possible.
dave...
--
David Schleef
Big Kitten LLC (http://www.bigkitten.com/) -- data acquisition on Linux
Bug reassigned from package `liboil' to `liboil'.
Request was from Sebastian Dröge <slomo@slomosnail.de>
to control@bugs.debian.org.
(full text, mbox, link).
Information forwarded to debian-bugs-dist@lists.debian.org, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to Christian Aichinger <Greek0@gmx.net>:
Extra info received and forwarded to list. Copy sent to David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #47 received at 368991@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On Fri, May 26, 2006 at 06:06:03PM +0200, Sebastian Dröge wrote:
> Am Freitag, den 26.05.2006, 17:46 +0200 schrieb Nicholas Crespi:
> > clicking on the play icon, banshee crashes with this error:
> > ERROR: Caught a segmentation fault while loading plugin file:
> > /usr/lib/gstreamer-0.10/libgstaudioresample.so
[...]
> Which version of liboil do you have?
> Am I right that you use a CPU with SSE instruction set?
Hi,
I'm able to reproduce that bug on my laptop:
| $ cat /proc/cpuinfo | egrep 'name|flags'
| model name : Intel(R) Pentium(R) M processor 1.50GHz
| flags : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca
| cmov pat clflush dts acpi mmx fxsr sse sse2 ss tm
| pbe est tm2
I've attached the output of oil-bugreport.
To actually debug the problem you need to get rid of that annoying
segtrap "feature". I haven't found any sane way to do this without
recompiling gstreamer.
I've attached a patch to the gstreamer sources that permanently
disables the feature (gst-diff). When the fixed gstreamer packages
are installed you can finally debug the problem with gdb:
| $ gdb /usr/lib/banshee/banshee.exe
When pressing play in banshee it segfaults at varying addresses,
however always at <liboil0.3 + 0x2101b>:
| Program received signal SIGSEGV, Segmentation fault.
| 0xb47a401b in ?? ()
| (gdb) bt
| #0 0xb47a401b in ?? ()
| [... garbage ...]
The segfault also occurs when liboil is compiled with noopt, however
the problem is slightly different (and much dumber) then.
With optimization the segfault happens in
composite_in_argb_sse_2pix(), at the following code piece:
00020ff0 <composite_in_argb_sse_2pix>:
20ff0: 55 push %ebp
20ff1: 89 e5 mov %esp,%ebp
20ff3: 57 push %edi
20ff4: 56 push %esi
20ff5: 53 push %ebx
20ff6: 83 ec 4c sub $0x4c,%esp
20ff9: 8b 7d 14 mov 0x14(%ebp),%edi
20ffc: e8 54 05 ff ff call 11555 <__i686.get_pc_thunk.bx>
21001: 81 c3 a3 91 02 00 add $0x291a3,%ebx
21007: 8b 75 10 mov 0x10(%ebp),%esi
2100a: 83 ff 01 cmp $0x1,%edi
2100d: 0f 8e 9f 00 00 00 jle 210b2 <composite_in_argb_sse_2pix+0xc2>
21013: 66 0f 6f 83 4c 70 ff movdqa 0xffff704c(%ebx),%xmm0
2101a: ff
2101b: 66 0f 7f 45 c8 movdqa %xmm0,0xffffffc8(%ebp) <==== SEGV
21020: c7 45 e0 00 00 00 00 movl $0x0,0xffffffe0(%ebp)
21027: 8b 45 0c mov 0xc(%ebp),%eax
2102a: 66 0f ef db pxor %xmm3,%xmm3
The function gets called by a rather long chain starting at oil_init
(via oil_optimize_all).
I guess it must be some alignment problem, however I don't know
enough of mmx and sse to fix this :-/. Also I haven't been able to
reproduce this problem outside of banshee, which would have
simplified debugging.
Any ideas what might be the cause of it? If you have any hints or
questions I'd be happy to help or try patches.
Cheers,
Christian Aichinger
[oil-bug (text/plain, attachment)]
[gst-diff (text/plain, attachment)]
[signature.asc (application/pgp-signature, inline)]
Information forwarded to debian-bugs-dist@lists.debian.org, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to Sebastian Dröge <slomo@slomosnail.de>:
Extra info received and forwarded to list. Copy sent to David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #52 received at 368991@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Hi,
attached is the output of oil-bugreport on an affected machine (Pentium
IV) and a backtrace of banshee after the segfault. I hope this helps.
The same also happens to muine when compiled with gst0.10 support btw
Bye
[banshee-backtrace.txt (text/plain, attachment)]
[oil-bugreport.txt (text/plain, attachment)]
Information forwarded to debian-bugs-dist@lists.debian.org, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to Christian Aichinger <Greek0@gmx.net>:
Extra info received and forwarded to list. Copy sent to David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #57 received at 368991@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On Wed, Jun 07, 2006 at 05:22:38PM +0200, Sebastian Dröge wrote:
> Hi,
> attached is the output of oil-bugreport on an affected machine (Pentium
> IV) and a backtrace of banshee after the segfault. I hope this helps.
>
> The same also happens to muine when compiled with gst0.10 support btw
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread -1212806848 (LWP 5271)]
> 0xb1bf6f87 in add_f32_sse_unroll2 (dest=0x890bd30, src1=0x890c0c8, src2=0x890c460, n=100) at math_sse_unroll2.c:44
> 44 xmm0 = _mm_loadu_ps(src1);
> (gdb) bt
> #0 0xb1bf6f87 in add_f32_sse_unroll2 (dest=0x890bd30, src1=0x890c0c8, src2=0x890c460, n=100) at math_sse_unroll2.c:44
I assume you got this backtrace with a liboil compiled with -O0. If
that's the case, this is _not_ the place where it crashes with -O2.
I wasted several hours at this point myself, sadly :-/
If you care you can try it with an unmodified liboil, at the crash
`bt` won't really work, but you can do `x/20i $pc`, that'll show the
code I posted in my report (starting at the <=== SEGV marker). You
should be able to get the same trace as I via `x/20i ($pc-0x2b)`.
Cheers,
Christian Aichinger
[signature.asc (application/pgp-signature, inline)]
Information forwarded to debian-bugs-dist@lists.debian.org, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to Christian Aichinger <Greek0@gmx.net>:
Extra info received and forwarded to list. Copy sent to David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #62 received at 368991@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
I can reproduce this bug outside of banshee and mono *wheee*. Also,
I think I understand the problem too now, and what really causes it.
First of all, the repro steps:
| $ cat a.c
| #include <stdio.h>
| #include <liboil/liboil.h>
|
| int main() {
| printf("oil_init...\n");
| oil_init();
| printf("Done...\n");
| return 0;
| }
|
| $ gcc -I/usr/include/liboil-0.3 -Wall -ggdb a.c -c -o a.o -mpreferred-stack-boundary=2
| $ gcc a.o /usr/lib/liboil-0.3.so
| $ ./a.out
| oil_init...
| zsh: segmentation fault ./a.out
The segfault is caused by the optimize_all call inside of oil_init,
when it tries all possible implementations.
The crash happens here:
| 0xb453c013 <composite_in_argb_sse_2pix+35>: movdqa 0xffff704c(%ebx),%xmm0
| 0xb453c01b <composite_in_argb_sse_2pix+43>: movdqa %xmm0,0xffffffc8(%ebp) <=== SEGV
This corresponds to the following code in composite_sse_2pix.c:
| static inline __m128i
| muldiv_255_sse2(__m128i a, __m128i b)
| {
| __m128i ret;
| __m128i roundconst = MC(8x0080);
|
| ret = _mm_mullo_epi16(a, b);
| ret = _mm_adds_epu16(ret, roundconst);
The problem is that gcc somehow has believes that it has to copy
that MC(8x0080) thing to the stack. Gcc tries to copy the constant
using movqda, which requires that memory operands are 16-byte
aligned. If it's not, the CPU raises a #GP exception, which the
kernel translates to a SEGV [1].
Normally this is not problematic, since gcc aligns the stack
boundary to 16 bytes by default. However this doesn't seem to hold
for mono/banshee, or if one manually changes that alignment.
Gcc can be convinced to optimize roundconst away and directly use
the MC(8x0080) constant, so that particular segfault goes away
(patches attached). There are however several other segfaults in
other places.
A fix can be found for some of them, but the problem is that you'd
have to prevent the use of any __m128 constants on the stack. This
means no local variables, no implicit copies by gcc, ...
That's quite a major PITA, if it's even possible at all.
The other possibility is to tell gcc that it's got to 16-byte align
those variables, no matter what. There's an alignment attribute for
that, which can be either applied to variables[2] or to types[3].
However when I tried it out, it didn't work, gcc(-4.0) always
generated the same faulty code that relied on the frame starting at
a multiple of 16.
To conclude, manually fixing up all this stuff seems impossible, and
getting gcc to solve it didn't work for me either.
So unless you have any better ideas we could ask the gcc folks if
they know a solution for this.
HTH,
Christian Aichinger
[1] http://enrico.phys.cmu.edu/QCDcluster/intel/vtune/reference/vc183.htm
[2] http://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Variable-Attributes.html
[3] http://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Type-Attributes.html
[liboil-sse-fix-sse_2pix.diff (text/plain, attachment)]
[liboil-sse-fix-sse_4pix.diff (text/plain, attachment)]
[signature.asc (application/pgp-signature, inline)]
Information forwarded to debian-bugs-dist@lists.debian.org:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to David Schleef <ds@schleef.org>:
Extra info received and forwarded to list.
(full text, mbox, link).
Message #67 received at 368991@bugs.debian.org (full text, mbox, reply):
On Thu, Jun 08, 2006 at 07:24:16AM +0200, Christian Aichinger wrote:
> Normally this is not problematic, since gcc aligns the stack
> boundary to 16 bytes by default. However this doesn't seem to hold
> for mono/banshee, or if one manually changes that alignment.
This makes sense. Thanks for figuring this out.
GCC is really dumb in this area, since it often assumes things about
stack alignment that just aren't true. GCC doesn't even always follow
the rules it assumes.
In general, liboil has been able to avoid these situations on other
architectures, so I'll just fix the code here.
Thankfully, this should be easy to put into a testsuite.
dave...
--
David Schleef
Big Kitten LLC (http://www.bigkitten.com/) -- data acquisition on Linux
Information forwarded to debian-bugs-dist@lists.debian.org, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to "Steinar H. Gunderson" <sgunderson@bigfoot.com>:
Extra info received and forwarded to list. Copy sent to David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #72 received at 368991@bugs.debian.org (full text, mbox, reply):
On Thu, Jun 08, 2006 at 02:06:01PM -0700, David Schleef wrote:
> GCC is really dumb in this area, since it often assumes things about
> stack alignment that just aren't true. GCC doesn't even always follow
> the rules it assumes.
>
> In general, liboil has been able to avoid these situations on other
> architectures, so I'll just fix the code here.
Is there any progress on this? This bug is currently (indirectly) what's
holding up removal of xorg-x11 from unstable, so I guess fixing it soonish
would be a good idea, if at all possible. :-)
/* Steinar */
--
Homepage: http://www.sesse.net/
Information forwarded to debian-bugs-dist@lists.debian.org, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to Loïc Minier <lool@dooz.org>:
Extra info received and forwarded to list. Copy sent to David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #77 received at 368991@bugs.debian.org (full text, mbox, reply):
Hi Christian,
On Thu, Jun 08, 2006, Christian Aichinger wrote:
> | $ gcc -I/usr/include/liboil-0.3 -Wall -ggdb a.c -c -o a.o -mpreferred-stack-boundary=2
From the gcc man page, -mpreferred-stack-boundary flag:
[....]
To ensure proper alignment of this values on the stack, the stack
boundary must be as aligned as that required by any value stored on
the stack. Further, every function must be generated such that it
keeps the stack aligned. Thus calling a function compiled with a
higher preferred stack boundary from a function compiled with a
lower preferred stack boundary will most likely misalign the stack.
It is recommended that libraries that use callbacks always use the
default setting.
None of liboil, banshee, or even mono seem to be built with
-mpreferred-stack-boundary, yet I can imagine some of this software has
misaligned the stack. Is there a way to find out which and add stack
alignment code before external function calls?
Bye,
--
Loïc Minier <lool@dooz.org>
Information forwarded to debian-bugs-dist@lists.debian.org, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to Christian Aichinger <Greek0@gmx.net>:
Extra info received and forwarded to list. Copy sent to David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #82 received at 368991@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On Mon, Jun 19, 2006 at 06:28:40PM +0200, Loïc Minier wrote:
> From the gcc man page, -mpreferred-stack-boundary flag:
> To ensure proper alignment of this values on the stack, the stack
> boundary must be as aligned as that required by any value stored on
> the stack.
[...]
> None of liboil, banshee, or even mono seem to be built with
> -mpreferred-stack-boundary, yet I can imagine some of this software has
> misaligned the stack. Is there a way to find out which and add stack
> alignment code before external function calls?
After some discussion with Loïc on IRC, I've implemented something
like this, it works for us both. So I'm pretty sure it works fine on
i386.
Where I'm sure it doesn't work is amd64, since I don't have access
to an amd64 machine. Michael Ablassmeier reported it works there
though.
The idea is to add a little wrapper function around the sse
functions to make sure the stack is aligned before they are called.
Would be nice if gcc could do this, without its support some asm
magic does work though.
I've attached the patch. It would be nice if someone who knows amd64
assembly could review this, and add proper amd64 support via ifdefs
if needed.
Cheers,
Christian Aichinger
[liboil-368991-sse-segv-fix.3.diff (text/plain, attachment)]
[signature.asc (application/pgp-signature, inline)]
Tags added: patch
Request was from Christian 'Greek0' Aichinger <greek0@GMX.net>
to control@bugs.debian.org.
(full text, mbox, link).
Information forwarded to debian-bugs-dist@lists.debian.org, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to Christian Aichinger <Greek0@gmx.net>:
Extra info received and forwarded to list. Copy sent to David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #89 received at 368991@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
It seems liboil works fine with banshee on amd64, probably because
16 byte is the minimal stack alignment there. So no fix is needed,
and copying around stuff on the stack only takes extra time.
So I've updated my patch to only adjust the stack on i386. It feels
suboptimal, since it needs that extra function, but I haven't found
a way to get rid of the _wrap functions by a preprocessor macro.
Anyway, it's better then otherwise.
Cheers,
Christian Aichinger
[liboil-368991-sse-segv-fix.4.diff (text/plain, attachment)]
[signature.asc (application/pgp-signature, inline)]
Acknowledgement sent to Loïc Minier <lool@dooz.org>:
Extra info received and filed, but not forwarded.
(full text, mbox, link).
Message #94 received at 368991-quiet@bugs.debian.org (full text, mbox, reply):
Hi,
David, do you have any objection to the patch proposed to workaround
the problem in GCC? (If you're busy, I am willing to prepare a NMU.)
Bye,
--
Loïc Minier <lool@dooz.org>
Information forwarded to debian-bugs-dist@lists.debian.org, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to Andreas Barth <aba@not.so.argh.org>:
Extra info received and forwarded to list. Copy sent to David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #99 received at 368991@bugs.debian.org (full text, mbox, reply):
Hi,
I uploaded an NMU of your package to make sure a fixed version goes into etch
(and to allow me to remove xorg-x11 (6.9) from etch).
Thanks for your work.
Cheers,
Andi
diff -Nur liboil-0.3.9/debian/changelog liboil-0.3.9~/debian/changelog
--- liboil-0.3.9/debian/changelog 2006-06-22 20:56:47.000000000 +0200
+++ liboil-0.3.9~/debian/changelog 2006-06-22 20:37:17.000000000 +0200
@@ -1,3 +1,15 @@
+liboil (0.3.9-1.1) unstable; urgency=low
+
+ * Non-maintainer upload.
+ * fix possible unalignment on i386 - this change not perfect
+ and should also contain a test suite, but is still better
+ than nothing at all. Thanks to Christian Aichinger for his
+ good work on this and the patch. Closes: #368991
+ (also keeping the patch around in the diff, so that it's
+ obvious what was changed)
+
+ -- Andreas Barth <aba@not.so.argh.org> Thu, 22 Jun 2006 19:31:26 +0200
+
liboil (0.3.9-1) unstable; urgency=low
* New upstream release.
diff -Nur liboil-0.3.9/liboil/sse/composite_sse_2pix.c liboil-0.3.9~/liboil/sse/composite_sse_2pix.c
--- liboil-0.3.9/liboil/sse/composite_sse_2pix.c 2005-12-21 02:27:54.000000000 +0100
+++ liboil-0.3.9~/liboil/sse/composite_sse_2pix.c 2006-06-22 20:36:42.000000000 +0200
@@ -32,6 +32,42 @@
#include <emmintrin.h>
#include <liboil/liboilcolorspace.h>
+/* Work around non-aligned stack frames (which causes the intristics to crash
+ * by making sure the stack frame is always aligned
+ */
+#if defined(__i386__)
+#define OIL_SSE_WRAPPER(name,ret, ...) \
+ ret name(__VA_ARGS__) __attribute__((used)); \
+ ret name ## _wrap (__VA_ARGS__) { \
+ OIL_SSE_WRAPPER_CALL(name); \
+ }
+
+#define OIL_SSE_WRAPPER_CALL(name) \
+ asm volatile( \
+ "\n\t" \
+ "subl $0x10,%%esp\n\t" \
+ "andl $0xfffffff0,%%esp\n\t" \
+ \
+ "movdqu 8(%%ebp),%%xmm0\n\t" \
+ "movdqa %%xmm0,(%%esp)\n\t" \
+ \
+ "call " #name "\n\t" \
+ "movl %%ebp,%%esp\n\t" \
+ : : \
+ : "eax","ecx","edx","xmm0")
+
+#elif defined(__amd64__)
+
+/* Needed because we call *_wrap. Should get optimized away anyway */
+#define OIL_SSE_WRAPPER(name,ret, ...) \
+ ret name ## _wrap (__VA_ARGS__) { \
+ name(__VA_ARGS__); \
+ }
+
+#else
+#error Can't use sse on !i386 and !amd64
+#endif
+
/* non-SSE2 compositing support */
#define COMPOSITE_OVER(d,s,m) ((d) + (s) - oil_muldiv_255((d),(m)))
#define COMPOSITE_ADD(d,s) oil_clamp_255((d) + (s))
@@ -41,20 +77,12 @@
* the channel value in the low byte. This means 2 pixels per pass.
*/
-union m128_int {
- __m128i m128;
- uint64_t ull[2];
-};
-
-static const struct _SSEData {
- union m128_int sse_8x00ff;
- union m128_int sse_8x0080;
-} c = {
- .sse_8x00ff.ull = {0x00ff00ff00ff00ffULL, 0x00ff00ff00ff00ffULL},
- .sse_8x0080.ull = {0x0080008000800080ULL, 0x0080008000800080ULL},
-};
+static const __m128i c_sse_8x00ff =
+ {0x00ff00ff00ff00ffULL, 0x00ff00ff00ff00ffULL};
+static const __m128i c_sse_8x0080 =
+ {0x0080008000800080ULL, 0x0080008000800080ULL};
-#define MC(x) (c.sse_##x.m128)
+#define MC(x) (c_sse_##x)
/* Shuffles the given value such that the alpha for each pixel appears in each
* channel of the pixel.
@@ -188,7 +216,11 @@
COMPOSITE_IN(oil_argb_B(*src), m));
}
}
-OIL_DEFINE_IMPL_FULL (composite_in_argb_const_src_sse_2pix,
+
+OIL_SSE_WRAPPER(composite_in_argb_const_src_sse_2pix, static void,
+ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
+
+OIL_DEFINE_IMPL_FULL (composite_in_argb_const_src_sse_2pix_wrap,
composite_in_argb_const_src, OIL_IMPL_FLAG_SSE2);
static void
@@ -216,7 +248,10 @@
COMPOSITE_IN(oil_argb_B(s), mask[0]));
}
}
-OIL_DEFINE_IMPL_FULL (composite_in_argb_const_mask_sse_2pix,
+OIL_SSE_WRAPPER(composite_in_argb_const_mask_sse_2pix, static void,
+ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
+
+OIL_DEFINE_IMPL_FULL (composite_in_argb_const_mask_sse_2pix_wrap,
composite_in_argb_const_mask, OIL_IMPL_FLAG_SSE2);
static void
@@ -272,7 +307,11 @@
*dest++ = d;
}
}
-OIL_DEFINE_IMPL_FULL (composite_over_argb_const_src_sse_2pix,
+
+OIL_SSE_WRAPPER(composite_over_argb_const_src_sse_2pix, static void,
+ uint32_t *dest, const uint32_t *src, int n)
+
+OIL_DEFINE_IMPL_FULL (composite_over_argb_const_src_sse_2pix_wrap,
composite_over_argb_const_src, OIL_IMPL_FLAG_SSE2);
static void
@@ -309,8 +348,12 @@
*dest++ = d;
}
}
-OIL_DEFINE_IMPL_FULL (composite_in_over_argb_sse_2pix, composite_in_over_argb,
- OIL_IMPL_FLAG_SSE2);
+
+OIL_SSE_WRAPPER(composite_in_over_argb_sse_2pix , static void,
+ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
+
+OIL_DEFINE_IMPL_FULL (composite_in_over_argb_sse_2pix_wrap,
+ composite_in_over_argb, OIL_IMPL_FLAG_SSE2);
static void
composite_in_over_argb_const_src_sse_2pix (uint32_t *dest, const uint32_t *src,
@@ -348,7 +391,11 @@
*dest++ = d;
}
}
-OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_src_sse_2pix,
+
+OIL_SSE_WRAPPER(composite_in_over_argb_const_src_sse_2pix , static void,
+ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
+
+OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_src_sse_2pix_wrap,
composite_in_over_argb_const_src, OIL_IMPL_FLAG_SSE2);
static void
@@ -387,7 +434,11 @@
*dest++ = d;
}
}
-OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_mask_sse_2pix,
+
+OIL_SSE_WRAPPER(composite_in_over_argb_const_mask_sse_2pix, static void,
+ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
+
+OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_mask_sse_2pix_wrap,
composite_in_over_argb_const_mask, OIL_IMPL_FLAG_SSE2);
static void
diff -Nur liboil-0.3.9/liboil/sse/composite_sse_4pix.c liboil-0.3.9~/liboil/sse/composite_sse_4pix.c
--- liboil-0.3.9/liboil/sse/composite_sse_4pix.c 2005-12-21 02:27:54.000000000 +0100
+++ liboil-0.3.9~/liboil/sse/composite_sse_4pix.c 2006-06-22 20:36:42.000000000 +0200
@@ -32,20 +32,49 @@
#include <emmintrin.h>
#include <liboil/liboilcolorspace.h>
-union m128_int {
- __m128i m128;
- uint64_t ull[2];
-};
-
-static const struct _SSEData {
- union m128_int sse_16xff;
- union m128_int sse_8x0080;
-} c = {
- .sse_16xff.ull = {0xffffffffffffffffULL, 0xffffffffffffffffULL},
- .sse_8x0080.ull = {0x0080008000800080ULL, 0x0080008000800080ULL},
-};
+/* Work around non-aligned stack frames (which causes the intristics to crash
+ * by making sure the stack frame is always aligned
+ */
+#if defined(__i386__)
+#define OIL_SSE_WRAPPER(name,ret, ...) \
+ ret name(__VA_ARGS__) __attribute__((used)); \
+ ret name ## _wrap (__VA_ARGS__) { \
+ OIL_SSE_WRAPPER_CALL(name); \
+ }
+
+#define OIL_SSE_WRAPPER_CALL(name) \
+ asm volatile( \
+ "\n\t" \
+ "subl $0x10,%%esp\n\t" \
+ "andl $0xfffffff0,%%esp\n\t" \
+ \
+ "movdqu 8(%%ebp),%%xmm0\n\t" \
+ "movdqa %%xmm0,(%%esp)\n\t" \
+ \
+ "call " #name "\n\t" \
+ "movl %%ebp,%%esp\n\t" \
+ : : \
+ : "eax","ecx","edx","xmm0")
+
+#elif defined(__amd64__)
+
+/* Needed because we call *_wrap. Should get optimized away anyway */
+#define OIL_SSE_WRAPPER(name,ret, ...) \
+ ret name ## _wrap (__VA_ARGS__) { \
+ name(__VA_ARGS__); \
+ }
+
+#else
+#error Can't use sse on !i386 and !amd64
+#endif
+
-#define MC(x) (c.sse_##x.m128)
+static const __m128i c_sse_16xff =
+ {0xffffffffffffffffULL, 0xffffffffffffffffULL};
+static const __m128i c_sse_8x0080 =
+ {0x0080008000800080ULL, 0x0080008000800080ULL};
+
+#define MC(x) (c_sse_##x)
/* non-SSE2 compositing support */
#define COMPOSITE_OVER(d,s,m) ((d) + (s) - oil_muldiv_255((d),(m)))
@@ -193,7 +222,11 @@
COMPOSITE_IN(oil_argb_B(s), m));
}
}
-OIL_DEFINE_IMPL_FULL (composite_in_argb_sse, composite_in_argb,
+
+OIL_SSE_WRAPPER(composite_in_argb_sse, static void,
+ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
+
+OIL_DEFINE_IMPL_FULL (composite_in_argb_sse_wrap, composite_in_argb,
OIL_IMPL_FLAG_SSE2);
static void
@@ -230,7 +263,11 @@
COMPOSITE_IN(oil_argb_B(*src), m));
}
}
-OIL_DEFINE_IMPL_FULL (composite_in_argb_const_src_sse,
+
+OIL_SSE_WRAPPER(composite_in_argb_const_src_sse , static void,
+ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
+
+OIL_DEFINE_IMPL_FULL (composite_in_argb_const_src_sse_wrap,
composite_in_argb_const_src, OIL_IMPL_FLAG_SSE2);
static void
@@ -267,7 +304,10 @@
COMPOSITE_IN(oil_argb_B(s), mask[0]));
}
}
-OIL_DEFINE_IMPL_FULL (composite_in_argb_const_mask_sse,
+OIL_SSE_WRAPPER(composite_in_argb_const_mask_sse, static void,
+ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
+
+OIL_DEFINE_IMPL_FULL (composite_in_argb_const_mask_sse_wrap,
composite_in_argb_const_mask, OIL_IMPL_FLAG_SSE2);
static void
@@ -339,7 +379,11 @@
*dest++ = d;
}
}
-OIL_DEFINE_IMPL_FULL (composite_over_argb_const_src_sse,
+
+OIL_SSE_WRAPPER(composite_over_argb_const_src_sse, static void,
+ uint32_t *dest, const uint32_t *src, int n)
+
+OIL_DEFINE_IMPL_FULL (composite_over_argb_const_src_sse_wrap,
composite_over_argb_const_src, OIL_IMPL_FLAG_SSE2);
static void
@@ -447,9 +491,11 @@
*dest++ = d;
}
}
-OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_src_sse,
- composite_in_over_argb_const_src, OIL_IMPL_FLAG_SSE2);
+OIL_SSE_WRAPPER(composite_in_over_argb_const_src_sse , static void,
+ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
+OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_src_sse_wrap,
+ composite_in_over_argb_const_src, OIL_IMPL_FLAG_SSE2);
static void
composite_in_over_argb_const_mask_sse (uint32_t *dest, const uint32_t *src,
const uint8_t *mask, int n)
@@ -502,7 +548,11 @@
*dest++ = d;
}
}
-OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_mask_sse,
+
+OIL_SSE_WRAPPER(composite_in_over_argb_const_mask_sse, static void,
+ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
+
+OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_mask_sse_wrap,
composite_in_over_argb_const_mask, OIL_IMPL_FLAG_SSE2);
static void
diff -Nur liboil-0.3.9/liboil/sse/sad8x8_sse.c liboil-0.3.9~/liboil/sse/sad8x8_sse.c
--- liboil-0.3.9/liboil/sse/sad8x8_sse.c 2005-12-23 22:46:25.000000000 +0100
+++ liboil-0.3.9~/liboil/sse/sad8x8_sse.c 2006-06-22 20:36:42.000000000 +0200
@@ -31,6 +31,44 @@
#include <liboil/liboilfunction.h>
#include <emmintrin.h>
+/* Work around non-aligned stack frames (which causes the intristics to crash
+ * by making sure the stack frame is always aligned
+ */
+#if defined(__i386__)
+#define OIL_SSE_WRAPPER(name,ret, ...) \
+ ret name(__VA_ARGS__) __attribute__((used)); \
+ ret name ## _wrap (__VA_ARGS__) { \
+ OIL_SSE_WRAPPER_CALL(name); \
+ }
+
+#define OIL_SSE_WRAPPER_CALL(name) \
+ asm volatile( \
+ "\n\t" \
+ "subl $0x18,%%esp\n\t" \
+ "andl $0xfffffff0,%%esp\n\t" \
+ \
+ "movdqu 8(%%ebp),%%xmm0\n\t" \
+ "movdqa %%xmm0,(%%esp)\n\t" \
+ "movl 0x18(%%ebp), %%ecx\n\t" \
+ "movl %%ecx, 0x10(%%esp)\n\t" \
+ \
+ "call " #name "\n\t" \
+ "movl %%ebp,%%esp\n\t" \
+ : : \
+ : "eax","ecx","edx","xmm0")
+
+#elif defined(__amd64__)
+
+/* Needed because we call *_wrap. Should get optimized away anyway */
+#define OIL_SSE_WRAPPER(name,ret, ...) \
+ ret name ## _wrap (__VA_ARGS__) { \
+ name(__VA_ARGS__); \
+ }
+
+#else
+#error Can't use sse on !i386 and !amd64
+#endif
+
union m128_int {
__m128i m128;
uint32_t i[4];
@@ -42,7 +80,7 @@
int sstr2)
{
int i;
- __m128i sum = _mm_setzero_si128();
+ __m128i sum __attribute__ ((aligned (16))) = _mm_setzero_si128();
union m128_int sumi;
for (i = 0; i < 4; i++) {
@@ -60,4 +98,7 @@
sumi.m128 = sum;
*dest = sumi.i[0] + sumi.i[2];
}
-OIL_DEFINE_IMPL_FULL (sad8x8_u8_sse, sad8x8_u8, OIL_IMPL_FLAG_SSE2);
+
+OIL_SSE_WRAPPER(sad8x8_u8_sse, static void,
+ uint32_t *dest, uint8_t *src1, int sstr1, uint8_t *src2, int sstr2)
+OIL_DEFINE_IMPL_FULL (sad8x8_u8_sse_wrap, sad8x8_u8, OIL_IMPL_FLAG_SSE2);
diff -Nur liboil-0.3.9/liboil-368991-sse-segv-fix.4.diff liboil-0.3.9~/liboil-368991-sse-segv-fix.4.diff
--- liboil-0.3.9/liboil-368991-sse-segv-fix.4.diff 1970-01-01 01:00:00.000000000 +0100
+++ liboil-0.3.9~/liboil-368991-sse-segv-fix.4.diff 2006-06-22 20:34:44.000000000 +0200
@@ -0,0 +1,358 @@
+--- liboil-0.3.9.orig/liboil/sse/composite_sse_2pix.c 2005-12-21 02:27:54.000000000 +0100
++++ liboil-0.3.9/liboil/sse/composite_sse_2pix.c 2006-06-20 19:10:33.000000000 +0200
+@@ -32,6 +32,42 @@
+ #include <emmintrin.h>
+ #include <liboil/liboilcolorspace.h>
+
++/* Work around non-aligned stack frames (which causes the intristics to crash
++ * by making sure the stack frame is always aligned
++ */
++#if defined(__i386__)
++#define OIL_SSE_WRAPPER(name,ret, ...) \
++ ret name(__VA_ARGS__) __attribute__((used)); \
++ ret name ## _wrap (__VA_ARGS__) { \
++ OIL_SSE_WRAPPER_CALL(name); \
++ }
++
++#define OIL_SSE_WRAPPER_CALL(name) \
++ asm volatile( \
++ "\n\t" \
++ "subl $0x10,%%esp\n\t" \
++ "andl $0xfffffff0,%%esp\n\t" \
++ \
++ "movdqu 8(%%ebp),%%xmm0\n\t" \
++ "movdqa %%xmm0,(%%esp)\n\t" \
++ \
++ "call " #name "\n\t" \
++ "movl %%ebp,%%esp\n\t" \
++ : : \
++ : "eax","ecx","edx","xmm0")
++
++#elif defined(__amd64__)
++
++/* Needed because we call *_wrap. Should get optimized away anyway */
++#define OIL_SSE_WRAPPER(name,ret, ...) \
++ ret name ## _wrap (__VA_ARGS__) { \
++ name(__VA_ARGS__); \
++ }
++
++#else
++#error Can't use sse on !i386 and !amd64
++#endif
++
+ /* non-SSE2 compositing support */
+ #define COMPOSITE_OVER(d,s,m) ((d) + (s) - oil_muldiv_255((d),(m)))
+ #define COMPOSITE_ADD(d,s) oil_clamp_255((d) + (s))
+@@ -41,20 +77,12 @@
+ * the channel value in the low byte. This means 2 pixels per pass.
+ */
+
+-union m128_int {
+- __m128i m128;
+- uint64_t ull[2];
+-};
+-
+-static const struct _SSEData {
+- union m128_int sse_8x00ff;
+- union m128_int sse_8x0080;
+-} c = {
+- .sse_8x00ff.ull = {0x00ff00ff00ff00ffULL, 0x00ff00ff00ff00ffULL},
+- .sse_8x0080.ull = {0x0080008000800080ULL, 0x0080008000800080ULL},
+-};
++static const __m128i c_sse_8x00ff =
++ {0x00ff00ff00ff00ffULL, 0x00ff00ff00ff00ffULL};
++static const __m128i c_sse_8x0080 =
++ {0x0080008000800080ULL, 0x0080008000800080ULL};
+
+-#define MC(x) (c.sse_##x.m128)
++#define MC(x) (c_sse_##x)
+
+ /* Shuffles the given value such that the alpha for each pixel appears in each
+ * channel of the pixel.
+@@ -188,7 +216,11 @@
+ COMPOSITE_IN(oil_argb_B(*src), m));
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_in_argb_const_src_sse_2pix,
++
++OIL_SSE_WRAPPER(composite_in_argb_const_src_sse_2pix, static void,
++ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
++
++OIL_DEFINE_IMPL_FULL (composite_in_argb_const_src_sse_2pix_wrap,
+ composite_in_argb_const_src, OIL_IMPL_FLAG_SSE2);
+
+ static void
+@@ -216,7 +248,10 @@
+ COMPOSITE_IN(oil_argb_B(s), mask[0]));
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_in_argb_const_mask_sse_2pix,
++OIL_SSE_WRAPPER(composite_in_argb_const_mask_sse_2pix, static void,
++ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
++
++OIL_DEFINE_IMPL_FULL (composite_in_argb_const_mask_sse_2pix_wrap,
+ composite_in_argb_const_mask, OIL_IMPL_FLAG_SSE2);
+
+ static void
+@@ -272,7 +307,11 @@
+ *dest++ = d;
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_over_argb_const_src_sse_2pix,
++
++OIL_SSE_WRAPPER(composite_over_argb_const_src_sse_2pix, static void,
++ uint32_t *dest, const uint32_t *src, int n)
++
++OIL_DEFINE_IMPL_FULL (composite_over_argb_const_src_sse_2pix_wrap,
+ composite_over_argb_const_src, OIL_IMPL_FLAG_SSE2);
+
+ static void
+@@ -309,8 +348,12 @@
+ *dest++ = d;
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_in_over_argb_sse_2pix, composite_in_over_argb,
+- OIL_IMPL_FLAG_SSE2);
++
++OIL_SSE_WRAPPER(composite_in_over_argb_sse_2pix , static void,
++ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
++
++OIL_DEFINE_IMPL_FULL (composite_in_over_argb_sse_2pix_wrap,
++ composite_in_over_argb, OIL_IMPL_FLAG_SSE2);
+
+ static void
+ composite_in_over_argb_const_src_sse_2pix (uint32_t *dest, const uint32_t *src,
+@@ -348,7 +391,11 @@
+ *dest++ = d;
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_src_sse_2pix,
++
++OIL_SSE_WRAPPER(composite_in_over_argb_const_src_sse_2pix , static void,
++ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
++
++OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_src_sse_2pix_wrap,
+ composite_in_over_argb_const_src, OIL_IMPL_FLAG_SSE2);
+
+ static void
+@@ -387,7 +434,11 @@
+ *dest++ = d;
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_mask_sse_2pix,
++
++OIL_SSE_WRAPPER(composite_in_over_argb_const_mask_sse_2pix, static void,
++ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
++
++OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_mask_sse_2pix_wrap,
+ composite_in_over_argb_const_mask, OIL_IMPL_FLAG_SSE2);
+
+ static void
+--- liboil-0.3.9.orig/liboil/sse/composite_sse_4pix.c 2005-12-21 02:27:54.000000000 +0100
++++ liboil-0.3.9/liboil/sse/composite_sse_4pix.c 2006-06-20 19:10:34.000000000 +0200
+@@ -32,20 +32,49 @@
+ #include <emmintrin.h>
+ #include <liboil/liboilcolorspace.h>
+
+-union m128_int {
+- __m128i m128;
+- uint64_t ull[2];
+-};
+-
+-static const struct _SSEData {
+- union m128_int sse_16xff;
+- union m128_int sse_8x0080;
+-} c = {
+- .sse_16xff.ull = {0xffffffffffffffffULL, 0xffffffffffffffffULL},
+- .sse_8x0080.ull = {0x0080008000800080ULL, 0x0080008000800080ULL},
+-};
++/* Work around non-aligned stack frames (which causes the intristics to crash
++ * by making sure the stack frame is always aligned
++ */
++#if defined(__i386__)
++#define OIL_SSE_WRAPPER(name,ret, ...) \
++ ret name(__VA_ARGS__) __attribute__((used)); \
++ ret name ## _wrap (__VA_ARGS__) { \
++ OIL_SSE_WRAPPER_CALL(name); \
++ }
++
++#define OIL_SSE_WRAPPER_CALL(name) \
++ asm volatile( \
++ "\n\t" \
++ "subl $0x10,%%esp\n\t" \
++ "andl $0xfffffff0,%%esp\n\t" \
++ \
++ "movdqu 8(%%ebp),%%xmm0\n\t" \
++ "movdqa %%xmm0,(%%esp)\n\t" \
++ \
++ "call " #name "\n\t" \
++ "movl %%ebp,%%esp\n\t" \
++ : : \
++ : "eax","ecx","edx","xmm0")
++
++#elif defined(__amd64__)
++
++/* Needed because we call *_wrap. Should get optimized away anyway */
++#define OIL_SSE_WRAPPER(name,ret, ...) \
++ ret name ## _wrap (__VA_ARGS__) { \
++ name(__VA_ARGS__); \
++ }
++
++#else
++#error Can't use sse on !i386 and !amd64
++#endif
++
+
+-#define MC(x) (c.sse_##x.m128)
++static const __m128i c_sse_16xff =
++ {0xffffffffffffffffULL, 0xffffffffffffffffULL};
++static const __m128i c_sse_8x0080 =
++ {0x0080008000800080ULL, 0x0080008000800080ULL};
++
++#define MC(x) (c_sse_##x)
+
+ /* non-SSE2 compositing support */
+ #define COMPOSITE_OVER(d,s,m) ((d) + (s) - oil_muldiv_255((d),(m)))
+@@ -193,7 +222,11 @@
+ COMPOSITE_IN(oil_argb_B(s), m));
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_in_argb_sse, composite_in_argb,
++
++OIL_SSE_WRAPPER(composite_in_argb_sse, static void,
++ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
++
++OIL_DEFINE_IMPL_FULL (composite_in_argb_sse_wrap, composite_in_argb,
+ OIL_IMPL_FLAG_SSE2);
+
+ static void
+@@ -230,7 +263,11 @@
+ COMPOSITE_IN(oil_argb_B(*src), m));
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_in_argb_const_src_sse,
++
++OIL_SSE_WRAPPER(composite_in_argb_const_src_sse , static void,
++ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
++
++OIL_DEFINE_IMPL_FULL (composite_in_argb_const_src_sse_wrap,
+ composite_in_argb_const_src, OIL_IMPL_FLAG_SSE2);
+
+ static void
+@@ -267,7 +304,10 @@
+ COMPOSITE_IN(oil_argb_B(s), mask[0]));
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_in_argb_const_mask_sse,
++OIL_SSE_WRAPPER(composite_in_argb_const_mask_sse, static void,
++ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
++
++OIL_DEFINE_IMPL_FULL (composite_in_argb_const_mask_sse_wrap,
+ composite_in_argb_const_mask, OIL_IMPL_FLAG_SSE2);
+
+ static void
+@@ -339,7 +379,11 @@
+ *dest++ = d;
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_over_argb_const_src_sse,
++
++OIL_SSE_WRAPPER(composite_over_argb_const_src_sse, static void,
++ uint32_t *dest, const uint32_t *src, int n)
++
++OIL_DEFINE_IMPL_FULL (composite_over_argb_const_src_sse_wrap,
+ composite_over_argb_const_src, OIL_IMPL_FLAG_SSE2);
+
+ static void
+@@ -447,9 +491,11 @@
+ *dest++ = d;
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_src_sse,
+- composite_in_over_argb_const_src, OIL_IMPL_FLAG_SSE2);
++OIL_SSE_WRAPPER(composite_in_over_argb_const_src_sse , static void,
++ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
+
++OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_src_sse_wrap,
++ composite_in_over_argb_const_src, OIL_IMPL_FLAG_SSE2);
+ static void
+ composite_in_over_argb_const_mask_sse (uint32_t *dest, const uint32_t *src,
+ const uint8_t *mask, int n)
+@@ -502,7 +548,11 @@
+ *dest++ = d;
+ }
+ }
+-OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_mask_sse,
++
++OIL_SSE_WRAPPER(composite_in_over_argb_const_mask_sse, static void,
++ uint32_t *dest, const uint32_t *src, const uint8_t *mask, int n)
++
++OIL_DEFINE_IMPL_FULL (composite_in_over_argb_const_mask_sse_wrap,
+ composite_in_over_argb_const_mask, OIL_IMPL_FLAG_SSE2);
+
+ static void
+--- liboil-0.3.9.orig/liboil/sse/sad8x8_sse.c 2005-12-23 22:46:25.000000000 +0100
++++ liboil-0.3.9/liboil/sse/sad8x8_sse.c 2006-06-20 19:10:32.000000000 +0200
+@@ -31,6 +31,44 @@
+ #include <liboil/liboilfunction.h>
+ #include <emmintrin.h>
+
++/* Work around non-aligned stack frames (which causes the intristics to crash
++ * by making sure the stack frame is always aligned
++ */
++#if defined(__i386__)
++#define OIL_SSE_WRAPPER(name,ret, ...) \
++ ret name(__VA_ARGS__) __attribute__((used)); \
++ ret name ## _wrap (__VA_ARGS__) { \
++ OIL_SSE_WRAPPER_CALL(name); \
++ }
++
++#define OIL_SSE_WRAPPER_CALL(name) \
++ asm volatile( \
++ "\n\t" \
++ "subl $0x18,%%esp\n\t" \
++ "andl $0xfffffff0,%%esp\n\t" \
++ \
++ "movdqu 8(%%ebp),%%xmm0\n\t" \
++ "movdqa %%xmm0,(%%esp)\n\t" \
++ "movl 0x18(%%ebp), %%ecx\n\t" \
++ "movl %%ecx, 0x10(%%esp)\n\t" \
++ \
++ "call " #name "\n\t" \
++ "movl %%ebp,%%esp\n\t" \
++ : : \
++ : "eax","ecx","edx","xmm0")
++
++#elif defined(__amd64__)
++
++/* Needed because we call *_wrap. Should get optimized away anyway */
++#define OIL_SSE_WRAPPER(name,ret, ...) \
++ ret name ## _wrap (__VA_ARGS__) { \
++ name(__VA_ARGS__); \
++ }
++
++#else
++#error Can't use sse on !i386 and !amd64
++#endif
++
+ union m128_int {
+ __m128i m128;
+ uint32_t i[4];
+@@ -42,7 +80,7 @@
+ int sstr2)
+ {
+ int i;
+- __m128i sum = _mm_setzero_si128();
++ __m128i sum __attribute__ ((aligned (16))) = _mm_setzero_si128();
+ union m128_int sumi;
+
+ for (i = 0; i < 4; i++) {
+@@ -60,4 +98,7 @@
+ sumi.m128 = sum;
+ *dest = sumi.i[0] + sumi.i[2];
+ }
+-OIL_DEFINE_IMPL_FULL (sad8x8_u8_sse, sad8x8_u8, OIL_IMPL_FLAG_SSE2);
++
++OIL_SSE_WRAPPER(sad8x8_u8_sse, static void,
++ uint32_t *dest, uint8_t *src1, int sstr1, uint8_t *src2, int sstr2)
++OIL_DEFINE_IMPL_FULL (sad8x8_u8_sse_wrap, sad8x8_u8, OIL_IMPL_FLAG_SSE2);
--
http://home.arcor.de/andreas-barth/
Tags added: fixed
Request was from Andreas Barth <aba@not.so.argh.org>
to control@bugs.debian.org.
(full text, mbox, link).
Information forwarded to debian-bugs-dist@lists.debian.org, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de>:
Extra info received and forwarded to list. Copy sent to David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #106 received at 368991@bugs.debian.org (full text, mbox, reply):
Hi,
the makro defines for amd64 can't work this way as it include the
parameter types in the function call (see buildd log).
To fix this it is simplest if on amd64 the original functions are
called and on i386 the wrapper. To do this the OIL_DEFINE_IMPL_FULL
makro has to be replaced by an extended version incorporating the
wrapper code from the old patch like this:
#if defined(__i386__)
#define OIL_DEFINE_IMPL_FULL_WRAPPER(sse_name, name, flags, ret, ...) \
ret sse_name(__VA_ARGS__) __attribute__((used)); \
ret sse_name ## _wrap (__VA_ARGS__) { \
OIL_SSE_WRAPPER_CALL(sse_name); \
} \
OIL_DEFINE_IMPL_FULL(sse_name ## _wrap, name, flags);
#define OIL_SSE_WRAPPER_CALL(name) \
asm volatile( \
"\n\t" \
"subl $0x10,%%esp\n\t" \
"andl $0xfffffff0,%%esp\n\t" \
\
"movdqu 8(%%ebp),%%xmm0\n\t" \
"movdqa %%xmm0,(%%esp)\n\t" \
\
"call " #name "\n\t" \
"movl %%ebp,%%esp\n\t" \
: : \
: "eax","ecx","edx","xmm0")
#elif defined(__amd64__)
/* Needed because we call *_wrap. Should get optimized away anyway */
#define OIL_DEFINE_IMPL_FULL_WRAPPER(sse_name, name, flags, ret, ...) \
OIL_DEFINE_IMPL_FULL(sse_name, name, flags);
#else
#error Can't use sse on !i386 and !amd64
#endif
The usage then change as the the makros get compined into a signle
call. This mean
OIL_SSE_WRAPPER(composite_over_argb_const_src_sse, static void,
uint32_t *dest, const uint32_t *src, int n)
OIL_DEFINE_IMPL_FULL (composite_over_argb_const_src_sse_wrap,
composite_over_argb_const_src, OIL_IMPL_FLAG_SSE2);
becomes
OIL_DEFINE_IMPL_FULL (composite_over_argb_const_src_sse,
composite_over_argb_const_src, OIL_IMPL_FLAG_SSE2,
static void,
uint32_t *dest, const uint32_t *src, int n);
Is that enough for you to fix this?
MfG
Goswin
Information forwarded to debian-bugs-dist@lists.debian.org, Andreas Barth <aba@not.so.argh.org>, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to Goswin Brederlow <brederlo@informatik.uni-tuebingen.de>:
Extra info received and forwarded to list. Copy sent to Andreas Barth <aba@not.so.argh.org>, David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #111 received at 368991@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Package: liboil
Followup-For: Bug #368991
Attached is a rework of the NMU patch that now compiles and runs on
both i386 and amd64.
Enjoy,
Goswin
-- System Information:
Debian Release: 3.1
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.16-rc4-xen
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
[liboil_0.3.9-1__0.3.9-1.2.diff (text/plain, attachment)]
Information forwarded to debian-bugs-dist@lists.debian.org, David Schleef <ds@schleef.org>:
Bug#368991; Package liboil.
(full text, mbox, link).
Acknowledgement sent to Christian Aichinger <Greek0@gmx.net>:
Extra info received and forwarded to list. Copy sent to David Schleef <ds@schleef.org>.
(full text, mbox, link).
Message #116 received at 368991@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Arrgh, what a dumb mistake. Sorry for screwing up.
Goswin has a patch in the works that fixes the problem on amd64 in a
rather nice manner.
Thanks for your help Goswin.
Sorry again,
Christian Aichinger
[signature.asc (application/pgp-signature, inline)]
Bug marked as found in version 0.3.9-1.1.
Request was from Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de>
to control@bugs.debian.org.
(full text, mbox, link).
Bug marked as fixed in version 0.3.9-1.1, send any further explanations to Marco Cabizza <marco87@gmail.com>
Request was from Steve Langasek <vorlon@debian.org>
to control@bugs.debian.org.
(full text, mbox, link).
Message sent on to Nicholas Crespi <roundtrip@gmail.com>:
Bug#368991.
(full text, mbox, link).
Message #123 received at 368991-submitter@bugs.debian.org (full text, mbox, reply):
# Hi folks,
#
# You are receiving this mail because you are the submitter of one or more
# bugs that have been fixed in a non-maintainer upload of a Debian package,
# but not yet acknowledged by the maintainers. With version tracking in the
# Debian BTS, it is important to know which version of a package fixes each
# bug so that they can be tracked for release status in the BTS, so I'm
# closing these bugs with the relevant version number information now.
#
# It is possible that this will be the only message you receive about this
# bug being fixed, and due to the volume of affected bugs we are
# unfortunately not sending individualized explanations for each bug. If
# you have questions about the fix for your particular bug or about this
# email, please contact me directly or follow up to the bug report in the
# BTS.
close 370031 1.12-0.1
close 370147 0.3.4.cvs.20050813-2.1
close 370178 3.1.0-5.2
close 370193 1.2.2-4.3
close 370232 1.2-2.1
close 370233 4.2.22-2.1
close 370244 0.7.6-1.1
close 370438 0.3.6-2.1
close 370447 0.1.5-1.1
close 370451 0.3.9-1.1
close 370504 1.99.0-2.1
close 370519 1.0.3-1.2
close 370757 2.2-5.2
close 370784 2.4.0-4.1
close 371142 1.1.3-5.2
close 372193 1:0.7.44.20051021-2.1
close 372275 0.7.3-3.1
close 372488 0.8.0-1
close 372558 0.5.10-1.1
close 372619 1.3-0.1
close 372840 0.9.10-3.2
close 373464 1.5.3-1.1
close 373509 0.99cvs20060405-1.1
close 373559 0.0.43-0.1
close 373693 2.4-11.1
close 373953 1.9.0+20060423-3.1
close 374000 3.1.0-5.3
close 374045 1.3bbn-9.1
close 374264 0.20-1-1.3
close 374396 5.8.8-6.1
close 374487 3.5.0.20030301-1.1
close 374490 1.0.1a-2.1
close 374595 1:0.90.0.1-1
close 374730 0.6-1.1
close 374846 3.2-1.1
close 374909 3.0.9-5.1
close 374935 1.15-6.1
close 374955 1.0.3-1.2
close 375105 9.51-2.1
close 375561 1.5.1-2.1
close 375572 1.1.1-1.1
close 375612 0.3.0+beta4-1.2
close 376197 0.9.0-0.1
close 376402 0.9d-2.2
close 376421 3.0-9.2
close 376422 1.3-4.2
close 376471 1.4.52-1.1
close 376670 1.1-3.2
close 376673 15-0.1
close 376715 0.86.2-6.1
close 376875 1.3-1.1
close 376946 1:2.2-2.1
close 377080 0.9.0-1.1
close 377089 0.18-0.1
close 377248 382-iso258-1.1
close 377285 2.7.5-2sarge2
close 377445 4.1-18.3
close 377652 3.0-16.1
close 377694 2.8-2.2
close 377813 0.5.0-1.3
close 377895 251-5.1
close 377978 20060704a-2
close 377991 1:1.18-2.3
close 378026 1.81-3.1
close 378049 0.18-2.2
close 378066 0.11.4-2
close 378091 0.4.2-3.0etch1
close 378198 6.4.2-1.1
close 378253 2.5.03.2382-2
close 378296 0.96.9-12.1
close 378393 1.4.4.cvs20060709-2.1
close 378397 1.4.4.cvs20060709-2.2
close 378412 2.34-4.1
close 378447 3.6.13-3.5
close 378498 1.6-8.1
close 378586 0.0.43-0.1
close 379214 4.1.2-1.1
close 379242 0.6.6-6.2
close 379261 1.0.57-2.2
close 379275 0.7.3-1.1
close 379486 1.19-7.2
close 379537 1.02-1.1
close 379566 0.52.2-5.1
close 379584 2.01.10-30.1
close 379744 0.1-1.2
close 379813 1.1.4-3.1
close 379895 1.0.57-2.2
thanks
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
vorlon@debian.org http://www.debian.org/
Bug marked as fixed in version 0.3.9-1.2, send any further explanations to Nicholas Crespi <roundtrip@gmail.com>
Request was from Steve Langasek <vorlon@debian.org>
to control@bugs.debian.org.
(full text, mbox, link).
Bug archived.
Request was from Debbugs Internal Request <owner@bugs.debian.org>
to internal_control@bugs.debian.org.
(Mon, 25 Jun 2007 04:02:51 GMT) (full text, mbox, link).
Send a report that this bug log contains spam.
Debian bug tracking system administrator <owner@bugs.debian.org>.
Last modified:
Sun Jan 14 01:03:45 2024;
Machine Name:
buxtehude
Debian Bug tracking system
Debbugs is free software and licensed under the terms of the GNU
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson,
2005-2017 Don Armstrong, and many other contributors.