Debian Bug report logs - #863672
performance critical libyuv built with Os

version graph

Package: firefox; Maintainer for firefox is Maintainers of Mozilla-related packages <pkg-mozilla-maintainers@lists.alioth.debian.org>; Source for firefox is src:firefox (PTS, buildd, popcon).

Reported by: Julian Taylor <jtaylor.debian@googlemail.com>

Date: Mon, 29 May 2017 21:18:01 UTC

Severity: normal

Tags: fixed-upstream, patch

Found in version firefox/53.0.is.52.0.2-1

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Maintainers of Mozilla-related packages <pkg-mozilla-maintainers@lists.alioth.debian.org>:
Bug#863672; Package firefox. (Mon, 29 May 2017 21:18:03 GMT) (full text, mbox, link).


Acknowledgement sent to Julian Taylor <jtaylor.debian@googlemail.com>:
New Bug report received and forwarded. Copy sent to Maintainers of Mozilla-related packages <pkg-mozilla-maintainers@lists.alioth.debian.org>. (Mon, 29 May 2017 21:18:03 GMT) (full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: Julian Taylor <jtaylor.debian@googlemail.com>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: performance critical libyuv built with Os
Date: Mon, 29 May 2017 23:14:38 +0200
[Message part 1 (text/plain, inline)]
Package: firefox
Version:  53.0.is.52.0.2-1
Severity: normal


libyuv which is a performance critical library for firefix is built with
-Os which is horrible for performance for it.
In particular row_common.cc which contains the generic parts of the
color transformation code:

See:
https://buildd.debian.org/status/fetch.php?pkg=firefox&arch=amd64&ver=53.0.is.52.0.2-1&stamp=1492644908&raw=0

/usr/bin/g++ -std=gnu++11 -o row_common.o -c  ...   -fPIC
-DMOZILLA_CLIENT -include
/&lt;&lt;PKGBUILDDIR&gt;&gt;/build-browser/mozilla-config.h -MD -MP -MF
.deps/row_common.o.pp -Wdate-time -D_FORTIFY_SOURCE=2 -Wall
-Wc++11-compat -Wempty-body -Wignored-qualifiers -Woverloaded-virtual
-Wpointer-arith -Wsign-compare -Wtype-limits -Wunreachable-code
-Wwrite-strings -Wno-invalid-offsetof -Wc++14-compat
-Wno-error=maybe-uninitialized -Wno-error=deprecated-declarations
-Wno-error=array-bounds -fno-lifetime-dse -fstack-protector-strong
-Wformat -Werror=format-security -fno-schedule-insns2 -fno-lifetime-dse
-fno-delete-null-pointer-checks -fno-exceptions -fno-strict-aliasing
-fno-rtti -ffunction-sections -fdata-sections -fno-exceptions
-fno-math-errno -pthread -pipe  -g -freorder-blocks -Os
-fomit-frame-pointer
/&lt;&lt;PKGBUILDDIR&gt;&gt;/media/libyuv/source/row_common.cc


The problematic part is the YuvPixel function which is called in loops
and in turn calls tiny clamp functions.
Os disables inlining so this causes massive overhead.
This is the top cpu profile on sites which e.g. display videos.
  17.25%  libxul.so                   [.] YuvPixel        ▒
   6.58%  libxul.so                   [.] Clamp           ▒
   6.46%  libxul.so                   [.] clamp255

The problem is not as bad as it looks as this generic code is only
executed on machines that do not have SSSE3, AVX2 or NEON (see
convert_argb.cc)
But there are still plenty useful cpus that do not have these
instruction sets and are crippled by the compiler flags used.

Is it possible to compile this library with O3 to allow the compiler to
vectorize it with the best available generic instruction set (e.g. SSE2
on x64).

cheers,
Julian Taylor

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Maintainers of Mozilla-related packages <pkg-mozilla-maintainers@lists.alioth.debian.org>:
Bug#863672; Package firefox. (Fri, 02 Jun 2017 15:24:12 GMT) (full text, mbox, link).


Acknowledgement sent to Laurent Bigonville <bigon@debian.org>:
Extra info received and forwarded to list. Copy sent to Maintainers of Mozilla-related packages <pkg-mozilla-maintainers@lists.alioth.debian.org>. (Fri, 02 Jun 2017 15:24:12 GMT) (full text, mbox, link).


Message #10 received at 863672@bugs.debian.org (full text, mbox, reply):

From: Laurent Bigonville <bigon@debian.org>
To: 863672@bugs.debian.org
Subject: Re: performance critical libyuv built with Os
Date: Fri, 2 Jun 2017 17:23:16 +0200
tag 863672 + patch fixed-upstream
thanks

On Mon, 29 May 2017 23:14:38 +0200 Julian Taylor 
<jtaylor.debian@googlemail.com> wrote:

>
> libyuv which is a performance critical library for firefix is built with
> -Os which is horrible for performance for it.
> In particular row_common.cc which contains the generic parts of the
> color transformation code:
>
> See:
> 
https://buildd.debian.org/status/fetch.php?pkg=firefox&arch=amd64&ver=53.0.is.52.0.2-1&stamp=1492644908&raw=0
>
> /usr/bin/g++ -std=gnu++11 -o row_common.o -c ... -fPIC
> -DMOZILLA_CLIENT -include
> /&lt;&lt;PKGBUILDDIR&gt;&gt;/build-browser/mozilla-config.h -MD -MP -MF
> .deps/row_common.o.pp -Wdate-time -D_FORTIFY_SOURCE=2 -Wall
> -Wc++11-compat -Wempty-body -Wignored-qualifiers -Woverloaded-virtual
> -Wpointer-arith -Wsign-compare -Wtype-limits -Wunreachable-code
> -Wwrite-strings -Wno-invalid-offsetof -Wc++14-compat
> -Wno-error=maybe-uninitialized -Wno-error=deprecated-declarations
> -Wno-error=array-bounds -fno-lifetime-dse -fstack-protector-strong
> -Wformat -Werror=format-security -fno-schedule-insns2 -fno-lifetime-dse
> -fno-delete-null-pointer-checks -fno-exceptions -fno-strict-aliasing
> -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions
> -fno-math-errno -pthread -pipe -g -freorder-blocks -Os
> -fomit-frame-pointer
> /&lt;&lt;PKGBUILDDIR&gt;&gt;/media/libyuv/source/row_common.cc
>
>
> The problematic part is the YuvPixel function which is called in loops
> and in turn calls tiny clamp functions.
> Os disables inlining so this causes massive overhead.
> This is the top cpu profile on sites which e.g. display videos.
> 17.25% libxul.so [.] YuvPixel ▒
> 6.58% libxul.so [.] Clamp ▒
> 6.46% libxul.so [.] clamp255
>
> The problem is not as bad as it looks as this generic code is only
> executed on machines that do not have SSSE3, AVX2 or NEON (see
> convert_argb.cc)
> But there are still plenty useful cpus that do not have these
> instruction sets and are crippled by the compiler flags used.
>
> Is it possible to compile this library with O3 to allow the compiler to
> vectorize it with the best available generic instruction set (e.g. SSE2
> on x64).

FTR, this is fixed upstream now, -O2 is used by default on desktop build:

https://hg.mozilla.org/integration/autoland/rev/8fdb9e30b6a7



Added tag(s) fixed-upstream and patch. Request was from Laurent Bigonville <bigon@debian.org> to control@bugs.debian.org. (Fri, 02 Jun 2017 15:24:13 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Tue Jan 9 22:23:00 2018; Machine Name: buxtehude

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.