Debian Bug report logs -
#875990
reproducible: i/o issues with profitbricks-build2-i386 since stretch upgrade
Reported by: Vagrant Cascadian <vagrant@debian.org>
Date: Sun, 17 Sep 2017 02:51:02 UTC
Severity: normal
Done: Holger Levsen <holger@layer-acht.org>
Bug is archived. No further changes may be made.
Toggle useless messages
Report forwarded
to debian-bugs-dist@lists.debian.org, Debian Jenkins Team <qa-jenkins-dev@lists.alioth.debian.org>:
Bug#875990; Package jenkins.debian.org.
(Sun, 17 Sep 2017 02:51:04 GMT) (full text, mbox, link).
Acknowledgement sent
to Vagrant Cascadian <vagrant@debian.org>:
New Bug report received and forwarded. Copy sent to Debian Jenkins Team <qa-jenkins-dev@lists.alioth.debian.org>.
(Sun, 17 Sep 2017 02:51:04 GMT) (full text, mbox, link).
Message #5 received at submit@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Package: jenkins.debian.org
Severity: normal
It looks like after the upgrade to stretch (late june/early july), two
of the i386 builders, profitbricks-build2-i386 and
profitbricks-build12-i386 suddenly developed large i/o issues.
You can see this on the munin graphs for the year, where the blue i/o
wait spikes:
https://jenkins.debian.net/munin/debian.net/profitbricks-build2-i386.debian.net/cpu.html
https://jenkins.debian.net/munin/debian.net/profitbricks-build12-i386.debian.net/cpu.html
Comparing this to the other i386 builders, where there is no huge
spike in i/o wait:
https://jenkins.debian.net/munin/debian.net/profitbricks-build6-i386.debian.net/cpu.html
https://jenkins.debian.net/munin/debian.net/profitbricks-build16-i386.debian.net/cpu.html
I suspect this is reducing the i386 builds per day significantly,
averaging only ~1200 in the last 3 months.
My *hunch* is that build2 and build12 are running a PAE kernel with more
than 8GB of ram, and affected by this kernel bug (introduced in linux
~4.2, possibly):
https://bugzilla.kernel.org/show_bug.cgi?id=196157
https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1698118
Reducing the ram of the affected builders to 8GB and having more PAE
builders with lighter workloads might be a workaround that would get
better performance... while still testing 32/64-bit kernel
variation.
Alternately, switching to only amd64 kernels might also fix the issue,
though wouldn't test 32/64-bit kernel variations.
Running a linux 4.1 kernel from snapshot.debian.org might be another way
to test the issue, even if not running long-term.
live well,
vagrant
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Jenkins Team <qa-jenkins-dev@lists.alioth.debian.org>:
Bug#875990; Package jenkins.debian.org.
(Sun, 17 Sep 2017 12:39:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Holger Levsen <holger@layer-acht.org>:
Extra info received and forwarded to list. Copy sent to Debian Jenkins Team <qa-jenkins-dev@lists.alioth.debian.org>.
(Sun, 17 Sep 2017 12:39:03 GMT) (full text, mbox, link).
Message #10 received at 875990@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Hi Vagrant,
thanks for filling this bug and thus properly documenting what we had discovered,
discussed and lost on IRC already…
On Sat, Sep 16, 2017 at 07:48:42PM -0700, Vagrant Cascadian wrote:
> My *hunch* is that build2 and build12 are running a PAE kernel with more
> than 8GB of ram, and affected by this kernel bug (introduced in linux
> ~4.2, possibly):
> https://bugzilla.kernel.org/show_bug.cgi?id=196157
> https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1698118
indeed!
> Reducing the ram of the affected builders to 8GB and having more PAE
> builders with lighter workloads might be a workaround that would get
> better performance... while still testing 32/64-bit kernel
> variation.
we lack the diskspace to do so.
> Alternately, switching to only amd64 kernels might also fix the issue,
> though wouldn't test 32/64-bit kernel variations.
indeed.
> Running a linux 4.1 kernel from snapshot.debian.org might be another way
> to test the issue, even if not running long-term.
yeah.
another option is to just wait. :/
--
cheers,
Holger
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Jenkins Team <qa-jenkins-dev@lists.alioth.debian.org>:
Bug#875990; Package jenkins.debian.org.
(Sun, 17 Sep 2017 16:30:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Vagrant Cascadian <vagrant@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Jenkins Team <qa-jenkins-dev@lists.alioth.debian.org>.
(Sun, 17 Sep 2017 16:30:03 GMT) (full text, mbox, link).
Message #15 received at 875990@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On 2017-09-17, Holger Levsen wrote:
> On Sat, Sep 16, 2017 at 07:48:42PM -0700, Vagrant Cascadian wrote:
>> My *hunch* is that build2 and build12 are running a PAE kernel with more
>> than 8GB of ram, and affected by this kernel bug (introduced in linux
>> ~4.2, possibly):
>> https://bugzilla.kernel.org/show_bug.cgi?id=196157
>> https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1698118
>
> indeed!
Not sure if it makes sense to also file a bug in the debian bug tracker
about this...
>> Reducing the ram of the affected builders to 8GB and having more PAE
>> builders with lighter workloads might be a workaround that would get
>> better performance... while still testing 32/64-bit kernel
>> variation.
>
> we lack the diskspace to do so.
Then it might still get better performance to lower the PAE builders to
only use 8GB of ram, even if that means running fewer jobs in
parallel...
>> Running a linux 4.1 kernel from snapshot.debian.org might be another way
>> to test the issue, even if not running long-term.
>
> yeah.
Now that I think about it, switching back to a 3.16.x kernel from jessie
for the PAE builders should be viable at least as long as jessie-lts is
around...
> another option is to just wait. :/
I suspect that *might* be an infinite wait; I get the impression this is
a very low-priority issue upstream, and it would take some active
attempt to fix it upstream...
live well,
vagrant
[signature.asc (application/pgp-signature, inline)]
Added blocking bug(s) of 875990: 876035
Request was from Vagrant Cascadian <vagrant@debian.org>
to submit@bugs.debian.org.
(Sun, 17 Sep 2017 17:21:05 GMT) (full text, mbox, link).
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Jenkins Team <qa-jenkins-dev@lists.alioth.debian.org>:
Bug#875990; Package jenkins.debian.org.
(Sun, 21 Jan 2018 22:27:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Vagrant Cascadian <vagrant@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Jenkins Team <qa-jenkins-dev@lists.alioth.debian.org>.
(Sun, 21 Jan 2018 22:27:03 GMT) (full text, mbox, link).
Message #22 received at 875990@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On 2017-09-17, Vagrant Cascadian wrote:
> On 2017-09-17, Holger Levsen wrote:
>> On Sat, Sep 16, 2017 at 07:48:42PM -0700, Vagrant Cascadian wrote:
>>> My *hunch* is that build2 and build12 are running a PAE kernel with more
>>> than 8GB of ram, and affected by this kernel bug (introduced in linux
>>> ~4.2, possibly):
>>> https://bugzilla.kernel.org/show_bug.cgi?id=196157
>>> https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1698118
>>
>> indeed!
>
> Not sure if it makes sense to also file a bug in the debian bug tracker
> about this...
I did end up filing https://bugs.debian.org/876035 and the response so
far was only to downgrade it as minor, due to the unusual use-case of
running i386 userspace with a PAE instead of amd64 kernel these days.
>>> Reducing the ram of the affected builders to 8GB and having more PAE
>>> builders with lighter workloads might be a workaround that would get
>>> better performance... while still testing 32/64-bit kernel
>>> variation.
>>
>> we lack the diskspace to do so.
>
> Then it might still get better performance to lower the PAE builders to
> only use 8GB of ram, even if that means running fewer jobs in
> parallel...
Again, I think simply lowering the ram to 8GB might actually result in
better performance, as it's a non-linear degredation. Looking at the ram
usage patterns of the i386 builders, as they infrequently use more than
8GB:
https://jenkins.debian.net/munin/debian.net/profitbricks-build12-i386.debian.net/memory.html
Or even 12GB or 16GB, though that will trigger the i/o wait issue more.
>> another option is to just wait. :/
>
> I suspect that *might* be an infinite wait; I get the impression this is
> a very low-priority issue upstream, and it would take some active
> attempt to fix it upstream...
Haven't seen any progress on issue in Debian or upstream, several months
later...
live well,
vagrant
[signature.asc (application/pgp-signature, inline)]
Reply sent
to Holger Levsen <holger@layer-acht.org>:
You have taken responsibility.
(Mon, 10 Sep 2018 21:00:06 GMT) (full text, mbox, link).
Notification sent
to Vagrant Cascadian <vagrant@debian.org>:
Bug acknowledged by developer.
(Mon, 10 Sep 2018 21:00:06 GMT) (full text, mbox, link).
Message #27 received at 875990-done@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
----- Forwarded message from Holger Levsen <gitlab@salsa.debian.org> -----
Date: Mon, 10 Sep 2018 20:48:48 +0000
From: Holger Levsen <gitlab@salsa.debian.org>
To: qa-jenkins-scm@lists.alioth.debian.org
Subject: [Qa-jenkins-scm] [Git][qa/jenkins.debian.net][master] reproducible Debian: use amd64 kernels on all i386 nodes (Closes: #875990
List-Id: "SCM mails for the development of jenkins.debian.org" <qa-jenkins-scm.alioth-lists.debian.net>
Reply-To: noreply@salsa.debian.org
Holger Levsen pushed to branch master at Debian QA / jenkins.debian.net
Commits:
0fef9342 by Holger Levsen at 2018-09-10T20:48:28Z
reproducible Debian: use amd64 kernels on all i386 nodes (Closes: #875990
Signed-off-by: Holger Levsen <holger@layer-acht.org>
- - - - -
3 changed files:
- − hosts/profitbricks-build12-i386/etc/apt/sources.list
- − hosts/profitbricks-build2-i386/etc/apt/sources.list
- update_jdn.sh
Changes:
=====================================
hosts/profitbricks-build12-i386/etc/apt/sources.list deleted
=====================================
@@ -1,15 +0,0 @@
-deb http://deb.debian.org/debian/ stretch main contrib non-free
-#deb-src http://deb.debian.org/debian/ stretch main contrib non-free
-
-deb http://deb.debian.org/debian/ stretch-updates main contrib non-free
-#deb-src http://deb.debian.org/debian/ stretch-updates main contrib non-free
-
-deb http://security.debian.org/ stretch/updates main contrib non-free
-#deb-src http://security.debian.org/ stretch/updates main contrib non-free
-
-deb http://deb.debian.org/debian/ stretch-backports main contrib non-free
-#deb-src http://deb.debian.org/debian/ stretch-backports main contrib non-free
-
-# workaround for i386 kernel bugs #875990 + #876035
-deb http://deb.debian.org/debian-security jessie/updates main
-deb http://deb.debian.org/debian jessie main
=====================================
hosts/profitbricks-build2-i386/etc/apt/sources.list deleted
=====================================
@@ -1,15 +0,0 @@
-deb http://deb.debian.org/debian/ stretch main contrib non-free
-#deb-src http://deb.debian.org/debian/ stretch main contrib non-free
-
-deb http://deb.debian.org/debian/ stretch-updates main contrib non-free
-#deb-src http://deb.debian.org/debian/ stretch-updates main contrib non-free
-
-deb http://security.debian.org/ stretch/updates main contrib non-free
-#deb-src http://security.debian.org/ stretch/updates main contrib non-free
-
-deb http://deb.debian.org/debian/ stretch-backports main contrib non-free
-#deb-src http://deb.debian.org/debian/ stretch-backports main contrib non-free
-
-# workaround for i386 kernel bugs #875990 + #876035
-deb http://deb.debian.org/debian-security jessie/updates main
-deb http://deb.debian.org/debian jessie main
=====================================
update_jdn.sh
=====================================
@@ -469,11 +469,12 @@ if [ -f /etc/debian_version ] ; then
$UP2DATE || sudo apt-get install mock
fi
# for varying kernels:
- # - we use bpo kernels on pb-build5+15 (and the default i386 kernel on pb-build2+12-i386)
- # - we use the default amd64 kernel on pb-build1+11 (and the default amd64 kernel on pb-build6+16-i386)
+ # - we use bpo kernels on pb-build5+15 (and the default amd64 kernel on pb-build6+16-i386)
if [ "$HOSTNAME" = "profitbricks-build5-amd64" ] || [ "$HOSTNAME" = "profitbricks-build15-amd64" ] ; then
$UP2DATE || sudo apt-get install -t stretch-backports linux-image-amd64
- elif [ "$HOSTNAME" = "profitbricks-build6-i386" ] || [ "$HOSTNAME" = "profitbricks-build16-i386" ] ; then
+ elif [ "$HOSTNAME" = "profitbricks-build6-i386" ] || [ "$HOSTNAME" = "profitbricks-build16-i386" ] \
+ || [ "$HOSTNAME" = "profitbricks-build2-i386" ] || [ "$HOSTNAME" = "profitbricks-build12-i386" ] ; then
+ # we dont vary the kernel on i386 atm, see #875990 + #876035
$UP2DATE || sudo apt-get install linux-image-amd64
fi
# only needed on the main nodes
View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/commit/0fef93423e8f485a5d2c866b86191c859c6787a4
--
View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/commit/0fef93423e8f485a5d2c866b86191c859c6787a4
You're receiving this email because of your account on salsa.debian.org.
_______________________________________________
Qa-jenkins-scm mailing list
Qa-jenkins-scm@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/qa-jenkins-scm
----- End forwarded message -----
--
cheers,
Holger
-------------------------------------------------------------------------------
holger@(debian|reproducible-builds|layer-acht).org
PGP fingerprint: B8BF 5413 7B09 D35C F026 FE9D 091A B856 069A AA1C
[signature.asc (application/pgp-signature, inline)]
Bug archived.
Request was from Debbugs Internal Request <owner@bugs.debian.org>
to internal_control@bugs.debian.org.
(Tue, 09 Oct 2018 07:32:37 GMT) (full text, mbox, link).
Send a report that this bug log contains spam.
Debian bug tracking system administrator <owner@bugs.debian.org>.
Last modified:
Wed May 17 10:48:48 2023;
Machine Name:
buxtehude
Debian Bug tracking system
Debbugs is free software and licensed under the terms of the GNU
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson,
2005-2017 Don Armstrong, and many other contributors.