Debian Bug report logs - #814183
openmpi 1.10.2 is broken on powerpc

version graph

Package: src:openmpi; Maintainer for src:openmpi is Alastair McKinstry <mckinstry@debian.org>;

Reported by: Matthias Klose <doko@debian.org>

Date: Mon, 8 Feb 2016 21:06:02 UTC

Severity: serious

Tags: sid, stretch

Found in version openmpi/1.10.2-5

Done: Alastair McKinstry <mckinstry@debian.org>

Bug is archived. No further changes may be made.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Mon, 08 Feb 2016 21:06:05 GMT) (full text, mbox, link).


Acknowledgement sent to Matthias Klose <doko@debian.org>:
New Bug report received and forwarded. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Mon, 08 Feb 2016 21:06:06 GMT) (full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: Matthias Klose <doko@debian.org>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: openmpi 1.10.2 is broken on powerpc
Date: Mon, 8 Feb 2016 22:02:52 +0100
Package: src:openmpi
Version: 1.10.2-5
Severity: serious
Tags: sid stretch

openmpi 1.10.2 is broken on powerpc.

Graham Inggs confirmed that at least aces3 and petsc fail in the same way in 
Debian unstable, as soon the mpi test program is launched.

[...]
Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI process
See http://www.mcs.anl.gov/petsc/documentation/faq.html
--------------------------------------------------------------------------
A deprecated MCA variable value was specified in the environment or
on the command line.  Deprecated MCA variables should be avoided;
they may disappear in future releases.

  Deprecated variable: orte_rsh_agent
  New variable:        plm_rsh_agent
--------------------------------------------------------------------------
lid velocity = 0.0016, prandtl # = 1, grashof # = 1
Number of SNES iterations = 2

Session terminated, terminating shell... ...terminated.
make: *** [build-arch] Terminated

build logs:
https://launchpad.net/ubuntu/+source/aces3/3.0.8-5build2/+build/8974836
https://launchpad.net/ubuntu/+source/petsc/3.6.2.dfsg1-3build2/+build/8975053




Information forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Tue, 09 Feb 2016 06:24:08 GMT) (full text, mbox, link).


Acknowledgement sent to Graham Inggs <ginggs@debian.org>:
Extra info received and forwarded to list. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Tue, 09 Feb 2016 06:24:08 GMT) (full text, mbox, link).


Message #10 received at 814183@bugs.debian.org (full text, mbox, reply):

From: Graham Inggs <ginggs@debian.org>
To: 814183@bugs.debian.org, Matthias Klose <doko@debian.org>
Subject: openmpi 1.10.2 is broken on powerpc
Date: Tue, 9 Feb 2016 08:20:18 +0200
I don't believe the warning below is related to the problem.

> A deprecated MCA variable value was specified in the environment or
> on the command line.  Deprecated MCA variables should be avoided;
> they may disappear in future releases.

It can be avoided by changing the following line in petsc's debian/rules

export OMPI_MCA_orte_rsh_agent=/bin/false

to

export OMPI_MCA_plm_rsh_agent=/bin/false

Unfortunately this does not prevent the building ending with (as does aces3):

Build killed with signal TERM after 150 minutes of inactivity

On powerpc, running one of petsc's tests on one processor gets a
result (instantly):

$ mpiexec -n 1 ./ex19 -da_refine 3 -snes_monitor_short -pc_type mg
-ksp_type fgmres -pc_mg_type full
lid velocity = 0.0016, prandtl # = 1, grashof # = 1
  0 SNES Function norm 0.0406612
  1 SNES Function norm 3.33636e-06
  2 SNES Function norm 1.653e-11
Number of SNES iterations = 2

Running it on two processors never completes:

$ mpiexec -n 2 ./ex19 -da_refine 3 -snes_monitor_short -pc_type mg
-ksp_type fgmres -pc_mg_type full
lid velocity = 0.0016, prandtl # = 1, grashof # = 1
  0 SNES Function norm 0.0406612



Information forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Tue, 09 Feb 2016 19:51:07 GMT) (full text, mbox, link).


Acknowledgement sent to Graham Inggs <ginggs@debian.org>:
Extra info received and forwarded to list. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Tue, 09 Feb 2016 19:51:07 GMT) (full text, mbox, link).


Message #15 received at 814183@bugs.debian.org (full text, mbox, reply):

From: Graham Inggs <ginggs@debian.org>
To: 814183@bugs.debian.org, Matthias Klose <doko@debian.org>
Subject: Re: Bug#814183: openmpi 1.10.2 is broken on powerpc
Date: Tue, 9 Feb 2016 21:49:29 +0200
Petsc rebuilt successfully [1] a couple of hours ago on poulenc.d.o. [2].
My previous tests were done on partch.d.o. [3].  Partch has 2GB of RAM
vs Poulenc's 5GB, I don't know if this is significant.


[1] https://buildd.debian.org/status/fetch.php?pkg=petsc&arch=powerpc&ver=3.6.2.dfsg1-3%2Bb3&stamp=1455016089
[2] https://db.debian.org/machines.cgi?host=poulenc
[3] https://db.debian.org/machines.cgi?host=partch



Information forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Thu, 11 Feb 2016 23:21:08 GMT) (full text, mbox, link).


Acknowledgement sent to Emilio Pozuelo Monfort <pochu@debian.org>:
Extra info received and forwarded to list. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Thu, 11 Feb 2016 23:21:08 GMT) (full text, mbox, link).


Message #20 received at 814183@bugs.debian.org (full text, mbox, reply):

From: Emilio Pozuelo Monfort <pochu@debian.org>
To: 814183@bugs.debian.org
Subject: Re: Bug#814183: openmpi 1.10.2 is broken on powerpc
Date: Fri, 12 Feb 2016 00:17:28 +0100
On Tue, 9 Feb 2016 21:49:29 +0200 Graham Inggs <ginggs@debian.org> wrote:
> Petsc rebuilt successfully [1] a couple of hours ago on poulenc.d.o. [2].
> My previous tests were done on partch.d.o. [3].  Partch has 2GB of RAM
> vs Poulenc's 5GB, I don't know if this is significant.

aces3 failed on powerpc-osuosl-01.

poulenc is a PPC970FX
patch is a POWER7
powerpc-osuosl-01 is a POWER8

Dunno if that is relevant.

Cheers,
Emilio



Information forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Fri, 12 Feb 2016 07:39:08 GMT) (full text, mbox, link).


Acknowledgement sent to Graham Inggs <ginggs@debian.org>:
Extra info received and forwarded to list. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Fri, 12 Feb 2016 07:39:08 GMT) (full text, mbox, link).


Message #25 received at 814183@bugs.debian.org (full text, mbox, reply):

From: Graham Inggs <ginggs@debian.org>
To: Emilio Pozuelo Monfort <pochu@debian.org>, 814183@bugs.debian.org
Subject: Re: Bug#814183: openmpi 1.10.2 is broken on powerpc
Date: Fri, 12 Feb 2016 09:25:56 +0200
On 12 February 2016 at 01:17, Emilio Pozuelo Monfort <pochu@debian.org> wrote:
> On Tue, 9 Feb 2016 21:49:29 +0200 Graham Inggs <ginggs@debian.org> wrote:
>> Petsc rebuilt successfully [1] a couple of hours ago on poulenc.d.o. [2].
>> My previous tests were done on partch.d.o. [3].  Partch has 2GB of RAM
>> vs Poulenc's 5GB, I don't know if this is significant.
>
> aces3 failed on powerpc-osuosl-01.
>
> poulenc is a PPC970FX
> patch is a POWER7
> powerpc-osuosl-01 is a POWER8
>
> Dunno if that is relevant.

It might be, thanks!  Is there any way to arrange for aces3 to be
rebuilt on poulenc?  That should tell us something.



Information forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Sat, 20 Feb 2016 14:45:03 GMT) (full text, mbox, link).


Acknowledgement sent to Emilio Pozuelo Monfort <pochu@debian.org>:
Extra info received and forwarded to list. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Sat, 20 Feb 2016 14:45:03 GMT) (full text, mbox, link).


Message #30 received at 814183@bugs.debian.org (full text, mbox, reply):

From: Emilio Pozuelo Monfort <pochu@debian.org>
To: 814183@bugs.debian.org, Graham Inggs <ginggs@debian.org>
Subject: Re: Bug#814183: openmpi 1.10.2 is broken on powerpc
Date: Sat, 20 Feb 2016 15:40:26 +0100
On Fri, 12 Feb 2016 09:25:56 +0200 Graham Inggs <ginggs@debian.org> wrote:
> On 12 February 2016 at 01:17, Emilio Pozuelo Monfort <pochu@debian.org> wrote:
> > On Tue, 9 Feb 2016 21:49:29 +0200 Graham Inggs <ginggs@debian.org> wrote:
> >> Petsc rebuilt successfully [1] a couple of hours ago on poulenc.d.o. [2].
> >> My previous tests were done on partch.d.o. [3].  Partch has 2GB of RAM
> >> vs Poulenc's 5GB, I don't know if this is significant.
> >
> > aces3 failed on powerpc-osuosl-01.
> >
> > poulenc is a PPC970FX
> > patch is a POWER7
> > powerpc-osuosl-01 is a POWER8
> >
> > Dunno if that is relevant.
> 
> It might be, thanks!  Is there any way to arrange for aces3 to be
> rebuilt on poulenc?  That should tell us something.

It built on poulenc and failed on powerpc-osuosl-01:

https://buildd.debian.org/status/logs.php?pkg=aces3&ver=3.0.8-5%2Bb1&arch=powerpc

Emilio



Merged 813722 814183 Request was from Graham Inggs <ginggs@debian.org> to control@bugs.debian.org. (Sun, 21 Feb 2016 08:15:08 GMT) (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Mon, 29 Feb 2016 17:51:10 GMT) (full text, mbox, link).


Acknowledgement sent to Graham Inggs <ginggs@debian.org>:
Extra info received and forwarded to list. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Mon, 29 Feb 2016 17:51:10 GMT) (full text, mbox, link).


Message #37 received at 814183@bugs.debian.org (full text, mbox, reply):

From: Graham Inggs <ginggs@debian.org>
To: 814183@bugs.debian.org
Subject: Re: Bug#814183: openmpi 1.10.2 is broken on powerpc
Date: Mon, 29 Feb 2016 19:48:41 +0200
I filed LP: #1550863 [1] to track the powerpc build failures in Ubuntu.


[1] https://bugs.launchpad.net/bugs/1550863



Disconnected #813722 from all other report(s). Request was from Graham Inggs <ginggs@debian.org> to control@bugs.debian.org. (Fri, 04 Mar 2016 18:45:36 GMT) (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Fri, 04 Mar 2016 19:12:20 GMT) (full text, mbox, link).


Acknowledgement sent to Graham Inggs <ginggs@debian.org>:
Extra info received and forwarded to list. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Fri, 04 Mar 2016 19:12:20 GMT) (full text, mbox, link).


Message #44 received at 814183@bugs.debian.org (full text, mbox, reply):

From: Graham Inggs <ginggs@debian.org>
To: 814183@bugs.debian.org, 816590@bugs.debian.org
Subject: Re: Bug#814183: openmpi 1.10.2 is broken on powerpc
Date: Fri, 4 Mar 2016 08:12:17 +0200
On 3 March 2016 at 13:47, Emilio Pozuelo Monfort <pochu@debian.org> wrote:
> Might be related to #813722 / #814183.

Definitely.

ELPA built on poulenc and praetorius, but failed on powerpc-osuosl-01:

https://buildd.debian.org/status/logs.php?pkg=elpa&arch=powerpc

Only looking at elpa >= 2015.05.001-1 since openmpi 1.10, and ignoring
failures quicker than 2.5 hours due to bugs in packaging.



Information forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Mon, 25 Apr 2016 19:06:06 GMT) (full text, mbox, link).


Acknowledgement sent to Graham Inggs <ginggs@debian.org>:
Extra info received and forwarded to list. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Mon, 25 Apr 2016 19:06:06 GMT) (full text, mbox, link).


Message #49 received at 814183@bugs.debian.org (full text, mbox, reply):

From: Graham Inggs <ginggs@debian.org>
To: Graham Inggs <ginggs@debian.org>, 816590@bugs.debian.org
Cc: 814183@bugs.debian.org
Subject: Re: [Debichem-devel] Bug#816590: Bug#814183: openmpi 1.10.2 is broken on powerpc
Date: Mon, 25 Apr 2016 21:02:15 +0200
Please see #816101 [1].  It seems the powerpc and mipsel issues are
closely related.
The PETSc package maintainer conditionally disabled the 2 process MPI
tests on powerpc and mipsel in order to work around the problem.


[1] https://bugs.debian.org/816101



Information forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Sun, 04 Sep 2016 14:45:03 GMT) (full text, mbox, link).


Acknowledgement sent to Emilio Pozuelo Monfort <pochu@debian.org>:
Extra info received and forwarded to list. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Sun, 04 Sep 2016 14:45:03 GMT) (full text, mbox, link).


Message #54 received at 814183@bugs.debian.org (full text, mbox, reply):

From: Emilio Pozuelo Monfort <pochu@debian.org>
To: 814183@bugs.debian.org
Subject: Re: Bug#814183: openmpi 1.10.2 is broken on powerpc
Date: Sun, 4 Sep 2016 16:43:40 +0200
On Fri, 12 Feb 2016 00:17:28 +0100 Emilio Pozuelo Monfort <pochu@debian.org> wrote:
> On Tue, 9 Feb 2016 21:49:29 +0200 Graham Inggs <ginggs@debian.org> wrote:
> > Petsc rebuilt successfully [1] a couple of hours ago on poulenc.d.o. [2].
> > My previous tests were done on partch.d.o. [3].  Partch has 2GB of RAM
> > vs Poulenc's 5GB, I don't know if this is significant.
> 
> aces3 failed on powerpc-osuosl-01.
> 
> poulenc is a PPC970FX
> patch is a POWER7
> powerpc-osuosl-01 is a POWER8

Any progress on this? Has this been forwarded upstream?

Emilio



Information forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Sun, 04 Sep 2016 14:51:04 GMT) (full text, mbox, link).


Acknowledgement sent to Alastair McKinstry <alastair.mckinstry@sceal.ie>:
Extra info received and forwarded to list. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Sun, 04 Sep 2016 14:51:04 GMT) (full text, mbox, link).


Message #59 received at 814183@bugs.debian.org (full text, mbox, reply):

From: Alastair McKinstry <alastair.mckinstry@sceal.ie>
To: Emilio Pozuelo Monfort <pochu@debian.org>, 814183@bugs.debian.org
Subject: Re: Bug#814183: openmpi 1.10.2 is broken on powerpc
Date: Sun, 4 Sep 2016 15:47:41 +0100

On 04/09/2016 15:43, Emilio Pozuelo Monfort wrote:
> On Fri, 12 Feb 2016 00:17:28 +0100 Emilio Pozuelo Monfort <pochu@debian.org> wrote:
>> On Tue, 9 Feb 2016 21:49:29 +0200 Graham Inggs <ginggs@debian.org> wrote:
>>> Petsc rebuilt successfully [1] a couple of hours ago on poulenc.d.o. [2].
>>> My previous tests were done on partch.d.o. [3].  Partch has 2GB of RAM
>>> vs Poulenc's 5GB, I don't know if this is significant.
>> aces3 failed on powerpc-osuosl-01.
>>
>> poulenc is a PPC970FX
>> patch is a POWER7
>> powerpc-osuosl-01 is a POWER8
> Any progress on this? Has this been forwarded upstream?
Yes, reported upstream.
I'm testing out a new version 2.0.1 that may have a fix.

>
> Emilio
Alastair

-- 
Alastair McKinstry, <alastair@sceal.ie>, <mckinstry@debian.org>, https://diaspora.sceal.ie/u/amckinstry
Misentropy: doubting that the Universe is becoming more disordered. 




Reply sent to Alastair McKinstry <mckinstry@debian.org>:
You have taken responsibility. (Tue, 13 Sep 2016 15:15:07 GMT) (full text, mbox, link).


Notification sent to Matthias Klose <doko@debian.org>:
Bug acknowledged by developer. (Tue, 13 Sep 2016 15:15:07 GMT) (full text, mbox, link).


Message #64 received at 814183-done@bugs.debian.org (full text, mbox, reply):

From: Alastair McKinstry <mckinstry@debian.org>
To: 813722-done@bugs.debian.org, 814183-done@bugs.debian.org
Subject: closed in 2.0.1-5
Date: Tue, 13 Sep 2016 16:11:52 +0100
[Message part 1 (text/plain, inline)]
close 812733

close 814183

thanks


I'm closing these bugs as fixed / unreproducible in 2.0.1-5. In
particular I've rebuilt both aces3 and petsc on powerpc and mipsel (on
partch.debian.org and eller.debian.org) and they build successfully.

There have been code changes and bug fixes in the wait /lock code, as
well as now using standard gcc atomics on both architectures, which
means the relevant code paths have changed.

Please reopen if the bug is seen again, but it is believed fixed.


Regards

Alastair


-- 
Alastair McKinstry, <alastair@sceal.ie>, <mckinstry@debian.org>, https://diaspora.sceal.ie/u/amckinstry
Misentropy: doubting that the Universe is becoming more disordered. 	


[signature.asc (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Alastair McKinstry <mckinstry@debian.org>:
Bug#814183; Package src:openmpi. (Wed, 14 Sep 2016 16:54:07 GMT) (full text, mbox, link).


Acknowledgement sent to Graham Inggs <ginggs@debian.org>:
Extra info received and forwarded to list. Copy sent to Alastair McKinstry <mckinstry@debian.org>. (Wed, 14 Sep 2016 16:54:07 GMT) (full text, mbox, link).


Message #69 received at 814183@bugs.debian.org (full text, mbox, reply):

From: Graham Inggs <ginggs@debian.org>
To: 814183@bugs.debian.org, 813722@bugs.debian.org
Subject: Re: Bug#814183: marked as done (openmpi 1.10.2 is broken on powerpc)
Date: Wed, 14 Sep 2016 18:50:54 +0200
Hi Alastair

On 13 September 2016 at 17:15, Debian Bug Tracking System
<owner@bugs.debian.org> wrote:
> I'm closing these bugs as fixed / unreproducible in 2.0.1-5. In
> particular I've rebuilt both aces3 and petsc on powerpc and mipsel (on
> partch.debian.org and eller.debian.org) and they build successfully.

Note that many of the packages are already carrying patches to skip
tests on powerpc, or limit the number of MPI processes on powerpc
(np=1).

> There have been code changes and bug fixes in the wait /lock code, as
> well as now using standard gcc atomics on both architectures, which
> means the relevant code paths have changed.

That's good to hear.

> Please reopen if the bug is seen again, but it is believed fixed.

That's fine with me.

Regards
Graham



Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Thu, 13 Oct 2016 07:31:31 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Sat Jan 6 11:03:44 2018; Machine Name: buxtehude

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.