Debian Bug report logs - #688711
[3.4.4 -> 3.5 regression] fails to find root device ("Unable to find LVM volume data/root")

version graph

Package: src:linux; Maintainer for src:linux is Debian Kernel Team <debian-kernel@lists.debian.org>;

Reported by: Jonathan Nieder <jrnieder@gmail.com>

Date: Mon, 24 Sep 2012 23:03:01 UTC

Severity: important

Tags: fixed-upstream, upstream

Found in versions linux/3.5.2-1~experimental.1, linux/3.5-1~experimental.1

Fixed in version 3.6.4-1~experimental.1

Done: Jonathan Nieder <jrnieder@gmail.com>

Bug is archived. No further changes may be made.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#688711; Package src:linux. (Mon, 24 Sep 2012 23:03:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonathan Nieder <jrnieder@gmail.com>:
New Bug report received and forwarded. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Mon, 24 Sep 2012 23:03:04 GMT) Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: Jonathan Nieder <jrnieder@gmail.com>
To: submit@bugs.debian.org
Subject: [3.4.4 -> 3.5 regression] fails to find root device ("Unable to find LVM volume data/root")
Date: Mon, 24 Sep 2012 16:00:16 -0700
[Message part 1 (text/plain, inline)]
Source: linux
Version: 3.5.2-1~experimental.1
Severity: important
Justification: fails to boot
Tags: upstream

Hi,

Trying to boot with the kernel from experimental, I get (typed by hand):

	[     1.949115]  sda: sda1 sda2 sda3 < sda5 >
	[     1.949751] sd 0:0:0:0: [sda] Attached SCSI disk
	  Volume group "data" not found
	  Skipping volume group data
	Unable to find LVM volume data/root
	[     1.994926] Refined TSC clocksource calibration: 2294.254 MHz.
	[     1.994977] Switching to clocksource tsc
	done.
	Begin: Waiting for root file system ... done.
	Gave up waiting for root device.

followed by some advice about how to diagnose.  At the busybox prompt,
looking in /dev/mapper, I only see "control" and "data-swap".  Similarly,

	(busybox) lvm lvscan
	  ACTIVE              '/dev/data/swap' [4.66 GiB] inherit
	[...]
	inactive              '/dev/data/root' [952.00 MiB] inherit
	inactive              '/dev/data/usr' [...]

So for some reason only the lv for my swap partition is being activated.

Reproducible with upstream kernels, too.  Bisects to v3.5-rc1~164
(Merge branch 'core-rcu-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip, 2012-05-21),
which is puzzling.  That kernel reproducibly fails to find the root
device.  I'm retesting its second and first parent now.

Bugscript output for the (working) wheezy kernel, fstab, and lvm.conf
attached.

Ideas?
Jonathan
[bugscript-output.txt (text/plain, attachment)]
[fstab (text/plain, attachment)]
[lvm.conf (text/plain, attachment)]

Marked as found in versions linux/3.5-1~experimental.1. Request was from Jonathan Nieder <jrnieder@gmail.com> to control@bugs.debian.org. (Mon, 24 Sep 2012 23:09:07 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#688711; Package src:linux. (Tue, 25 Sep 2012 00:09:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonathan Nieder <jrnieder@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Tue, 25 Sep 2012 00:09:05 GMT) Full text and rfc822 format available.

Message #12 received at 688711@bugs.debian.org (full text, mbox):

From: Jonathan Nieder <jrnieder@gmail.com>
To: 688711@bugs.debian.org
Subject: Re: [3.4.4 -> 3.5 regression] fails to find root device ("Unable to find LVM volume data/root")
Date: Mon, 24 Sep 2012 17:07:23 -0700
Jonathan Nieder wrote:

> 	[     1.949115]  sda: sda1 sda2 sda3 < sda5 >
> 	[     1.949751] sd 0:0:0:0: [sda] Attached SCSI disk
> 	  Volume group "data" not found
[...]
>                                           Bisects to v3.5-rc1~164
[...]
>          I'm retesting its second and first parent now.

Both parents test ok, so looks like something bad happened during that
merge.  Next step is to linearize it, unless someone has a nicer idea.



Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#688711; Package src:linux. (Tue, 25 Sep 2012 14:18:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Frederik Himpe <fhimpe@telenet.be>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Tue, 25 Sep 2012 14:18:03 GMT) Full text and rfc822 format available.

Message #17 received at 688711@bugs.debian.org (full text, mbox):

From: Frederik Himpe <fhimpe@telenet.be>
To: 688711@bugs.debian.org
Subject: Confirmed
Date: Tue, 25 Sep 2012 16:03:17 +0200
I can confirm this bug on one of my machines. Another system using LVM,
does not show this problem though. Running vgchange -a y in the
initramfs shell makes them active and then boot can continue without
problem.

-- 
Frederik Himpe <fhimpe@telenet.be>




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#688711; Package src:linux. (Wed, 26 Sep 2012 07:03:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonathan Nieder <jrnieder@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Wed, 26 Sep 2012 07:03:06 GMT) Full text and rfc822 format available.

Message #22 received at 688711@bugs.debian.org (full text, mbox):

From: Jonathan Nieder <jrnieder@gmail.com>
To: 688711@bugs.debian.org
Subject: Re: [3.4.4 -> 3.5 regression] fails to find root device ("Unable to find LVM volume data/root")
Date: Wed, 26 Sep 2012 00:00:55 -0700
[Message part 1 (text/plain, inline)]
# 3.6-rc7
tags 688711 + fixed-upstream
quit

Jonathan Nieder wrote:
> Jonathan Nieder wrote:

>> 	[     1.949115]  sda: sda1 sda2 sda3 < sda5 >
>> 	[     1.949751] sd 0:0:0:0: [sda] Attached SCSI disk
>> 	  Volume group "data" not found
> [...]
>>                                           Bisects to v3.5-rc1~164
> [...]
>>          I'm retesting its second and first parent now.
>
> Both parents test ok, so looks like something bad happened during that
> merge.  Next step is to linearize it, unless someone has a nicer idea.

With all 35 core-rcu-for-linus patches[1] applied on top of

 5ec29e3149d8 Merge branch 'core-locking-for-linus'

the tree matches v3.5-rc1~164 and reliably fails to boot, with the
same message described above.

Unfortunately bisecting does not produce a clear result:

 - the kernel booted fine with the first three patches applied
   (up to and including c57afe80db4e "rcu: Make RCU_FAST_NO_HZ account
   for pauses out of idle)

 - with patch 4, the boot failure happened once, but I couldn't make
   it happen again

 - likewise with patches up to and including #7 (98248a0e2432 "rcu:
   Explicitly initialize RCU_FAST_NO_HZ per-CPU variables") --- the
   boot failure happened once, but next time I tried it it booted
   fine

 - the kernel reliably fails to boot towards the end of the series

The bug resists investigation.  Bisection log attached.

Luckily 3.5.2-1~experimental.1 still reliably fails to boot.  Better
news: 3.6~rc7-1~experimental.1, built as described in bug#688834,
reliably boots ok (!).  So my use case is taken care of.  Let's hope
the fix sticks.

Sincerely,
Jonathan

[1]
 2fdbb31b6627 rcu: Add RCU_FAST_NO_HZ tracing for idle exit
 2ee3dc80660a rcu: Make RCU_FAST_NO_HZ use timer rather than hrtimer
 c57afe80db4e rcu: Make RCU_FAST_NO_HZ account for pauses out of idle
 79b9a75fb703 rcu: Add warning for RCU_FAST_NO_HZ timer firing
 f511fc624642 rcu: Ensure that RCU_FAST_NO_HZ timers expire on correct CPU
 21e52e156663 rcu: Make RCU_FAST_NO_HZ handle timer migration
 98248a0e2432 rcu: Explicitly initialize RCU_FAST_NO_HZ per-CPU variables
 b1420f1c8bfc rcu: Make rcu_barrier() less disruptive
 559f9badd11d rcu: List-debug variants of rcu list routines
 f88022a4f650 rcu: Replace list_first_entry_rcu() with list_first_or_null_rcu()
 c9336643e144 rcu: Clarify help text for RCU_BOOST_PRIO
 d8169d4c369e rcu: Make __kfree_rcu() less dependent on compiler choices
 8932a63d5edb rcu: Reduce cache-miss initialization latencies for large systems
 dabb8aa96020 rcu: Document kernel command-line parameters
 6d8133919bac rcu: Document why rcu_blocking_is_gp() is safe
 048a0e8f5e1d timer: Fix mod_timer_pinned() header comment
 616c310e83b8 rcu: Move PREEMPT_RCU preemption to switch_to() invocation
 9dd8fb16c361 rcu: Make exit_rcu() more precise and consolidate
 37e377d2823e rcu: Fixes to rcutorture error handling and cleanup
 fae4b54f28f0 rcu: Introduce rcutorture testing for rcu_barrier()
 cef50120b61c rcu: Direct algorithmic SRCU implementation
 4b7a3e9e3211 rcu: Remove fast check path from __synchronize_srcu()
 440253c17fc4 rcu: Increment upper bit only for srcu_read_lock()
 944ce9af4767 rcu: Flip ->completed only once per SRCU grace period
 18108ebfebe9 rcu: Improve SRCU's wait_idx() comments
 b52ce066c55a rcu: Implement a variant of Peter's SRCU algorithm
 966f58c2f6df rcu: Remove unused srcu_barrier()
 dc87917501e3 rcu: Improve srcu_readers_active_idx()'s cache locality
 d9792edd7a9a rcu: Use single value to handle expedited SRCU grace periods
 931ea9d1a6e0 rcu: Implement per-domain single-threaded call_srcu() state
              machine
 9059c94017f7 rcu: Add rcutorture test for call_srcu()
 9fab97876af8 rcu: Update RCU maintainership
[bisection.log (text/plain, attachment)]

Added tag(s) fixed-upstream. Request was from Jonathan Nieder <jrnieder@gmail.com> to control@bugs.debian.org. (Wed, 26 Sep 2012 07:03:08 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#688711; Package src:linux. (Tue, 09 Oct 2012 20:39:08 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonathan Nieder <jrnieder@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Tue, 09 Oct 2012 20:39:08 GMT) Full text and rfc822 format available.

Message #29 received at 688711@bugs.debian.org (full text, mbox):

From: Jonathan Nieder <jrnieder@gmail.com>
To: 688711@bugs.debian.org
Subject: Re: [3.4.4 -> 3.5 regression] fails to find root device ("Unable to find LVM volume data/root")
Date: Tue, 9 Oct 2012 13:36:50 -0700
tags 688711 + pending
quit

Jonathan Nieder wrote:

> Luckily 3.5.2-1~experimental.1 still reliably fails to boot.  Better
> news: 3.6~rc7-1~experimental.1, built as described in bug#688834,
> reliably boots ok (!).  So my use case is taken care of.  Let's hope
> the fix sticks.

I'd been using 3.6-rc7 for a while (several boots) and started using
3.6.1 yesterday (two boots).  No problems --- phew.



Added tag(s) pending. Request was from Jonathan Nieder <jrnieder@gmail.com> to control@bugs.debian.org. (Tue, 09 Oct 2012 20:39:10 GMT) Full text and rfc822 format available.

Reply sent to Jonathan Nieder <jrnieder@gmail.com>:
You have taken responsibility. (Thu, 01 Nov 2012 19:12:08 GMT) Full text and rfc822 format available.

Notification sent to Jonathan Nieder <jrnieder@gmail.com>:
Bug acknowledged by developer. (Thu, 01 Nov 2012 19:12:08 GMT) Full text and rfc822 format available.

Message #36 received at 688711-done@bugs.debian.org (full text, mbox):

From: Jonathan Nieder <jrnieder@gmail.com>
To: 688711-done@bugs.debian.org
Subject: Re: [3.4.4 -> 3.5 regression] fails to find root device ("Unable to find LVM volume data/root")
Date: Thu, 1 Nov 2012 12:09:24 -0700
Version: 3.6.4-1~experimental.1

Jonathan Nieder wrote:
> Jonathan Nieder wrote:

>> Luckily 3.5.2-1~experimental.1 still reliably fails to boot.  Better
>> news: 3.6~rc7-1~experimental.1, built as described in bug#688834,
>> reliably boots ok (!).  So my use case is taken care of.  Let's hope
>> the fix sticks.
>
> I'd been using 3.6-rc7 for a while (several boots) and started using
> 3.6.1 yesterday (two boots).  No problems --- phew.

Closing.



Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Sun, 02 Dec 2012 07:25:45 GMT) Full text and rfc822 format available.

Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Thu Apr 24 02:02:40 2014; Machine Name: buxtehude.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.