Debian Bug report logs - #525880
xfsprogs: xfs_repair enters livelock and makes no progress

version graph

Package: xfsprogs; Maintainer for xfsprogs is XFS Development Team <linux-xfs@vger.kernel.org>; Source for xfsprogs is src:xfsprogs (PTS, buildd, popcon).

Reported by: Marc Lehmann <debian-reportbug@plan9.de>

Date: Mon, 27 Apr 2009 16:48:01 UTC

Severity: important

Found in version xfsprogs/2.9.8-1lenny1

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Nathan Scott <nathans@debian.org>:
Bug#525880; Package xfsprogs. (Mon, 27 Apr 2009 16:48:04 GMT) (full text, mbox, link).


Acknowledgement sent to Marc Lehmann <debian-reportbug@plan9.de>:
New Bug report received and forwarded. Copy sent to Nathan Scott <nathans@debian.org>. (Mon, 27 Apr 2009 16:48:04 GMT) (full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: Marc Lehmann <debian-reportbug@plan9.de>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: xfsprogs: xfs_repair enters livelock and makes no progress
Date: Mon, 27 Apr 2009 18:45:58 +0200
Package: xfsprogs
Version: 2.9.8-1lenny1
Severity: important


After a crash I ran an xfs_repair on a filesystem. xfs_repair did a lot of
I/O etc. as one would expect, but after a while:

   Phase 1 - find and verify superblock...
   Phase 2 - using internal log
           - zero log...
           - scan filesystem freespace and inode maps...
           - found root inode chunk
   Phase 3 - for each AG...
           - scan and clear agi unlinked lists...
           - process known inodes and perform inode discovery...
           - agno = 0
           - agno = 1
           - agno = 2
           - agno = 3
           - process newly discovered inodes...
   Phase 4 - check for duplicate blocks...
           - setting up duplicate extent list...
           - check for inodes claiming duplicate blocks...
           - agno = 0
           - agno = 1
           - agno = 2
           - agno = 3
   Phase 5 - rebuild AG headers and trees...
           - reset superblock...
   Phase 6 - check inode connectivity...
           - resetting contents of realtime bitmap and summary inodes
           - traversing filesystem ...

it stopped doing any I/O and only sat there, using 170-180% cpu (top, dual-core system).

An strace doesn't show xfs_repair doing anything except mutex operations:

   http://ue.tst.eu/37906f67a8a9d081cc55a4dc6a672471.txt

After 10 minutes, I interrupted xfs_repair and ran it again, with taskset
1, which this time ran through (which does not mean much). It also spent
a long time during the "traversing filesystem" phase, much longer than it
took the first run to stop working, so my guess is that something in that
phase can cause a livelock.

I should also note that xfs_repair usually finishes within <20 seconds on
all the filesystems on the box, but it takes a very very long time (20
minutes, doing lots of I/O) on this filesystem, although the filesystems
should have similar content.

-- System Information:
Debian Release: 5.0
  APT prefers stable
  APT policy: (990, 'stable'), (500, 'unstable'), (500, 'testing'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.26-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=C, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages xfsprogs depends on:
hi  libc6                         2.7-18     GNU C Library: Shared libraries
ii  libreadline5                  5.2-3.1    GNU readline and history libraries
ii  libuuid1                      1.41.3-1   universally unique id library

xfsprogs recommends no packages.

Versions of packages xfsprogs suggests:
ii  attr                          1:2.4.43-2 Utilities for manipulating filesys
pn  dvhtool                       <none>     (no description available)
ii  quota                         3.16-7     implementation of the disk quota s
ii  xfsdump                       2.2.48-1   Administrative utilities for the X

-- no debconf information




Information forwarded to debian-bugs-dist@lists.debian.org, Nathan Scott <nathans@debian.org>:
Bug#525880; Package xfsprogs. (Mon, 27 Apr 2009 17:42:03 GMT) (full text, mbox, link).


Acknowledgement sent to Marc Lehmann <debian-reportbug@plan9.de>:
Extra info received and forwarded to list. Copy sent to Nathan Scott <nathans@debian.org>. (Mon, 27 Apr 2009 17:42:03 GMT) (full text, mbox, link).


Message #10 received at 525880@bugs.debian.org (full text, mbox, reply):

From: Marc Lehmann <debian-reportbug@plan9.de>
To: Debian Bug Tracking System <525880@bugs.debian.org>
Subject: xfsprogs: more info
Date: Mon, 27 Apr 2009 19:40:22 +0200
Package: xfsprogs
Version: 2.9.8-1lenny1
Followup-For: Bug #525880


I re-ran taskset 1 xfs_repair, and sure enough, the same thing happened. I
attached gdb and here are some not-so-useful backtraces:


   (gdb) inf thr
     6 Thread 0x4413a950 (LWP 3537)  0x00007f1270dd2bd1 in sem_wait () from /lib/libpthread.so.0
     5 Thread 0x42937950 (LWP 3538)  0x0000000000402560 in pthread_mutex_unlock@plt ()
     4 Thread 0x43138950 (LWP 3539)  0x000000000042e96a in ?? ()
     3 Thread 0x41935950 (LWP 3540)  0x00000000004024b0 in pthread_mutex_lock@plt ()
     2 Thread 0x42136950 (LWP 3541)  0x000000000042e973 in ?? ()
     1 Thread 0x7f12715f0730 (LWP 2908)  0x00007f1270dd3384 in __lll_lock_wait () from /lib/libpthread.so.0
   (gdb) thr 1
   [Switching to thread 1 (Thread 0x7f12715f0730 (LWP 2908))]#0  0x00007f1270dd3384 in __lll_lock_wait () from /lib/libpthread.so.0
   (gdb) bt
   #0  0x00007f1270dd3384 in __lll_lock_wait () from /lib/libpthread.so.0
   #1  0x00007f1270dcebf0 in _L_lock_102 () from /lib/libpthread.so.0
   #2  0x00007f1270dce4fe in pthread_mutex_lock () from /lib/libpthread.so.0
   #3  0x000000000042f1d8 in ?? ()
   #4  0x000000000042f22d in ?? ()
   #5  0x0000000000430031 in ?? ()
   #6  0x0000000000444694 in ?? ()
   #7  0x00000000004309d4 in ?? ()
   #8  0x0000000000421dff in ?? ()
   #9  0x0000000000422b95 in ?? ()
   #10 0x000000000042b7e5 in ?? ()
   #11 0x00007f1270a911a6 in __libc_start_main () from /lib/libc.so.6
   #12 0x00000000004025c9 in ?? ()
   #13 0x00007fff79609ed8 in ?? ()
   #14 0x000000000000001c in ?? ()
   #15 0x0000000000000002 in ?? ()
   #16 0x00007fff7960b896 in ?? ()
   #17 0x00007fff7960b8a1 in ?? ()
   #18 0x0000000000000000 in ?? ()
   (gdb) thr 2
   [Switching to thread 2 (Thread 0x42136950 (LWP 3541))]#0  0x000000000042e973 in ?? ()
   (gdb) bt
   #0  0x000000000042e973 in ?? ()
   #1  0x000000000042cb3c in ?? ()
   #2  0x000000000042f1c5 in ?? ()
   #3  0x0000000000424890 in ?? ()
   #4  0x0000000000424e39 in ?? ()
   #5  0x00000000004243ab in ?? ()
   #6  0x0000000000424656 in ?? ()
   #7  0x00000000004253c6 in ?? ()
   #8  0x00000000004255ac in ?? ()
   #9  0x00007f1270dccfc7 in start_thread () from /lib/libpthread.so.0
   #10 0x00007f1270b425ad in clone () from /lib/libc.so.6
   #11 0x0000000000000000 in ?? ()
   (gdb) thr 3
   [Switching to thread 3 (Thread 0x41935950 (LWP 3540))]#0  0x00000000004024b0 in pthread_mutex_lock@plt ()
   (gdb) bt
   #0  0x00000000004024b0 in pthread_mutex_lock@plt ()
   #1  0x000000000042c634 in ?? ()
   #2  0x000000000042cccd in ?? ()
   #3  0x000000000042f1c5 in ?? ()
   #4  0x0000000000424890 in ?? ()
   #5  0x0000000000424e39 in ?? ()
   #6  0x00000000004247e0 in ?? ()
   #7  0x00000000004253c6 in ?? ()
   #8  0x00000000004255ac in ?? ()
   #9  0x00007f1270dccfc7 in start_thread () from /lib/libpthread.so.0
   #10 0x00007f1270b425ad in clone () from /lib/libc.so.6
   #11 0x0000000000000000 in ?? ()
   (gdb) thr 4
   [Switching to thread 4 (Thread 0x43138950 (LWP 3539))]#0  0x000000000042e96a in ?? ()
   (gdb) bt
   #0  0x000000000042e96a in ?? ()
   #1  0x000000000042cb3c in ?? ()
   #2  0x000000000042f1c5 in ?? ()
   #3  0x0000000000424890 in ?? ()
   #4  0x0000000000424e39 in ?? ()
   #5  0x00000000004243ab in ?? ()
   #6  0x0000000000424656 in ?? ()
   #7  0x00000000004253c6 in ?? ()
   #8  0x00000000004255ac in ?? ()
   #9  0x00007f1270dccfc7 in start_thread () from /lib/libpthread.so.0
   #10 0x00007f1270b425ad in clone () from /lib/libc.so.6
   #11 0x0000000000000000 in ?? ()
   (gdb) thr 5
   [Switching to thread 5 (Thread 0x42937950 (LWP 3538))]#0  0x0000000000402560 in pthread_mutex_unlock@plt ()
   (gdb) bt
   #0  0x0000000000402560 in pthread_mutex_unlock@plt ()
   #1  0x000000000042c7ed in ?? ()
   #2  0x000000000042cccd in ?? ()
   #3  0x000000000042f1c5 in ?? ()
   #4  0x0000000000424890 in ?? ()
   #5  0x0000000000424e39 in ?? ()
   #6  0x00000000004243ab in ?? ()
   #7  0x0000000000424656 in ?? ()
   #8  0x00000000004253c6 in ?? ()
   #9  0x00000000004255ac in ?? ()
   #10 0x00007f1270dccfc7 in start_thread () from /lib/libpthread.so.0
   #11 0x00007f1270b425ad in clone () from /lib/libc.so.6
   #12 0x0000000000000000 in ?? ()
   (gdb) thr 6
   [Switching to thread 6 (Thread 0x4413a950 (LWP 3537))]#0  0x00007f1270dd2bd1 in sem_wait () from /lib/libpthread.so.0
   (gdb) bt
   #0  0x00007f1270dd2bd1 in sem_wait () from /lib/libpthread.so.0
   #1  0x0000000000424b2e in ?? ()
   #2  0x00007f1270dccfc7 in start_thread () from /lib/libpthread.so.0
   #3  0x00007f1270b425ad in clone () from /lib/libc.so.6
   #4  0x0000000000000000 in ?? ()


-- System Information:
Debian Release: 5.0
  APT prefers stable
  APT policy: (990, 'stable'), (500, 'unstable'), (500, 'testing'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.26-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=C, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages xfsprogs depends on:
hi  libc6                         2.7-18     GNU C Library: Shared libraries
ii  libreadline5                  5.2-3.1    GNU readline and history libraries
ii  libuuid1                      1.41.3-1   universally unique id library

xfsprogs recommends no packages.

Versions of packages xfsprogs suggests:
ii  attr                          1:2.4.43-2 Utilities for manipulating filesys
pn  dvhtool                       <none>     (no description available)
ii  quota                         3.16-7     implementation of the disk quota s
ii  xfsdump                       2.2.48-1   Administrative utilities for the X

-- no debconf information




Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Wed Oct 11 03:50:08 2017; Machine Name: beach

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.