Debian Bug report logs - #700975
linux-image-3.7-trunk-amd64: Marvell 88SE9230: Freaks out and drops all disks if sent SMART command during RAID rebuild

version graph

Package: src:linux; Maintainer for src:linux is Debian Kernel Team <debian-kernel@lists.debian.org>;

Reported by: Maik Zumstrull <maik@zumstrull.net>

Date: Tue, 19 Feb 2013 22:39:01 UTC

Severity: normal

Found in versions linux/3.7.8-1~experimental.1, linux/3.2.35-2, linux/3.8.2-1~experimental.1

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Tue, 19 Feb 2013 22:39:04 GMT) (full text, mbox, link).


Acknowledgement sent to Maik Zumstrull <maik@zumstrull.net>:
New Bug report received and forwarded. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Tue, 19 Feb 2013 22:39:04 GMT) (full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: Maik Zumstrull <maik@zumstrull.net>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: linux-image-3.7-trunk-amd64: Marvell 88SE9230: Freaks out and drops all disks if sent SMART command during RAID rebuild
Date: Tue, 19 Feb 2013 23:34:32 +0100
Package: src:linux
Version: 3.7.8-1~experimental.1
Severity: normal

Subject says most of it. These are new components: PCIe card with that
Marvell chip, 4 exposed SATA 6G ports, one WD Red 3TB on each port.
Disks are fine according to SMART conveyance test.

Console action:

maik@antares:~/ > sudo mdadm -C linux-data -n 4 -l 5 /dev/sde /dev/sdf /dev/sdg /dev/sdh
mdadm: /dev/sde appears to be part of a raid array:
    level=raid5 devices=4 ctime=Tue Feb 19 22:47:42 2013
mdadm: /dev/sdf appears to be part of a raid array:
    level=raid5 devices=4 ctime=Tue Feb 19 22:47:42 2013
mdadm: /dev/sdg appears to be part of a raid array:
    level=raid5 devices=4 ctime=Tue Feb 19 22:47:42 2013
mdadm: /dev/sdh appears to be part of a raid array:
    level=raid5 devices=4 ctime=Tue Feb 19 22:47:42 2013
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md/linux-data started.
maik@antares:~/ > cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md127 : active raid5 sdh[4] sdg[2] sdf[1] sde[0]
      8790405120 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      [>....................]  recovery =  0.0% (679936/2930135040) finish=574.4min speed=84992K/sec
      
unused devices: <none>
maik@antares:~/ > sudo smartctl -H /dev/sdg
smartctl 5.43 2012-06-05 r3561 [x86_64-linux-3.7-trunk-amd64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

maik@antares:~/ > sudo smartctl -H /dev/sdg
smartctl 5.43 2012-06-05 r3561 [x86_64-linux-3.7-trunk-amd64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

maik@antares:~/ > sudo smartctl -H /dev/sdg
smartctl 5.43 2012-06-05 r3561 [x86_64-linux-3.7-trunk-amd64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

Error SMART Thresholds Read failed: scsi error aborted command
Smartctl: SMART Read Thresholds failed.

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: UNKNOWN!
SMART Status, Attributes and Thresholds cannot be read.


I've extended the kernel log below so it covers everything from the
moment of RAID creation, including the failing SMART command.

The problem seems fairly repeatable, so far every attempt at creating a
RAID has failed this way (because smartd eventually pings the disks).


-- Package-specific info:
** Version:
Linux version 3.7-trunk-amd64 (debian-kernel@lists.debian.org) (gcc version 4.7.2 (Debian 4.7.2-5) ) #1 SMP Debian 3.7.8-1~experimental.1

** Command line:
BOOT_IMAGE=/vmlinuz-3.7-trunk-amd64 root=UUID=06cc1c4c-d190-49bb-9219-0ccfa22e2661 ro acpi_os=Linux quiet

** Tainted: PO (4097)
 * Proprietary module has been loaded.
 * Out-of-tree module has been loaded.

** Kernel log:
[  122.445618] md: bind<sde>
[  122.445783] md: bind<sdf>
[  122.445952] md: bind<sdg>
[  122.456452] md: bind<sdh>
[  122.459351] async_tx: api initialized (async)
[  122.460092] xor: automatically using best checksumming function:
[  122.499742]    avx       : 11990.000 MB/sec
[  122.567597] raid6: sse2x1    4318 MB/s
[  122.635438] raid6: sse2x2    5419 MB/s
[  122.703278] raid6: sse2x4    6137 MB/s
[  122.703280] raid6: using algorithm sse2x4 (6137 MB/s)
[  122.703282] raid6: using ssse3x2 recovery algorithm
[  122.706977] md: raid6 personality registered for level 6
[  122.706981] md: raid5 personality registered for level 5
[  122.706983] md: raid4 personality registered for level 4
[  122.707309] md/raid:md127: device sdg operational as raid disk 2
[  122.707312] md/raid:md127: device sdf operational as raid disk 1
[  122.707314] md/raid:md127: device sde operational as raid disk 0
[  122.707711] md/raid:md127: allocated 4338kB
[  122.707797] md/raid:md127: raid level 5 active with 3 out of 4 devices, algorithm 2
[  122.707799] RAID conf printout:
[  122.707801]  --- level:5 rd:4 wd:3
[  122.707803]  disk 0, o:1, dev:sde
[  122.707805]  disk 1, o:1, dev:sdf
[  122.707806]  disk 2, o:1, dev:sdg
[  122.707833] md127: detected capacity change from 0 to 9001374842880
[  122.707860] RAID conf printout:
[  122.707865]  --- level:5 rd:4 wd:3
[  122.707868]  disk 0, o:1, dev:sde
[  122.707870]  disk 1, o:1, dev:sdf
[  122.707872]  disk 2, o:1, dev:sdg
[  122.707873]  disk 3, o:1, dev:sdh
[  122.707965] md: recovery of RAID array md127
[  122.707968] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[  122.707970] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
[  122.707973] md: using 128k window, over a total of 2930135040k.
[  122.740153]  md127: unknown partition table
[  180.531641] ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  180.531648] ata9.00: failed command: SMART
[  180.531655] ata9.00: cmd b0/d1:01:01:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[  180.531655]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  180.531658] ata9.00: status: { DRDY }
[  180.531666] ata9: hard resetting link
[  185.887433] ata9: link is slow to respond, please be patient (ready=0)
[  190.524871] ata9: COMRESET failed (errno=-16)
[  190.524877] ata9: hard resetting link
[  195.872694] ata9: link is slow to respond, please be patient (ready=0)
[  200.510134] ata9: COMRESET failed (errno=-16)
[  200.510141] ata9: hard resetting link
[  205.857925] ata9: link is slow to respond, please be patient (ready=0)
[  235.470518] ata9: COMRESET failed (errno=-16)
[  235.470526] ata9: limiting SATA link speed to 3.0 Gbps
[  235.470529] ata9: hard resetting link
[  240.483102] ata9: COMRESET failed (errno=-16)
[  240.483110] ata9: reset failed, giving up
[  240.483112] ata9.00: disabled
[  240.483134] ata9: EH complete
[  240.483185] sd 8:0:0:0: [sdg] Unhandled error code
[  240.483188] sd 8:0:0:0: [sdg]  
[  240.483191] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  240.483193] sd 8:0:0:0: [sdg] CDB: 
[  240.483195] Read(10): 28 00 00 63 18 00 00 04 00 00
[  240.483203] end_request: I/O error, dev sdg, sector 6494208
[  240.483207] md/raid:md127: read error not correctable (sector 6494208 on sdg).
[  240.483211] md/raid:md127: Disk failure on sdg, disabling device.
[  240.483211] md/raid:md127: Operation continuing on 2 devices.
[  240.483219] md/raid:md127: read error not correctable (sector 6494216 on sdg).
[  240.483222] md/raid:md127: read error not correctable (sector 6494224 on sdg).
[  240.483224] md/raid:md127: read error not correctable (sector 6494232 on sdg).
[  240.483226] md/raid:md127: read error not correctable (sector 6494240 on sdg).
[  240.483229] md/raid:md127: read error not correctable (sector 6494248 on sdg).
[  240.483231] md/raid:md127: read error not correctable (sector 6494256 on sdg).
[  240.483233] md/raid:md127: read error not correctable (sector 6494264 on sdg).
[  240.483235] md/raid:md127: read error not correctable (sector 6494272 on sdg).
[  240.483238] md/raid:md127: read error not correctable (sector 6494280 on sdg).
[  240.483308] sd 8:0:0:0: [sdg] Unhandled error code
[  240.483310] sd 8:0:0:0: [sdg]  
[  240.483312] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  240.483314] sd 8:0:0:0: [sdg] CDB: 
[  240.483315] Read(10): 28 00 00 63 1c 00 00 04 00 00
[  240.483322] end_request: I/O error, dev sdg, sector 6495232
[  301.216814] ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  301.216818] ata7.00: failed command: FLUSH CACHE EXT
[  301.216821] ata7.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[  301.216821]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  301.216822] ata7.00: status: { DRDY }
[  301.216827] ata7: hard resetting link
[  301.216842] ata10.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  301.216845] ata10.00: failed command: FLUSH CACHE EXT
[  301.216849] ata10.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[  301.216849]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  301.216851] ata10.00: status: { DRDY }
[  301.216855] ata10: hard resetting link
[  301.216861] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  301.216864] ata8.00: failed command: FLUSH CACHE EXT
[  301.216868] ata8.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[  301.216868]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  301.216870] ata8.00: status: { DRDY }
[  301.216874] ata8: hard resetting link
[  302.038921] ata10: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  302.038945] ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  302.038966] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  307.027548] ata7.00: qc timeout (cmd 0xec)
[  307.027562] ata8.00: qc timeout (cmd 0xec)
[  307.027574] ata10.00: qc timeout (cmd 0xec)
[  307.530401] ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[  307.530406] ata8.00: revalidation failed (errno=-5)
[  307.530413] ata8: hard resetting link
[  307.530423] ata7.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[  307.530428] ata7.00: revalidation failed (errno=-5)
[  307.530433] ata7: hard resetting link
[  307.530442] ata10.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[  307.530446] ata10.00: revalidation failed (errno=-5)
[  307.530451] ata10: hard resetting link
[  308.352538] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  308.352559] ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  308.352580] ata10: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  318.329827] ata8.00: qc timeout (cmd 0xec)
[  318.329840] ata7.00: qc timeout (cmd 0xec)
[  318.329851] ata10.00: qc timeout (cmd 0xec)
[  318.832656] ata7.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[  318.832661] ata7.00: revalidation failed (errno=-5)
[  318.832666] ata7: limiting SATA link speed to 3.0 Gbps
[  318.832671] ata7: hard resetting link
[  318.832681] ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[  318.832685] ata8.00: revalidation failed (errno=-5)
[  318.832689] ata8: limiting SATA link speed to 3.0 Gbps
[  318.832693] ata8: hard resetting link
[  318.832703] ata10.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[  318.832707] ata10.00: revalidation failed (errno=-5)
[  318.832711] ata10: limiting SATA link speed to 3.0 Gbps
[  318.832715] ata10: hard resetting link
[  319.654789] ata10: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
[  319.654811] ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
[  319.654831] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
[  349.586621] ata7.00: qc timeout (cmd 0xec)
[  349.586636] ata10.00: qc timeout (cmd 0xec)
[  349.586656] ata8.00: qc timeout (cmd 0xec)
[  350.089487] ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[  350.089492] ata8.00: revalidation failed (errno=-5)
[  350.089496] ata8.00: disabled
[  350.089506] ata8.00: device reported invalid CHS sector 0
[  350.089517] ata7.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[  350.089521] ata7.00: revalidation failed (errno=-5)
[  350.089524] ata7.00: disabled
[  350.089532] ata7.00: device reported invalid CHS sector 0
[  350.089543] ata10.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[  350.089548] ata10.00: revalidation failed (errno=-5)
[  350.089551] ata10.00: disabled
[  350.089559] ata10.00: device reported invalid CHS sector 0
[  350.592344] ata10: hard resetting link
[  350.592359] ata7: hard resetting link
[  350.592373] ata8: hard resetting link
[  351.414474] ata10: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
[  351.414496] ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
[  351.414517] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
[  351.917332] ata8: EH complete
[  351.917338] ata7: EH complete
[  351.917352] ata10: EH complete
[  351.917370] sd 7:0:0:0: [sdf] Unhandled error code
[  351.917372] sd 6:0:0:0: [sde] Unhandled error code
[  351.917375] sd 6:0:0:0: [sde]  
[  351.917377] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  351.917379] sd 6:0:0:0: [sde] CDB: 
[  351.917385] sd 9:0:0:0: [sdh] Unhandled error code
[  351.917388] sd 7:0:0:0: [sdf]  
[  351.917389] sd 9:0:0:0: [sdh]  
[  351.917391] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  351.917392] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  351.917394] sd 7:0:0:0: [sdf] CDB: 
[  351.917395] sd 9:0:0:0: [sdh] CDB: 
[  351.917380] Write(10): 2a
[  351.917396] Write(10)Write(10):: 2a 2a 00 00 00 00 00 00 00 00 08 08 00 00 00 00 01 01 00 00
[  351.917418] 
[  351.917420] end_request: I/O error, dev sdf, sector 8
[  351.917421] end_request: I/O error, dev sdh, sector 8
[  351.917423] end_request: I/O error, dev sdf, sector 8
[  351.917424] end_request: I/O error, dev sdh, sector 8
[  351.917426] md: super_written gets error=-5, uptodate=0
[  351.917427] md: super_written gets error=-5, uptodate=0
[  351.917430] md/raid:md127: Disk failure on sdf, disabling device.
[  351.917430] md/raid:md127: Operation continuing on 1 devices.
[  351.917432] md/raid:md127: Disk failure on sdh, disabling device.
[  351.917432] md/raid:md127: Operation continuing on 1 devices.
[  351.917444]  00 00 00 00 08 00 00 01 00
[  351.917451] end_request: I/O error, dev sde, sector 8
[  351.917454] end_request: I/O error, dev sde, sector 8
[  351.917456] md: super_written gets error=-5, uptodate=0
[  351.917459] md/raid:md127: Disk failure on sde, disabling device.
[  351.917459] md/raid:md127: Operation continuing on 0 devices.
[  351.921299] md: md127: recovery done.
[  351.921543] RAID conf printout:
[  351.921548]  --- level:5 rd:4 wd:0
[  351.921550]  disk 0, o:0, dev:sde
[  351.921552]  disk 1, o:0, dev:sdf
[  351.921553]  disk 2, o:0, dev:sdg
[  351.921555]  disk 3, o:0, dev:sdh
[  351.925329] RAID conf printout:
[  351.925333]  --- level:5 rd:4 wd:0
[  351.925336]  disk 0, o:0, dev:sde
[  351.925338]  disk 1, o:0, dev:sdf
[  351.925340]  disk 2, o:0, dev:sdg
[  351.925346] RAID conf printout:
[  351.925347]  --- level:5 rd:4 wd:0
[  351.925349]  disk 0, o:0, dev:sde
[  351.925351]  disk 1, o:0, dev:sdf
[  351.925352]  disk 2, o:0, dev:sdg
[  351.929384] RAID conf printout:
[  351.929389]  --- level:5 rd:4 wd:0
[  351.929392]  disk 0, o:0, dev:sde
[  351.929394]  disk 1, o:0, dev:sdf
[  351.929398] RAID conf printout:
[  351.929399]  --- level:5 rd:4 wd:0
[  351.929401]  disk 0, o:0, dev:sde
[  351.929403]  disk 1, o:0, dev:sdf
[  351.937283] RAID conf printout:
[  351.937288]  --- level:5 rd:4 wd:0
[  351.937291]  disk 0, o:0, dev:sde
[  351.937295] RAID conf printout:
[  351.937297]  --- level:5 rd:4 wd:0
[  351.937299]  disk 0, o:0, dev:sde
[  351.948780] RAID conf printout:
[  351.948785]  --- level:5 rd:4 wd:0

** Model information
sys_vendor: System manufacturer
product_name: System Product Name
product_version: System Version
chassis_vendor: Chassis Manufacture
chassis_version: Chassis Version
bios_vendor: American Megatrends Inc.
bios_version: 3202
board_vendor: ASUSTeK COMPUTER INC.
board_name: P8C WS
board_version: Rev 1.xx

** Loaded modules:
raid456
async_raid6_recov
async_memcpy
async_pq
raid6_pq
async_xor
xor
async_tx
fuse
parport_pc
ppdev
lp
parport
bnep
rfcomm
bluetooth
cpufreq_stats
crc16
cpufreq_userspace
cpufreq_powersave
cpufreq_conservative
uinput
nls_utf8
nls_cp437
vfat
fat
ecryptfs
md_mod
snd_hda_codec_hdmi
joydev
snd_usb_audio
snd_usbmidi_lib
snd_seq_midi
snd_seq_midi_event
uvcvideo
videobuf2_vmalloc
videobuf2_memops
snd_rawmidi
videobuf2_core
videodev
media
snd_hda_codec_realtek
iTCO_wdt
iTCO_vendor_support
eeepc_wmi
asus_wmi
sparse_keymap
rfkill
evdev
coretemp
kvm_intel
snd_hda_intel
snd_hda_codec
fglrx(PO)
psmouse
snd_hwdep
kvm
snd_pcm
snd_page_alloc
i2c_i801
snd_seq
pcspkr
serio_raw
snd_seq_device
snd_timer
lpc_ich
i2c_core
mfd_core
snd
soundcore
mei
acpi_cpufreq
mperf
wmi
video
button
processor
xfs
btrfs
libcrc32c
zlib_deflate
sha256_generic
dm_crypt
efivars
dm_mirror
dm_region_hash
dm_log
hid_cherry
hid_generic
usbhid
usb_storage
hid
dm_mod
sg
sd_mod
crc_t10dif
crc32c_intel
ghash_clmulni_intel
aesni_intel
aes_x86_64
ablk_helper
cryptd
xts
lrw
gf128mul
microcode
fan
thermal
thermal_sys
ahci
libahci
e1000e
ehci_hcd
xhci_hcd
libata
scsi_mod
usbcore
usb_common

** PCI devices:
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v2/Ivy Bridge DRAM Controller [8086:0158] (rev 09)
	Subsystem: ASUSTeK Computer Inc. Device [1043:852f]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: <access denied>

00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port [8086:0151] (rev 09) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	I/O behind bridge: 0000e000-0000efff
	Memory behind bridge: e0000000-f00fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA+ MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:06.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port [8086:015d] (rev 09) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
	I/O behind bridge: 0000d000-0000dfff
	Memory behind bridge: f0300000-f03fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:14.0 USB controller [0c03]: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller [8086:1e31] (rev 04) (prog-if 30 [XHCI])
	Subsystem: ASUSTeK Computer Inc. Device [1043:852f]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 43
	Region 0: Memory at f0400000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: <access denied>
	Kernel driver in use: xhci_hcd

00:16.0 Communication controller [0780]: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 [8086:1e3a] (rev 04)
	Subsystem: ASUSTeK Computer Inc. Device [1043:852f]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 52
	Region 0: Memory at f041b000 (64-bit, non-prefetchable) [size=16]
	Capabilities: <access denied>
	Kernel driver in use: mei

00:1a.0 USB controller [0c03]: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 [8086:1e2d] (rev 04) (prog-if 20 [EHCI])
	Subsystem: ASUSTeK Computer Inc. Device [1043:852f]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at f0418000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: <access denied>
	Kernel driver in use: ehci_hcd

00:1b.0 Audio device [0403]: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller [8086:1e20] (rev 04)
	Subsystem: ASUSTeK Computer Inc. Device [1043:84fb]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 53
	Region 0: Memory at f0410000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: snd_hda_intel

00:1c.0 PCI bridge [0604]: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 [8086:1e10] (rev c4) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1c.5 PCI bridge [0604]: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 6 [8086:1e1a] (rev c4) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
	I/O behind bridge: 0000c000-0000cfff
	Memory behind bridge: f0200000-f02fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1c.6 PCI bridge [0604]: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 7 [8086:1e1c] (rev c4) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
	I/O behind bridge: 0000b000-0000bfff
	Memory behind bridge: f0100000-f01fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1d.0 USB controller [0c03]: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 [8086:1e26] (rev 04) (prog-if 20 [EHCI])
	Subsystem: ASUSTeK Computer Inc. Device [1043:852f]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 23
	Region 0: Memory at f0417000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: <access denied>
	Kernel driver in use: ehci_hcd

00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] (rev a4) (prog-if 01 [Subtractive decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Bus: primary=00, secondary=06, subordinate=06, sec-latency=32
	Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>

00:1f.0 ISA bridge [0601]: Intel Corporation C216 Series Chipset LPC Controller [8086:1e53] (rev 04)
	Subsystem: ASUSTeK Computer Inc. Device [1043:852f]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: <access denied>
	Kernel driver in use: lpc_ich

00:1f.2 SATA controller [0106]: Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] [8086:1e02] (rev 04) (prog-if 01 [AHCI 1.0])
	Subsystem: ASUSTeK Computer Inc. Device [1043:852f]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin B routed to IRQ 47
	Region 0: I/O ports at f070 [size=8]
	Region 1: I/O ports at f060 [size=4]
	Region 2: I/O ports at f050 [size=8]
	Region 3: I/O ports at f040 [size=4]
	Region 4: I/O ports at f020 [size=32]
	Region 5: Memory at f0416000 (32-bit, non-prefetchable) [size=2K]
	Capabilities: <access denied>
	Kernel driver in use: ahci

00:1f.3 SMBus [0c05]: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller [8086:1e22] (rev 04)
	Subsystem: ASUSTeK Computer Inc. Device [1043:852f]
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin C routed to IRQ 18
	Region 0: Memory at f0415000 (64-bit, non-prefetchable) [size=256]
	Region 4: I/O ports at f000 [size=32]

01:00.0 VGA compatible controller [0300]: Advanced Micro Devices [AMD] nee ATI Cape Verde [Radeon HD 7700 Series] [1002:683d] (prog-if 00 [VGA controller])
	Subsystem: ASUSTeK Computer Inc. Device [1043:042b]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 55
	Region 0: Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Region 2: Memory at f0000000 (64-bit, non-prefetchable) [size=256K]
	Region 4: I/O ports at e000 [size=256]
	Expansion ROM at f0040000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: fglrx_pci

01:00.1 Audio device [0403]: Advanced Micro Devices [AMD] nee ATI Device [1002:aab0]
	Subsystem: ASUSTeK Computer Inc. Device [1043:aab0]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin B routed to IRQ 54
	Region 0: Memory at f0060000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: snd_hda_intel

02:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller [1b4b:9230] (rev 10) (prog-if 01 [AHCI 1.0])
	Subsystem: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller [1b4b:9230]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 48
	Region 0: I/O ports at d050 [size=8]
	Region 1: I/O ports at d040 [size=4]
	Region 2: I/O ports at d030 [size=8]
	Region 3: I/O ports at d020 [size=4]
	Region 4: I/O ports at d000 [size=32]
	Region 5: Memory at f0310000 (32-bit, non-prefetchable) [size=2K]
	Expansion ROM at f0300000 [disabled] [size=64K]
	Capabilities: <access denied>
	Kernel driver in use: ahci

04:00.0 Ethernet controller [0200]: Intel Corporation 82574L Gigabit Network Connection [8086:10d3]
	Subsystem: ASUSTeK Computer Inc. Device [1043:8369]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 17
	Region 0: Memory at f0200000 (32-bit, non-prefetchable) [size=128K]
	Region 2: I/O ports at c000 [size=32]
	Region 3: Memory at f0220000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: e1000e

05:00.0 Ethernet controller [0200]: Intel Corporation 82574L Gigabit Network Connection [8086:10d3]
	Subsystem: ASUSTeK Computer Inc. Device [1043:8369]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 18
	Region 0: Memory at f0100000 (32-bit, non-prefetchable) [size=128K]
	Region 2: I/O ports at b000 [size=32]
	Region 3: Memory at f0120000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: e1000e


** USB devices:
Bus 003 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 004 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 003: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
Bus 003 Device 004: ID 051d:0002 American Power Conversion Uninterruptible Power Supply
Bus 003 Device 005: ID 090c:1000 Silicon Motion, Inc. - Taiwan (formerly Feiya Technology Corp.) 64MB QDI U2 DISK
Bus 003 Device 006: ID 2433:b111  
Bus 004 Device 003: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
Bus 004 Device 004: ID 046a:0023 Cherry GmbH CyMotion Master Linux Keyboard
Bus 004 Device 005: ID 046d:c03d Logitech, Inc. M-BT96a Pilot Optical Mouse
Bus 003 Device 007: ID 0424:2640 Standard Microsystems Corp. USB 2.0 Hub
Bus 003 Device 008: ID 046d:081d Logitech, Inc. HD Webcam C510
Bus 003 Device 009: ID 0424:4064 Standard Microsystems Corp. Ultra Fast Media Reader


-- System Information:
Debian Release: 7.0
  APT prefers testing
  APT policy: (500, 'testing'), (400, 'unstable'), (300, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.7-trunk-amd64 (SMP w/4 CPU cores)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages linux-image-3.7-trunk-amd64 depends on:
ii  debconf [debconf-2.0]                   1.5.49
ii  initramfs-tools [linux-initramfs-tool]  0.109
ii  kmod                                    9-2
ii  linux-base                              3.5
ii  module-init-tools                       9-2

Versions of packages linux-image-3.7-trunk-amd64 recommends:
ii  firmware-linux-free  3.2

Versions of packages linux-image-3.7-trunk-amd64 suggests:
pn  debian-kernel-handbook     <none>
pn  grub-pc | extlinux | lilo  <none>
ii  linux-doc-3.7              3.7.8-1~experimental.1

Versions of packages linux-image-3.7-trunk-amd64 is related to:
pn  firmware-atheros        <none>
pn  firmware-bnx2           <none>
pn  firmware-bnx2x          <none>
pn  firmware-brcm80211      <none>
pn  firmware-intelwimax     <none>
pn  firmware-ipw2x00        <none>
pn  firmware-ivtv           <none>
pn  firmware-iwlwifi        <none>
pn  firmware-libertas       <none>
ii  firmware-linux          0.37
ii  firmware-linux-nonfree  0.37
pn  firmware-myricom        <none>
pn  firmware-netxen         <none>
pn  firmware-qlogic         <none>
pn  firmware-ralink         <none>
pn  firmware-realtek        <none>
pn  xen-hypervisor          <none>

-- debconf information:
  linux-image-3.7-trunk-amd64/postinst/ignoring-ramdisk:
  linux-image-3.7-trunk-amd64/postinst/depmod-error-initrd-3.7-trunk-amd64: false
  linux-image-3.7-trunk-amd64/prerm/removing-running-kernel-3.7-trunk-amd64: true
  linux-image-3.7-trunk-amd64/postinst/missing-firmware-3.7-trunk-amd64:



Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Sun, 24 Feb 2013 19:15:07 GMT) (full text, mbox, link).


Acknowledgement sent to Maik Zumstrull <maik@zumstrull.net>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Sun, 24 Feb 2013 19:15:07 GMT) (full text, mbox, link).


Message #10 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Maik Zumstrull <maik@zumstrull.net>
To: 700975@bugs.debian.org
Subject: Happens with wheezy kernel
Date: Sun, 24 Feb 2013 20:10:30 +0100
Tried building an array with linux-image-3.2.0-4-amd64 / 3.2.35-2, same issue.



Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Tue, 12 Mar 2013 22:03:03 GMT) (full text, mbox, link).


Acknowledgement sent to Maik Zumstrull <maik@zumstrull.net>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Tue, 12 Mar 2013 22:03:03 GMT) (full text, mbox, link).


Message #15 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Maik Zumstrull <maik@zumstrull.net>
To: 700975@bugs.debian.org
Subject: Also happens on 3.8
Date: Tue, 12 Mar 2013 23:01:14 +0100
[Message part 1 (text/plain, inline)]
Just tried with 3.8.2, fresh from experimental.
[Message part 2 (text/html, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Thu, 04 Apr 2013 19:18:04 GMT) (full text, mbox, link).


Acknowledgement sent to Maik Zumstrull <maik@zumstrull.net>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Thu, 04 Apr 2013 19:18:04 GMT) (full text, mbox, link).


Message #20 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Maik Zumstrull <maik@zumstrull.net>
To: 700975@bugs.debian.org
Cc: Ben Hutchings <ben@decadent.org.uk>
Subject: RAID is still barely usable
Date: Thu, 4 Apr 2013 21:16:02 +0200
Any way to get some attention on this bug? Unless I'm missing
something, it seems to make a fairly major feature (md) nearly
unusable (for me), with issues of potential data loss.

Maybe my hardware is just too exotic? But as far as I can tell, this
is a popular AHCI chip, and those are popular disks.



Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Thu, 04 Apr 2013 19:36:09 GMT) (full text, mbox, link).


Acknowledgement sent to Jonathan Nieder <jrnieder@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Thu, 04 Apr 2013 19:36:09 GMT) (full text, mbox, link).


Message #25 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Jonathan Nieder <jrnieder@gmail.com>
To: Maik Zumstrull <maik@zumstrull.net>
Cc: 700975@bugs.debian.org, Ben Hutchings <ben@decadent.org.uk>
Subject: Re: Marvell 88SE9230: Freaks out and drops all disks if sent SMART command during RAID rebuild
Date: Thu, 4 Apr 2013 12:34:23 -0700
found 700975 linux/3.2.35-2 , linux/3.8.2-1~experimental.1
quit

Hi Maik,

Maik Zumstrull wrote:

> Any way to get some attention on this bug?

Sorry for the slow reply.

I missed your message "Also happens on 3.8", probably because the
subject didn't make it stand out in the inbox.  Sorry about that.

Please send a summary of symptoms to linux-raid@vger.kernel.org,
cc-ing either me or this bug log so we can track it.  Be sure to
mention:

 * steps to reproduce, expected result, actual result, and how the
   difference indicates a bug (should be simple enough here)

 * which kernel versions you have tried and what happens with each

 * any relevant kernel log messages and when the occur relative to
   the steps used to reproduce the problem

 * relevant hardware information (lspci output for the SATA controller,
   full "dmesg" output from booting as an attachment)

The message

| [  240.483211] md/raid:md127: Disk failure on sdg, disabling device.

does not make this sound good.  But hopefully the md/raid folks might
have ideas for tracking the problem down further.

Hope that helps,
Jonathan



Marked as found in versions linux/3.2.35-2 and linux/3.8.2-1~experimental.1. Request was from Jonathan Nieder <jrnieder@gmail.com> to control@bugs.debian.org. (Thu, 04 Apr 2013 19:36:12 GMT) (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Thu, 04 Apr 2013 19:42:04 GMT) (full text, mbox, link).


Acknowledgement sent to Maik Zumstrull <maik@zumstrull.net>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Thu, 04 Apr 2013 19:42:04 GMT) (full text, mbox, link).


Message #32 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Maik Zumstrull <maik@zumstrull.net>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: 700975@bugs.debian.org, Ben Hutchings <ben@decadent.org.uk>
Subject: Re: Marvell 88SE9230: Freaks out and drops all disks if sent SMART command during RAID rebuild
Date: Thu, 4 Apr 2013 21:39:48 +0200
On Thu, Apr 4, 2013 at 9:34 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:

> Please send a summary of symptoms to linux-raid@vger.kernel.org,
> cc-ing either me or this bug log so we can track it.

Will do. FWIW I think the underlying issue is likely in the SATA code,
with RAID just providing the tons of background I/O load needed to
trigger the issue. But I'm happy to start on that end.



Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Thu, 04 Apr 2013 19:48:04 GMT) (full text, mbox, link).


Acknowledgement sent to Jonathan Nieder <jrnieder@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Thu, 04 Apr 2013 19:48:04 GMT) (full text, mbox, link).


Message #37 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Jonathan Nieder <jrnieder@gmail.com>
To: Maik Zumstrull <maik@zumstrull.net>
Cc: 700975@bugs.debian.org, Ben Hutchings <ben@decadent.org.uk>
Subject: Re: Marvell 88SE9230: Freaks out and drops all disks if sent SMART command during RAID rebuild
Date: Thu, 4 Apr 2013 12:44:42 -0700
Maik Zumstrull wrote:

>          FWIW I think the underlying issue is likely in the SATA code,
> with RAID just providing the tons of background I/O load needed to
> trigger the issue.

Yeah, makes sense.  Also feel free to cc linux-ide@vger.kernel.org to
give the ahci driver devs a chance to chime in.

Thanks,
Jonathan



Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Thu, 04 Apr 2013 20:15:11 GMT) (full text, mbox, link).


Acknowledgement sent to Maik Zumstrull <maik@zumstrull.net>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Thu, 04 Apr 2013 20:15:11 GMT) (full text, mbox, link).


Message #42 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Maik Zumstrull <maik@zumstrull.net>
To: linux-raid@vger.kernel.org, linux-ide@vger.kernel.org
Cc: 700975@bugs.debian.org
Subject: RAID barely usable on my home machine
Date: Thu, 4 Apr 2013 22:13:05 +0200
Hello Linux RAID and ATA people,

I've managed to find a configuration on my home desktop where a
particular RAID array is barely usable.

You can find my initial report at:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700975

In summary:

- I create an array across four disks on a Marvell AHCI controller,
which automatically goes into rebuild mode.
- Somebody (e.g. smartd or udisks2 or me, testing) sends a SMART
command to one of the disks.
- The SMART command fails.
- The ATA subsystems freaks out all over the place, until eventually
none of the disks on that controller are responsive.
- The array is dead until reboot. (Curiously, without data loss so
far. Kudos on the RAID code, I guess.)

I've found the issue to be highly reproducible so far. Things mostly
work if the array is not under heavy load (not rebuilding, no big file
copies going on) or I make completely sure nothing sends SMART
commands. I currently do keep real files on that array, but backed-up
ones, so I could wipe it for more tests if really necessary.

I've tried various kernels from Debian (3.2, 3.7, and 3.8 series) and
found them all affected.

Here are some edited excerpts from the kernel log messages as found in
the Debian bug, see unedited transcript there.

Getting our RAID on:

[  122.707833] md127: detected capacity change from 0 to 9001374842880
[  122.707860] RAID conf printout:
[  122.707865]  --- level:5 rd:4 wd:3
[  122.707868]  disk 0, o:1, dev:sde
[  122.707870]  disk 1, o:1, dev:sdf
[  122.707872]  disk 2, o:1, dev:sdg
[  122.707873]  disk 3, o:1, dev:sdh
[  122.707965] md: recovery of RAID array md127
[  122.707968] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[  122.707970] md: using maximum available idle IO bandwidth (but not
more than 200000 KB/sec) for recovery.
[  122.707973] md: using 128k window, over a total of 2930135040k.

We see a SMART we don't like:

[  180.531641] ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  180.531648] ata9.00: failed command: SMART
[  180.531655] ata9.00: cmd b0/d1:01:01:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[  180.531655]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[  180.531658] ata9.00: status: { DRDY }

Woops, a non-critical command failed? Best shoot the controller in the
face until it stops twitching:

[  180.531666] ata9: hard resetting link
[  185.887433] ata9: link is slow to respond, please be patient (ready=0)
[  190.524871] ata9: COMRESET failed (errno=-16)
[  190.524877] ata9: hard resetting link
[  195.872694] ata9: link is slow to respond, please be patient (ready=0)
[  200.510134] ata9: COMRESET failed (errno=-16)
[  200.510141] ata9: hard resetting link
[  205.857925] ata9: link is slow to respond, please be patient (ready=0)
[  235.470518] ata9: COMRESET failed (errno=-16)
[  235.470526] ata9: limiting SATA link speed to 3.0 Gbps
[  235.470529] ata9: hard resetting link
[  240.483102] ata9: COMRESET failed (errno=-16)
[  240.483110] ata9: reset failed, giving up
[  240.483112] ata9.00: disabled
[  240.483134] ata9: EH complete

So now other stuff goes wrong:

[  301.216814] ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  301.216818] ata7.00: failed command: FLUSH CACHE EXT
[  301.216821] ata7.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[  301.216821]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[  301.216822] ata7.00: status: { DRDY }
[  301.216827] ata7: hard resetting link
[  301.216842] ata10.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  301.216845] ata10.00: failed command: FLUSH CACHE EXT
[  301.216849] ata10.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[  301.216849]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[  301.216851] ata10.00: status: { DRDY }
[  301.216855] ata10: hard resetting link
[  301.216861] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  301.216864] ata8.00: failed command: FLUSH CACHE EXT
[  301.216868] ata8.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[  301.216868]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[  301.216870] ata8.00: status: { DRDY }

Until eventually, the patient's dead…so let's report success:

[  351.917459] md/raid:md127: Disk failure on sde, disabling device.
[  351.917459] md/raid:md127: Operation continuing on 0 devices.
[  351.921299] md: md127: recovery done.

This is on a cheapo PCIe extension board with four internal SATA3
ports. Chip is a "Marvell Technology Group Ltd. 88SE9230 PCIe SATA
6Gb/s Controller [1b4b:9230]" using the ahci driver.

It would be really good to see this fixed. I see two issues:
- That SMART command probably shouldn't fail. Weird drive firmware?
Timeout too tight?
- A failing SMART command should probably not trigger a breakdown of
the whole controller. At least, not such a messy one.

I'll make myself available, as time allows, to provide requested
additional information.



Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Fri, 05 Apr 2013 00:12:04 GMT) (full text, mbox, link).


Acknowledgement sent to Roger Heflin <rogerheflin@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Fri, 05 Apr 2013 00:12:04 GMT) (full text, mbox, link).


Message #47 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Roger Heflin <rogerheflin@gmail.com>
To: Maik Zumstrull <maik@zumstrull.net>
Cc: Linux RAID <linux-raid@vger.kernel.org>, linux-ide@vger.kernel.org, 700975@bugs.debian.org
Subject: Re: RAID barely usable on my home machine
Date: Thu, 4 Apr 2013 19:08:13 -0500
[Message part 1 (text/plain, inline)]
lspci look like this for the controller:
SATA controller: Marvell Technology Group Ltd. Device 9230 (rev 10)

4pt sata3.0 6gbit or is yours a different one?

I have the issue also, I have eliminated all smart hits against the disks
and no incidents since then.

It does appear to be load related, if the controller is being hit hard and
the smart command comes along sometimes the controller loses its mind and
all of the disks stop responding.

I have seagate 1.5tb drives on mine that had had the issues.

I am using 3.7.10...it also has the issue.

and reboot is the only thing that clears it, and I have got pretty good at
forcing the raid back online when this happens.




On Thu, Apr 4, 2013 at 3:13 PM, Maik Zumstrull <maik@zumstrull.net> wrote:

> Hello Linux RAID and ATA people,
>
> I've managed to find a configuration on my home desktop where a
> particular RAID array is barely usable.
>
> You can find my initial report at:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700975
>
> In summary:
>
> - I create an array across four disks on a Marvell AHCI controller,
> which automatically goes into rebuild mode.
> - Somebody (e.g. smartd or udisks2 or me, testing) sends a SMART
> command to one of the disks.
> - The SMART command fails.
> - The ATA subsystems freaks out all over the place, until eventually
> none of the disks on that controller are responsive.
> - The array is dead until reboot. (Curiously, without data loss so
> far. Kudos on the RAID code, I guess.)
>
> I've found the issue to be highly reproducible so far. Things mostly
> work if the array is not under heavy load (not rebuilding, no big file
> copies going on) or I make completely sure nothing sends SMART
> commands. I currently do keep real files on that array, but backed-up
> ones, so I could wipe it for more tests if really necessary.
>
> I've tried various kernels from Debian (3.2, 3.7, and 3.8 series) and
> found them all affected.
>
> Here are some edited excerpts from the kernel log messages as found in
> the Debian bug, see unedited transcript there.
>
> Getting our RAID on:
>
> [  122.707833] md127: detected capacity change from 0 to 9001374842880
> [  122.707860] RAID conf printout:
> [  122.707865]  --- level:5 rd:4 wd:3
> [  122.707868]  disk 0, o:1, dev:sde
> [  122.707870]  disk 1, o:1, dev:sdf
> [  122.707872]  disk 2, o:1, dev:sdg
> [  122.707873]  disk 3, o:1, dev:sdh
> [  122.707965] md: recovery of RAID array md127
> [  122.707968] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [  122.707970] md: using maximum available idle IO bandwidth (but not
> more than 200000 KB/sec) for recovery.
> [  122.707973] md: using 128k window, over a total of 2930135040k.
>
> We see a SMART we don't like:
>
> [  180.531641] ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> frozen
> [  180.531648] ata9.00: failed command: SMART
> [  180.531655] ata9.00: cmd b0/d1:01:01:4f:c2/00:00:00:00:00/00 tag 0 pio
> 512 in
> [  180.531655]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [  180.531658] ata9.00: status: { DRDY }
>
> Woops, a non-critical command failed? Best shoot the controller in the
> face until it stops twitching:
>
> [  180.531666] ata9: hard resetting link
> [  185.887433] ata9: link is slow to respond, please be patient (ready=0)
> [  190.524871] ata9: COMRESET failed (errno=-16)
> [  190.524877] ata9: hard resetting link
> [  195.872694] ata9: link is slow to respond, please be patient (ready=0)
> [  200.510134] ata9: COMRESET failed (errno=-16)
> [  200.510141] ata9: hard resetting link
> [  205.857925] ata9: link is slow to respond, please be patient (ready=0)
> [  235.470518] ata9: COMRESET failed (errno=-16)
> [  235.470526] ata9: limiting SATA link speed to 3.0 Gbps
> [  235.470529] ata9: hard resetting link
> [  240.483102] ata9: COMRESET failed (errno=-16)
> [  240.483110] ata9: reset failed, giving up
> [  240.483112] ata9.00: disabled
> [  240.483134] ata9: EH complete
>
> So now other stuff goes wrong:
>
> [  301.216814] ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> frozen
> [  301.216818] ata7.00: failed command: FLUSH CACHE EXT
> [  301.216821] ata7.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> [  301.216821]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [  301.216822] ata7.00: status: { DRDY }
> [  301.216827] ata7: hard resetting link
> [  301.216842] ata10.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> frozen
> [  301.216845] ata10.00: failed command: FLUSH CACHE EXT
> [  301.216849] ata10.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> [  301.216849]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [  301.216851] ata10.00: status: { DRDY }
> [  301.216855] ata10: hard resetting link
> [  301.216861] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> frozen
> [  301.216864] ata8.00: failed command: FLUSH CACHE EXT
> [  301.216868] ata8.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> [  301.216868]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [  301.216870] ata8.00: status: { DRDY }
>
> Until eventually, the patient's dead…so let's report success:
>
> [  351.917459] md/raid:md127: Disk failure on sde, disabling device.
> [  351.917459] md/raid:md127: Operation continuing on 0 devices.
> [  351.921299] md: md127: recovery done.
>
> This is on a cheapo PCIe extension board with four internal SATA3
> ports. Chip is a "Marvell Technology Group Ltd. 88SE9230 PCIe SATA
> 6Gb/s Controller [1b4b:9230]" using the ahci driver.
>
> It would be really good to see this fixed. I see two issues:
> - That SMART command probably shouldn't fail. Weird drive firmware?
> Timeout too tight?
> - A failing SMART command should probably not trigger a breakdown of
> the whole controller. At least, not such a messy one.
>
> I'll make myself available, as time allows, to provide requested
> additional information.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
[Message part 2 (text/html, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Fri, 05 Apr 2013 00:21:04 GMT) (full text, mbox, link).


Acknowledgement sent to Roger Heflin <rogerheflin@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Fri, 05 Apr 2013 00:21:04 GMT) (full text, mbox, link).


Message #52 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Roger Heflin <rogerheflin@gmail.com>
To: Maik Zumstrull <maik@zumstrull.net>
Cc: Linux RAID <linux-raid@vger.kernel.org>, linux-ide <linux-ide@vger.kernel.org>, 700975 <700975@bugs.debian.org>
Subject: Re: RAID barely usable on my home machine
Date: Thu, 4 Apr 2013 19:16:38 -0500
trying again...gmail decided to put my response into formatted
text...so several lists rejected it.

lspci look like this for the controller:
SATA controller: Marvell Technology Group Ltd. Device 9230 (rev 10)

4pt sata3.0 6gbit or is yours a different one?

I have the issue also, I have eliminated all smart hits against the
disks and no incidents since then.

It does appear to be load related, if the controller is being hit hard
and the smart command comes along sometimes the controller loses its
mind and all of the disks stop responding.

I have seagate 1.5tb drives on mine that had had the issues.

I am using 3.7.10...it also has the issue.

and reboot is the only thing that clears it, and I have got pretty
good at forcing the raid back online when this happens.

On Thu, Apr 4, 2013 at 3:13 PM, Maik Zumstrull <maik@zumstrull.net> wrote:
> Hello Linux RAID and ATA people,
>
> I've managed to find a configuration on my home desktop where a
> particular RAID array is barely usable.
>
> You can find my initial report at:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700975
>
> In summary:
>
> - I create an array across four disks on a Marvell AHCI controller,
> which automatically goes into rebuild mode.
> - Somebody (e.g. smartd or udisks2 or me, testing) sends a SMART
> command to one of the disks.
> - The SMART command fails.
> - The ATA subsystems freaks out all over the place, until eventually
> none of the disks on that controller are responsive.
> - The array is dead until reboot. (Curiously, without data loss so
> far. Kudos on the RAID code, I guess.)
>
> I've found the issue to be highly reproducible so far. Things mostly
> work if the array is not under heavy load (not rebuilding, no big file
> copies going on) or I make completely sure nothing sends SMART
> commands. I currently do keep real files on that array, but backed-up
> ones, so I could wipe it for more tests if really necessary.
>
> I've tried various kernels from Debian (3.2, 3.7, and 3.8 series) and
> found them all affected.
>
> Here are some edited excerpts from the kernel log messages as found in
> the Debian bug, see unedited transcript there.
>
> Getting our RAID on:
>
> [  122.707833] md127: detected capacity change from 0 to 9001374842880
> [  122.707860] RAID conf printout:
> [  122.707865]  --- level:5 rd:4 wd:3
> [  122.707868]  disk 0, o:1, dev:sde
> [  122.707870]  disk 1, o:1, dev:sdf
> [  122.707872]  disk 2, o:1, dev:sdg
> [  122.707873]  disk 3, o:1, dev:sdh
> [  122.707965] md: recovery of RAID array md127
> [  122.707968] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [  122.707970] md: using maximum available idle IO bandwidth (but not
> more than 200000 KB/sec) for recovery.
> [  122.707973] md: using 128k window, over a total of 2930135040k.
>
> We see a SMART we don't like:
>
> [  180.531641] ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [  180.531648] ata9.00: failed command: SMART
> [  180.531655] ata9.00: cmd b0/d1:01:01:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
> [  180.531655]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [  180.531658] ata9.00: status: { DRDY }
>
> Woops, a non-critical command failed? Best shoot the controller in the
> face until it stops twitching:
>
> [  180.531666] ata9: hard resetting link
> [  185.887433] ata9: link is slow to respond, please be patient (ready=0)
> [  190.524871] ata9: COMRESET failed (errno=-16)
> [  190.524877] ata9: hard resetting link
> [  195.872694] ata9: link is slow to respond, please be patient (ready=0)
> [  200.510134] ata9: COMRESET failed (errno=-16)
> [  200.510141] ata9: hard resetting link
> [  205.857925] ata9: link is slow to respond, please be patient (ready=0)
> [  235.470518] ata9: COMRESET failed (errno=-16)
> [  235.470526] ata9: limiting SATA link speed to 3.0 Gbps
> [  235.470529] ata9: hard resetting link
> [  240.483102] ata9: COMRESET failed (errno=-16)
> [  240.483110] ata9: reset failed, giving up
> [  240.483112] ata9.00: disabled
> [  240.483134] ata9: EH complete
>
> So now other stuff goes wrong:
>
> [  301.216814] ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [  301.216818] ata7.00: failed command: FLUSH CACHE EXT
> [  301.216821] ata7.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> [  301.216821]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [  301.216822] ata7.00: status: { DRDY }
> [  301.216827] ata7: hard resetting link
> [  301.216842] ata10.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [  301.216845] ata10.00: failed command: FLUSH CACHE EXT
> [  301.216849] ata10.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> [  301.216849]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [  301.216851] ata10.00: status: { DRDY }
> [  301.216855] ata10: hard resetting link
> [  301.216861] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [  301.216864] ata8.00: failed command: FLUSH CACHE EXT
> [  301.216868] ata8.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> [  301.216868]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [  301.216870] ata8.00: status: { DRDY }
>
> Until eventually, the patient's dead…so let's report success:
>
> [  351.917459] md/raid:md127: Disk failure on sde, disabling device.
> [  351.917459] md/raid:md127: Operation continuing on 0 devices.
> [  351.921299] md: md127: recovery done.
>
> This is on a cheapo PCIe extension board with four internal SATA3
> ports. Chip is a "Marvell Technology Group Ltd. 88SE9230 PCIe SATA
> 6Gb/s Controller [1b4b:9230]" using the ahci driver.
>
> It would be really good to see this fixed. I see two issues:
> - That SMART command probably shouldn't fail. Weird drive firmware?
> Timeout too tight?
> - A failing SMART command should probably not trigger a breakdown of
> the whole controller. At least, not such a messy one.
>
> I'll make myself available, as time allows, to provide requested
> additional information.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Fri, 05 Apr 2013 08:09:04 GMT) (full text, mbox, link).


Acknowledgement sent to Maik Zumstrull <maik@zumstrull.net>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Fri, 05 Apr 2013 08:09:04 GMT) (full text, mbox, link).


Message #57 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Maik Zumstrull <maik@zumstrull.net>
To: Roger Heflin <rogerheflin@gmail.com>
Cc: Linux RAID <linux-raid@vger.kernel.org>, linux-ide <linux-ide@vger.kernel.org>, 700975 <700975@bugs.debian.org>
Subject: Re: RAID barely usable on my home machine
Date: Fri, 5 Apr 2013 10:06:58 +0200
On Fri, Apr 5, 2013 at 2:16 AM, Roger Heflin <rogerheflin@gmail.com> wrote:

> lspci look like this for the controller:
> SATA controller: Marvell Technology Group Ltd. Device 9230 (rev 10)
>
> 4pt sata3.0 6gbit or is yours a different one?

Mine looks slightly different (included somewhere in my mail), but
should be a device in the same family.

> I have the issue also, I have eliminated all smart hits against the
> disks and no incidents since then.

Not a great workaround as such, first of all, running SMART against
your storage is kind of recommended, and secondly, as I said, udisks2
will also SMART your disks occasionally, which you have to uninstall
parts of GNOME to get rid of.

> I have seagate 1.5tb drives on mine that had had the issues.

Suggests it's not something about my drives, as I have WD Red disks.



Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Fri, 05 Apr 2013 08:27:07 GMT) (full text, mbox, link).


Acknowledgement sent to Robin Hill <robin@robinhill.me.uk>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Fri, 05 Apr 2013 08:27:07 GMT) (full text, mbox, link).


Message #62 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Robin Hill <robin@robinhill.me.uk>
To: Maik Zumstrull <maik@zumstrull.net>
Cc: linux-raid@vger.kernel.org, linux-ide@vger.kernel.org, 700975@bugs.debian.org
Subject: Re: RAID barely usable on my home machine
Date: Fri, 5 Apr 2013 09:23:47 +0100
[Message part 1 (text/plain, inline)]
On Thu Apr 04, 2013 at 10:13:05 +0200, Maik Zumstrull wrote:

> Hello Linux RAID and ATA people,
> 
> I've managed to find a configuration on my home desktop where a
> particular RAID array is barely usable.
> 
> You can find my initial report at:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700975
> 
> In summary:
> 
> - I create an array across four disks on a Marvell AHCI controller,
> which automatically goes into rebuild mode.
> - Somebody (e.g. smartd or udisks2 or me, testing) sends a SMART
> command to one of the disks.
> - The SMART command fails.
> - The ATA subsystems freaks out all over the place, until eventually
> none of the disks on that controller are responsive.
> - The array is dead until reboot. (Curiously, without data loss so
> far. Kudos on the RAID code, I guess.)
> 
> I've found the issue to be highly reproducible so far. Things mostly
> work if the array is not under heavy load (not rebuilding, no big file
> copies going on) or I make completely sure nothing sends SMART
> commands. I currently do keep real files on that array, but backed-up
> ones, so I could wipe it for more tests if really necessary.
> 
I used to have the same issues on one of my machines. The solution was
to buy a decent SAS/SATA HBA (I went with the Intel RS2WC080, but see
http://blog.zorinaq.com/?e=10 for a more complete list), which now works
perfectly with exactly the same drives as before.

HTH,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Fri, 05 Apr 2013 09:03:04 GMT) (full text, mbox, link).


Acknowledgement sent to Peter Maloney <peter.maloney@brockmann-consult.de>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Fri, 05 Apr 2013 09:03:04 GMT) (full text, mbox, link).


Message #67 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Peter Maloney <peter.maloney@brockmann-consult.de>
To: Maik Zumstrull <maik@zumstrull.net>
Cc: Roger Heflin <rogerheflin@gmail.com>, Linux RAID <linux-raid@vger.kernel.org>, linux-ide <linux-ide@vger.kernel.org>, 700975 <700975@bugs.debian.org>
Subject: Re: RAID barely usable on my home machine
Date: Fri, 05 Apr 2013 10:58:46 +0200
On 2013-04-05 10:06, Maik Zumstrull wrote:
> On Fri, Apr 5, 2013 at 2:16 AM, Roger Heflin <rogerheflin@gmail.com> wrote:
>
>> lspci look like this for the controller:
>> SATA controller: Marvell Technology Group Ltd. Device 9230 (rev 10)
>>
>> 4pt sata3.0 6gbit or is yours a different one?
> Mine looks slightly different (included somewhere in my mail), but
> should be a device in the same family.
>
>> I have the issue also, I have eliminated all smart hits against the
>> disks and no incidents since then.
> Not a great workaround as such, first of all, running SMART against
> your storage is kind of recommended, and secondly, as I said, udisks2
> will also SMART your disks occasionally, which you have to uninstall
> parts of GNOME to get rid of.
This reminds me of an old thread in freebsd-scsi, where a guy with some
SAS disks and a SAS2008 controller would have his disks 'lost' if he
used smartctl on routine, and could reproduce it reliably by spamming a
disk with smartctl -a (but smartctl -i, and something else would not
reproduce it). And he found that to solve it, he can change the "disk
tags" to much lower. I think "disk tags" might be equivalent to the
nr_requests in Linux.

Here is the thread:
http://osdir.com/ml/freebsd-scsi/2011-11/msg00006.html
>> I have seagate 1.5tb drives on mine that had had the issues.
> Suggests it's not something about my drives, as I have WD Red disks.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 

--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.maloney@brockmann-consult.de
Internet: http://www.brockmann-consult.de
--------------------------------------------




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Thu, 11 Apr 2013 12:21:04 GMT) (full text, mbox, link).


Acknowledgement sent to Maik Zumstrull <maik@zumstrull.net>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Thu, 11 Apr 2013 12:21:04 GMT) (full text, mbox, link).


Message #72 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Maik Zumstrull <maik@zumstrull.net>
To: linux-raid@vger.kernel.org, linux-ide@vger.kernel.org, 700975@bugs.debian.org, robin@robinhill.me.uk
Subject: Re: RAID barely usable on my home machine
Date: Thu, 11 Apr 2013 14:18:43 +0200
On Fri, Apr 5, 2013 at 10:23 AM, Robin Hill <robin@robinhill.me.uk> wrote:
> On Thu Apr 04, 2013 at 10:13:05 +0200, Maik Zumstrull wrote:

>> I've managed to find a configuration on my home desktop where a
>> particular RAID array is barely usable.

> I used to have the same issues on one of my machines. The solution was
> to buy a decent SAS/SATA HBA (I went with the Intel RS2WC080, but see
> http://blog.zorinaq.com/?e=10 for a more complete list), which now works
> perfectly with exactly the same drives as before.

Calling that a "solution" seems generous, but since it doesn't look
like this'll get fixed any time soon, I decided to go the same way and
just ordered LSI SAS 9211-8i.

Thanks for the list, that seems like a very useful resource.



Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Wed, 27 Nov 2013 16:21:05 GMT) (full text, mbox, link).


Acknowledgement sent to Martin Gallant <martyg@goodbit.net>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Wed, 27 Nov 2013 16:21:05 GMT) (full text, mbox, link).


Message #77 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Martin Gallant <martyg@goodbit.net>
To: 700975@bugs.debian.org
Subject: Marvell 88SE9230: Freaks out - Reported upstream?
Date: Wed, 27 Nov 2013 10:16:36 -0600
Has this issue been reported upstream to the kernel developers?
I just did a search on bugzila, and couldn't find any record of this issue.

-- 
Marty



Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Mon, 13 Jul 2015 22:09:03 GMT) (full text, mbox, link).


Acknowledgement sent to Stefan Lesser <s@68k.ch>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Mon, 13 Jul 2015 22:09:03 GMT) (full text, mbox, link).


Message #82 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Stefan Lesser <s@68k.ch>
To: 700975@bugs.debian.org
Subject: Marvell 88SE9230: Freaks out - Reported upstream?
Date: Mon, 13 Jul 2015 23:54:25 +0200
Hi folks,

Quick ping here. I haven't seen any report of this issue upstream either, but since it's a pretty hard issue to track down from its symptoms alone it may affect a few other people as well.

Thanks,
Stefan


Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#700975; Package src:linux. (Mon, 13 Mar 2017 06:12:03 GMT) (full text, mbox, link).


Acknowledgement sent to Alexander Klimenko <a.klimenko@mnogobyte.ru>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Mon, 13 Mar 2017 06:12:03 GMT) (full text, mbox, link).


Message #87 received at 700975@bugs.debian.org (full text, mbox, reply):

From: Alexander Klimenko <a.klimenko@mnogobyte.ru>
To: 700975@bugs.debian.org
Subject: linux-image-3.7-trunk-amd64: Marvell 88SE9230: Freaks out and drops all disks if sent SMART command during RAID rebuild
Date: Mon, 13 Mar 2017 08:54:23 +0300
Looks like we are experiencing the same problem problem with Espada FG-EST14A-1 (88SE9230) and 3.16.0-4-amd64 kernel.
The main difference is that we are not using md - but zfs with ssd cache behind this controller which gives high load too.



Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Tue Jan 9 22:13:43 2018; Machine Name: buxtehude

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.