Debian Bug report logs - #707178
ITP: breakin -- stress-test and hardware diagnostics tool

Package: wnpp; Maintainer for wnpp is wnpp@debian.org;

Reported by: Antoine Beaupré <anarcat@debian.org>

Date: Tue, 7 May 2013 22:57:01 UTC

Owned by: "Antoine Beaupré" <anarcat@debian.org>

Severity: wishlist

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, debian-devel@lists.debian.org, taggart@riseup.net, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Tue, 07 May 2013 22:57:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Antoine Beaupré <anarcat@debian.org>:
New Bug report received and forwarded. Copy sent to debian-devel@lists.debian.org, taggart@riseup.net, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Tue, 07 May 2013 22:57:06 GMT) Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: Antoine Beaupré <anarcat@debian.org>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: ITP: breakin -- stress-test and hardware diagnostics tool
Date: Tue, 07 May 2013 18:55:21 -0400
Package: wnpp
Severity: wishlist
Owner: "Antoine Beaupré" <anarcat@debian.org>

* Package name    : breakin
  Version         : 3.20
  Upstream Author : Jason D. Clinton <me@jasonclinton.com>, Kyle Sheumaker <ksheumaker@advancedclustering.com>
* URL             : http://www.advancedclustering.com/software/breakin.html
* License         : GPL2
  Programming Lang: C
  Description     : stress-test and hardware diagnostics tool

Breakin is Advanced Clustering's stress-test and hardware diagnostics
tool. Advanced Clustering engineers developed breakin because no other
available product — commercial or open source — could pinpoint hardware
issues and component failures as well as they wanted.

= Packaging notes =

The software is split into two main parts: the "breakin" software, which
is a fairly self-contained C program with affiliated scripts which runs
on boot up, and the "bootimage" distribution, which is a custom Linux
distribution that is built from the ground up.

I believe that the "breakin" part should be made into a Debian package,
and the "bootimage" part should be discarded or made into a "debirf"
extension.

The dependencies of breakin are unclear, but the bootimage build scripts
build the following parts:

anarcat@desktop006:bootimage$ ls srcctrl/
actdmi    dropbear    htop        lvm2      parted         syslinux     wget
afio      e2fsprogs   initfs      mcelog    pciutils       tcpdump
arecacli  edac-utils  ipmitool    mdadm     rsync          terminfo
bonnie++  ethtool     kernel      mdinfo    screen         udev
breakin   hddtemp     links       mstflint  smartmontools  udpcast
busybox   hpl         lm-sensors  numactl   stream         util-linux

A lot of those are available in Debian already, except:

actdmi
arecacli
breakin
hpl
initfs
kernel
mdinfo
stream
terminfo

Specific assessment should be done on those dependencies.

I am interested in packaging the "breakin" part, but my time is limited
and would appreciate any help. I can sponsor a package too, and will try
to inform upstream of our efforts.



Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Wed, 08 May 2013 00:03:09 GMT) Full text and rfc822 format available.

Acknowledgement sent to Matt Taggart <taggart@debian.org>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Wed, 08 May 2013 00:03:09 GMT) Full text and rfc822 format available.

Message #10 received at 707178@bugs.debian.org (full text, mbox):

From: Matt Taggart <taggart@debian.org>
To: Antoine Beaupré <anarcat@debian.org>, 707178@bugs.debian.org
Subject: Re: Bug#707178: ITP: breakin -- stress-test and hardware diagnostics tool
Date: Tue, 07 May 2013 16:54:14 -0700
Hi,

I investigated some of the not-already-in-debian dependencies:

Antoine Beaupré writes:

> actdmi

http://git.advancedclustering.com/git/actdmi.git

An advancedclustering fork of dmidecode, forked sometime around version
2.5. Not sure why they forked or what they changed, a diff to upstream
of the era isn't obvious. Some more investigation will be needed here.

> arecacli

Some sort of tool for controlling areca HBAs?

http://www.areca.us/support/s_linux/cli/x86_64/cli64.zip

> breakin

The actual breakin tool

http://git.advancedclustering.com/git/breakin.git

> hpl

http://www.netlib.org/benchmark/hpl/

The hpcc package appears to depend on this too but expects you to install
it by hand from upstream.

> initfs

Refers to

http://git.advancedclustering.com/git/bootimage-initfs.git

Tools for creating the initfs the bootimage uses.

> kernel

Refers to upstream kernel source from

http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.38.2.tar.bz2

> mdinfo

Simple tool for gathering info about md raid arrays.

http://git.advancedclustering.com/git/mdinfo.git

> stream

CPU/Memory benchmark

http://www.cs.virginia.edu/stream/

the hpcc package also uses this (also expected to be installed by hand).

> terminfo

Copies things from /usr/share/terminfo and /lib/terminfo
Provided by ncurses-base and ncurses-term packages.

afio is also listed as a dependency, but is currently in non-free
(#509287). I haven't investigated what it's using it for.

-- 
Matt Taggart
taggart@debian.org



Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Wed, 15 May 2013 23:42:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Antoine Beaupré <anarcat@debian.org>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Wed, 15 May 2013 23:42:07 GMT) Full text and rfc822 format available.

Message #15 received at 707178@bugs.debian.org (full text, mbox):

From: Antoine Beaupré <anarcat@debian.org>
To: taggart@debian.org
Cc: 707178@bugs.debian.org, dkg@fifthhorseman.net, jrollins@finestructure.net
Subject: start of a debirf stress-testing extension: "stressant"
Date: Wed, 15 May 2013 19:38:50 -0400
[Message part 1 (text/plain, inline)]
So the guts of this idea are around this ITP:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=707178

Our requirement at Koumbit.org is to have a tool that will do a series
of tests when we receive or want to verify new hardware. This would
ideally be in a debirf image, since those are easily to build and
share. A good inspiration of this is the "breakin" software, which the
above ITP is about.

I have started working on a silly debirf rescue image. I basically took
the rescue image from 0.33 and added this file in
rescue/modules/stressant:

debirf_exec apt-get --no-install-recommends --assume-yes install bonnie++ cpuburn stress smartmontools e2fsprogs util-linux screen

Quite trivial really. But to ease collaboration, I started a repo here:

https://redmine.koumbit.net/projects/stressant

Cheers,

A.

-- 
Le pouvoir n'est pas à conquérir, il est à détruire
                        - Jean-François Brient, de la servitude moderne
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Thu, 16 May 2013 14:30:23 GMT) Full text and rfc822 format available.

Acknowledgement sent to Daniel Kahn Gillmor <dkg@fifthhorseman.net>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Thu, 16 May 2013 14:30:23 GMT) Full text and rfc822 format available.

Message #20 received at 707178@bugs.debian.org (full text, mbox):

From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: Antoine Beaupré <anarcat@debian.org>
Cc: taggart@debian.org, 707178@bugs.debian.org, debirf@lists.mayfirst.org
Subject: Re: start of a debirf stress-testing extension: "stressant"
Date: Thu, 16 May 2013 10:26:13 -0400
[Message part 1 (text/plain, inline)]
[adding a cc: to the debirf mailing list [0]]

On 05/15/2013 07:38 PM, Antoine Beaupré wrote:
> So the guts of this idea are around this ITP:
> 
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=707178

interesting, thanks.

> I have started working on a silly debirf rescue image. I basically took
> the rescue image from 0.33 and added this file in
> rescue/modules/stressant:
> 
> debirf_exec apt-get --no-install-recommends --assume-yes install bonnie++ cpuburn stress smartmontools e2fsprogs util-linux screen

screen and smartmontools and e2fsprogs should already be part of
debirf's stock rescue.

> Quite trivial really. But to ease collaboration, I started a repo here:
> 
> https://redmine.koumbit.net/projects/stressant

we'd be happy to include these sorts of changes directly in the debirf
upstream repo, whether they're just packages you think belong in the
rescue image, or (even better in my opinion) if you want to make
"stressant" another stock example profile, alongside "minimal",
"rescue", and "xkiosk".

Feel free to clone the debirf repo [1] send patches directly to the
mailing list.  We'd be happy to integrate them.

Regards,

	--dkg

[0] https://lists.mayfirst.org/mailman/listinfo/debirf
[1] git://finestructure.net/debirf or
git://lair.fifthhorseman.net/~dkg/debirf

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Thu, 16 May 2013 15:12:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Antoine Beaupré <anarcat@debian.org>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Thu, 16 May 2013 15:12:04 GMT) Full text and rfc822 format available.

Message #25 received at 707178@bugs.debian.org (full text, mbox):

From: Antoine Beaupré <anarcat@debian.org>
To: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Cc: taggart@debian.org, 707178@bugs.debian.org, debirf@lists.mayfirst.org
Subject: Re: start of a debirf stress-testing extension: "stressant"
Date: Thu, 16 May 2013 11:08:51 -0400
[Message part 1 (text/plain, inline)]
On 2013-05-16 10:26:13, Daniel Kahn Gillmor wrote:
> [adding a cc: to the debirf mailing list [0]]
>
> On 05/15/2013 07:38 PM, Antoine Beaupré wrote:
>> I have started working on a silly debirf rescue image. I basically took
>> the rescue image from 0.33 and added this file in
>> rescue/modules/stressant:
>> 
>> debirf_exec apt-get --no-install-recommends --assume-yes install bonnie++ cpuburn stress smartmontools e2fsprogs util-linux screen
>
> screen and smartmontools and e2fsprogs should already be part of
> debirf's stock rescue.

alright, fixin'... fixed!

i wonder if util-linux is really necessary - since it's of priority
'required', are those installed automatically?

>> Quite trivial really. But to ease collaboration, I started a repo here:
>> 
>> https://redmine.koumbit.net/projects/stressant
>
> we'd be happy to include these sorts of changes directly in the debirf
> upstream repo, whether they're just packages you think belong in the
> rescue image, or (even better in my opinion) if you want to make
> "stressant" another stock example profile, alongside "minimal",
> "rescue", and "xkiosk".

That would be great!

However, one thing I am looking at is a way to generate binary images
directly through a package, something that has been discussed in #620294
I believe, but never implemented.

Also, I wonder if it wouldn't be better to set `stressant` as an example
of how it's possible to extend debirf without having to merge patches
into the upstream project itself, some kind of "contrib" way of doing
things that could radically expand the number of debirf-built flavors
out there, as they wouldn't depend on a central piece of authority to be
distributed or accepted.

> Feel free to clone the debirf repo [1] send patches directly to the
> mailing list.  We'd be happy to integrate them.

So far, the changes to the rescue image are really minimal, it's just a
bunch of extra packages but otherwise very minimally changed from the
rescue build.

I will wait and see what you guys think about making a third party
extension before going on with a merge, especially since the project is
far from finished...

A.
-- 
Antoine Beaupré +++ Réseau Koumbit Networks +++ +1.514.387.6262 #208
--------------------------------------------------------------------
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Thu, 16 May 2013 15:24:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Daniel Kahn Gillmor <dkg@fifthhorseman.net>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Thu, 16 May 2013 15:24:04 GMT) Full text and rfc822 format available.

Message #30 received at 707178@bugs.debian.org (full text, mbox):

From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: Antoine Beaupré <anarcat@debian.org>
Cc: taggart@debian.org, 707178@bugs.debian.org, debirf@lists.mayfirst.org
Subject: Re: start of a debirf stress-testing extension: "stressant"
Date: Thu, 16 May 2013 11:20:32 -0400
[Message part 1 (text/plain, inline)]
On 05/16/2013 11:08 AM, Antoine Beaupré wrote:

> i wonder if util-linux is really necessary - since it's of priority
> 'required', are those installed automatically?

util-linux is also "Essential: yes", so i think that's superfluous too.

> That would be great!
> 
> However, one thing I am looking at is a way to generate binary images
> directly through a package, something that has been discussed in #620294
> I believe, but never implemented.

Cool, but we should keep that as a separate issue, i think.

> Also, I wonder if it wouldn't be better to set `stressant` as an example
> of how it's possible to extend debirf without having to merge patches
> into the upstream project itself, some kind of "contrib" way of doing
> things that could radically expand the number of debirf-built flavors
> out there, as they wouldn't depend on a central piece of authority to be
> distributed or accepted.

i don't want to play a gatekeeper role, and i'm happy to encourage other
contributions.  but i actually actively want something like the proposed
stressant in debirf.  I think it would be really useful.

> So far, the changes to the rescue image are really minimal, it's just a
> bunch of extra packages but otherwise very minimally changed from the
> rescue build.

if it's a small set of packages that arguably fit in with rescue, let's
fold them in there.

But i also like the idea of a stressant profile being something you boot
into (like memtest86+) that just automatically hammers on your hardware
in a reliable and repeatable fashion while reporting its conclusions in
a standardized way.

> I will wait and see what you guys think about making a third party
> extension before going on with a merge, especially since the project is
> far from finished...

I'd personally rather integrate your suggestions with mainline debirf
than juggle it as a "contrib".

	--dkg

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Thu, 16 May 2013 17:39:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Antoine Beaupré <anarcat@debian.org>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Thu, 16 May 2013 17:39:04 GMT) Full text and rfc822 format available.

Message #35 received at 707178@bugs.debian.org (full text, mbox):

From: Antoine Beaupré <anarcat@debian.org>
To: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Cc: taggart@debian.org, 707178@bugs.debian.org, debirf@lists.mayfirst.org
Subject: Re: start of a debirf stress-testing extension: "stressant"
Date: Thu, 16 May 2013 13:34:52 -0400
[Message part 1 (text/plain, inline)]
On 2013-05-16 11:20:32, Daniel Kahn Gillmor wrote:
> On 05/16/2013 11:08 AM, Antoine Beaupré wrote:
>
>> i wonder if util-linux is really necessary - since it's of priority
>> 'required', are those installed automatically?
>
> util-linux is also "Essential: yes", so i think that's superfluous too.

fixed.

>> However, one thing I am looking at is a way to generate binary images
>> directly through a package, something that has been discussed in #620294
>> I believe, but never implemented.
>
> Cool, but we should keep that as a separate issue, i think.

Agreed.

>> Also, I wonder if it wouldn't be better to set `stressant` as an example
>> of how it's possible to extend debirf without having to merge patches
>> into the upstream project itself, some kind of "contrib" way of doing
>> things that could radically expand the number of debirf-built flavors
>> out there, as they wouldn't depend on a central piece of authority to be
>> distributed or accepted.
>
> i don't want to play a gatekeeper role, and i'm happy to encourage other
> contributions.  but i actually actively want something like the proposed
> stressant in debirf.  I think it would be really useful.

Okay well how about this: I have tons of little tweaks to do to that
package, and I prefer to keep it small and local for now. When it is
complete and makes sense, we'll merge it in.

I do think it would be helpful to have documentation on how to create
third-party extensions like I did for stressant.

>> So far, the changes to the rescue image are really minimal, it's just a
>> bunch of extra packages but otherwise very minimally changed from the
>> rescue build.
>
> if it's a small set of packages that arguably fit in with rescue, let's
> fold them in there.

Okay. Then maybe I should reset our repo to be a fork of debirf instead
of standalone?

> But i also like the idea of a stressant profile being something you boot
> into (like memtest86+) that just automatically hammers on your hardware
> in a reliable and repeatable fashion while reporting its conclusions in
> a standardized way.

yes, that is the ultimate goal behind this ITP.

A.

-- 
Pour marcher au pas d'une musique militaire, il n'y a pas besoin de
cerveau, une moelle épinière suffit.
                        - Albert Enstein
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Thu, 16 May 2013 17:48:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Daniel Kahn Gillmor <dkg@fifthhorseman.net>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Thu, 16 May 2013 17:48:07 GMT) Full text and rfc822 format available.

Message #40 received at 707178@bugs.debian.org (full text, mbox):

From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: Antoine Beaupré <anarcat@debian.org>
Cc: taggart@debian.org, 707178@bugs.debian.org, debirf@lists.mayfirst.org
Subject: Re: start of a debirf stress-testing extension: "stressant"
Date: Thu, 16 May 2013 13:45:25 -0400
[Message part 1 (text/plain, inline)]
On 05/16/2013 01:34 PM, Antoine Beaupré wrote:
> Okay well how about this: I have tons of little tweaks to do to that
> package, and I prefer to keep it small and local for now. When it is
> complete and makes sense, we'll merge it in.

it's software; it will never be complete :)  but i'm fine with waiting
until it "makes sense" to you to merge it.

> I do think it would be helpful to have documentation on how to create
> third-party extensions like I did for stressant.

hm, yes, that's probably true.

> Okay. Then maybe I should reset our repo to be a fork of debirf instead
> of standalone?

yes, please, that would make it much easier to merge (in both
directions), and would also make it clearer which version of debirf
you're expecting to work from.

	--dkg

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Fri, 17 Jan 2014 09:21:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Bryan Fisher <bryanf@pinnacle.co.za>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Fri, 17 Jan 2014 09:21:04 GMT) Full text and rfc822 format available.

Message #45 received at 707178@bugs.debian.org (full text, mbox):

From: Bryan Fisher <bryanf@pinnacle.co.za>
To: Antoine Beaupré <anarcat@debian.org>, "taggart@debian.org" <taggart@debian.org>
Cc: "707178@bugs.debian.org" <707178@bugs.debian.org>, "dkg@fifthhorseman.net" <dkg@fifthhorseman.net>, "jrollins@finestructure.net" <jrollins@finestructure.net>
Subject: Breakin - stress-test and hardware diagnostics tool - Please see if you are able to assist to an issue we are having now for more than a month on 3 servers
Date: Fri, 17 Jan 2014 09:11:22 +0000
[Message part 1 (text/plain, inline)]
Good day,

My name is Bryan Fisher, and I work for a company called Pinnacle Africa in South Africa, Cape Town.

I was hoping that maybe you could assist me in the issue that I am getting with server h/w please.

Attached is a screenshot of what happens when I insert a USB key to copy the Breakin log file. It also indicates that 'Failid - Other tests have errors, tuning on ID light..' would it be possible if you could point me in a direction to find the fault please?

I have run multiple memtest and it passes I have run burin in test in windows it passes, I have ran Sandra test & diagnostics and it passes while monitoring voltages and system temperatures it is all stable.

However, the server works for a few weeks, couple of months and it starts getting issues like reboots, freezing up and basically becomes unstable. We have changed mainboards, all ram modules, tested the PSU but the issue still remains.

This is the server H/W below that is in use;
ECC REG RAM - 8GB x4 modules
X9SRI-F - mainboard
E5-2620V2 2.1GB 6C Ivy bridge
1U Dual Xeon Chassis CSE-813MTQ-600CB

Please advise if you are able to assist or even just tell where the issue might be.

I got you e-mail address from here;
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=707178


Thank you kindly in advance! Hope to hearfrom you ASAP

Kind regards,


Bryan Fisher | Server Specialist
Cape Town | Pinnacle Africa
Direct: +27 21 5500 357 | Fax: +27 21 551 3444


[Message part 2 (text/html, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Fri, 17 Jan 2014 09:24:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Bryan Fisher <bryanf@pinnacle.co.za>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Fri, 17 Jan 2014 09:24:05 GMT) Full text and rfc822 format available.

Message #50 received at 707178@bugs.debian.org (full text, mbox):

From: Bryan Fisher <bryanf@pinnacle.co.za>
To: Antoine Beaupré <anarcat@debian.org>, "taggart@debian.org" <taggart@debian.org>
Cc: "707178@bugs.debian.org" <707178@bugs.debian.org>, "dkg@fifthhorseman.net" <dkg@fifthhorseman.net>, "jrollins@finestructure.net" <jrollins@finestructure.net>
Subject: RE: Breakin - stress-test and hardware diagnostics tool - Please see if you are able to assist to an issue we are having now for more than a month on 3 servers
Date: Fri, 17 Jan 2014 09:13:10 +0000
[Message part 1 (text/plain, inline)]
Sorry here's the attachment

From: Bryan Fisher
Sent: 17 January 2014 11:11 AM
To: 'Antoine Beaupré'; 'taggart@debian.org'
Cc: '707178@bugs.debian.org'; 'dkg@fifthhorseman.net'; 'jrollins@finestructure.net'
Subject: Breakin - stress-test and hardware diagnostics tool - Please see if you are able to assist to an issue we are having now for more than a month on 3 servers
Importance: High

Good day,

My name is Bryan Fisher, and I work for a company called Pinnacle Africa in South Africa, Cape Town.

I was hoping that maybe you could assist me in the issue that I am getting with server h/w please.

Attached is a screenshot of what happens when I insert a USB key to copy the Breakin log file. It also indicates that 'Failid - Other tests have errors, tuning on ID light..' would it be possible if you could point me in a direction to find the fault please?

I have run multiple memtest and it passes I have run burin in test in windows it passes, I have ran Sandra test & diagnostics and it passes while monitoring voltages and system temperatures it is all stable.

However, the server works for a few weeks, couple of months and it starts getting issues like reboots, freezing up and basically becomes unstable. We have changed mainboards, all ram modules, tested the PSU but the issue still remains.

This is the server H/W below that is in use;
ECC REG RAM - 8GB x4 modules
X9SRI-F - mainboard
E5-2620V2 2.1GB 6C Ivy bridge
1U Dual Xeon Chassis CSE-813MTQ-600CB

Please advise if you are able to assist or even just tell where the issue might be.

I got you e-mail address from here;
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=707178


Thank you kindly in advance! Hope to hearfrom you ASAP

Kind regards,


Bryan Fisher | Server Specialist
Cape Town | Pinnacle Africa
Direct: +27 21 5500 357 | Fax: +27 21 551 3444


[Message part 2 (text/html, inline)]
[Breakin_error.jpg (image/jpeg, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Fri, 17 Jan 2014 18:18:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Daniel Kahn Gillmor <dkg@fifthhorseman.net>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Fri, 17 Jan 2014 18:18:04 GMT) Full text and rfc822 format available.

Message #55 received at 707178@bugs.debian.org (full text, mbox):

From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: Bryan Fisher <bryanf@pinnacle.co.za>, Antoine Beaupré <anarcat@debian.org>, "taggart@debian.org" <taggart@debian.org>
Cc: "707178@bugs.debian.org" <707178@bugs.debian.org>, "jrollins@finestructure.net" <jrollins@finestructure.net>
Subject: Re: Breakin - stress-test and hardware diagnostics tool - Please see if you are able to assist to an issue we are having now for more than a month on 3 servers
Date: Fri, 17 Jan 2014 13:14:40 -0500
[Message part 1 (text/plain, inline)]
Hi Bryan--

On 01/17/2014 04:13 AM, Bryan Fisher wrote:

> My name is Bryan Fisher, and I work for a company called Pinnacle Africa in South Africa, Cape Town.
> 
> I was hoping that maybe you could assist me in the issue that I am getting with server h/w please.

I think you're asking about something underlated to what
http://bugs.debian.org/707178 is talking about.  The people that you've
e-mailed don't have anything to do with the breakin project, and we
can't support your organization's hardware at any rate.

I recommend you follow up with your hardware vendor (or retain other
local technical staff), and explain to them the errors that you're
having.  But your post is off-topic for the discussion of
http://bugs.debian.org/707178.

Regards,

	--dkg

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Fri, 17 Jan 2014 19:54:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Fri, 17 Jan 2014 19:54:04 GMT) Full text and rfc822 format available.

Message #60 received at 707178@bugs.debian.org (full text, mbox):

From: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
To: Bryan Fisher <bryanf@pinnacle.co.za>
Cc: 707178@bugs.debian.org, 733565@bugs.debian.org, Antoine Beaupré <anarcat@debian.org>, "taggart@debian.org" <taggart@debian.org>, "dkg@fifthhorseman.net" <dkg@fifthhorseman.net>, "jrollins@finestructure.net" <jrollins@finestructure.net>
Subject: Re: Bug#707178: Breakin - stress-test and hardware diagnostics tool - Please see if you are able to assist to an issue we are having now for more than a month on 3 servers
Date: Fri, 17 Jan 2014 20:51:10 +0100
Hi Bryan,

On 17.01.2014 10:13, Bryan Fisher wrote:
> I was hoping that maybe you could assist me in the issue that I am
> getting with server h/w please.
>
> Attached is a screenshot of what happens when I insert a USB key to copy
> the Breakin log file. It also indicates that ‘Failid – Other tests have
> errors, tuning on ID light..’ would it be possible if you could point me
> in a direction to find the fault please?

The error message (repeated 3 times) I read from the screenshot is:
kernel: [ 3009.877308] sd 9:0:0:0: [sdb] No Caching mode page present
kernel: [ 3009.877311] sd 9:0:0:0: [sdb] Assuming drive cache: write through

This reminds me of bug #733565 [1], which is about a request to silence 
these error messages.
I have seen similar messages and they seem to be totally harmless and 
have nothing to do with hardware failure.

Best regards,
Andreas


1: http://bugs.debian.org/733565



Information forwarded to debian-bugs-dist@lists.debian.org, wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>:
Bug#707178; Package wnpp. (Mon, 20 Jan 2014 06:42:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Bryan Fisher <bryanf@pinnacle.co.za>:
Extra info received and forwarded to list. Copy sent to wnpp@debian.org, "Antoine Beaupré" <anarcat@debian.org>. (Mon, 20 Jan 2014 06:42:05 GMT) Full text and rfc822 format available.

Message #65 received at 707178@bugs.debian.org (full text, mbox):

From: Bryan Fisher <bryanf@pinnacle.co.za>
To: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Cc: "707178@bugs.debian.org" <707178@bugs.debian.org>, "733565@bugs.debian.org" <733565@bugs.debian.org>, Antoine Beaupré <anarcat@debian.org>, "taggart@debian.org" <taggart@debian.org>, "dkg@fifthhorseman.net" <dkg@fifthhorseman.net>, "jrollins@finestructure.net" <jrollins@finestructure.net>
Subject: RE: Bug#707178: Breakin - stress-test and hardware diagnostics tool - Please see if you are able to assist to an issue we are having now for more than a month on 3 servers
Date: Mon, 20 Jan 2014 06:35:57 +0000
Thank you very much Andreas!

Kind regards,

Bryan


-----Original Message-----
From: Andreas Cadhalpun [mailto:andreas.cadhalpun@googlemail.com] 
Sent: 17 January 2014 09:51 PM
To: Bryan Fisher
Cc: 707178@bugs.debian.org; 733565@bugs.debian.org; Antoine Beaupré; taggart@debian.org; dkg@fifthhorseman.net; jrollins@finestructure.net
Subject: Re: Bug#707178: Breakin - stress-test and hardware diagnostics tool - Please see if you are able to assist to an issue we are having now for more than a month on 3 servers

Hi Bryan,

On 17.01.2014 10:13, Bryan Fisher wrote:
> I was hoping that maybe you could assist me in the issue that I am 
> getting with server h/w please.
>
> Attached is a screenshot of what happens when I insert a USB key to 
> copy the Breakin log file. It also indicates that 'Failid - Other 
> tests have errors, tuning on ID light..' would it be possible if you 
> could point me in a direction to find the fault please?

The error message (repeated 3 times) I read from the screenshot is:
kernel: [ 3009.877308] sd 9:0:0:0: [sdb] No Caching mode page present
kernel: [ 3009.877311] sd 9:0:0:0: [sdb] Assuming drive cache: write through

This reminds me of bug #733565 [1], which is about a request to silence these error messages.
I have seen similar messages and they seem to be totally harmless and have nothing to do with hardware failure.

Best regards,
Andreas


1: http://bugs.debian.org/733565



Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Thu Apr 17 15:29:06 2014; Machine Name: beach.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.