Debian Bug report logs - #607368
Please decide how kernel ABI should be managed

Package: tech-ctte; Maintainer for tech-ctte is Technical Committee <debian-ctte@lists.debian.org>;

Reported by: Julien BLACHE <jblache@debian.org>

Date: Fri, 17 Dec 2010 14:09:02 UTC

Severity: serious

Tags: squeeze-ignore

Done: Don Armstrong <don@debian.org>

Bug is archived. No further changes may be made.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#607368; Package src:linux-2.6. (Fri, 17 Dec 2010 14:09:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien-externe BLACHE <julien-externe.blache@edf.fr>:
New Bug report received and forwarded. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Fri, 17 Dec 2010 14:09:04 GMT) Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: Julien-externe BLACHE <julien-externe.blache@edf.fr>
To: submit@bugs.debian.org
Subject: linux-2.6: silent ABI change in 2.6.32.26 breaks external modules (smp_ops changes)
Date: Fri, 17 Dec 2010 14:53:27 +0100
Source: linux-2.6
Version: 2.6.32-28
Severity: serious

Hi,

smp_ops was changed in a rather incompatible way in 2.6.32.26, breaking 
the kernel ABI:

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 4cfc908..4c2f63c 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -50,7 +50,7 @@ struct smp_ops {
        void (*smp_prepare_cpus)(unsigned max_cpus);
        void (*smp_cpus_done)(unsigned max_cpus);
 
-       void (*smp_send_stop)(void);
+       void (*stop_other_cpus)(int wait);
        void (*smp_send_reschedule)(int cpu);
 
        int (*cpu_up)(unsigned cpu);

This change was, in turn, willfully ignored (SVN rev 16598) and the kernel 
ABI remained at 5.

This breaks external modules like VMware (vmmon) that use the smp_ops 
symbol.

Please revert or bump the kernel ABI to 6 to reflect this ABI change.

Thanks,

JB.

-- 
Consultant INTM - Debian Developer - TMI Calibre
EDF - DSP - CSP IT - ITS Rhône Alpes - C4S - CCNPS
04 69 65 68 56




Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus.
____________________________________________________

This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or virus-free.





Reply sent to Ben Hutchings <ben@decadent.org.uk>:
You have taken responsibility. (Fri, 17 Dec 2010 15:48:15 GMT) Full text and rfc822 format available.

Notification sent to Julien-externe BLACHE <julien-externe.blache@edf.fr>:
Bug acknowledged by developer. (Fri, 17 Dec 2010 15:48:15 GMT) Full text and rfc822 format available.

Message #10 received at 607368-done@bugs.debian.org (full text, mbox):

From: Ben Hutchings <ben@decadent.org.uk>
To: 607368-done@bugs.debian.org
Subject: Re: Bug#607368: linux-2.6: silent ABI change in 2.6.32.26 breaks external modules (smp_ops changes)
Date: Fri, 17 Dec 2010 15:39:41 +0000
On Fri, Dec 17, 2010 at 02:53:27PM +0100, Julien-externe BLACHE wrote:
> Source: linux-2.6
> Version: 2.6.32-28
> Severity: serious
> 
> Hi,
> 
> smp_ops was changed in a rather incompatible way in 2.6.32.26, breaking 
> the kernel ABI:
> 
> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> index 4cfc908..4c2f63c 100644
> --- a/arch/x86/include/asm/smp.h
> +++ b/arch/x86/include/asm/smp.h
> @@ -50,7 +50,7 @@ struct smp_ops {
>         void (*smp_prepare_cpus)(unsigned max_cpus);
>         void (*smp_cpus_done)(unsigned max_cpus);
>  
> -       void (*smp_send_stop)(void);
> +       void (*stop_other_cpus)(int wait);
>         void (*smp_send_reschedule)(int cpu);
>  
>         int (*cpu_up)(unsigned cpu);
> 
> This change was, in turn, willfully ignored (SVN rev 16598) and the kernel 
> ABI remained at 5.
> 
> This breaks external modules like VMware (vmmon) that use the smp_ops 
> symbol.
[...]

smp_ops is exported only for use by KVM and not out-of-tree modules.
We will not bump the ABI number.

Ben.

-- 
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
                                                              - Albert Camus




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#607368; Package src:linux-2.6. (Sat, 18 Dec 2010 08:27:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Sat, 18 Dec 2010 08:27:07 GMT) Full text and rfc822 format available.

Message #15 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: 607368@bugs.debian.org
Cc: control@bugs.debian.org
Subject: Kernel ABI management
Date: Sat, 18 Dec 2010 09:26:41 +0100
reopen 607368
submitter 607368 !
thanks

Hi,

I am sorry that I have to reopen this bug, but first this is about more
than just smp_ops and second the outcome isn't satisfactory.

Whether a symbol is exported for a specific purpose or for general
usage, whether you like it or not, every symbol that is exported is part
of the ABI. If it changes, the ABI changes and it changes for everybody,
regardless of whether they're supposed to be using that symbol or not.

We would not accept that behaviour from a shared library, I don't see
any reason why we would accept it from the kernel.

As it stands, the kernel ABI number has just been rendered useless; I
can no longer trust it nor rely on it. Every kernel revision will have
to be tested to make sure all modules are still compatible with the new
ABI, given the ABI will change silently without bumping the ABI number.

Unsuspecting users will have their setup break upon reboot after
updating their kernel packages without any obvious clue as to what
caused the breakage.

This is a big deal as it puts a big question mark where the kernel ABI
number used to be. This is a problem for users, admins, ISV, vendors
higher up the chain, everybody. It's no longer possible to offer
certified modules for Debian kernels given the kernel ABI number cannot
be relied upon anymore.

Out of tree modules exist and you can't just ignore them; in some
environments they are necessary to make things work and you won't have a
way around that.

So I am asking you to reconsider your position and go back to strictly
maintaining the kernel ABI number. This situation is a big step backward
for the Debian kernel packages and I hope it'll be fixed soon.

Thanks,

JB.

-- 
 Julien BLACHE <jblache@debian.org>  |  Debian, because code matters more 
 Debian & GNU/Linux Developer        |       <http://www.debian.org>
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Did not alter fixed versions and reopened. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Sat, 18 Dec 2010 08:27:08 GMT) Full text and rfc822 format available.

Changed Bug submitter to 'Julien BLACHE <jblache@debian.org>' from 'Julien-externe BLACHE <julien-externe.blache@edf.fr>' Request was from Julien BLACHE <jblache@debian.org> to control@bugs.debian.org. (Sat, 18 Dec 2010 08:27:09 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#607368; Package src:linux-2.6. (Sat, 18 Dec 2010 14:18:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ben Hutchings <ben@decadent.org.uk>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Sat, 18 Dec 2010 14:18:02 GMT) Full text and rfc822 format available.

Message #24 received at 607368@bugs.debian.org (full text, mbox):

From: Ben Hutchings <ben@decadent.org.uk>
To: Julien BLACHE <jblache@debian.org>, 607368@bugs.debian.org
Subject: Re: Bug#607368: Kernel ABI management
Date: Sat, 18 Dec 2010 14:14:50 +0000
[Message part 1 (text/plain, inline)]
On Sat, 2010-12-18 at 09:26 +0100, Julien BLACHE wrote:
> reopen 607368
> submitter 607368 !
> thanks
> 
> Hi,
> 
> I am sorry that I have to reopen this bug, but first this is about more
> than just smp_ops and second the outcome isn't satisfactory.
> 
> Whether a symbol is exported for a specific purpose or for general
> usage, whether you like it or not, every symbol that is exported is part
> of the ABI. If it changes, the ABI changes and it changes for everybody,
> regardless of whether they're supposed to be using that symbol or not.

No distribution promises that all exported symbols will be unchanged.

Some distributions provide a list all exported symbols which can be
depended on not to change.  We haven't done that but we do consider
where symbols are used before deciding a change can be ignored.

(As an example, there are several sets of drivers for related hardware
in which one core module exports symbols to the specific driver modules.
Those exports should in no way be depended on by OOT modules.)

> We would not accept that behaviour from a shared library, I don't see
> any reason why we would accept it from the kernel.

This is not true; for example, the interface between libc and NSS is not
stable.

> As it stands, the kernel ABI number has just been rendered useless; I
> can no longer trust it nor rely on it. Every kernel revision will have
> to be tested to make sure all modules are still compatible with the new
> ABI, given the ABI will change silently without bumping the ABI number.
> 
> Unsuspecting users will have their setup break upon reboot after
> updating their kernel packages without any obvious clue as to what
> caused the breakage.
> 
> This is a big deal as it puts a big question mark where the kernel ABI
> number used to be. This is a problem for users, admins, ISV, vendors
> higher up the chain, everybody. It's no longer possible to offer
> certified modules for Debian kernels given the kernel ABI number cannot
> be relied upon anymore.

If someone claims to certify something about future Debian kernels
without talking to the kernel team, they are a fraud.

> Out of tree modules exist and you can't just ignore them; in some
> environments they are necessary to make things work and you won't have a
> way around that.

Example?

> So I am asking you to reconsider your position and go back to strictly
> maintaining the kernel ABI number. This situation is a big step backward
> for the Debian kernel packages and I hope it'll be fixed soon.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#607368; Package src:linux-2.6. (Sat, 18 Dec 2010 15:21:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Sat, 18 Dec 2010 15:21:05 GMT) Full text and rfc822 format available.

Message #29 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: Ben Hutchings <ben@decadent.org.uk>
Cc: 607368@bugs.debian.org
Subject: Re: Bug#607368: Kernel ABI management
Date: Sat, 18 Dec 2010 16:20:21 +0100
Ben Hutchings <ben@decadent.org.uk> wrote:

Hi,

> Some distributions provide a list all exported symbols which can be
> depended on not to change.  We haven't done that but we do consider

What you're saying here is very important: you haven't done that yet,
which implies that all symbols are covered by the ABI.

This is reinforced by reading the packaging scripts and realizing they
check the whole ABI, prior to -28.

> where symbols are used before deciding a change can be ignored.

I can perfectly imagine that you weren't aware of VMware's reliance upon
this symbol before, but you are now.

No need to tell you that quite a few of our users out there will use
VMware on Squeeze and be impacted by this change.

> (As an example, there are several sets of drivers for related hardware
> in which one core module exports symbols to the specific driver modules.
> Those exports should in no way be depended on by OOT modules.)

As smp_ops is exported by the core kernel and not by the common core of
a self-contained set of drivers, I don't think this argument holds here.

Reviewing the kernel revision history, smp_ops was indeed exported to
allow building KVM as a module. The commit message certainly doesn't
claim that KVM should be the sole user of this exported symbol.

I fail to see a reason why VMware or anybody else should refrain from
using smp_ops if they need it.

>> We would not accept that behaviour from a shared library, I don't see
>> any reason why we would accept it from the kernel.
>
> This is not true; for example, the interface between libc and NSS is not
> stable.

And it's been widely recognized as a design flaw and a royal pain in the
ass for, like, forever. Not exactly an example you want to follow.

> If someone claims to certify something about future Debian kernels
> without talking to the kernel team, they are a fraud.

See the top of this mail where you state that no list of symbols covered
by the ABI was ever published for Debian kernels. It isn't unreasonable
under these circumstances to assume that all symbols are covered.

>> Out of tree modules exist and you can't just ignore them; in some
>> environments they are necessary to make things work and you won't have a
>> way around that.
>
> Example?

VMware, nVidia, various drivers and infrastructure for communications
hardware (been there, done that), ...

JB.

-- 
 Julien BLACHE <jblache@debian.org>  |  Debian, because code matters more 
 Debian & GNU/Linux Developer        |       <http://www.debian.org>
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Added tag(s) wontfix. Request was from Ben Hutchings <ben@decadent.org.uk> to control@bugs.debian.org. (Sat, 18 Dec 2010 16:21:02 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#607368; Package src:linux-2.6. (Sat, 18 Dec 2010 16:51:09 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ben Hutchings <ben@decadent.org.uk>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Sat, 18 Dec 2010 16:51:09 GMT) Full text and rfc822 format available.

Message #36 received at 607368@bugs.debian.org (full text, mbox):

From: Ben Hutchings <ben@decadent.org.uk>
To: Julien BLACHE <jblache@debian.org>
Cc: 607368@bugs.debian.org
Subject: Re: Bug#607368: Kernel ABI management
Date: Sat, 18 Dec 2010 16:46:00 +0000
[Message part 1 (text/plain, inline)]
On Sat, 2010-12-18 at 16:20 +0100, Julien BLACHE wrote:
> Ben Hutchings <ben@decadent.org.uk> wrote:
> 
> Hi,
> 
> > Some distributions provide a list all exported symbols which can be
> > depended on not to change.  We haven't done that but we do consider
> 
> What you're saying here is very important: you haven't done that yet,
> which implies that all symbols are covered by the ABI.
> 
> This is reinforced by reading the packaging scripts and realizing they
> check the whole ABI, prior to -28.

This is not correct.  We have ignored many changes since 2.6.32-12 when
the ABI number was bumped to 5.  In 2.6.32-27 the symbol version files
were refreshed and the ignore list was reset.

> > where symbols are used before deciding a change can be ignored.
> 
> I can perfectly imagine that you weren't aware of VMware's reliance upon
> this symbol before, but you are now.
> 
> No need to tell you that quite a few of our users out there will use
> VMware on Squeeze and be impacted by this change.
>
> > (As an example, there are several sets of drivers for related hardware
> > in which one core module exports symbols to the specific driver modules.
> > Those exports should in no way be depended on by OOT modules.)
> 
> As smp_ops is exported by the core kernel and not by the common core of
> a self-contained set of drivers, I don't think this argument holds here.
> 
> Reviewing the kernel revision history, smp_ops was indeed exported to
> allow building KVM as a module. The commit message certainly doesn't
> claim that KVM should be the sole user of this exported symbol.

The upstream policy is that symbol exports may be removed when there are
no in-tree users.  So that export could even be made conditional on
CONFIG_KVM_MODULE (or whatever it's called).

> I fail to see a reason why VMware or anybody else should refrain from
> using smp_ops if they need it.

Because it's a low-level implementation detail.

Maybe I should find a way to limit that export so OOT users won't make
this mistake.

> >> We would not accept that behaviour from a shared library, I don't see
> >> any reason why we would accept it from the kernel.
> >
> > This is not true; for example, the interface between libc and NSS is not
> > stable.
> 
> And it's been widely recognized as a design flaw and a royal pain in the
> ass for, like, forever. Not exactly an example you want to follow.
> 
> > If someone claims to certify something about future Debian kernels
> > without talking to the kernel team, they are a fraud.
> 
> See the top of this mail where you state that no list of symbols covered
> by the ABI was ever published for Debian kernels. It isn't unreasonable
> under these circumstances to assume that all symbols are covered.

It is extremely stupid.

> >> Out of tree modules exist and you can't just ignore them; in some
> >> environments they are necessary to make things work and you won't have a
> >> way around that.
> >
> > Example?
> 
> VMware, nVidia, various drivers and infrastructure for communications
> hardware (been there, done that), ...

VMware - use KVM.
nvidia - use nouveau, report a bug if it doesn't work.
random drivers - send them to the maintainer of crap (Greg K-H, for the
staging tree).

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#607368; Package src:linux-2.6. (Sat, 18 Dec 2010 17:27:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Sat, 18 Dec 2010 17:27:07 GMT) Full text and rfc822 format available.

Message #41 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: Ben Hutchings <ben@decadent.org.uk>
Cc: 607368@bugs.debian.org
Subject: Re: Bug#607368: Kernel ABI management
Date: Sat, 18 Dec 2010 18:23:20 +0100
Ben Hutchings <ben@decadent.org.uk> wrote:

Hi Ben,

>> This is reinforced by reading the packaging scripts and realizing they
>> check the whole ABI, prior to -28.
>
> This is not correct.  We have ignored many changes since 2.6.32-12 when
> the ABI number was bumped to 5.  In 2.6.32-27 the symbol version files
> were refreshed and the ignore list was reset.

This is even more troubling.

> The upstream policy is that symbol exports may be removed when there are
> no in-tree users.  So that export could even be made conditional on
> CONFIG_KVM_MODULE (or whatever it's called).

Upstream policy doesn't break your setup from one kernel package
revision to the other.

> Maybe I should find a way to limit that export so OOT users won't make
> this mistake.

Good luck with that, it's been tried already with EXPORT_SYMBOL_GPL()
and people still do work around that.

>> See the top of this mail where you state that no list of symbols covered
>> by the ABI was ever published for Debian kernels. It isn't unreasonable
>> under these circumstances to assume that all symbols are covered.
>
> It is extremely stupid.

We obviously disagree.

>> VMware, nVidia, various drivers and infrastructure for communications
>> hardware (been there, done that), ...
>
> VMware - use KVM.

Not possible. We require 3D pass-through that KVM doesn't offer. Windows
virtio drivers failed us on Vista/Seven (can't remember, not my
area), plain old IDE emulation is too slow to be usable. Also, issues
with moving a VM from one host to another from a Windows licensing
standpoint (still researching this one, though).

It's not that using KVM wouldn't ease our (and our user's) life
considerably, it's that we *cannot* use it, and there are real reasons
why we cannot (and I'm not even speaking of getting that solution
approved internally, it's really a detail given the above).

As you can see on my blog, I'm the one responsible for packaging
VMware. My life would be better if we could just use KVM, believe
me. Packaging VMware 7 was a nightmare.

> nvidia - use nouveau, report a bug if it doesn't work.

Doesn't work with our cards, not by a long shot. Probably won't work for
another decade or so, so not an option. We do need working and fast 3D.

Switching to AMD - oh yeah, we tried that. I have a drawer full of test
cards. Not a single one has working 3D with free drivers, and here again
it won't happen for another year or two *best case*. Not an option.

Once again: not that we wouldn't like to use free drivers, but we just
can't. And I'm the one backporting and testing the nVidia drivers, so
believe me when I tell you I'd be using Nouveau if it was an option.

We are limited by our user's requirements on the one hand and by what
hardware vendors can sell us on the other hand - and they can't sell us
yesteryear's tech forever, especially on high-end mobile workstations.

Anybody doing this type of large-scale deployment faces the same issues.

> random drivers - send them to the maintainer of crap (Greg K-H, for the
> staging tree).

:-) That being said, not every out of tree driver comes with
source. Although pure crap has made it to staging in the past, I'm
pretty sure multi-megabyte binary blobs don't stand a chance.

I'm pretty disappointed by the way you're handling this; it feels like
you have little care for what your users actually need. I find it a bit
sad, given all the very good work you've been doing with the kernel
otherwise.

As I wrote already, it's not like VMware is some obscure piece of
software that nobody knows about.

JB.

-- 
 Julien BLACHE <jblache@debian.org>  |  Debian, because code matters more 
 Debian & GNU/Linux Developer        |       <http://www.debian.org>
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#607368; Package src:linux-2.6. (Sat, 18 Dec 2010 17:54:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Bastian Blank <waldi@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Sat, 18 Dec 2010 17:54:06 GMT) Full text and rfc822 format available.

Message #46 received at 607368@bugs.debian.org (full text, mbox):

From: Bastian Blank <waldi@debian.org>
To: Julien BLACHE <jblache@debian.org>, 607368@bugs.debian.org
Cc: Ben Hutchings <ben@decadent.org.uk>
Subject: Re: Bug#607368: Kernel ABI management
Date: Sat, 18 Dec 2010 18:50:11 +0100
On Sat, Dec 18, 2010 at 06:23:20PM +0100, Julien BLACHE wrote:
> Ben Hutchings <ben@decadent.org.uk> wrote:
> > This is not correct.  We have ignored many changes since 2.6.32-12 when
> > the ABI number was bumped to 5.  In 2.6.32-27 the symbol version files
> > were refreshed and the ignore list was reset.
> This is even more troubling.

It is reality. We settled for a best-effort implementation and this
suites 99% of our users. If you have problem with that, go ask the CTTE.

> > The upstream policy is that symbol exports may be removed when there are
> > no in-tree users.  So that export could even be made conditional on
> > CONFIG_KVM_MODULE (or whatever it's called).
> Upstream policy doesn't break your setup from one kernel package
> revision to the other.

Yes, it does.

> > Maybe I should find a way to limit that export so OOT users won't make
> > this mistake.
> Good luck with that, it's been tried already with EXPORT_SYMBOL_GPL()
> and people still do work around that.

And? If they do, it is clearly visible.

Bastian

-- 
You!  What PLANET is this!
		-- McCoy, "The City on the Edge of Forever", stardate 3134.0




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#607368; Package src:linux-2.6. (Sat, 18 Dec 2010 18:51:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ben Hutchings <ben@decadent.org.uk>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Sat, 18 Dec 2010 18:51:07 GMT) Full text and rfc822 format available.

Message #51 received at 607368@bugs.debian.org (full text, mbox):

From: Ben Hutchings <ben@decadent.org.uk>
To: Julien BLACHE <jblache@debian.org>
Cc: 607368@bugs.debian.org
Subject: Re: Bug#607368: Kernel ABI management
Date: Sat, 18 Dec 2010 18:48:31 +0000
[Message part 1 (text/plain, inline)]
On Sat, 2010-12-18 at 18:23 +0100, Julien BLACHE wrote:
> Ben Hutchings <ben@decadent.org.uk> wrote:
> 
> Hi Ben,
> 
> >> This is reinforced by reading the packaging scripts and realizing they
> >> check the whole ABI, prior to -28.
> >
> > This is not correct.  We have ignored many changes since 2.6.32-12 when
> > the ABI number was bumped to 5.  In 2.6.32-27 the symbol version files
> > were refreshed and the ignore list was reset.
> 
> This is even more troubling.
> 
> > The upstream policy is that symbol exports may be removed when there are
> > no in-tree users.  So that export could even be made conditional on
> > CONFIG_KVM_MODULE (or whatever it's called).
> 
> Upstream policy doesn't break your setup from one kernel package
> revision to the other.

Actually it does.  The ABI change was part of a stable update.

> > Maybe I should find a way to limit that export so OOT users won't make
> > this mistake.
> 
> Good luck with that, it's been tried already with EXPORT_SYMBOL_GPL()
> and people still do work around that.

That is probably copyright infringement.

> >> See the top of this mail where you state that no list of symbols covered
> >> by the ABI was ever published for Debian kernels. It isn't unreasonable
> >> under these circumstances to assume that all symbols are covered.
> >
> > It is extremely stupid.
> 
> We obviously disagree.
> 
> >> VMware, nVidia, various drivers and infrastructure for communications
> >> hardware (been there, done that), ...
> >
> > VMware - use KVM.
> 
> Not possible. We require 3D pass-through that KVM doesn't offer. Windows
> virtio drivers failed us on Vista/Seven (can't remember, not my
> area), plain old IDE emulation is too slow to be usable. Also, issues
> with moving a VM from one host to another from a Windows licensing
> standpoint (still researching this one, though).

It sounds like you should really be using ESX/vSphere on the host,
rather than VMware Server on Debian.  I mean, VMware Server is basically
demo-ware.

> > nvidia - use nouveau, report a bug if it doesn't work.
> 
> Doesn't work with our cards, not by a long shot. Probably won't work for
> another decade or so, so not an option. We do need working and fast 3D.
> 
> Switching to AMD - oh yeah, we tried that. I have a drawer full of test
> cards. Not a single one has working 3D with free drivers, and here again
> it won't happen for another year or two *best case*. Not an option.
> 
> Once again: not that we wouldn't like to use free drivers, but we just
> can't. And I'm the one backporting and testing the nVidia drivers, so
> believe me when I tell you I'd be using Nouveau if it was an option.

Where are your bug reports on nouveau?

> We are limited by our user's requirements on the one hand and by what
> hardware vendors can sell us on the other hand - and they can't sell us
> yesteryear's tech forever, especially on high-end mobile workstations.
> 
> Anybody doing this type of large-scale deployment faces the same issues.
> 
> > random drivers - send them to the maintainer of crap (Greg K-H, for the
> > staging tree).
> 
> :-) That being said, not every out of tree driver comes with
> source. Although pure crap has made it to staging in the past, I'm
> pretty sure multi-megabyte binary blobs don't stand a chance.

Binary-only drivers for Linux are generally copyright infringements.  If
we break them: good.  (I know nvidia provides a Linux-specific stub as
source and it might be an exception to this.)

> I'm pretty disappointed by the way you're handling this; it feels like
> you have little care for what your users actually need.

We do, just not all of what *you* (one of our users) want.

Ben.

> I find it a bit
> sad, given all the very good work you've been doing with the kernel
> otherwise.
> 
> As I wrote already, it's not like VMware is some obscure piece of
> software that nobody knows about.



-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
[signature.asc (application/pgp-signature, inline)]

Reply sent to maximilian attems <max@stro.at>:
You have taken responsibility. (Sat, 18 Dec 2010 20:39:06 GMT) Full text and rfc822 format available.

Notification sent to Julien BLACHE <jblache@debian.org>:
Bug acknowledged by developer. (Sat, 18 Dec 2010 20:39:06 GMT) Full text and rfc822 format available.

Message #56 received at 607368-done@bugs.debian.org (full text, mbox):

From: maximilian attems <max@stro.at>
To: 607368-done@bugs.debian.org
Subject: Re: Bug#607368: Kernel ABI management
Date: Sat, 18 Dec 2010 20:35:25 +0000
On Sat, Dec 18, 2010 at 06:23:20PM +0100, Julien BLACHE wrote:
> Ben Hutchings <ben@decadent.org.uk> wrote:
> 
> >> This is reinforced by reading the packaging scripts and realizing they
> >> check the whole ABI, prior to -28.
> >
> > This is not correct.  We have ignored many changes since 2.6.32-12 when
> > the ABI number was bumped to 5.  In 2.6.32-27 the symbol version files
> > were refreshed and the ignore list was reset.
> 
> This is even more troubling.

no it is reality, please wakeup.

We never supported oot binary crap, nor do we intend to do.
closing, as you already got all the explanations.

-- 
maks




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#607368; Package src:linux-2.6. (Sun, 19 Dec 2010 09:45:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Sun, 19 Dec 2010 09:45:06 GMT) Full text and rfc822 format available.

Message #61 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: Ben Hutchings <ben@decadent.org.uk>
Cc: 607368@bugs.debian.org
Subject: Re: Bug#607368: Kernel ABI management
Date: Sun, 19 Dec 2010 10:42:38 +0100
Ben Hutchings <ben@decadent.org.uk> wrote:

Hi,

>> Good luck with that, it's been tried already with EXPORT_SYMBOL_GPL()
>> and people still do work around that.
>
> That is probably copyright infringement.

Maybe, maybe not. Nobody actually really knows.

> It sounds like you should really be using ESX/vSphere on the host,
> rather than VMware Server on Debian.  I mean, VMware Server is basically
> demo-ware.

That's VMware Player/Workstation running on workstations, not servers,
actually. So no, we don't need ESX/vSphere.

> Where are your bug reports on nouveau?

I don't think there is any value in reporting that an nVidia card that
was released last month doesn't work with nouveau. I think the nouveau
folks do know it's not supported without me telling them.

> Binary-only drivers for Linux are generally copyright infringements.  If

Says you. Again, nobody really knows. Some may very well be infringing
copyrights, sure.

> we break them: good.  (I know nvidia provides a Linux-specific stub as
> source and it might be an exception to this.)

You can be assured that I share your feelings towards binary-only
drivers.

>> I'm pretty disappointed by the way you're handling this; it feels like
>> you have little care for what your users actually need.
>
> We do, just not all of what *you* (one of our users) want.

Yeah, I'm probably the only Debian folk out there that has to support a
thousand workstations with VMware. We should definitely move to RedHat,
or something.

JB.

-- 
 Julien BLACHE <jblache@debian.org>  |  Debian, because code matters more 
 Debian & GNU/Linux Developer        |       <http://www.debian.org>
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#607368; Package src:linux-2.6. (Sun, 19 Dec 2010 09:45:08 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Sun, 19 Dec 2010 09:45:08 GMT) Full text and rfc822 format available.

Message #66 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: maximilian attems <max@stro.at>
Cc: 607368@bugs.debian.org
Subject: Re: Bug#607368 closed by maximilian attems <max@stro.at> (Re: Bug#607368: Kernel ABI management)
Date: Sun, 19 Dec 2010 10:44:15 +0100
owner@bugs.debian.org (Debian Bug Tracking System) wrote:

> We never supported oot binary crap, nor do we intend to do.
> closing, as you already got all the explanations.

For the record, VMware modules come as source.

JB.

-- 
 Julien BLACHE - Debian & GNU/Linux Developer - <jblache@debian.org> 
 
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>:
Bug#607368; Package src:linux-2.6. (Sun, 19 Dec 2010 18:33:09 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Kernel Team <debian-kernel@lists.debian.org>. (Sun, 19 Dec 2010 18:33:09 GMT) Full text and rfc822 format available.

Message #71 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: 607368@bugs.debian.org
Cc: control@bugs.debian.org, debian-ctte@lists.debian.org
Subject: Please decide how kernel ABI should be managed
Date: Sun, 19 Dec 2010 19:30:58 +0100
reopen 607368
tags 607368 - wontfix
reassign 607368 tech-ctte
retitle 607368 Please decide how kernel ABI should be managed
thanks

Hi,

I am hereby asking the tech-ctte to decide how the kernel ABI should be
managed.

Case in point: the kernel team decided to ignore changes to the smp_ops
symbol in 2.6.32-28 which broke external modules (vmware) without any
prior warning.

I am worried that this is going to happen again during the lifetime of
Squeeze, silently breaking working setups upon reboot after a kernel
update, even though the new kernel carries the same ABI number as the
previous one.

I do agree that it is fine to ignore changes to symbols that are only
exported and used inside a self-contained group of modules to which no
additional modules will ever need to be added.

I disagree with the kernel team's take that it is OK for them to ignore
symbol changes in all other cases, especially for symbols exported by
the core kernel (like smp_ops).

This kind of silent breakage is a nightmare from an ops standpoint and
it does have a cost for our users. The ABI number should guarantee that
upgrading from a revision of linux-image to another carrying the same
ABI number will not cause any breakage with external modules built for
this ABI.

As the kernel team made it clear that they make their decision partly
based on symbol usage, I'd like to highlight once again, for the
specific case of smp_ops, that VMware modules aren't exactly pet modules
that only a few of our users care about. There is ample proof of this on
several web forums and mailing-lists dedicated to either VMware or
Debian.

I am seeking a generic ruling by the tech-ctte to ensure that the kernel
ABI number remains meaningful and dependable.

I think it would be best if this matter would be decided upon before the
release of Squeeze, or not too long after it, so as to avoid further
breakages in early kernel updates for Squeeze.

Thanks,

JB.

-- 
 Julien BLACHE <jblache@debian.org>  |  Debian, because code matters more 
 Debian & GNU/Linux Developer        |       <http://www.debian.org>
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Did not alter fixed versions and reopened. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Sun, 19 Dec 2010 18:33:10 GMT) Full text and rfc822 format available.

Removed tag(s) wontfix. Request was from Julien BLACHE <jblache@debian.org> to control@bugs.debian.org. (Sun, 19 Dec 2010 18:33:11 GMT) Full text and rfc822 format available.

Bug reassigned from package 'src:linux-2.6' to 'tech-ctte'. Request was from Julien BLACHE <jblache@debian.org> to control@bugs.debian.org. (Sun, 19 Dec 2010 18:33:12 GMT) Full text and rfc822 format available.

Bug No longer marked as found in versions linux-2.6/2.6.32-28. Request was from Julien BLACHE <jblache@debian.org> to control@bugs.debian.org. (Sun, 19 Dec 2010 18:33:13 GMT) Full text and rfc822 format available.

Changed Bug title to 'Please decide how kernel ABI should be managed' from 'linux-2.6: silent ABI change in 2.6.32.26 breaks external modules (smp_ops changes)' Request was from Julien BLACHE <jblache@debian.org> to control@bugs.debian.org. (Sun, 19 Dec 2010 18:33:13 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sun, 19 Dec 2010 18:54:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Moritz Muehlenhoff <jmm@inutil.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sun, 19 Dec 2010 18:54:05 GMT) Full text and rfc822 format available.

Message #86 received at 607368@bugs.debian.org (full text, mbox):

From: Moritz Muehlenhoff <jmm@inutil.org>
To: Julien BLACHE <jblache@debian.org>
Cc: 607368@bugs.debian.org, debian-ctte@lists.debian.org
Subject: Re: Please decide how kernel ABI should be managed
Date: Sun, 19 Dec 2010 19:51:20 +0100
On Sun, Dec 19, 2010 at 07:30:58PM +0100, Julien BLACHE wrote:
> reopen 607368
> tags 607368 - wontfix
> reassign 607368 tech-ctte
> retitle 607368 Please decide how kernel ABI should be managed
> thanks
> 
> Hi,
> 
> I am hereby asking the tech-ctte to decide how the kernel ABI should be
> managed.
> 
> Case in point: the kernel team decided to ignore changes to the smp_ops
> symbol in 2.6.32-28 which broke external modules (vmware) without any
> prior warning.

FWIW; the ABI handling has been fairly strict during the lifetime of
a stable release. I'm not aware that the same situation has occured
during the Etch or Lenny lifetime.

Cheers,
        Moritz




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sun, 19 Dec 2010 19:21:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefano Zacchiroli <leader@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sun, 19 Dec 2010 19:21:05 GMT) Full text and rfc822 format available.

Message #91 received at 607368@bugs.debian.org (full text, mbox):

From: Stefano Zacchiroli <leader@debian.org>
To: 607368@bugs.debian.org
Cc: Julien BLACHE <jblache@debian.org>, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Please decide how kernel ABI should be managed
Date: Sun, 19 Dec 2010 20:19:22 +0100
[Message part 1 (text/plain, inline)]
On Sun, Dec 19, 2010 at 07:30:58PM +0100, Julien BLACHE wrote:
> I am hereby asking the tech-ctte to decide how the kernel ABI should
> be managed.

Hi Julien, from the bug log it's pretty clear that there was no
possibilities of agreement between you and the kernel team, so thanks
for bringing this issue to tech-ctte.

I've a question for the kernel team, which might help some investigation
of the tech-ctte. There seem to be two intertwined issue here:

1) the general policy of kernel ABI maintenance
2) the specific smp_ops issue

You asked ruling about (1), on which there is a clear divergence of
opinions between you (as bug reporter / user) and the kernel team (as
package maintainers). Of course ruling about (1) will also address (2),
one way or the other.

Still, (2) is more urgent, as (I agree on that) it will impact upgrade
experience of Debian users like Julien, who are forced to use VMWare. No
matter who is at fault, the choice about (2) will have an impact on a
specific class of users.

My question to the kernel team is if, no matter (2), there are
*technical* reasons for not reverting the removal of the "smp_send_stop"
symbol. I understand there are "political" reasons for *not* reverting
the change, like reinforcing the position that people should not rely on
symbols not exported for out-of-tree modules. I believe it would help
the discussion to know whether there are technical blockers to the
revert.

> I think it would be best if this matter would be decided upon before
> the release of Squeeze, or not too long after it, so as to avoid
> further breakages in early kernel updates for Squeeze.

+1


Just my 0.02€,
Cheers.

-- 
Stefano Zacchiroli -o- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
Quando anche i santi ti voltano le spalle, |  .  |. I've fans everywhere
ti resta John Fante -- V. Capossela .......| ..: |.......... -- C. Adams
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sun, 19 Dec 2010 21:21:13 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ben Hutchings <ben@decadent.org.uk>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sun, 19 Dec 2010 21:21:14 GMT) Full text and rfc822 format available.

Message #96 received at 607368@bugs.debian.org (full text, mbox):

From: Ben Hutchings <ben@decadent.org.uk>
To: Stefano Zacchiroli <leader@debian.org>
Cc: 607368@bugs.debian.org, Julien BLACHE <jblache@debian.org>, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Please decide how kernel ABI should be managed
Date: Sun, 19 Dec 2010 21:18:50 +0000
[Message part 1 (text/plain, inline)]
On Sun, 2010-12-19 at 20:19 +0100, Stefano Zacchiroli wrote:
> On Sun, Dec 19, 2010 at 07:30:58PM +0100, Julien BLACHE wrote:
> > I am hereby asking the tech-ctte to decide how the kernel ABI should
> > be managed.
> 
> Hi Julien, from the bug log it's pretty clear that there was no
> possibilities of agreement between you and the kernel team, so thanks
> for bringing this issue to tech-ctte.
> 
> I've a question for the kernel team, which might help some investigation
> of the tech-ctte. There seem to be two intertwined issue here:
> 
> 1) the general policy of kernel ABI maintenance
> 2) the specific smp_ops issue
> 
> You asked ruling about (1), on which there is a clear divergence of
> opinions between you (as bug reporter / user) and the kernel team (as
> package maintainers). Of course ruling about (1) will also address (2),
> one way or the other.
> 
> Still, (2) is more urgent, as (I agree on that) it will impact upgrade
> experience of Debian users like Julien, who are forced to use VMWare. No
> matter who is at fault, the choice about (2) will have an impact on a
> specific class of users.
> 
> My question to the kernel team is if, no matter (2), there are
> *technical* reasons for not reverting the removal of the "smp_send_stop"
> symbol. I understand there are "political" reasons for *not* reverting
> the change, like reinforcing the position that people should not rely on
> symbols not exported for out-of-tree modules. I believe it would help
> the discussion to know whether there are technical blockers to the
> revert.
[...]

smp_send_stop was never exported in its own right.  The change to
smp_ops was made as part of this bug fix:

commit ae832c21a08514fd11d2d1d6e217c8a537764bb0
Author: Alok Kataria <akataria@vmware.com>
Date:   Mon Oct 11 14:37:08 2010 -0700

    x86, kexec: Make sure to stop all CPUs before exiting the kernel
    
    commit 76fac077db6b34e2c6383a7b4f3f4f7b7d06d8ce upstream.

    x86 smp_ops now has a new op, stop_other_cpus which takes a parameter
    "wait" this allows the caller to specify if it wants to stop until all
    the cpus have processed the stop IPI.  This is required specifically
    for the kexec case where we should wait for all the cpus to be stopped
    before starting the new kernel.  We now wait for the cpus to stop in
    all cases except for panic/kdump where we expect things to be broken
    and we are doing our best to make things work anyway.
    
    This patch fixes a legitimate regression, which was introduced during
    2.6.30, by commit id 4ef702c10b5df18ab04921fc252c26421d4d6c75.
    
    Signed-off-by: Alok N Kataria <akataria@vmware.com>
    LKML-Reference: <1286833028.1372.20.camel@ank32.eng.vmware.com>
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
    Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

(ooh, irony).

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sun, 19 Dec 2010 23:39:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to maximilian attems <max@stro.at>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sun, 19 Dec 2010 23:39:03 GMT) Full text and rfc822 format available.

Message #101 received at 607368@bugs.debian.org (full text, mbox):

From: maximilian attems <max@stro.at>
To: Stefano Zacchiroli <leader@debian.org>
Cc: 607368@bugs.debian.org, Julien BLACHE <jblache@debian.org>, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Please decide how kernel ABI should be managed
Date: Sun, 19 Dec 2010 23:38:06 +0000
On Sun, Dec 19, 2010 at 08:19:22PM +0100, Stefano Zacchiroli wrote:
> On Sun, Dec 19, 2010 at 07:30:58PM +0100, Julien BLACHE wrote:
> > I am hereby asking the tech-ctte to decide how the kernel ABI should
> > be managed.
> 
> Hi Julien, from the bug log it's pretty clear that there was no
> possibilities of agreement between you and the kernel team, so thanks
> for bringing this issue to tech-ctte.
> 
> I've a question for the kernel team, which might help some investigation
> of the tech-ctte. There seem to be two intertwined issue here:
> 
> 1) the general policy of kernel ABI maintenance

we try to avoid ABI bumps at our best.
especially in times of release the ABI is kind of frozen due to
d-i requirements. There is no way so shortly before the release
we would bump ABI.

upstream has no ABI rule best read in Documentation/stable_api_nonsense.txt
thus stable updates to indeed change ABI.


> 2) the specific smp_ops issue
> 
> You asked ruling about (1), on which there is a clear divergence of
> opinions between you (as bug reporter / user) and the kernel team (as
> package maintainers). Of course ruling about (1) will also address (2),
> one way or the other.
> 
> Still, (2) is more urgent, as (I agree on that) it will impact upgrade
> experience of Debian users like Julien, who are forced to use VMWare. No
> matter who is at fault, the choice about (2) will have an impact on a
> specific class of users.

The submitter shows a clear confusion between the requirements of a shared
lib userspace and the linux-2.6 kernel.

Furthermore it is indeed quite unclear if said company is not effectively
violating GPL and several core dev do indeed think so.

-- 
maks





Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Mon, 20 Dec 2010 17:09:51 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Mon, 20 Dec 2010 17:09:51 GMT) Full text and rfc822 format available.

Message #106 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: maximilian attems <max@stro.at>
Cc: 607368@bugs.debian.org
Subject: Re: Please decide how kernel ABI should be managed
Date: Mon, 20 Dec 2010 17:59:38 +0100
maximilian attems <max@stro.at> wrote:

Hi,

> The submitter shows a clear confusion between the requirements of a shared
> lib userspace and the linux-2.6 kernel.

Be assured there is no confusion on my end on this topic.

> Furthermore it is indeed quite unclear if said company is not effectively
> violating GPL and several core dev do indeed think so.

Uh? [citation needed] please, especially given VMware modules ship as
source although I can't remember their licensing terms right now.

JB.

-- 
 Julien BLACHE <jblache@debian.org>  |  Debian, because code matters more 
 Debian & GNU/Linux Developer        |       <http://www.debian.org>
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Tue, 21 Dec 2010 11:27:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Tue, 21 Dec 2010 11:27:03 GMT) Full text and rfc822 format available.

Message #111 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: 607368@bugs.debian.org
Cc: maximilian attems <max@stro.at>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Tue, 21 Dec 2010 12:23:31 +0100
Julien BLACHE <jblache@debian.org> wrote:

Hi,

>> Furthermore it is indeed quite unclear if said company is not effectively
>> violating GPL and several core dev do indeed think so.
>
> Uh? [citation needed] please, especially given VMware modules ship as
> source although I can't remember their licensing terms right now.

I've done that now and all the modules are GPL. There goes your claim.

JB.

-- 
 Julien BLACHE - Debian & GNU/Linux Developer - <jblache@debian.org> 
 
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Thu, 23 Dec 2010 20:12:03 GMT) Full text and rfc822 format available.

Message #114 received at 607368@bugs.debian.org (full text, mbox):

From: Don Armstrong <don@debian.org>
To: 607368@bugs.debian.org, Julien BLACHE <jblache@debian.org>
Subject: Re: Please decide how kernel ABI should be managed
Date: Thu, 23 Dec 2010 12:08:05 -0800
On Sun, 19 Dec 2010, Julien BLACHE wrote:
> I think it would be best if this matter would be decided upon before
> the release of Squeeze, or not too long after it, so as to avoid
> further breakages in early kernel updates for Squeeze.

I have a couple of (possibly naïve) questions that would help me
understand the space of solutions here.

1) What is the kernel ABI currently used to indicate? Where do we
specify what it guarantees?

2) What are all of the options for handling this situation?
Specifically, how should a package maintainer who is maintaining a
out-of-tree module which uses symbols from the kernel handle them
through an upgrade which changes the symbols? If the symbols need to
be covered by the ABI, how can the maintainer get them covered by ABI?
What should they do in cases when they are not covered by the ABI?

My main concern is that there seems to be no way for oot modules like
the vmware modules to sanely keep in step with the kernel ABI. While
this may not be a concern for kernel upstream, it's something that we
would ideally deal with to avoid issues for our users on upgrades.


Don Armstrong

-- 
He no longer wished to be dead. At the same time, it cannot be said
that he was glad to be alive. But at least he did not resent it. He
was alive, and the stubbornness of this fact had little by little
begun to fascinate him -- as if he had managed to outlive himself, as
if he were somehow living a posthumous life.
 -- Paul Auster _City of Glass_

http://www.donarmstrong.com              http://rzlab.ucr.edu




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Fri, 24 Dec 2010 11:12:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Fri, 24 Dec 2010 11:12:03 GMT) Full text and rfc822 format available.

Message #119 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: 607368@bugs.debian.org
Subject: Re: Please decide how kernel ABI should be managed
Date: Fri, 24 Dec 2010 12:09:38 +0100
Don Armstrong <don@debian.org> wrote:

Hi Don,

You should bounce your mail to the kernel team as they were not Cc:ed
and the questions are directed to them.

> My main concern is that there seems to be no way for oot modules like
> the vmware modules to sanely keep in step with the kernel ABI. While

Correct.

> this may not be a concern for kernel upstream, it's something that we
> would ideally deal with to avoid issues for our users on upgrades.

Upstream doesn't have a notion of kernel ABI, this is left for the
distributors to handle. This can only work if changes to the ABI don't
get ignored for convenience or any other equally bad reason.

JB.

-- 
 Julien BLACHE <jblache@debian.org>  |  Debian, because code matters more 
 Debian & GNU/Linux Developer        |       <http://www.debian.org>
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sun, 26 Dec 2010 14:06:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ben Hutchings <ben@decadent.org.uk>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sun, 26 Dec 2010 14:06:03 GMT) Full text and rfc822 format available.

Message #124 received at 607368@bugs.debian.org (full text, mbox):

From: Ben Hutchings <ben@decadent.org.uk>
To: Don Armstrong <don@debian.org>, 607368@bugs.debian.org
Cc: Julien BLACHE <jblache@debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Sun, 26 Dec 2010 11:32:36 +0000
[Message part 1 (text/plain, inline)]
On Thu, 2010-12-23 at 12:08 -0800, Don Armstrong wrote:
> On Sun, 19 Dec 2010, Julien BLACHE wrote:
> > I think it would be best if this matter would be decided upon before
> > the release of Squeeze, or not too long after it, so as to avoid
> > further breakages in early kernel updates for Squeeze.
> 
> I have a couple of (possibly naïve) questions that would help me
> understand the space of solutions here.
> 
> 1) What is the kernel ABI currently used to indicate?

The ABI *number* indicates a range of versions within which newer
versions are likely to remain compatible with modules built for an older
version.

> Where do we specify what it guarantees?

We don't.

> 2) What are all of the options for handling this situation?
> Specifically, how should a package maintainer who is maintaining a
> out-of-tree module which uses symbols from the kernel handle them
> through an upgrade which changes the symbols? If the symbols need to
> be covered by the ABI, how can the maintainer get them covered by ABI?
> What should they do in cases when they are not covered by the ABI?
> 
> My main concern is that there seems to be no way for oot modules like
> the vmware modules to sanely keep in step with the kernel ABI. While
> this may not be a concern for kernel upstream, it's something that we
> would ideally deal with to avoid issues for our users on upgrades.

I think I should explain at this point the trade-off we're trying to
make.

As you know, the kernel-space ABI is volatile and upstream has no
intention of maintaining it, even within a stable/long-term series.
Build configuration changes may also change the ABI in unexpected ways.
Therefore it is generally not practical to maintain ABI within a single
upstream version.

Changing the ABI number requires (1) changing the package names and (2)
rebuilding out-of-tree modules.  (1) means linux-2.6 must go through the
NEW queue and also disrupts d-i development (the latter problem may be
reduced within the wheezy release cycle).  It also requires end users
and administrators to explicitly remove old kernel image packages.  (2)
should not be a huge burden so long as the modules are packaged using
dkms, but auto- rebuilding relies on having a toolchain installed.
Therefore we do not like to change the ABI number during a stable
release or the preceding freeze.

The result of these competing pressures is that we have to fudge ABI
changes.  Where possible, we adjust upstream fixes to remain
backward-compatible.  In other cases we revert fixes or ignore the ABI
changes, based on our judgement of the costs and benefits.

---

If people don't like this compromise, then I think the only reasonable
alternative is to do what most other distributions do: set the kernel
version (as shown by uname -r) to the package version.  This means that
each new upload will have new package names (and will require an upload
of linux-latest-2.6).  APT should also be fixed to allow auto-removal of
old kernel images.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sun, 26 Dec 2010 20:27:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Don Armstrong <don@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sun, 26 Dec 2010 20:27:03 GMT) Full text and rfc822 format available.

Message #129 received at 607368@bugs.debian.org (full text, mbox):

From: Don Armstrong <don@debian.org>
To: 607368@bugs.debian.org, Julien BLACHE <jblache@debian.org>
Cc: Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Sun, 26 Dec 2010 12:23:04 -0800
On Sun, 26 Dec 2010, Ben Hutchings wrote:
> On Thu, 2010-12-23 at 12:08 -0800, Don Armstrong wrote:
> > On Sun, 19 Dec 2010, Julien BLACHE wrote:
> > > I think it would be best if this matter would be decided upon before
> > > the release of Squeeze, or not too long after it, so as to avoid
> > > further breakages in early kernel updates for Squeeze.
> > 
> > I have a couple of (possibly naïve) questions that would help me
> > understand the space of solutions here.
> > 
> > 1) What is the kernel ABI currently used to indicate?
> 
> The ABI *number* indicates a range of versions within which newer
> versions are likely to remain compatible with modules built for an
> older version.

So currently there is no guarantee that a specific ABI maintains any
kind of compatibility for out of tree modules; it is a best effort
based on the kernel maintainer's understanding of what symbols have
changed and what out of tree (or even in-tree) modules are affected.

Do the kernel maintainers currently track compatibility of in-tree
modules for modules which may reasonably be loaded during the lifetime
of the install? [I'm thinking of removable device drivers, things like
KVM, etc.]

> I think I should explain at this point the trade-off we're trying to
> make.
> 
> As you know, the kernel-space ABI is volatile and upstream has no
> intention of maintaining it, even within a stable/long-term series.
> Build configuration changes may also change the ABI in unexpected
> ways. Therefore it is generally not practical to maintain ABI within
> a single upstream version.

Right.
 
> Changing the ABI number requires (1) changing the package names and
> (2) rebuilding out-of-tree modules. (1) means linux-2.6 must go
> through the NEW queue and also disrupts d-i development (the latter
> problem may be reduced within the wheezy release cycle). It also
> requires end users and administrators to explicitly remove old
> kernel image packages. (2) should not be a huge burden so long as
> the modules are packaged using dkms, but auto- rebuilding relies on
> having a toolchain installed. Therefore we do not like to change the
> ABI number during a stable release or the preceding freeze.

So from what I can see, the ideal situation is to not change the
kernel ABI number unless we absolutely have to.

What I think is missing now, is a discussion of which cases where
changing the ABI number is necessary for proper functioning, and which
cases of malfunction we feel are acceptable, and which are not.

For in tree modules, all of the problems that would occur from
upgrading a kernel where the ABI had changed (but not the number) can
be resolved by rebooting. I'm personally a bit concerned that these
errors may be a bit disconcerting to our users, but that may be
something we decide to live with and document.

For out of tree modules, these problems can either be resolved by
changing the ABI number, or possibly by using Breaks: for all of the
affected out-of-tree modules where the change wasn't wide-spread
enough to bump the ABI number. A slightly wilder alternative, is to
Provides: linux-kernel-abi-2.6.32-vmware-5 or something for
out-of-tree modules which aren't going to be covered by the main ABI,
but are important enough to require compatibility. Alternatively, we
can ignore them, and require that end-users of these out of tree
modules know that they must upgrade their out-of-tree modules in
lockstep with the kernel.

Which in-tree modules should we change the ABI number for?

Which out-of-tree modules?

How does an out-of-tree module writer know? How can they promote their
module to get a Breaks or Provides or whatever?


Don Armstrong

-- 
It has always been Debian's philosophy in the past to stick to what
makes sense, regardless of what crack the rest of the universe is
smoking.
 -- Andrew Suffield in 20030403211305.GD29698@doc.ic.ac.uk

http://www.donarmstrong.com              http://rzlab.ucr.edu




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sun, 26 Dec 2010 20:54:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Russ Allbery <rra@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sun, 26 Dec 2010 20:54:02 GMT) Full text and rfc822 format available.

Message #134 received at 607368@bugs.debian.org (full text, mbox):

From: Russ Allbery <rra@debian.org>
To: 607368@bugs.debian.org
Cc: Julien BLACHE <jblache@debian.org>, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Sun, 26 Dec 2010 12:51:14 -0800
Don Armstrong <don@debian.org> writes:

> So currently there is no guarantee that a specific ABI maintains any
> kind of compatibility for out of tree modules; it is a best effort based
> on the kernel maintainer's understanding of what symbols have changed
> and what out of tree (or even in-tree) modules are affected.

I feel like I should note here that I've been maintaining a complex
out-of-tree kernel module for Debian for many years now (openafs) and am
also involved in maintaining the non-free NVIDIA modules, and I can't
remember ever having the kernel ABI break for those modules without the
ABI number changing.  It's probably happened and I just don't remember it,
but certainly not enough to be memorable.

*Upstream* has caused us all sorts of problems from time to time because
of taking public symbols and making them GPL-only (OpenAFS predates Linux
and the core of the source is licensed under a free but GPL-incompatible
license, which also affects the kernel module), but the Debian kernel
maintainers have always done a great job at maintaining ABI guarantees,
insofar as my packages are affected.  The only problem that I recall with
the ABI numbering was the unfortunate use of -trunk as an ABI version
during the squeeze development cycle, and there mostly because -trunk
sorted inappropriately after regular ABI numbers were introduced, not
because of an inherent problem with the use of that technique in unstable.

So while I do recognize that there was a problem with an out-of-tree
module that brought this particular bug to the technical committee, I have
to say that with my out-of-tree module maintainer hat on the kernel team
seems to, by and large, be doing a good job of maintaining the kernel ABI
already.  That inclines me against supporting any major change in how this
is handled.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sun, 26 Dec 2010 20:57:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sun, 26 Dec 2010 20:57:03 GMT) Full text and rfc822 format available.

Message #139 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: 607368@bugs.debian.org
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Sun, 26 Dec 2010 21:54:44 +0100
Don Armstrong <don@debian.org> wrote:

Hi,

> For out of tree modules, these problems can either be resolved by
> changing the ABI number, or possibly by using Breaks: for all of the
> affected out-of-tree modules where the change wasn't wide-spread
> enough to bump the ABI number. A slightly wilder alternative, is to
> Provides: linux-kernel-abi-2.6.32-vmware-5 or something for
> out-of-tree modules which aren't going to be covered by the main ABI,
> but are important enough to require compatibility. Alternatively, we

This doesn't work for modules packaged/installed with DKMS, which is
slowly replacing module-assistant (and is not Debian-specific, this is
important to keep in mind here).

Unless DKMS in Debian switches to building modules at boot time, which
it currently doesn't do - and that would not solve the issue for modules
needed in the initrd. Not to mention that it would lengthen the boot
time and could break the boot for any number of reasons [1].


As you noted, silently breaking the ABI opens up a window during which
modules on-disk are potentially incompatible with the running
kernel. Not ideal and not easy to diagnose if you don't have some kernel
knowledge.

JB.

[1] Like running into an endless loop while attempting to build a
module, as happened to me with blcr, which would be pretty inconvenient
at boot time.

-- 
 Julien BLACHE <jblache@debian.org>  |  Debian, because code matters more 
 Debian & GNU/Linux Developer        |       <http://www.debian.org>
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sun, 26 Dec 2010 23:09:12 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ben Hutchings <ben@decadent.org.uk>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sun, 26 Dec 2010 23:09:12 GMT) Full text and rfc822 format available.

Message #144 received at 607368@bugs.debian.org (full text, mbox):

From: Ben Hutchings <ben@decadent.org.uk>
To: Don Armstrong <don@debian.org>
Cc: 607368@bugs.debian.org, Julien BLACHE <jblache@debian.org>, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Sun, 26 Dec 2010 23:05:16 +0000
[Message part 1 (text/plain, inline)]
On Sun, 2010-12-26 at 12:23 -0800, Don Armstrong wrote:
> On Sun, 26 Dec 2010, Ben Hutchings wrote:
> > On Thu, 2010-12-23 at 12:08 -0800, Don Armstrong wrote:
> > > On Sun, 19 Dec 2010, Julien BLACHE wrote:
> > > > I think it would be best if this matter would be decided upon before
> > > > the release of Squeeze, or not too long after it, so as to avoid
> > > > further breakages in early kernel updates for Squeeze.
> > > 
> > > I have a couple of (possibly naïve) questions that would help me
> > > understand the space of solutions here.
> > > 
> > > 1) What is the kernel ABI currently used to indicate?
> > 
> > The ABI *number* indicates a range of versions within which newer
> > versions are likely to remain compatible with modules built for an
> > older version.
> 
> So currently there is no guarantee that a specific ABI maintains any
> kind of compatibility for out of tree modules; it is a best effort
> based on the kernel maintainer's understanding of what symbols have
> changed and what out of tree (or even in-tree) modules are affected.
> 
> Do the kernel maintainers currently track compatibility of in-tree
> modules for modules which may reasonably be loaded during the lifetime
> of the install? [I'm thinking of removable device drivers, things like
> KVM, etc.]

Not specifically.  *Most* modules will remain compatible, but we expect
users to reboot shortly after a kernel upgrade.

[...]
> What I think is missing now, is a discussion of which cases where
> changing the ABI number is necessary for proper functioning, and which
> cases of malfunction we feel are acceptable, and which are not.
> 
> For in tree modules, all of the problems that would occur from
> upgrading a kernel where the ABI had changed (but not the number) can
> be resolved by rebooting. I'm personally a bit concerned that these
> errors may be a bit disconcerting to our users, but that may be
> something we decide to live with and document.
> 
> For out of tree modules, these problems can either be resolved by
> changing the ABI number,

Yes.

> or possibly by using Breaks: for all of the
> affected out-of-tree modules where the change wasn't wide-spread
> enough to bump the ABI number.

No.  Firstly, if we know that an ABI change would break an OOT module
then we try to avoid making that change.  Therefore, if an ABI change
does break an OOT module then we would not know that we should add the
Breaks relation.  Also, we now recommend that OOT module sources are
packaged using dkms, which means the module binaries are *not* packaged
and no such relation can be declared.

> A slightly wilder alternative, is to
> Provides: linux-kernel-abi-2.6.32-vmware-5 or something for
> out-of-tree modules which aren't going to be covered by the main ABI,
> but are important enough to require compatibility.
[...]

I refuse to support any specific OOT module in this way unless paid to
do so.  I expect that other kernel team members will tell you the same.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sun, 26 Dec 2010 23:57:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Don Armstrong <don@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sun, 26 Dec 2010 23:57:07 GMT) Full text and rfc822 format available.

Message #149 received at 607368@bugs.debian.org (full text, mbox):

From: Don Armstrong <don@debian.org>
To: 607368@bugs.debian.org, Julien BLACHE <jblache@debian.org>, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Sun, 26 Dec 2010 15:55:12 -0800
On Sun, 26 Dec 2010, Ben Hutchings wrote:
> On Sun, 2010-12-26 at 12:23 -0800, Don Armstrong wrote:
> > or possibly by using Breaks: for all of the affected out-of-tree
> > modules where the change wasn't wide-spread enough to bump the ABI
> > number.
> 
> No. Firstly, if we know that an ABI change would break an OOT module
> then we try to avoid making that change.

Ok. And am I correct in assuming that if the ABI change would break an
OOT module, you would normally change the ABI number?

Which OOT modules are important enough to result in ABI number
changes?

How are the symbols that those OOT modules use communicated to the
kernel team?

What does the kernel maintainer team feel should be done by the
maintainer in this case to ensure continuity of upgrades and rebuilds
of the OOT modules?

> > A slightly wilder alternative, is to Provides:
> > linux-kernel-abi-2.6.32-vmware-5 or something for out-of-tree
> > modules which aren't going to be covered by the main ABI, but are
> > important enough to require compatibility.
> 
> I refuse to support any specific OOT module in this way unless paid
> to do so. I expect that other kernel team members will tell you the
> same.

I personally don't think a Provides: solution is going to be feasible
for technical reasons, and coordination reasons. Lets restrict
ourselves to discussing the technical reasons why a solution is
infeasible, rather than possible monetary impetus required to
implement them.


Don Armstrong

-- 
No matter how many instances of white swans we may have observed, this
does not justify the conclusion that all swans are white.
 -- Sir Karl Popper _Logic of Scientific Discovery_

http://www.donarmstrong.com              http://rzlab.ucr.edu




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Mon, 27 Dec 2010 04:45:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ben Hutchings <ben@decadent.org.uk>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Mon, 27 Dec 2010 04:45:06 GMT) Full text and rfc822 format available.

Message #154 received at 607368@bugs.debian.org (full text, mbox):

From: Ben Hutchings <ben@decadent.org.uk>
To: Don Armstrong <don@debian.org>
Cc: 607368@bugs.debian.org, Julien BLACHE <jblache@debian.org>, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Mon, 27 Dec 2010 04:43:14 +0000
[Message part 1 (text/plain, inline)]
On Sun, 2010-12-26 at 15:55 -0800, Don Armstrong wrote:
> On Sun, 26 Dec 2010, Ben Hutchings wrote:
> > On Sun, 2010-12-26 at 12:23 -0800, Don Armstrong wrote:
> > > or possibly by using Breaks: for all of the affected out-of-tree
> > > modules where the change wasn't wide-spread enough to bump the ABI
> > > number.
> > 
> > No. Firstly, if we know that an ABI change would break an OOT module
> > then we try to avoid making that change.
> 
> Ok. And am I correct in assuming that if the ABI change would break an
> OOT module, you would normally change the ABI number?

In the time I've been involved in the kernel team, I haven't yet seen a
case where a bug fix required an ABI change that I knew would break an
OOT module.

I understand that in the past the kernel team has deferred such bug
fixes and eventually applied such deferred changes as a batch while
changing the ABI number, after coordinating with affected people (such
as the d-i and CD teams).

> Which OOT modules are important enough to result in ABI number
> changes?

We don't have a formal policy but I think we consider OOT modules that
(1) appear to be used in production and (2) have published source code
for at least the part that directly uses kernel symbols.

Anything distributed by Debian should meet those qualifications, but
users such as Julien also care about modules from other sources.  I
normally use Google Code Search to check for OOT modules using symbols
that have changed ABI and which I think might be ignorable.

> How are the symbols that those OOT modules use communicated to the
> kernel team?

They aren't.

> What does the kernel maintainer team feel should be done by the
> maintainer in this case to ensure continuity of upgrades and rebuilds
> of the OOT modules?
[...]

We recommend that OOT module package makes use of DKMS.  DKMS includes
hook scripts to trigger rebuilding OOT modules automatically for each
new kernel ABI version, if the end user or administrator installs the
module source and the appropriate linux-headers package.  In a more
tightly controlled environment where such packages should not be
installed on production servers, the administrator must rebuild modules
elsewhere and deploy them along with the kernel upgrade.  DKMS provides
various means for this.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Tue, 04 Jan 2011 05:00:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Don Armstrong <don@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Tue, 04 Jan 2011 05:00:05 GMT) Full text and rfc822 format available.

Message #159 received at 607368@bugs.debian.org (full text, mbox):

From: Don Armstrong <don@debian.org>
To: 607368@bugs.debian.org
Cc: Julien BLACHE <jblache@debian.org>, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Mon, 3 Jan 2011 20:56:39 -0800
On Mon, 27 Dec 2010, Ben Hutchings wrote:
> On Sun, 2010-12-26 at 15:55 -0800, Don Armstrong wrote:
> > Ok. And am I correct in assuming that if the ABI change would
> > break an OOT module, you would normally change the ABI number?
> 
> In the time I've been involved in the kernel team, I haven't yet
> seen a case where a bug fix required an ABI change that I knew would
> break an OOT module.

So in this case, if it was clear that the change would have broken an
OOT module, the kernel team would normally either postpone the change,
or change the ABI number.

> Anything distributed by Debian should meet those qualifications, but
> users such as Julien also care about modules from other sources. I
> normally use Google Code Search to check for OOT modules using
> symbols that have changed ABI and which I think might be ignorable.

Ok. For some reason, I hadn't originally noticed that this was
concerning an OOT module which Debian itself didn't actually
distribute. [Julien: I'm correct in that, right?] But that's probably
fine.
 
> > How are the symbols that those OOT modules use communicated to the
> > kernel team?
> 
> They aren't.

Would putting the onus on OOT maintainers to maintain such a list be
of benefit to the kernel maintainer team?

> > What does the kernel maintainer team feel should be done by the
> > maintainer in this case to ensure continuity of upgrades and rebuilds
> > of the OOT modules?
> [...]
> 
> We recommend that OOT module package makes use of DKMS. DKMS
> includes hook scripts to trigger rebuilding OOT modules
> automatically for each new kernel ABI version, if the end user or
> administrator installs the module source and the appropriate
> linux-headers package. In a more tightly controlled environment
> where such packages should not be installed on production servers,
> the administrator must rebuild modules elsewhere and deploy them
> along with the kernel upgrade. DKMS provides various means for this.

Makes sense. What about this case? What should Julien do?

Julien: Are you currently shipping a kernel in production which would
be affected by this change if we don't change the ABI number? Or does
this only affect cases where you are testing squeeze? Could it be
worked around by using DKMS or similar with prebuilt binaries and
requiring exact kernel version dependencies?


Don Armstrong

-- 
I don't care how poor and inefficient a little country is; they like
to run their own business.  I know men that would make my wife a
better husband than I am; but, darn it, I'm not going to give her to
'em.
 -- The Best of Will Rogers

http://www.donarmstrong.com              http://rzlab.ucr.edu




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Tue, 04 Jan 2011 11:30:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Tue, 04 Jan 2011 11:30:03 GMT) Full text and rfc822 format available.

Message #164 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: 607368@bugs.debian.org
Cc: Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Tue, 04 Jan 2011 12:28:22 +0100
Don Armstrong <don@debian.org> wrote:

Hi,

> Ok. For some reason, I hadn't originally noticed that this was
> concerning an OOT module which Debian itself didn't actually
> distribute. [Julien: I'm correct in that, right?] But that's probably
> fine.

You are correct.

> Julien: Are you currently shipping a kernel in production which would
> be affected by this change if we don't change the ABI number? Or does
> this only affect cases where you are testing squeeze? Could it be

I have 30 beta-testers that are affected by this issue on the
workstations they have started using for their everyday work. Although
it's still a beta phase, at this point, these workstations are to be
considered "in production" given the users have basically made the
switch now.

Full deployment involves over a thousand workstations.

> worked around by using DKMS or similar with prebuilt binaries and
> requiring exact kernel version dependencies?

DKMS is useless if the ABI number doesn't change, in its current
form. If DKMS was changed to rebuild all modules when the kernel package
is upgraded, we'd still have issues with on-disk modules not matching
the running kernel ABI until the machine is rebooted. This can sometimes
take two or three weeks if a long-running computation is running on the
machine.

We switched to DKMS to reduce the maintenance cost associated with
prebuilt binaries. We'd rather not come back to that if we can help
it. It also adds a delay to kernel updates that we'd rather avoid.

As to using strict dependencies... it makes all of the above even
worse.

And I'll ask again: what's the point of the kernel ABI number if we have
to use strict dependencies? Seriously?

We need a kernel ABI numbering we can rely on.

JB.

-- 
 Julien BLACHE <jblache@debian.org>  |  Debian, because code matters more 
 Debian & GNU/Linux Developer        |       <http://www.debian.org>
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Tue, 04 Jan 2011 22:33:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ben Hutchings <ben@decadent.org.uk>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Tue, 04 Jan 2011 22:33:07 GMT) Full text and rfc822 format available.

Message #169 received at 607368@bugs.debian.org (full text, mbox):

From: Ben Hutchings <ben@decadent.org.uk>
To: Julien BLACHE <jblache@debian.org>
Cc: 607368@bugs.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Tue, 4 Jan 2011 22:30:42 +0000
On Tue, Jan 04, 2011 at 12:28:22PM +0100, Julien BLACHE wrote:
> Don Armstrong <don@debian.org> wrote:
[...]
> > worked around by using DKMS or similar with prebuilt binaries and
> > requiring exact kernel version dependencies?
> 
> DKMS is useless if the ABI number doesn't change, in its current
> form. If DKMS was changed to rebuild all modules when the kernel package
> is upgraded, we'd still have issues with on-disk modules not matching
> the running kernel ABI until the machine is rebooted. This can sometimes
> take two or three weeks if a long-running computation is running on the
> machine.
> 
> We switched to DKMS to reduce the maintenance cost associated with
> prebuilt binaries. We'd rather not come back to that if we can help
> it. It also adds a delay to kernel updates that we'd rather avoid.
> 
> As to using strict dependencies... it makes all of the above even
> worse.
> 
> And I'll ask again: what's the point of the kernel ABI number if we have
> to use strict dependencies? Seriously?
[...]
 
Do pay attention.  We were discussing the implications of changing our
current practice of trying to avoid ABI bumps during freeze and stable
updates.  We would then probably change the uname release (the ABI
identifier) in each version of the package.

Ben.

-- 
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
                                                              - Albert Camus




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Tue, 04 Jan 2011 23:09:09 GMT) Full text and rfc822 format available.

Acknowledgement sent to Don Armstrong <don@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Tue, 04 Jan 2011 23:09:09 GMT) Full text and rfc822 format available.

Message #174 received at 607368@bugs.debian.org (full text, mbox):

From: Don Armstrong <don@debian.org>
To: Julien BLACHE <jblache@debian.org>, 607368@bugs.debian.org
Cc: Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Tue, 4 Jan 2011 15:05:10 -0800
On Tue, 04 Jan 2011, Julien BLACHE wrote:
> Don Armstrong <don@debian.org> wrote:
> > Julien: Are you currently shipping a kernel in production which
> > would be affected by this change if we don't change the ABI
> > number? Or does this only affect cases where you are testing
> > squeeze? Could it be
> 
> I have 30 beta-testers that are affected by this issue on the
> workstations they have started using for their everyday work.
> Although it's still a beta phase, at this point, these workstations
> are to be considered "in production" given the users have basically
> made the switch now.

Ok. My main concern here is what exactly would happen if we were to
ignore the ABI change for this particular issue, and then put in place
some kind of a process where the kernel team could be informed of
downstream users of the ABI.

From my current understanding, the ABI number is only meant to cover
some of the symbols which can be used externally, not all of them.
[Specifically, those that the kernel team are aware of being used
externally.]

> Full deployment involves over a thousand workstations.

But presumably they're not running a testing version affected by this.

> > worked around by using DKMS or similar with prebuilt binaries and
> > requiring exact kernel version dependencies?
> 
> DKMS is useless if the ABI number doesn't change, in its current
> form. If DKMS was changed to rebuild all modules when the kernel
> package is upgraded, we'd still have issues with on-disk modules not
> matching the running kernel ABI until the machine is rebooted. This
> can sometimes take two or three weeks if a long-running computation
> is running on the machine.

Presumably this wouldn't be much of an issue, unless users are going
to be newly loading these modules. [Which I would hope wouldn't be the
case if you were running a long-running computation.]

> As to using strict dependencies... it makes all of the above even
> worse.

Certainly; there's a cost to be born on both sides. The most important
thing to avoid from my perspective is a kernel which when booted has
modules that cannot be loaded.
 
> And I'll ask again: what's the point of the kernel ABI number if we
> have to use strict dependencies?

Some modules may need strict dependencies if they are using symbols
not covered by the ABI; this is one possible way that we can resolve
this issue.

> Seriously?

Lets restrict ourselves to discussing the technical issues and
possible solutions instead of rhetorical flourishes.


Don Armstrong

-- 
The computer allows you to make mistakes faster than any other
invention, with the possible exception of handguns and tequila
 -- Mitch Ratcliffe

http://www.donarmstrong.com              http://rzlab.ucr.edu




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Wed, 05 Jan 2011 01:27:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Russ Allbery <rra@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Wed, 05 Jan 2011 01:27:03 GMT) Full text and rfc822 format available.

Message #179 received at 607368@bugs.debian.org (full text, mbox):

From: Russ Allbery <rra@debian.org>
To: Julien BLACHE <jblache@debian.org>, 607368@bugs.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Tue, 04 Jan 2011 17:23:25 -0800
Ben Hutchings <ben@decadent.org.uk> writes:

> Do pay attention.  We were discussing the implications of changing our
> current practice of trying to avoid ABI bumps during freeze and stable
> updates.  We would then probably change the uname release (the ABI
> identifier) in each version of the package.

This is certainly becoming more appealing with DKMS, but with my Stanford
sysadmin hat on, I have to admit that we'd find it rather annoying if the
ABI changed in stable.  I think that may be a good way to go in unstable
and testing up to the release, but it would be very nice to not do that
after the release.

With hundreds of servers, we'd rather not install compilers and DKMS on
every one of them, and with lots of machines, the loss of reproducibility
from separately compiling the modules on every system is an increasingly
large drawback.  We currently build internal packages (from the *-source
packages provided by Debian) for those external modules that we use so
that we can deploy the same thing everywhere, and having to rebuild
modules for every kernel update and deploy those new builds with the
kernel update would be fairly annoying.  With that system, we know for
sure that if the module mysteriously fails on one system but not on
others, it's not because it's a weird build or has some other compilation
issue.

In fact, we know almost exactly how annoying it would be, since Red Hat
has this policy, and it's been a major pain.  The handling of the kernel
versioning in stable is currently one of the major selling points for
Debian over Red Hat for us.  The very few times an ABI change was forced
in Debian stable due to some security issue, we had to put a fair bit of
work into making sure that everything was upgraded properly everywhere to
the new ABI.

(So thank you very much for all the work that you put into maintaining the
ABI!)

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Wed, 05 Jan 2011 01:48:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ben Hutchings <ben@decadent.org.uk>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Wed, 05 Jan 2011 01:48:03 GMT) Full text and rfc822 format available.

Message #184 received at 607368@bugs.debian.org (full text, mbox):

From: Ben Hutchings <ben@decadent.org.uk>
To: Russ Allbery <rra@debian.org>
Cc: Julien BLACHE <jblache@debian.org>, 607368@bugs.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Wed, 05 Jan 2011 01:45:29 +0000
[Message part 1 (text/plain, inline)]
On Tue, 2011-01-04 at 17:23 -0800, Russ Allbery wrote:
> Ben Hutchings <ben@decadent.org.uk> writes:
> 
> > Do pay attention.  We were discussing the implications of changing our
> > current practice of trying to avoid ABI bumps during freeze and stable
> > updates.  We would then probably change the uname release (the ABI
> > identifier) in each version of the package.
> 
> This is certainly becoming more appealing with DKMS, but with my Stanford
> sysadmin hat on, I have to admit that we'd find it rather annoying if the
> ABI changed in stable.  I think that may be a good way to go in unstable
> and testing up to the release, but it would be very nice to not do that
> after the release.

However, the upstream policy for stable updates does not support this.

> With hundreds of servers, we'd rather not install compilers and DKMS on
> every one of them, and with lots of machines, the loss of reproducibility
> from separately compiling the modules on every system is an increasingly
> large drawback.

This is why DKMS has the facility to build packages for installation
elsewhere.

> We currently build internal packages (from the *-source
> packages provided by Debian) for those external modules that we use so
> that we can deploy the same thing everywhere, and having to rebuild
> modules for every kernel update and deploy those new builds with the
> kernel update would be fairly annoying. With that system, we know for
> sure that if the module mysteriously fails on one system but not on
> others, it's not because it's a weird build or has some other compilation
> issue.
> 
> In fact, we know almost exactly how annoying it would be, since Red Hat
> has this policy, and it's been a major pain.  The handling of the kernel
> versioning in stable is currently one of the major selling points for
> Debian over Red Hat for us.
[...]

Note that Red Hat does maintain the ABI for most functions, even though
it change the uname release.  If you package OOT modules using the 'KMP'
macros for RPM, binary modules will be sym-linked into a 'weak-updates'
subdirectory for a newer kernel if their symbol dependencies are still
met.

We could try to implement something like that in Debian.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Wed, 05 Jan 2011 01:57:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Russ Allbery <rra@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Wed, 05 Jan 2011 01:57:04 GMT) Full text and rfc822 format available.

Message #189 received at 607368@bugs.debian.org (full text, mbox):

From: Russ Allbery <rra@debian.org>
To: Ben Hutchings <ben@decadent.org.uk>
Cc: Julien BLACHE <jblache@debian.org>, 607368@bugs.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Tue, 04 Jan 2011 17:55:47 -0800
Ben Hutchings <ben@decadent.org.uk> writes:
> On Tue, 2011-01-04 at 17:23 -0800, Russ Allbery wrote:

>> With hundreds of servers, we'd rather not install compilers and DKMS on
>> every one of them, and with lots of machines, the loss of
>> reproducibility from separately compiling the modules on every system
>> is an increasingly large drawback.

> This is why DKMS has the facility to build packages for installation
> elsewhere.

But there would be no purpose served in using DKMS for this.  The only
place where DKMS has an advantage over building real Debian packages for
the modules is if you're going to let every machine build its own modules.
As soon as you are distributing modules built once to multiple machines,
using DKMS to do that is vaguely absurd: you have to reinvent all the
mechanisms of a repository and package upgrade system, when we already
have a perfectly useful and reasonable one in apt repositories with
package versioning and proper dependencies.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Wed, 05 Jan 2011 02:09:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ben Hutchings <ben@decadent.org.uk>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Wed, 05 Jan 2011 02:09:03 GMT) Full text and rfc822 format available.

Message #194 received at 607368@bugs.debian.org (full text, mbox):

From: Ben Hutchings <ben@decadent.org.uk>
To: Russ Allbery <rra@debian.org>
Cc: Julien BLACHE <jblache@debian.org>, 607368@bugs.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Wed, 05 Jan 2011 02:04:12 +0000
[Message part 1 (text/plain, inline)]
On Tue, 2011-01-04 at 17:55 -0800, Russ Allbery wrote:
> Ben Hutchings <ben@decadent.org.uk> writes:
> > On Tue, 2011-01-04 at 17:23 -0800, Russ Allbery wrote:
> 
> >> With hundreds of servers, we'd rather not install compilers and DKMS on
> >> every one of them, and with lots of machines, the loss of
> >> reproducibility from separately compiling the modules on every system
> >> is an increasingly large drawback.
> 
> > This is why DKMS has the facility to build packages for installation
> > elsewhere.
> 
> But there would be no purpose served in using DKMS for this.  The only
> place where DKMS has an advantage over building real Debian packages for
> the modules is if you're going to let every machine build its own modules.
[...]

DKMS does build real Debian packages.  And that means that OOT module
sources do not need to be packaged differently depending on where the
modules will be built.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Wed, 05 Jan 2011 02:12:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Russ Allbery <rra@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Wed, 05 Jan 2011 02:12:03 GMT) Full text and rfc822 format available.

Message #199 received at 607368@bugs.debian.org (full text, mbox):

From: Russ Allbery <rra@debian.org>
To: Ben Hutchings <ben@decadent.org.uk>
Cc: Julien BLACHE <jblache@debian.org>, 607368@bugs.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Tue, 04 Jan 2011 18:07:52 -0800
Ben Hutchings <ben@decadent.org.uk> writes:

> DKMS does build real Debian packages.  And that means that OOT module
> sources do not need to be packaged differently depending on where the
> modules will be built.

Oh, huh, I hadn't noticed that.  Thanks for the pointer!  I'll have to
play with that; I'd only previously seen the tarball distribution and
installation mechanism.

The work of providing both the -dkms and the traditional -source package
is fairly trivial and not much of a drain on the packager's time once the
original -source rules have been written.  I'm doing it right now for
multiple packages.  But writing the original -source package rules file is
arcane and very under-documented, so this is potentially a long-term
improvement.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Wed, 05 Jan 2011 11:33:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien BLACHE <jblache@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Wed, 05 Jan 2011 11:33:06 GMT) Full text and rfc822 format available.

Message #204 received at 607368@bugs.debian.org (full text, mbox):

From: Julien BLACHE <jblache@debian.org>
To: Don Armstrong <don@debian.org>
Cc: 607368@bugs.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Wed, 05 Jan 2011 12:29:22 +0100
Don Armstrong <don@debian.org> wrote:

Hi,

> Ok. My main concern here is what exactly would happen if we were to
> ignore the ABI change for this particular issue, and then put in place
> some kind of a process where the kernel team could be informed of
> downstream users of the ABI.

The harm is done now, reverting or bumping the ABI at this point only
makes things worse.

>> Full deployment involves over a thousand workstations.
>
> But presumably they're not running a testing version affected by this.

At this time I have no assurance that this issue or a similar issue with
another symbol won't happen again during the Squeeze lifetime, so they
are potentially affected until proven otherwise as far as I'm concerned.

To the thousand machines given above, you can add several hundred
machines part of several HPC clusters; the nodes use external InfiniBand
drivers from ofa-kernel 1.5.2 in the pkg-ofed repository. Having the
cluster fail to come online after a kernel upgrade would be interesting.

We also have servers using the Brocade FC HBA/CNA drivers from Brocade,
due to the 2.6.32 drivers being way out of date (2.6.32->2.6.37 is
ca. 100 commits and needs new firmware files with new names, if anyone
is interested).

>> package is upgraded, we'd still have issues with on-disk modules not
>> matching the running kernel ABI until the machine is rebooted. This
>> can sometimes take two or three weeks if a long-running computation
>> is running on the machine.
>
> Presumably this wouldn't be much of an issue, unless users are going
> to be newly loading these modules. [Which I would hope wouldn't be the
> case if you were running a long-running computation.]

Modules get loaded automatically pretty much all the time on a
workstation: filesystem modules for a USB key or when upgrading grub,
drivers for USB devices, you name it.

>> And I'll ask again: what's the point of the kernel ABI number if we
>> have to use strict dependencies?
>
> Some modules may need strict dependencies if they are using symbols
> not covered by the ABI; this is one possible way that we can resolve
> this issue.

The issue I have with that, other than the fact that it is just plain
wrong, is that all the module packaging tools were built on the premise
that changes to the kernel ABI are reflected by the ABI number. None of
the tools work if that premise doesn't hold true.

JB.

-- 
 Julien BLACHE <jblache@debian.org>  |  Debian, because code matters more 
 Debian & GNU/Linux Developer        |       <http://www.debian.org>
 Public key available on <http://www.jblache.org> - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Wed, 26 Jan 2011 17:30:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien Cristau <jcristau@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Wed, 26 Jan 2011 17:30:03 GMT) Full text and rfc822 format available.

Message #209 received at 607368@bugs.debian.org (full text, mbox):

From: Julien Cristau <jcristau@debian.org>
To: Julien BLACHE <jblache@debian.org>, 607368@bugs.debian.org
Cc: debian-kernel@lists.debian.org, debian-ctte@lists.debian.org
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Wed, 26 Jan 2011 18:26:25 +0100
[Message part 1 (text/plain, inline)]
On Sun, Dec 19, 2010 at 19:30:58 +0100, Julien BLACHE wrote:

> I think it would be best if this matter would be decided upon before the
> release of Squeeze, or not too long after it, so as to avoid further
> breakages in early kernel updates for Squeeze.
> 
We're getting close to the squeeze release.  Is the technical committee
going to reach a decision on this?

Cheers,
Julien
[signature.asc (application/pgp-signature, inline)]

Added tag(s) squeeze-ignore. Request was from Julien Cristau <jcristau@debian.org> to control@bugs.debian.org. (Thu, 03 Feb 2011 13:54:12 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sat, 28 May 2011 05:00:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Brett L. Trotter" <brett@silcon.com>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sat, 28 May 2011 05:00:06 GMT) Full text and rfc822 format available.

Message #216 received at 607368@bugs.debian.org (full text, mbox):

From: "Brett L. Trotter" <brett@silcon.com>
To: 607368@bugs.debian.org
Subject: Kernel issue
Date: Fri, 27 May 2011 23:49:24 -0500
I run RHEL 6, rebooted after an update to find VMWare workstation does
not work anymore. I'm firmly pissed at the ambivalence I hear from the
linux kernel people.




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sat, 28 May 2011 05:18:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Russ Allbery <rra@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sat, 28 May 2011 05:18:03 GMT) Full text and rfc822 format available.

Message #221 received at 607368@bugs.debian.org (full text, mbox):

From: Russ Allbery <rra@debian.org>
To: "Brett L. Trotter" <brett@silcon.com>, 607368@bugs.debian.org
Subject: Re: Bug#607368: Kernel issue
Date: Fri, 27 May 2011 22:14:53 -0700
Well, if you're going to report a problem you had with Red Hat Enterprise
Linux to Debian, I'll make the obvious response: you should really switch
to Debian.  It's a much better distribution.  :)  And KVM or Xen are nice,
free alternatives to VMware.

For a more helpful response, you may want to report this problem to Red
Hat and/or VMware, since there's not a lot we can do about either product.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sat, 28 May 2011 05:24:12 GMT) Full text and rfc822 format available.

Acknowledgement sent to Russ Allbery <rra@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sat, 28 May 2011 05:24:12 GMT) Full text and rfc822 format available.

Message #226 received at 607368@bugs.debian.org (full text, mbox):

From: Russ Allbery <rra@debian.org>
To: 607368@bugs.debian.org
Subject: Re: Bug#607368: Please decide how kernel ABI should be managed
Date: Fri, 27 May 2011 22:23:07 -0700
So, since something drew my attention to this bug again....

We made a decision by default to not override the kernel team for squeeze
already.  Reviewing the thread, it seems to me like the kernel team both
has good reasons for their decisions and has a reasonable grasp of the
issues, and is evaluating possible alternative solutions going forward.  I
don't see a need for the technical committee to override their decisions
here.

Would anyone like to put forward any alternative proposed actions besides
declining to override the kernel team?  Should we have a vote?

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Tue, 21 Feb 2012 21:03:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Don Armstrong <don@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Tue, 21 Feb 2012 21:03:07 GMT) Full text and rfc822 format available.

Message #231 received at 607368@bugs.debian.org (full text, mbox):

From: Don Armstrong <don@debian.org>
To: 607368@bugs.debian.org
Subject: Ballot for kernel ABI decision
Date: Tue, 21 Feb 2012 13:00:45 -0800
From the bug log, the current consensus appears to be to not override
the kernel maintainer's policy with regards to ABI numbering. As such,
I suggest the following ballot:

A) The technical committee declines to override the kernel maintenance
team's ABI numbering policy.

B) Further discussion

I'll call for a vote in a few days if there aren't any objections.


Don Armstrong

-- 
[On a trip back from collecting grass seeds in tropical bird stomachs
and being thought by the customs agents to be transporting Marijuana.]
"Anyone so square as to tell you they are transporting grass seeds is
bound to be OK"
 -- Peter K. Klopfer _Seeds of Doubt_ Science 134:177 10 April 2009

http://www.donarmstrong.com              http://rzlab.ucr.edu




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Fri, 24 Feb 2012 20:09:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Don Armstrong <don@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Fri, 24 Feb 2012 20:09:03 GMT) Full text and rfc822 format available.

Message #236 received at 607368@bugs.debian.org (full text, mbox):

From: Don Armstrong <don@debian.org>
To: 607368@bugs.debian.org
Subject: Call for Vote: Kernel ABI numbering policy
Date: Fri, 24 Feb 2012 12:04:55 -0800
I call for a vote on the kernel ABI numbering policy bug with the
following ballot:

A) The technical committee declines to override the kernel maintenance
team's ABI numbering policy.

B) Further discussion

END.


Don Armstrong

-- 
Who is thinking this?
I am.
 -- Greg Egan _Diaspora_ p38

http://www.donarmstrong.com              http://rzlab.ucr.edu




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Fri, 24 Feb 2012 20:57:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Don Armstrong <don@debian.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Fri, 24 Feb 2012 20:57:04 GMT) Full text and rfc822 format available.

Message #241 received at 607368@bugs.debian.org (full text, mbox):

From: Don Armstrong <don@debian.org>
To: 607368@bugs.debian.org
Subject: Re: Bug#607368: Call for Vote: Kernel ABI numbering policy
Date: Fri, 24 Feb 2012 12:55:04 -0800
On Fri, 24 Feb 2012, Don Armstrong wrote:
> I call for a vote on the kernel ABI numbering policy bug with the
> following ballot:
> 
> A) The technical committee declines to override the kernel maintenance
> team's ABI numbering policy.
> 
> B) Further discussion
> 
> END.

I vote AB.


Don Armstrong

-- 
"She decided what she wished to happen and then assumed that reality
would bend to her wishes." [...] "Reality doesn't indulge wishes."
 -- Terry Goodkind _Phantom_ p133

http://www.donarmstrong.com              http://rzlab.ucr.edu




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Fri, 24 Feb 2012 23:45:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Bdale Garbee <bdale@gag.com>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Fri, 24 Feb 2012 23:45:03 GMT) Full text and rfc822 format available.

Message #246 received at 607368@bugs.debian.org (full text, mbox):

From: Bdale Garbee <bdale@gag.com>
To: Don Armstrong <don@debian.org>, 607368@bugs.debian.org
Subject: Re: Bug#607368: Call for Vote: Kernel ABI numbering policy
Date: Fri, 24 Feb 2012 16:41:40 -0700
<#part sign=pgpmime>
On Fri, 24 Feb 2012 12:04:55 -0800, Don Armstrong <don@debian.org> wrote:
> I call for a vote on the kernel ABI numbering policy bug with the
> following ballot:
> 
> A) The technical committee declines to override the kernel maintenance
> team's ABI numbering policy.
> 
> B) Further discussion
> 
> END.

I vote AB.

Bdale




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Sat, 25 Feb 2012 10:12:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Andreas Barth <aba@ayous.org>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Sat, 25 Feb 2012 10:12:05 GMT) Full text and rfc822 format available.

Message #251 received at 607368@bugs.debian.org (full text, mbox):

From: Andreas Barth <aba@ayous.org>
To: 607368@bugs.debian.org
Subject: Re: Bug#607368: Call for Vote: Kernel ABI numbering policy
Date: Sat, 25 Feb 2012 10:38:17 +0100
* Don Armstrong (don@debian.org) [120224 21:09]:
> I call for a vote on the kernel ABI numbering policy bug with the
> following ballot:
> 
> A) The technical committee declines to override the kernel maintenance
> team's ABI numbering policy.
> 
> B) Further discussion

Voting AB


Andi




Information forwarded to debian-bugs-dist@lists.debian.org, Technical Committee <debian-ctte@lists.debian.org>:
Bug#607368; Package tech-ctte. (Mon, 27 Feb 2012 12:09:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ian Jackson <ijackson@chiark.greenend.org.uk>:
Extra info received and forwarded to list. Copy sent to Technical Committee <debian-ctte@lists.debian.org>. (Mon, 27 Feb 2012 12:09:17 GMT) Full text and rfc822 format available.

Message #256 received at 607368@bugs.debian.org (full text, mbox):

From: Ian Jackson <ijackson@chiark.greenend.org.uk>
To: 607368@bugs.debian.org
Subject: Re: Bug#607368: Call for Vote: Kernel ABI numbering policy
Date: Mon, 27 Feb 2012 12:04:06 +0000
Don Armstrong writes ("Bug#607368: Call for Vote: Kernel ABI numbering policy"):
> I call for a vote on the kernel ABI numbering policy bug with the
> following ballot:
> 
> A) The technical committee declines to override the kernel maintenance
> team's ABI numbering policy.
> 
> B) Further discussion

I vote AB.

Ian.




Reply sent to Don Armstrong <don@debian.org>:
You have taken responsibility. (Mon, 27 Feb 2012 17:36:06 GMT) Full text and rfc822 format available.

Notification sent to Julien BLACHE <jblache@debian.org>:
Bug acknowledged by developer. (Mon, 27 Feb 2012 17:36:06 GMT) Full text and rfc822 format available.

Message #261 received at 607368-done@bugs.debian.org (full text, mbox):

From: Don Armstrong <don@debian.org>
To: 607368-done@bugs.debian.org, Debian Kernel Team <debian-kernel@lists.debian.org>
Subject: Re: Bug#607368: Call for Vote: Kernel ABI numbering policy
Date: Mon, 27 Feb 2012 09:32:58 -0800
On Mon, 27 Feb 2012, Ian Jackson wrote:
> Don Armstrong writes ("Bug#607368: Call for Vote: Kernel ABI numbering policy"):
> > I call for a vote on the kernel ABI numbering policy bug with the
> > following ballot:
> > 
> > A) The technical committee declines to override the kernel maintenance
> > team's ABI numbering policy.
> > 
> > B) Further discussion
> 
> I vote AB.

With Ian's vote, the outcome is no longer in doubt, and the decision is:

A) The technical committee declines to override the kernel maintenance
team's ABI numbering policy.


Don Armstrong

-- 
You could say she lived on the edge... Well, maybe not exactly on the edge,
just close enough to watch other people fall off.
  -- hugh macleod http://www.gapingvoid.com/Moveable_Type/archives/000309.html

http://www.donarmstrong.com              http://rzlab.ucr.edu




Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Tue, 27 Mar 2012 07:38:50 GMT) Full text and rfc822 format available.

Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Sun Apr 20 06:57:28 2014; Machine Name: buxtehude.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.