Debian Bug report logs -
#888234
dpkg: packages not fully upgraded, but dpkg doesn't notice
Toggle useless messages
Report forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#888234; Package dpkg.
(Wed, 24 Jan 2018 05:09:05 GMT) (full text, mbox, link).
Acknowledgement sent
to Christoph Anton Mitterer <calestyo@scientia.net>:
New Bug report received and forwarded. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Wed, 24 Jan 2018 05:09:05 GMT) (full text, mbox, link).
Message #5 received at submit@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Package: dpkg
Version: 1.19.0.5
Severity: serious
Hi.
Not sure if the following is expected to happen, but I think
dpkg doesn't notice when some packages aren't actually fully upgraded.
Here's the situation:
Since a while I see sporadic freezes of my system (quite often actually
while I do upgrades via aptitude and dpkg runs through the packages).
Until know I thought it would be my new notebook, but I just went back to
my old one (swapped the SSD) and there it hapened as well, so it must be
something else in Debian (kernel, GPU drivers whatever).
Once it's frozen, only power off helps (no magic sysrq), after reboot, when
I check the logs, then apt's termlog, doesn't show the end message, and
it seems also that dpkg's log shows that packages haven't completely installed
/configured, etc.
(see attached log)
And I noticed that my locales where broken, because of the recent new libc
packages in sid... and "locales" wasn't configured.
Normally dpkg -C would show this then, but it doesn't.
Neither does dpkg --configure -a do anything.
This happened alrready quite some times now, an probably my system has
many packages in a state not fully installed, while dpkg thinks everything
would be fine.
Interestingly: debsums -asc doesn't find problems.
I have no idea how to debug this any further... please tell me if you need
anything.
btw: this is on btrfs
Cheers,
Chris
-- System Information:
Debian Release: buster/sid
APT prefers unstable-debug
APT policy: (500, 'unstable-debug'), (500, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 4.14.0-3-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_DE.UTF-8, LC_CTYPE=en_DE.UTF-8 (charmap=UTF-8), LANGUAGE=en_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Versions of packages dpkg depends on:
ii libbz2-1.0 1.0.6-8.1
ii libc6 2.26-5
ii liblzma5 5.2.2-1.3
ii libselinux1 2.7-2
ii tar 1.29b-2
ii zlib1g 1:1.2.8.dfsg-5
dpkg recommends no packages.
Versions of packages dpkg suggests:
ii apt 1.6~alpha7
ii debsig-verify 0.18
-- no debconf information
[dpkg.log (text/plain, attachment)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#888234; Package dpkg.
(Wed, 24 Jan 2018 08:39:05 GMT) (full text, mbox, link).
Acknowledgement sent
to Julien Patriarca <leatherface@debian.org>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Wed, 24 Jan 2018 08:39:05 GMT) (full text, mbox, link).
Message #10 received at 888234@bugs.debian.org (full text, mbox, reply):
Package: dpkg
Version: 1.19.0.5
Followup-For: Bug #888234
I meet this bug 1 in 2 upgrades on my laptop. Apt is downloading the
packages, then dpkg kicks in to install them, and the laptop is
completely frozen. I have to power-cycle it.
Once back into the system, I have to run dpkg --configure -a and apt
install --fix-broken.
Each time I get this message from apt : the package <pacakge-name> needs to be reinstalled, but i can't find an archive for it.
I have to rm -rf /var/lib/apt/lists, then run the apt install
--fix-broken and finally upgrade if necessary.
Please tell me if you wish me to provide some more details, logs or
whatever to investigate this.
-- System Information:
Debian Release: buster/sid
APT prefers testing
APT policy: (900, 'testing'), (90, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 4.14.0-3-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
Versions of packages dpkg depends on:
ii libbz2-1.0 1.0.6-8.1
ii libc6 2.26-4
ii liblzma5 5.2.2-1.3
ii libselinux1 2.7-2
ii tar 1.29b-2
ii zlib1g 1:1.2.8.dfsg-5
dpkg recommends no packages.
Versions of packages dpkg suggests:
ii apt 1.6~alpha6
pn debsig-verify <none>
-- no debconf information
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#888234; Package dpkg.
(Thu, 25 Jan 2018 23:36:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Guillem Jover <guillem@debian.org>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Thu, 25 Jan 2018 23:36:03 GMT) (full text, mbox, link).
Message #15 received at 888234@bugs.debian.org (full text, mbox, reply):
On Wed, 2018-01-24 at 09:32:13 +0100, Julien Patriarca wrote:
> Package: dpkg
> Version: 1.19.0.5
> Followup-For: Bug #888234
> I meet this bug 1 in 2 upgrades on my laptop. Apt is downloading the
> packages, then dpkg kicks in to install them, and the laptop is
> completely frozen. I have to power-cycle it.
I don't think this bug has much to do with the reported one TBH.
> Once back into the system, I have to run dpkg --configure -a and apt
> install --fix-broken.
> Each time I get this message from apt : the package <pacakge-name>
> needs to be reinstalled, but i can't find an archive for it.
This means apt and dpkg are aware something is broken, which
apparently does not happen with the reported bug.
> I have to rm -rf /var/lib/apt/lists, then run the apt install
> --fix-broken and finally upgrade if necessary.
Removing the lists should not be needed? Perhaps you just need to
run «apt update» because the archive does not contain the packages
listed in your local metadata.
> Please tell me if you wish me to provide some more details, logs or
> whatever to investigate this.
I'm not sure whether this is really a bug at all, if it is it might
be in apt though.
Thanks,
Guillem
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#888234; Package dpkg.
(Thu, 25 Jan 2018 23:45:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Guillem Jover <guillem@debian.org>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Thu, 25 Jan 2018 23:45:03 GMT) (full text, mbox, link).
Message #20 received at 888234@bugs.debian.org (full text, mbox, reply):
Control: severity -1 important
[ The severity might deserve to be even lower though, depending on the
analysis of the bug. ]
Hi!
On Wed, 2018-01-24 at 06:05:14 +0100, Christoph Anton Mitterer wrote:
> Package: dpkg
> Version: 1.19.0.5
> Severity: serious
> Not sure if the following is expected to happen, but I think
> dpkg doesn't notice when some packages aren't actually fully upgraded.
Are you sure they are not fully upgraded? What makes you think so?
Just the dpkg.log below?
> Here's the situation:
> Since a while I see sporadic freezes of my system (quite often actually
> while I do upgrades via aptitude and dpkg runs through the packages).
>
> Until know I thought it would be my new notebook, but I just went back to
> my old one (swapped the SSD) and there it hapened as well, so it must be
> something else in Debian (kernel, GPU drivers whatever).
Intel microcode?
> Once it's frozen, only power off helps (no magic sysrq), after reboot, when
> I check the logs, then apt's termlog, doesn't show the end message, and
> it seems also that dpkg's log shows that packages haven't completely installed
> /configured, etc.
> (see attached log)
That might well be because the log file is not fsync()ed to disk, so
the operations might hae finished but the log not been written.
> And I noticed that my locales where broken, because of the recent new libc
> packages in sid... and "locales" wasn't configured.
>
>
> Normally dpkg -C would show this then, but it doesn't.
> Neither does dpkg --configure -a do anything.
And there are no packages in the status file with Status less than
installed. And no lingering files under «/var/lib/dpkg/updates»?
> This happened alrready quite some times now, an probably my system has
> many packages in a state not fully installed, while dpkg thinks everything
> would be fine.
dpkg is very careful about how it handles its database. If it think
they are installed, and there are no update journal entries on the
above directory. Then this might indicate something more severe like
a very broken filesystem on-disk or implementation or hardware failure
or similar.
> Interestingly: debsums -asc doesn't find problems.
That to me would indicate that the packages are either the old
versions or the new ones, but thay match.
> I have no idea how to debug this any further... please tell me if you need
> anything.
> btw: this is on btrfs
That alone seems very suspect IMO.
Thanks,
Guillem
Severity set to 'important' from 'serious'
Request was from Guillem Jover <guillem@debian.org>
to 888234-submit@bugs.debian.org.
(Thu, 25 Jan 2018 23:45:03 GMT) (full text, mbox, link).
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#888234; Package dpkg.
(Fri, 26 Jan 2018 00:15:04 GMT) (full text, mbox, link).
Acknowledgement sent
to Christoph Anton Mitterer <calestyo@scientia.net>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Fri, 26 Jan 2018 00:15:04 GMT) (full text, mbox, link).
Message #27 received at 888234@bugs.debian.org (full text, mbox, reply):
On Fri, 2018-01-26 at 00:42 +0100, Guillem Jover wrote:
> Are you sure they are not fully upgraded? What makes you think so?
> Just the dpkg.log below?
No, I ignored it at first cause I thought it's not so unlikely, that
the log simply didn't got flushed out before the freeze.
But then there was recently an upgrade to glibc (which includes local
re-generation). A crash happened and afterwards e.g. gnome-terminal
didn't even start anymore (with some locale related errors, when
started from xterm).
Once I regenerated the locales, gnome-terminal worked fine again.
Of course it could simply be, that the locales didn't get flushed out
in time (respectively no commit was made in btrfs),... but then dpkg
shouldn't think it would be configured, right?
Also, when I was actually looking at the upgrade process in aptitude...
I had several times such a freeze, and it didn't even reach the
"Setting up..." phase.
So intuitively I'd guess it couldn't finish that after the freeze (at
least the HDD LED didn't show any flashing).
> Intel microcode?
Unlikely, cause I've seen these freezes long before the
spectre/meltdown upgrades, and at least once again after the microcode
was withdrawn since.
> > Normally dpkg -C would show this then, but it doesn't.
> > Neither does dpkg --configure -a do anything.
>
> And there are no packages in the status file with Status less than
> installed. And no lingering files under «/var/lib/dpkg/updates»?
/var/lib/dpkg/updates/ is empty (well at least right now... not sure if
it would have gotten cleaned up somehow else in the meantime).
/var/lib/dpkg# grep ^Status status | sort -u
Status: deinstall ok config-files
Status: hold ok installed
Status: install ok installed
but these are all expected (i.e. the deinstall config-files are
deleted/not-purged packages)
> > This happened alrready quite some times now, an probably my system
> > has
> > many packages in a state not fully installed, while dpkg thinks
> > everything
> > would be fine.
>
> dpkg is very careful about how it handles its database. If it think
> they are installed, and there are no update journal entries on the
> above directory. Then this might indicate something more severe like
> a very broken filesystem on-disk or implementation or hardware
> failure
> or similar.
Arguably, btrfs isn't perfect, but so far I never found any real
corruptions in case of any freezes/crashes/etc.
The only thing what I ever found was that something wasn't committed
yet, and got completely removed, but that in turn should dpkg protect
against, AFAIU (with syncs at the appropriate places).
As for the hardware: I think it could then only be the SSD, cause this
is the only common thing, after I switched the notebook now.
> > Interestingly: debsums -asc doesn't find problems.
>
> That to me would indicate that the packages are either the old
> versions or the new ones, but thay match.
Is there any easy way to check that (i.e. whether they files are all
still old, but dpkg thinks the upgrade was performed and the new
version would be in place)?
I did a random sample and compared one file of libc6 and locales
package, but from my system with that of the .deb,... but of course I
may have just picked the wrong one that still matches.
Could it be, that they always got unpacked, but not configured and that
only this information would have been somehow lost?
Cause that could explain why the locales haven't been regenerated.
> > I have no idea how to debug this any further... please tell me if
> > you need
> > anything.
> > btw: this is on btrfs
>
> That alone seems very suspect IMO.
;-)
Well it's much better IMO than it's reputation. Of course when one uses
not-yet-stable features like qgroups or raid56, things get easily
problematic,... but other than that the design of btrfs with CoW should
in principle prevent any corruptions and just allow for either-old-or-
new.
And as I've said, so far I never found a really corrupt file.
Thanks :)
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#888234; Package dpkg.
(Fri, 26 Jan 2018 02:06:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Guillem Jover <guillem@debian.org>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Fri, 26 Jan 2018 02:06:03 GMT) (full text, mbox, link).
Message #32 received at 888234@bugs.debian.org (full text, mbox, reply):
Hi!
On Fri, 2018-01-26 at 01:04:08 +0100, Christoph Anton Mitterer wrote:
> On Fri, 2018-01-26 at 00:42 +0100, Guillem Jover wrote:
> > Are you sure they are not fully upgraded? What makes you think so?
> > Just the dpkg.log below?
>
> No, I ignored it at first cause I thought it's not so unlikely, that
> the log simply didn't got flushed out before the freeze.
Yeah, I guess it's never been considered an important file to preserve
above all, including performance degradation. I'll have to ponder, or
benchmark whether fdatasync(2)ing the log file would penalize too much.
> But then there was recently an upgrade to glibc (which includes local
> re-generation). A crash happened and afterwards e.g. gnome-terminal
> didn't even start anymore (with some locale related errors, when
> started from xterm).
> Once I regenerated the locales, gnome-terminal worked fine again.
>
> Of course it could simply be, that the locales didn't get flushed out
> in time (respectively no commit was made in btrfs),... but then dpkg
> shouldn't think it would be configured, right?
dpkg does not and cannot control what and how things are done in
packages's maintainer scripts. So it can happen that dpkg syncs all
its databases and all the extracted files to disk, but the maintainer
scripts do not call the equivalent fsync(2) and thus those linger
around in memory and get lost on a crash. I'd expect most maintainer
script to not be abrupt-crash-safe, or even many applications TBH,
as not many things do the rename(2)/fsync(2) dance or similar.
> > > Normally dpkg -C would show this then, but it doesn't.
> > > Neither does dpkg --configure -a do anything.
> >
> > And there are no packages in the status file with Status less than
> > installed. And no lingering files under «/var/lib/dpkg/updates»?
>
> /var/lib/dpkg/updates/ is empty (well at least right now... not sure if
> it would have gotten cleaned up somehow else in the meantime).
Any subsequent write action would have incorporated the database
journal entries.
> > > This happened alrready quite some times now, an probably my system
> > > has
> > > many packages in a state not fully installed, while dpkg thinks
> > > everything
> > > would be fine.
> >
> > dpkg is very careful about how it handles its database. If it think
> > they are installed, and there are no update journal entries on the
> > above directory. Then this might indicate something more severe like
> > a very broken filesystem on-disk or implementation or hardware
> > failure
> > or similar.
>
> Arguably, btrfs isn't perfect, but so far I never found any real
> corruptions in case of any freezes/crashes/etc.
> The only thing what I ever found was that something wasn't committed
> yet, and got completely removed, but that in turn should dpkg protect
> against, AFAIU (with syncs at the appropriate places).
dpkg can only protect what it does itself.
> > > Interestingly: debsums -asc doesn't find problems.
> >
> > That to me would indicate that the packages are either the old
> > versions or the new ones, but thay match.
>
> Is there any easy way to check that (i.e. whether they files are all
> still old, but dpkg thinks the upgrade was performed and the new
> version would be in place)?
> I did a random sample and compared one file of libc6 and locales
> package, but from my system with that of the .deb,... but of course I
> may have just picked the wrong one that still matches.
The easiest is probably to download the .debs matching the versions in
the system, and compare their md5sums with the ones in the dpkg db. If
«dpkg -V» then says there's no problem, then that should mean the
unpacked files are fine.
This of course does not cover any files generated by maintainer
scripts.
> Could it be, that they always got unpacked, but not configured and that
> only this information would have been somehow lost?
> Cause that could explain why the locales haven't been regenerated.
If they are unpacked but not configured the new files would be on
disk, and the new md5sums as well, and the db would contain an
appropriate status. The package status does not progress until the
current stage has been finished, and those get properly synced to
disk. So in principle no, that should never happen.
I assume though that the locales had been generated but not flushed
to disk. I don't see any fsync(2)/fdatasync(2) in the glibc source
for the locale generators (one is a shell script, the other is a perl
script, and the last is a C program).
So, if the above checks look fine, I'd say the only thing that can
be done is perhaps to consider syncing the log file, but that might
be too much. And perhaps clone and reassign to glibc to make its
maintainer scripts more robust against abrupt-crashes. But take
into account this will be an uphill battle, as mentioned above
most maintscript and even most programs and applications are not
abruch-crash safe anyway…
Thanks,
Guillem
Reply sent
to Guillem Jover <guillem@debian.org>:
You have taken responsibility.
(Fri, 03 Aug 2018 02:03:03 GMT) (full text, mbox, link).
Notification sent
to Christoph Anton Mitterer <calestyo@scientia.net>:
Bug acknowledged by developer.
(Fri, 03 Aug 2018 02:03:03 GMT) (full text, mbox, link).
Message #37 received at 888234-done@bugs.debian.org (full text, mbox, reply):
Hi!
On Fri, 2018-01-26 at 03:03:32 +0100, Guillem Jover wrote:
> On Fri, 2018-01-26 at 01:04:08 +0100, Christoph Anton Mitterer wrote:
> > Could it be, that they always got unpacked, but not configured and that
> > only this information would have been somehow lost?
> > Cause that could explain why the locales haven't been regenerated.
>
> If they are unpacked but not configured the new files would be on
> disk, and the new md5sums as well, and the db would contain an
> appropriate status. The package status does not progress until the
> current stage has been finished, and those get properly synced to
> disk. So in principle no, that should never happen.
>
> I assume though that the locales had been generated but not flushed
> to disk. I don't see any fsync(2)/fdatasync(2) in the glibc source
> for the locale generators (one is a shell script, the other is a perl
> script, and the last is a C program).
>
> So, if the above checks look fine, I'd say the only thing that can
> be done is perhaps to consider syncing the log file, but that might
> be too much. And perhaps clone and reassign to glibc to make its
> maintainer scripts more robust against abrupt-crashes. But take
> into account this will be an uphill battle, as mentioned above
> most maintscript and even most programs and applications are not
> abruch-crash safe anyway…
So, I don't think this is really actionable from the dpkg side. And
fsync()ing the log file seems too much. So I'm closing it now.
I'd recommend either switching filesystem, or try to file bug reports
to packages that might not be doing fsync()+rename() on anything
called by maintainer scripts, but I'm not sure how that will be
received by the various maintainers and more importantly upstreams.
Thanks,
Guillem
Bug archived.
Request was from Debbugs Internal Request <owner@bugs.debian.org>
to internal_control@bugs.debian.org.
(Fri, 31 Aug 2018 07:27:25 GMT) (full text, mbox, link).
Send a report that this bug log contains spam.
Debian bug tracking system administrator <owner@bugs.debian.org>.
Last modified:
Fri Jul 24 00:44:35 2020;
Machine Name:
bembo
Debian Bug tracking system
Debbugs is free software and licensed under the terms of the GNU
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson,
2005-2017 Don Armstrong, and many other contributors.