Debian Bug report logs -
#613428
dpkg --force-unsafe-io still calls fsync()
Reported by: Mike Hommey <mh+reportbug@glandium.org>
Date: Mon, 14 Feb 2011 20:03:02 UTC
Severity: normal
Tags: wontfix
Found in version dpkg/1.15.8.10
Done: Guillem Jover <guillem@debian.org>
Bug is archived. No further changes may be made.
Toggle useless messages
Report forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Mon, 14 Feb 2011 20:03:05 GMT) (full text, mbox, link).
Acknowledgement sent
to Mike Hommey <mh+reportbug@glandium.org>:
New Bug report received and forwarded. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Mon, 14 Feb 2011 20:03:05 GMT) (full text, mbox, link).
Message #5 received at submit@bugs.debian.org (full text, mbox, reply):
Package: dpkg
Version: 1.15.8.10
Severity: normal
The manual page says:
unsafe-io: Do not perform safe I/O operations when unpacking. Currently
this implies not performing file system syncs before file renames, (...)
While this is stricly true, there are still two fsync()s occuring on each
package unpack, making the whole thing still slow when installing many
packages at a time.
These happen for /var/lib/dpkg/updates and /var/lib/dpkg/tmp.ci.
Mike
-- System Information:
Debian Release: wheezy/sid
APT prefers unstable
APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.32-5-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages dpkg depends on:
ii coreutils 8.5-1 GNU core utilities
ii libbz2-1.0 1.0.5-6 high-quality block-sorting file co
ii libc6 2.11.2-11 Embedded GNU C Library: Shared lib
ii libselinux1 2.0.96-1 SELinux runtime shared libraries
ii xz-utils 5.0.0-2 XZ-format compression utilities
ii zlib1g 1:1.2.3.4.dfsg-3 compression library - runtime
dpkg recommends no packages.
Versions of packages dpkg suggests:
ii apt 0.8.11.1 Advanced front-end for dpkg
-- no debconf information
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Tue, 15 Feb 2011 07:06:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Raphael Hertzog <hertzog@debian.org>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Tue, 15 Feb 2011 07:06:03 GMT) (full text, mbox, link).
Message #10 received at 613428@bugs.debian.org (full text, mbox, reply):
Hi,
On Mon, 14 Feb 2011, Mike Hommey wrote:
> The manual page says:
> unsafe-io: Do not perform safe I/O operations when unpacking. Currently
> this implies not performing file system syncs before file renames, (...)
>
> While this is stricly true, there are still two fsync()s occuring on each
> package unpack, making the whole thing still slow when installing many
> packages at a time.
>
> These happen for /var/lib/dpkg/updates and /var/lib/dpkg/tmp.ci.
This is on purpose.
The status database has always been synced IIRC. There might be one
fsync() more on the directory containing status information but that
should not make it much slower than what it used to be... at least not
compared to many fsync() for unpacked files.
We want to always ensure the consistency of the internal database.
Use eatmydata if you really want no fsync(), but we don't want to make
it too easy for users to corrupt the database.
Cheers,
--
Raphaël Hertzog ◈ Debian Developer
Follow my Debian News ▶ http://RaphaelHertzog.com (English)
▶ http://RaphaelHertzog.fr (Français)
Added tag(s) wontfix.
Request was from Raphaël Hertzog <hertzog@debian.org>
to control@bugs.debian.org.
(Sat, 26 Mar 2011 14:57:17 GMT) (full text, mbox, link).
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Mon, 02 Jan 2012 09:48:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Mon, 02 Jan 2012 09:48:04 GMT) (full text, mbox, link).
Message #17 received at 613428@bugs.debian.org (full text, mbox, reply):
[ Mike Hommey ]
> While this is stricly true, there are still two fsync()s occuring on each
> package unpack, making the whole thing still slow when installing many
> packages at a time.
>
> These happen for /var/lib/dpkg/updates and /var/lib/dpkg/tmp.ci.
[ Raphael Hertzog ]
> This is on purpose.
Can you explain in which situation where --force-unsafe-io is used
that you believe these fsync()s to be an advantage? I've tried to
come up with such scenarios without any luck so far.
The users of --force-unsafe-io seem to be those that know that if
something go wrong during installation, they scratch everything and
start again, and finishing quickly is more important than handling
power outages. One example is building live CDs. Another is
installing the system for the first time. A third is creating chroots
for testing. In all these cases any power outage or other system
failure will make one erase everything and start over, and there is no
need for dpkg to keep an consistent state.
I would love my distro upgrade test to become faster and thus hope
also these fsyncs can go away. I install stable in a chroot,
dist-upgrade to testing, and check the rsult. If the something go
wrong I remove the chroot and start over, so there is no need for any
sync to disk. :)
--
Happy hacking
Petter Reinholdtsen
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Mon, 02 Jan 2012 10:09:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Jonathan Nieder <jrnieder@gmail.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Mon, 02 Jan 2012 10:09:07 GMT) (full text, mbox, link).
Message #22 received at 613428@bugs.debian.org (full text, mbox, reply):
Petter Reinholdtsen wrote:
> The users of --force-unsafe-io seem to be those that
[...]
In retrospect, introducing --force-unsafe-io was probably a mistake.
Making sure to always call a wrapper function that behaves just like
fsync() but can be disabled would be a maintenance burden for almost
no benefit, given that eatmydata exists.
The current semantics are at least distinct from eatmydata, though
it's not obvious to me that it is a very useful distinction. (I guess
the idea is that it is for situations in which you can easily detect
corruption of individual installed packages and reinstall them, while
the internal database is still precious.)
Hope that clarifies a little,
Jonathan
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Mon, 02 Jan 2012 10:18:27 GMT) (full text, mbox, link).
Acknowledgement sent
to Raphael Hertzog <hertzog@debian.org>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Mon, 02 Jan 2012 10:18:28 GMT) (full text, mbox, link).
Message #27 received at 613428@bugs.debian.org (full text, mbox, reply):
Hi,
On Mon, 02 Jan 2012, Petter Reinholdtsen wrote:
> [ Mike Hommey ]
> > While this is stricly true, there are still two fsync()s occuring on each
> > package unpack, making the whole thing still slow when installing many
> > packages at a time.
> >
> > These happen for /var/lib/dpkg/updates and /var/lib/dpkg/tmp.ci.
>
> [ Raphael Hertzog ]
> > This is on purpose.
>
> Can you explain in which situation where --force-unsafe-io is used
> that you believe these fsync()s to be an advantage? I've tried to
> come up with such scenarios without any luck so far.
This is an option that we wish it did not exist.
> The users of --force-unsafe-io seem to be those that know that if
> something go wrong during installation, they scratch everything and
> start again, and finishing quickly is more important than handling
> power outages. One example is building live CDs. Another is
> installing the system for the first time. A third is creating chroots
> for testing. In all these cases any power outage or other system
> failure will make one erase everything and start over, and there is no
> need for dpkg to keep an consistent state.
>
> I would love my distro upgrade test to become faster and thus hope
> also these fsyncs can go away. I install stable in a chroot,
> dist-upgrade to testing, and check the rsult. If the something go
> wrong I remove the chroot and start over, so there is no need for any
> sync to disk. :)
The proper approach is to enhance your testing tools to use "eatmydata"
to really disable all fsync() and not only those of dpkg.
--force-unsafe-io has not been meant for those use case at all, it was
meant for some users to gain back some performance lost on supplementary
fsync() that have been added to dpkg. It was not meant to disable all
fsync() and in particular not those on the database.
Cheers,
--
Raphaël Hertzog ◈ Debian Developer
Pre-order a copy of the Debian Administrator's Handbook and help
liberate it: http://debian-handbook.info/liberation/
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Mon, 02 Jan 2012 10:36:13 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Mon, 02 Jan 2012 10:36:24 GMT) (full text, mbox, link).
Message #32 received at 613428@bugs.debian.org (full text, mbox, reply):
Thank you for the quick reply. I wish you a happy new year. :)
[Raphael Hertzog]
> This is an option that we wish it did not exist.
OK. Still do not explain to me in what situation or use case it is
useful drop fsync() for the package files while still using fsync() on
/var/lib/dpkg/updates and /var/lib/dpkg/tmp.ci. I assume there is
such usecase, given that the option is working the way it is.
> The proper approach is to enhance your testing tools to use
> "eatmydata" to really disable all fsync() and not only those of
> dpkg.
It is not really possible to do this without rewriting all of Debian
to allow it. While adding a file to gain a similar effect is
possible. I tried to get eatmydata to work, but there are just too
many packages that would need to change for it to have the desired
effect.
> --force-unsafe-io has not been meant for those use case at all, it was
> meant for some users to gain back some performance lost on supplementary
> fsync() that have been added to dpkg. It was not meant to disable all
> fsync() and in particular not those on the database.
I would expect these users to also want the extra performance gained
by dropping the left behind fsyncs()? Why should this use case want
the remaining fsync()s in place?
--
Happy hacking
Petter Reinholdtsen
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Mon, 02 Jan 2012 11:18:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Raphael Hertzog <hertzog@debian.org>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Mon, 02 Jan 2012 11:18:05 GMT) (full text, mbox, link).
Message #37 received at 613428@bugs.debian.org (full text, mbox, reply):
On Mon, 02 Jan 2012, Petter Reinholdtsen wrote:
> I would expect these users to also want the extra performance gained
> by dropping the left behind fsyncs()? Why should this use case want
> the remaining fsync()s in place?
Because they care about the integrity of their system? We de not want to
make it easy to corrupt your dpkg database.
Cheers,
--
Raphaël Hertzog ◈ Debian Developer
Pre-order a copy of the Debian Administrator's Handbook and help
liberate it: http://debian-handbook.info/liberation/
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Mon, 02 Jan 2012 12:15:05 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Mon, 02 Jan 2012 12:17:03 GMT) (full text, mbox, link).
Message #42 received at 613428@bugs.debian.org (full text, mbox, reply):
[Raphael Hertzog]
> Because they care about the integrity of their system? We de not
> want to make it easy to corrupt your dpkg database.
Your comment do not make sense to me. I fail to understand how those
caring about the integrity of their system during the dpkg run would
use --force-unsafe-io. At least I only use it if I care about speed,
and would be happy to start again if something went wrong half-way
through the installation. When I use it, I do not care about the
integrity, and neither do those building live CDs. :)
Anyway, hopefully someone some time in the future will come up with a
usecase that could make me understand when it is useful to not fsync
the package files and only fsync the dpkg database.
--
Happy hacking
Petter Reinholdtsen
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Sun, 19 May 2013 13:30:04 GMT) (full text, mbox, link).
Acknowledgement sent
to Christoph Biedl <debian.axhn@manchmal.in-ulm.de>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Sun, 19 May 2013 13:30:04 GMT) (full text, mbox, link).
Message #47 received at 613428@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Raphael Hertzog wrote...
> The proper approach is to enhance your testing tools to use "eatmydata"
> to really disable all fsync() and not only those of dpkg.
Not a good idea. eatmydata introduces new bugs, #667965 is one that
hit me. It causes some pain in multiarch installations, at least a
lot of noise from ldd. And I cannot think of a good reason why the
actual build scripts should do fsync, at least to a degree where this
visibly degrades performance.
Therefore, having something eatmydata-ish inside dpkg was really
helpful in corner cases like buildd is one.
Now, to propose a workaround for build daemons:
Create a small program that wraps the dpkg call into eatmydata. I did
so and called it dpkg-eatmydata. Written in C for performance reasons,
it's still just
#!/bin/sh
/usr/bin/eatmydata /usr/bin/dpkg $@
A configuration sniplet for apt
$ cat /etc/apt/apt.conf.d/dpkg-eatmydata
Dir::Bin::dpkg "/usr/bin/dpkg-eatmydata";
will cause the build dependency resolver to call that
eatmydata-wrapped dpkg, resulting in the desired speed gain.
Downside: Some binaries called from the maintainer scripts still might
emit ldd warnings. I can live with that.
> --force-unsafe-io has not been meant for those use case at all, it was
> meant for some users to gain back some performance lost on supplementary
> fsync() that have been added to dpkg. It was not meant to disable all
> fsync() and in particular not those on the database.
Since that option doesn't really do what it promises, in my humble
opinion it should be retired some day.
Christoph
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Wed, 17 Sep 2014 09:15:05 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Wed, 17 Sep 2014 09:15:05 GMT) (full text, mbox, link).
Message #52 received at 613428@bugs.debian.org (full text, mbox, reply):
Hi.
I did some testing installing using eatmydata to see how much it could
reduce the installation time. I used the enclosed test script to
compare the installation time for three test setup. One is the normal
one, the other is using dpkg-divert to divert apt-get, aptitude and
dpkg, while the third uses the Dir::Bin::dpkg setting to use a dpkg
wrapper with eatmydata enabled.
The installation was done with 100Mbit/s connection to the Debian
mirror, so most of the time is spend unpacking. I tried using two
package sets, kde-standard and kde-full, picked to get a fairly large
number of packages installed.
This was the result. The number is in seconds.
Installing kde-standard
Wed Sep 17 09:54:50 CEST 2014 used: 357 divert
Wed Sep 17 10:00:51 CEST 2014 used: 359 dpkg_conf
Wed Sep 17 10:09:38 CEST 2014 used: 525 default
Installing kde-full with policy-rc.d in place.
Wed Sep 17 10:29:33 CEST 2014 used: 424 divert
Wed Sep 17 10:36:29 CEST 2014 used: 413 dpkg_conf
Wed Sep 17 10:45:35 CEST 2014 used: 543 default
As you can see, the reduction in installation time is in the range
21-32 percent of the current default. It is not obvious to me why the
Dir::Bin::dpkg approach can be quicker than the divert approach. This
might be caused by other issues, as the last run was done just after
boot. Perhaps the order these tests are executed matter?
Anyway, just wanted to share with you this data point comparing the
different ways to speed up package installation in Debian.
------------------------ test-install-speed ----------------------------
#!/bin/sh
suite=testing
chroot=chroot-testing
mirror=http://http.debian.net/debian
mirror=http://ftp.uio.no/debian
unset TMP TMPDIR
# Never wait for input
DEBIAN_FRONTEND=noninteractive
export DEBIAN_FRONTEND
make_chroot() {
debootstrap $suite $chroot $mirror
printf "#!/bin/sh\nexit 101\n" > $chroot/usr/sbin/policy-rc.d
chmod a+rx $chroot/usr/sbin/policy-rc.d
chroot $chroot apt-get install -y eatmydata
}
install_chroot_pkgs() {
chroot $chroot apt-get install -y kde-full
}
test_default() {
make_chroot
install_chroot_pkgs
}
test_divert() {
make_chroot
for bin in dpkg apt-get aptitude tasksel ; do
file=/usr/bin/$bin
# Test that the file exist and have not been diverted already.
if [ -f $chroot/usr/bin/$bin ] ; then
info "diverting /usr/bin/$bin using eatmydata"
printf "#!/bin/sh\neatmydata $bin.distrib \"\$@\"\n" \
> $chroot/usr/bin/$bin.edu
chmod 755 $chroot/usr/bin/$bin.edu
chroot $chroot dpkg-divert --package debian-edu-config \
--rename --quiet --add /usr/bin/$bin
ln -sf ./$bin.edu $chroot/usr/bin/$bin
else
error "unable to divert /usr/bin/$bin, as it is missing."
fi
done
install_chroot_pkgs
}
# https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=613428
test_dpkg_conf() {
make_chroot
cat > $chroot/usr/bin/dpkg-eatmydata <<'EOF'
#!/bin/sh
/usr/bin/eatmydata /usr/bin/dpkg "$@"
EOF
chmod a+rx $chroot/usr/bin/dpkg-eatmydata
cat > $chroot/etc/apt/apt.conf.d/dpkg-eatmydata <<EOF
Dir::Bin::dpkg "/usr/bin/dpkg-eatmydata";
EOF
install_chroot_pkgs
}
for f in divert dpkg_conf default ; do
rm -rf "$chroot"
start=$(date +%s)
test_$f
end=$(date +%s)
(LC_ALL=C date; echo "used: $(($end - $start)) $f" ) >> test.log
done
echo
tail test.log
------------------------------------------------------------------------
--
Happy hacking
Petter Reinholdtsen
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Wed, 17 Sep 2014 13:18:04 GMT) (full text, mbox, link).
Acknowledgement sent
to Guillem Jover <guillem@debian.org>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Wed, 17 Sep 2014 13:18:04 GMT) (full text, mbox, link).
Message #57 received at 613428@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Hi!
On Wed, 2014-09-17 at 11:11:30 +0200, Petter Reinholdtsen wrote:
> I did some testing installing using eatmydata to see how much it could
> reduce the installation time. I used the enclosed test script to
> compare the installation time for three test setup. One is the normal
> one, the other is using dpkg-divert to divert apt-get, aptitude and
> dpkg, while the third uses the Dir::Bin::dpkg setting to use a dpkg
> wrapper with eatmydata enabled.
You asked in the past why the current implementation is the way it
is. A quick summary would be that, dpkg has always done fsync() on
the database (not on its directories but still), and got support to
perform fsync() on the unpacked filesystem, due to new filesystems
being very unsafe. And then other people using those filesystems that
force to choose between safety or speed saw a substantial performance
loss, so they could decide what they preferred, and could go back to
the previous (broken) behavior.
Of course, because these same filesystems show a very poor performance
with applications that are doing proper safe file handling, some
people also started to use the --force-unsafe-io option when doing
throwaway installations or upgrades, like on first install, or on
buildds and similar.
In any case, the point of the option is that, even if you get your
unpacked files corruped with the 0-length issue, or similar on your
day-to-day system, you should always be able to restore it from a
recovery media, as you might only need to reinstall damaged packages,
and you know which ones those might be, because the database would be
in a sane state.
Part of this is explained in
<https://wiki.debian.org/Teams/Dpkg/FAQ#Q:_Why_is_dpkg_so_slow_when_using_new_filesystems_such_as_btrfs_or_ext4.3F>
> This was the result. The number is in seconds.
>
> Installing kde-standard
>
> Wed Sep 17 09:54:50 CEST 2014 used: 357 divert
> Wed Sep 17 10:00:51 CEST 2014 used: 359 dpkg_conf
> Wed Sep 17 10:09:38 CEST 2014 used: 525 default
>
> Installing kde-full with policy-rc.d in place.
> Wed Sep 17 10:29:33 CEST 2014 used: 424 divert
> Wed Sep 17 10:36:29 CEST 2014 used: 413 dpkg_conf
> Wed Sep 17 10:45:35 CEST 2014 used: 543 default
>
> As you can see, the reduction in installation time is in the range
> 21-32 percent of the current default. It is not obvious to me why the
> Dir::Bin::dpkg approach can be quicker than the divert approach. This
> might be caused by other issues, as the last run was done just after
> boot. Perhaps the order these tests are executed matter?
You should either clear the kernel cache or reboot on each iteration
to try to get a similar initial state. The former can be done with
something like:
sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
You might also want to try with the attached dpkg patch which should
disable all fsync() calls in the main dpkg program, to see how the
rest of the system affects your performance, besides dpkg itself.
Thanks,
Guillem
[dpkg-no-fsync.patch (text/x-diff, attachment)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Sat, 20 Sep 2014 02:45:05 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Sat, 20 Sep 2014 02:45:05 GMT) (full text, mbox, link).
Message #62 received at 613428@bugs.debian.org (full text, mbox, reply):
[Guillem Jover]
> Hi!
Hi. :)
> You asked in the past why the current implementation is the way it
> is. A quick summary would be that, [...]
Thank you for the explanation. :)
> You should either clear the kernel cache or reboot on each iteration
> to try to get a similar initial state. The former can be done with
> something like:
>
> sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
Good point. I did some more testing with such flushing in place, and
tried a bit with different ordering too, and got this result
installing the kde-full package and dependences (around 1700
packages):
Fri Sep 19 10:39:48 CEST 2014 used: 576 dpkg_conf
Fri Sep 19 10:49:09 CEST 2014 used: 558 divert
Fri Sep 19 11:01:33 CEST 2014 used: 741 default
Fri Sep 19 11:14:08 CEST 2014 used: 556 dpkg_conf
Fri Sep 19 11:23:28 CEST 2014 used: 557 divert
Fri Sep 19 11:37:57 CEST 2014 used: 866 default
Fri Sep 19 12:11:27 CEST 2014 used: 930 default
Fri Sep 19 12:20:54 CEST 2014 used: 564 divert
Fri Sep 19 12:30:21 CEST 2014 used: 564 dpkg_conf
Fri Sep 19 15:24:50 CEST 2014 used: 805 default
Fri Sep 19 15:34:15 CEST 2014 used: 562 divert
Fri Sep 19 15:43:40 CEST 2014 used: 562 dpkg_conf
The machine had 100Mbit/s to the mirror, so most of the time is spent
unpacking. The speedup seem to be significant, in the range two to
three minutes for this set of packages.
Also, this indicate that using eatmydata for dpkg is enough to get
most of the advantage, and that using it for apt-get do not gain much
extra speedup. Adding a wrapper and configuring Dir::Bin::dpkg to
call it might be seen as less intrusive than the dpkg-divert method.
> You might also want to try with the attached dpkg patch which should
> disable all fsync() calls in the main dpkg program, to see how the
> rest of the system affects your performance, besides dpkg itself.
Will try to find time to do this later.:)
This is the script I use for testing now.
----------------------------------------------------------------------------------------
#!/bin/sh
set -e
suite=testing
chroot=chroot-testing
#mirror=http://http.debian.net/debian
mirror=http://ftp.uio.no/debian
unset TMP TMPDIR TEMP TEMPDIR
# Never wait for input
DEBIAN_FRONTEND=noninteractive
export DEBIAN_FRONTEND
info() { echo info: "$@" ; }
error() { echo error: "$@" ; }
make_chroot() {
debootstrap $suite $chroot $mirror
printf "#!/bin/sh\nexit 101\n" > $chroot/usr/sbin/policy-rc.d
chmod a+rx $chroot/usr/sbin/policy-rc.d
chroot $chroot apt-get install -y eatmydata
}
install_chroot_pkgs() {
chroot $chroot apt-get install -o APT::Acquire::Retries=3 -y kde-full
}
test_default() {
make_chroot
install_chroot_pkgs
}
test_divert() {
make_chroot
for bin in dpkg apt-get aptitude tasksel ; do
file=/usr/bin/$bin
# Test that the file exist and have not been diverted already.
if [ -f $chroot/usr/bin/$bin ] ; then
info "diverting /usr/bin/$bin using eatmydata"
printf "#!/bin/sh\neatmydata $bin.distrib \"\$@\"\n" \
> $chroot/usr/bin/$bin.edu
chmod 755 $chroot/usr/bin/$bin.edu
chroot $chroot dpkg-divert --package debian-edu-config \
--rename --quiet --add /usr/bin/$bin
ln -sf ./$bin.edu $chroot/usr/bin/$bin
else
error "unable to divert /usr/bin/$bin, as it is missing."
fi
done
install_chroot_pkgs
}
# https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=613428
test_dpkg_conf() {
make_chroot
cat > $chroot/usr/bin/dpkg-eatmydata <<'EOF'
#!/bin/sh
/usr/bin/eatmydata /usr/bin/dpkg "$@"
EOF
chmod a+rx $chroot/usr/bin/dpkg-eatmydata
cat > $chroot/etc/apt/apt.conf.d/dpkg-eatmydata <<EOF
Dir::Bin::dpkg "/usr/bin/dpkg-eatmydata";
EOF
install_chroot_pkgs
}
info "Logging to run-$suite.log"
exec < /dev/null > run-$suite.log 2>&1
for f in \
default \
divert \
dpkg_conf
do
rm -rf "$chroot"
echo
echo "Testing $f"
echo
sync && echo 3 > /proc/sys/vm/drop_caches
start=$(date +%s)
test_$f
end=$(date +%s)
(LC_ALL=C date; echo "used: $(($end - $start)) $f" ) >> test.log
done
echo
tail test.log
----------------------------------------------------------------------------------------
--
Happy hacking
Petter Reinholdtsen
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Sun, 21 Sep 2014 13:21:04 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Sun, 21 Sep 2014 13:21:05 GMT) (full text, mbox, link).
Message #67 received at 613428@bugs.debian.org (full text, mbox, reply):
I've now tested using the dpkg patch disabling fsync(), and ran each
test three times, first comparing the normal dpkg with the
Dir::Bin::dpkg wrapper, and next comparing the patched dpkg with the
patched dpkg and the Dir::Bin::dpkg wrapper:
Sun Sep 21 09:21:28 CEST 2014 used: 750 default
Sun Sep 21 09:30:53 CEST 2014 used: 562 dpkg_conf
Sun Sep 21 09:43:32 CEST 2014 used: 756 default
Sun Sep 21 09:53:02 CEST 2014 used: 567 dpkg_conf
Sun Sep 21 10:06:25 CEST 2014 used: 800 default
Sun Sep 21 10:15:47 CEST 2014 used: 559 dpkg_conf
The 'default' average is 769+-32 seconds, the 'dpkg_conf' average is
563+-5 seconds.
Sun Sep 21 10:33:15 CEST 2014 used: 772 dpkg_nofsync
Sun Sep 21 10:42:38 CEST 2014 used: 560 dpkg_conf
Sun Sep 21 10:55:19 CEST 2014 used: 758 dpkg_nofsync
Sun Sep 21 11:04:43 CEST 2014 used: 561 dpkg_conf
Sun Sep 21 11:17:23 CEST 2014 used: 757 dpkg_nofsync
Sun Sep 21 11:26:45 CEST 2014 used: 559 dpkg_conf
The 'dpkg_nofsync' average is 762+-10 seconds, the 'dpkg_conf' average
is 560+-1 seconds. So the advantage of disabling fsync() in dpkg
itself seem neglectable, while the advantage of using eatmydata is
significant, also with a patched dpkg.
--
Happy hacking
Petter Reinholdtsen
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Mon, 22 Sep 2014 08:39:04 GMT) (full text, mbox, link).
Acknowledgement sent
to Guillem Jover <guillem@debian.org>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Mon, 22 Sep 2014 08:39:04 GMT) (full text, mbox, link).
Message #72 received at 613428@bugs.debian.org (full text, mbox, reply):
Hi!
On Sun, 2013-05-19 at 15:27:32 +0200, Christoph Biedl wrote:
> Raphael Hertzog wrote...
> > The proper approach is to enhance your testing tools to use "eatmydata"
> > to really disable all fsync() and not only those of dpkg.
>
> Not a good idea. eatmydata introduces new bugs, #667965 is one that
> hit me. It causes some pain in multiarch installations, at least a
> lot of noise from ldd. And I cannot think of a good reason why the
> actual build scripts should do fsync, at least to a degree where this
> visibly degrades performance.
If you don't want fsync()s on your buildd then you need to run
something like eatmydata, nothing else will give you the performance
that you want. More so because the filesystems that require pervasive
fsync()s are the ones that will be suffering the most, and they are
forcing people to sprinkle fsync()s in their programs, to avoid the
common data loss scenarios that they introduced to get the speed boost
in exchange for data safety. On those you cannot have both.
> Therefore, having something eatmydata-ish inside dpkg was really
> helpful in corner cases like buildd is one.
Where to plug eatmydata is the buildd admin's choice, or part of the
buildd framework, not something that should be provided by dpkg.
And as I've suspected all along and as shown by Petter, the database
fsync()s are insignificant.
> > --force-unsafe-io has not been meant for those use case at all, it was
> > meant for some users to gain back some performance lost on supplementary
> > fsync() that have been added to dpkg. It was not meant to disable all
> > fsync() and in particular not those on the database.
>
> Since that option doesn't really do what it promises, in my humble
> opinion it should be retired some day.
It does, perhaps the man page is not clear enough, but it was
implemented this way on purpose. I'll be updating the man page to
make it crystal clear though.
Thanks,
Guillem
Reply sent
to Guillem Jover <guillem@debian.org>:
You have taken responsibility.
(Mon, 22 Sep 2014 08:45:08 GMT) (full text, mbox, link).
Notification sent
to Mike Hommey <mh+reportbug@glandium.org>:
Bug acknowledged by developer.
(Mon, 22 Sep 2014 08:45:08 GMT) (full text, mbox, link).
Message #77 received at 613428-done@bugs.debian.org (full text, mbox, reply):
Hi!
On Sun, 2014-09-21 at 15:20:25 +0200, Petter Reinholdtsen wrote:
> I've now tested using the dpkg patch disabling fsync(), and ran each
> test three times, first comparing the normal dpkg with the
> Dir::Bin::dpkg wrapper, and next comparing the patched dpkg with the
> patched dpkg and the Dir::Bin::dpkg wrapper:
Thanks a bunch for doing these tests!
> Sun Sep 21 09:21:28 CEST 2014 used: 750 default
> Sun Sep 21 09:30:53 CEST 2014 used: 562 dpkg_conf
> Sun Sep 21 09:43:32 CEST 2014 used: 756 default
> Sun Sep 21 09:53:02 CEST 2014 used: 567 dpkg_conf
> Sun Sep 21 10:06:25 CEST 2014 used: 800 default
> Sun Sep 21 10:15:47 CEST 2014 used: 559 dpkg_conf
> The 'default' average is 769+-32 seconds, the 'dpkg_conf' average is
> 563+-5 seconds.
>
> Sun Sep 21 10:33:15 CEST 2014 used: 772 dpkg_nofsync
> Sun Sep 21 10:42:38 CEST 2014 used: 560 dpkg_conf
> Sun Sep 21 10:55:19 CEST 2014 used: 758 dpkg_nofsync
> Sun Sep 21 11:04:43 CEST 2014 used: 561 dpkg_conf
> Sun Sep 21 11:17:23 CEST 2014 used: 757 dpkg_nofsync
> Sun Sep 21 11:26:45 CEST 2014 used: 559 dpkg_conf
>
> The 'dpkg_nofsync' average is 762+-10 seconds, the 'dpkg_conf' average
> is 560+-1 seconds. So the advantage of disabling fsync() in dpkg
> itself seem neglectable, while the advantage of using eatmydata is
> significant, also with a patched dpkg.
I've suspected this would be the case all along, but as I don't use
one of the new filesystems (due to issues like this), I never
bothered to test it.
Given that it now should be clear that any significant performance
loss is not coming from the pre-existing db fsync()s, but from other
commands, I'm going to close this bug report. If someone can perform
more tests showing otherwise, please feel free to reopen and I'll
consider extending or adding a new option for a full-unsafe-io mode,
but as it stands, I don't think that makese any sense, when what you
might want is to just use eatmydata or similar.
Thanks,
Guillem
Information forwarded
to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#613428; Package dpkg.
(Mon, 22 Sep 2014 17:45:10 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>.
(Mon, 22 Sep 2014 17:45:10 GMT) (full text, mbox, link).
Message #82 received at 613428@bugs.debian.org (full text, mbox, reply):
[Guillem Jover]
> I've suspected this would be the case all along, but as I don't use
> one of the new filesystems (due to issues like this), I never
> bothered to test it.
Note, I only tested on ext4. I have not tested on any of the new file
systems. :)
--
Happy hacking
Petter Reinholdtsen
Bug archived.
Request was from Debbugs Internal Request <owner@bugs.debian.org>
to internal_control@bugs.debian.org.
(Tue, 21 Oct 2014 07:28:29 GMT) (full text, mbox, link).
Send a report that this bug log contains spam.
Debian bug tracking system administrator <owner@bugs.debian.org>.
Last modified:
Fri Jul 24 07:00:41 2020;
Machine Name:
buxtehude
Debian Bug tracking system
Debbugs is free software and licensed under the terms of the GNU
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson,
2005-2017 Don Armstrong, and many other contributors.