Debian Bug report logs - #588339
sync() causes significant dpkg unpack performance degradation on tmpfs (pbuilder, piuparts, ...)

version graph

Package: dpkg; Maintainer for dpkg is Dpkg Developers <debian-dpkg@lists.debian.org>; Source for dpkg is src:dpkg.

Reported by: Andreas Beckmann <debian@abeckmann.de>

Date: Wed, 7 Jul 2010 14:03:02 UTC

Severity: normal

Found in version dpkg/1.15.7.2

Fixed in version dpkg/1.15.8.6

Done: Guillem Jover <guillem@debian.org>

Bug is archived. No further changes may be made.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#588339; Package dpkg. (Wed, 07 Jul 2010 14:03:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Andreas Beckmann <debian@abeckmann.de>:
New Bug report received and forwarded. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Wed, 07 Jul 2010 14:03:04 GMT) Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: Andreas Beckmann <debian@abeckmann.de>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: sync() causes significant dpkg unpack performance degradation on tmpfs (pbuilder, piuparts, ...)
Date: Wed, 07 Jul 2010 16:01:26 +0200
Package: dpkg
Version: 1.15.7.2
Severity: normal

Hi,

the fsync()/sync() changes done recently cause significant performance
loss of dpkg when used in chroot environments living on tmpfs.

For short-lived chroots that cause a lot of e.g. dpkg activity (e.g.
pbuilder environments while building, piuparts while testing, ...) I
prefer to have them on tmpfs in order to reduce the amount of I/O which
hits the disk - and get a significant speed boost that way.

Unfortunately since the last dpkg changes concerning sync()/fsync() this
no longer works out well - the continuous sync() from a "virtual" chroot
on tmpfs hits the physical system really hard, causing speed loss
factors between 3-5, probably more if multiple pbuilder builds/piuparts
tests are run in parallel.

Is there a possibility to disable the syncing when dpkg runs on tmpfs?


Andreas

-- System Information:
Debian Release: squeeze/sid
  APT prefers stable
  APT policy: (800, 'stable'), (700, 'testing'), (600, 'unstable'), (130, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-5-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages dpkg depends on:
ii  coreutils         8.5-1                  GNU core utilities
ii  libbz2-1.0        1.0.5-4                high-quality block-sorting file co
ii  libc6             2.11.2-2               Embedded GNU C Library: Shared lib
ii  libselinux1       2.0.94-1               SELinux runtime shared libraries
ii  xz-utils          4.999.9beta+20100527-1 XZ-format compression utilities
ii  zlib1g            1:1.2.3.4.dfsg-3       compression library - runtime

dpkg recommends no packages.

Versions of packages dpkg suggests:
ii  apt                           0.7.25.3   Advanced front-end for dpkg

-- no debconf information




Information forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#588339; Package dpkg. (Wed, 07 Jul 2010 16:57:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonathan Nieder <jrnieder@gmail.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Wed, 07 Jul 2010 16:57:03 GMT) Full text and rfc822 format available.

Message #10 received at 588339@bugs.debian.org (full text, mbox):

From: Jonathan Nieder <jrnieder@gmail.com>
To: linux-2.6@packages.debian.org
Cc: Andreas Beckmann <debian@abeckmann.de>, 588339@bugs.debian.org
Subject: Re: Bug#588339: sync() causes significant dpkg unpack performance degradation on tmpfs (pbuilder, piuparts, ...)
Date: Wed, 7 Jul 2010 11:55:34 -0500
Hi kernel team,

Andreas Beckmann wrote:

> Unfortunately since the last dpkg changes concerning sync()/fsync() this
> no longer works out well - the continuous sync() from a "virtual" chroot
> on tmpfs hits the physical system really hard, causing speed loss
> factors between 3-5, probably more if multiple pbuilder builds/piuparts
> tests are run in parallel.

Why would sync() do anything on tmpfs?  The s_bdi field from its
superblock is never set to non-NULL in mm/shmem.c, so that’s not it.
Ah, but sync_filesystems() iterates over all filesystems, not just
those accessible from the chroot.

This sucks.  To recap:

1. On ext4 with certain mount options, using rename() without first
   calling fsync() to get the data on disk has an unfortunate risk of
   clearing out a file[1].

2. On ext4 with certain mount options, using fsync() instead of sync()
   to sync a collection of newly installed files is unacceptably
   slow[2].

3. sync() obviously does way more than we want it too, since it
   touches files and filesystems that have nothing to do with
   dpkg’s work.

So what should we do?  Dear kernel, we will happily provide a list
of files we want to be renamed in place.  Can you make sure they
have the right data without _repeatedly_ incurring the penalty of
fsync()?

Jonathan

[1] http://bugs.debian.org/567089
[2] http://bugs.debian.org/578635




Information forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#588339; Package dpkg. (Wed, 20 Oct 2010 21:15:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Chanoch (Ken) Bloom" <kbloom@gmail.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Wed, 20 Oct 2010 21:15:02 GMT) Full text and rfc822 format available.

Message #15 received at 588339@bugs.debian.org (full text, mbox):

From: "Chanoch (Ken) Bloom" <kbloom@gmail.com>
To: 588339@bugs.debian.org
Subject: sync/fsync in dpkg
Date: Wed, 20 Oct 2010 16:11:05 -0500
> Why would sync() do anything on tmpfs?  The s_bdi field from its
> superblock is never set to non-NULL in mm/shmem.c, so that’s not it.
> Ah, but sync_filesystems() iterates over all filesystems, not just
> those accessible from the chroot.
> 
> This sucks.  To recap:
> 
> 1. On ext4 with certain mount options, using rename() without first
>    calling fsync() to get the data on disk has an unfortunate risk of
>    clearing out a file[1].

This issue was current at the beginning of 2010, around the time Bug
#567089 was filed and discussed. It's been fixed in the kernel since
then. See http://lwn.net/Articles/322823/, and
http://lwn.net/Articles/326471/

Does it still affect the shipping Debian kernel?

> 2. On ext4 with certain mount options, using fsync() instead of sync()
>    to sync a collection of newly installed files is unacceptably
>    slow[2].

The problem here was "data=ordered". ext3 also suffered from this
problem, since its default was "data=ordered".
In brief, ONE fsync() call cost about as much as ONE sync() call.
The solution was "don't use data=ordered" (and Linus patched the
kernel to change the default) then fsync() will be suitably faster.

The bug you cite here was also around April/May when this problem was
being sorted out by the Linux kernel community.

Though this may still affects the shipping Debian kernel for
"data=ordered" mounts (I don't actually know whether they've managed
to fix data=ordered), it should no longer affect default mount
options. Is that right?

See http://lwn.net/Articles/328363/

> 3. sync() obviously does way more than we want it too, since it
>    touches files and filesystems that have nothing to do with
>    dpkg’s work.
> 
> So what should we do?  Dear kernel, we will happily provide a list
> of files we want to be renamed in place.  Can you make sure they
> have the right data without _repeatedly_ incurring the penalty of
> fsync()?

Is a solution of "mount your hard drive in a way that fsync() doesn't
hurt" a good solution? I think that was the upstream kernel
developers' decision on how to handle this.

If not, maybe postponing sync() calls further is the solution.
I.e. instead of doing it after every package, do it every 10 packages,
or just do it once at the end of an apt-get dist-upgrade.


Just a benchmark on performance with sync() versus without sync(). This test
was done on ext4 in cowbuilder chroots, with all of the packages pre-cached by
apt-cacher-ng.

# time eatmydata apt-get install --no-install-recommends openoffice.org
0 upgraded, 142 newly installed, 0 to remove and 0 not upgraded.
...
real 0m57.682s
user 0m37.030s
sys 0m7.220s

# time apt-get install --no-install-recommends openoffice.org
0 upgraded, 142 newly installed, 0 to remove and 0 not upgraded.
...
real 3m17.158s
user 0m37.186s
sys 0m11.057s




-- 
Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/




Information forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#588339; Package dpkg. (Wed, 20 Oct 2010 22:06:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Modestas Vainius <modestas@vainius.eu>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Wed, 20 Oct 2010 22:06:06 GMT) Full text and rfc822 format available.

Message #20 received at 588339@bugs.debian.org (full text, mbox):

From: Modestas Vainius <modestas@vainius.eu>
To: 588339@bugs.debian.org, "Chanoch (Ken) Bloom" <kbloom@gmail.com>
Subject: Re: Bug#588339: sync/fsync in dpkg
Date: Thu, 21 Oct 2010 01:04:26 +0300
[Message part 1 (text/plain, inline)]
Hello,

On ketvirtadienis 21 Spalis 2010 00:11:05 Chanoch (Ken) Bloom wrote:
> This issue was current at the beginning of 2010, around the time Bug
> #567089 was filed and discussed. It's been fixed in the kernel since
> then. See http://lwn.net/Articles/322823/, and
> http://lwn.net/Articles/326471/
> 
> Does it still affect the shipping Debian kernel?

ext3 is not a very big problem from my experience unless there is heavy I/O in 
the background. I use 2.6.35 kernel.

> The problem here was "data=ordered". ext3 also suffered from this
> problem, since its default was "data=ordered".
> In brief, ONE fsync() call cost about as much as ONE sync() call.
> The solution was "don't use data=ordered" (and Linus patched the
> kernel to change the default) then fsync() will be suitably faster.

ext4 and especially btrfs take a huge (in 10x-60x range) performance hit due 
to those repetitive fsync() or sync() calls. But since dpkg keeps calling 
sync(), performance suffers even if dpkg is not writing to ext4/btrfs file 
system directly.

My benchmarks are here:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=588254

-- 
Modestas Vainius <modestas@vainius.eu>
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#588339; Package dpkg. (Thu, 21 Oct 2010 00:09:08 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Chanoch (Ken) Bloom" <kbloom@gmail.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Thu, 21 Oct 2010 00:09:08 GMT) Full text and rfc822 format available.

Message #25 received at 588339@bugs.debian.org (full text, mbox):

From: "Chanoch (Ken) Bloom" <kbloom@gmail.com>
To: Modestas Vainius <modestas@vainius.eu>
Cc: 588339@bugs.debian.org
Subject: Re: Bug#588339: sync/fsync in dpkg
Date: Wed, 20 Oct 2010 19:05:32 -0500
On Thu, 2010-10-21 at 01:04 +0300, Modestas Vainius wrote:
> Hello,
> 
> On ketvirtadienis 21 Spalis 2010 00:11:05 Chanoch (Ken) Bloom wrote:
> > This issue was current at the beginning of 2010, around the time Bug
> > #567089 was filed and discussed. It's been fixed in the kernel since
> > then. See http://lwn.net/Articles/322823/, and
> > http://lwn.net/Articles/326471/
> > 
> > Does it still affect the shipping Debian kernel?
> 
> ext3 is not a very big problem from my experience unless there is heavy I/O in 
> the background. I use 2.6.35 kernel.
> 
> > The problem here was "data=ordered". ext3 also suffered from this
> > problem, since its default was "data=ordered".
> > In brief, ONE fsync() call cost about as much as ONE sync() call.
> > The solution was "don't use data=ordered" (and Linus patched the
> > kernel to change the default) then fsync() will be suitably faster.
> 
> ext4 and especially btrfs take a huge (in 10x-60x range) performance hit due 
> to those repetitive fsync() or sync() calls. But since dpkg keeps calling 
> sync(), performance suffers even if dpkg is not writing to ext4/btrfs file 
> system directly.
> 
> My benchmarks are here:
> 
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=588254

The version you're using in Bug #588254 is the first version of dpkg()
that uses sync(). What I'd like to establish is whether Linus and
friends have fixed fsync() so that it's significantly faster than
sync(). It sounded like that was their goal.

Now, looking through the kernel git I'm not so sure, because commit
aa32a796389bedbcf1c7714385b18714a0743810 switched ext3 back to
data=ordered. See also commit 6d41807614151829ae17a3a58bff8572af5e407e
which changed the Kconfig option to discuss the tradeoff.

Probably the best thing to do is to take some benchmarks of both options
(one version using fsync(), one version using sync(), one version
without fsync or sync) on several filesystems with different mount
options.

Or just ask on lkml.org.








Information forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#588339; Package dpkg. (Thu, 21 Oct 2010 16:39:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Sven Joachim <svenjoac@gmx.de>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Thu, 21 Oct 2010 16:39:04 GMT) Full text and rfc822 format available.

Message #30 received at 588339@bugs.debian.org (full text, mbox):

From: Sven Joachim <svenjoac@gmx.de>
To: "Chanoch \(Ken\) Bloom" <kbloom@gmail.com>
Cc: 588339@bugs.debian.org, Modestas Vainius <modestas@vainius.eu>
Subject: Re: Bug#588339: sync/fsync in dpkg
Date: Thu, 21 Oct 2010 18:36:34 +0200
On 2010-10-21 02:05 +0200, Chanoch (Ken) Bloom wrote:

> The version you're using in Bug #588254 is the first version of dpkg()
> that uses sync(). What I'd like to establish is whether Linus and
> friends have fixed fsync() so that it's significantly faster than
> sync(). It sounded like that was their goal.

If so, they did not quite succeed at least on ext4 in 2.6.36, because in
dpkg 1.15.7 (which uses fsync() rather than sync()) I'm still seeing the
massive slowdown mentioned in http://bugs.debian.org/578635.

Sven




Information forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#588339; Package dpkg. (Thu, 21 Oct 2010 17:15:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ken Bloom <kbloom@gmail.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Thu, 21 Oct 2010 17:15:05 GMT) Full text and rfc822 format available.

Message #35 received at 588339@bugs.debian.org (full text, mbox):

From: Ken Bloom <kbloom@gmail.com>
To: Sven Joachim <svenjoac@gmx.de>
Cc: 588339@bugs.debian.org, Modestas Vainius <modestas@vainius.eu>
Subject: Re: Bug#588339: sync/fsync in dpkg
Date: Thu, 21 Oct 2010 12:12:51 -0500
On Thu, 2010-10-21 at 18:36 +0200, Sven Joachim wrote:
> On 2010-10-21 02:05 +0200, Chanoch (Ken) Bloom wrote:
> 
> > The version you're using in Bug #588254 is the first version of dpkg()
> > that uses sync(). What I'd like to establish is whether Linus and
> > friends have fixed fsync() so that it's significantly faster than
> > sync(). It sounded like that was their goal.
> 
> If so, they did not quite succeed at least on ext4 in 2.6.36, because in
> dpkg 1.15.7 (which uses fsync() rather than sync()) I'm still seeing the
> massive slowdown mentioned in http://bugs.debian.org/578635.

And what mount options are you using? If you're using
defaults, /etc/mtab (and therefore the mount command) won't know what
the default values are, but you can check /proc/mounts which will
include the data= mount option.

--Ken





Information forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#588339; Package dpkg. (Thu, 21 Oct 2010 17:21:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonathan Nieder <jrnieder@gmail.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Thu, 21 Oct 2010 17:21:04 GMT) Full text and rfc822 format available.

Message #40 received at 588339@bugs.debian.org (full text, mbox):

From: Jonathan Nieder <jrnieder@gmail.com>
To: Ken Bloom <kbloom@gmail.com>, 588339@bugs.debian.org
Cc: Sven Joachim <svenjoac@gmx.de>, Modestas Vainius <modestas@vainius.eu>, debian-kernel@lists.debian.org
Subject: Re: Bug#588339: sync/fsync in dpkg
Date: Thu, 21 Oct 2010 12:14:41 -0500
(+cc: debian-kernel)

Ken Bloom wrote:

> And what mount options are you using? If you're using
> defaults, /etc/mtab (and therefore the mount command) won't know what
> the default values are, but you can check /proc/mounts which will
> include the data= mount option.

data=ordered.  That's the default for ext4.

As I mentioned in my rant before[*], what we really need is a way
to supply a list of paths to sync.  Until we have that, a single sync
that can be disabled in the cases where it doesn't matter is probably
the best we can do.

[*] Sorry about that, by the way.




Information forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#588339; Package dpkg. (Fri, 22 Oct 2010 09:06:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Guillem Jover <guillem@debian.org>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Fri, 22 Oct 2010 09:06:03 GMT) Full text and rfc822 format available.

Message #45 received at 588339@bugs.debian.org (full text, mbox):

From: Guillem Jover <guillem@debian.org>
To: "Chanoch (Ken) Bloom" <kbloom@gmail.com>, 588339@bugs.debian.org
Subject: Re: Bug#588339: sync/fsync in dpkg
Date: Fri, 22 Oct 2010 11:03:21 +0200
Hi!

On Wed, 2010-10-20 at 16:11:05 -0500, Chanoch (Ken) Bloom wrote:
> > 1. On ext4 with certain mount options, using rename() without first
> >    calling fsync() to get the data on disk has an unfortunate risk of
> >    clearing out a file[1].
> 
> This issue was current at the beginning of 2010, around the time Bug
> #567089 was filed and discussed. It's been fixed in the kernel since
> then. See http://lwn.net/Articles/322823/, and
> http://lwn.net/Articles/326471/
> 
> Does it still affect the shipping Debian kernel?

Some of the problems might have been patched over, but AFAIK it still
affects latest upstream kernels:

  <https://bugzilla.kernel.org/show_bug.cgi?id=15910>

> > 2. On ext4 with certain mount options, using fsync() instead of sync()
> >    to sync a collection of newly installed files is unacceptably
> >    slow[2].
> 
> The problem here was "data=ordered". ext3 also suffered from this
> problem, since its default was "data=ordered".
> In brief, ONE fsync() call cost about as much as ONE sync() call.
> The solution was "don't use data=ordered" (and Linus patched the
> kernel to change the default) then fsync() will be suitably faster.
> 
> The bug you cite here was also around April/May when this problem was
> being sorted out by the Linux kernel community.

AFAIR benchmarks showed during the process to fix the fsync() slowdown
bug in dpkg, ext3 didn't suffer a significant slowdown, while ext4 did.

The biggest problem with using sync() is that it affect *all* mount
points, not just the one where the file might be stored. So background
I/O might cause way more load than necessary.

> > 3. sync() obviously does way more than we want it too, since it
> >    touches files and filesystems that have nothing to do with
> >    dpkg’s work.
> > 
> > So what should we do?  Dear kernel, we will happily provide a list
> > of files we want to be renamed in place.  Can you make sure they
> > have the right data without _repeatedly_ incurring the penalty of
> > fsync()?
> 
> Is a solution of "mount your hard drive in a way that fsync() doesn't
> hurt" a good solution? I think that was the upstream kernel
> developers' decision on how to handle this.

Well, fsync() is the correct solution for this problem, if the file
system cannot handle it, then I'd say the file system is the problem.

> If not, maybe postponing sync() calls further is the solution.
> I.e. instead of doing it after every package, do it every 10 packages,
> or just do it once at the end of an apt-get dist-upgrade.

Postponing fsync() or sync() calls give the same guarantees as not
doing them at all in the presence of an abrupt system crash/shutdown.

regards,
guillem




Information forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#588339; Package dpkg. (Tue, 26 Oct 2010 08:39:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Sven Joachim <svenjoac@gmx.de>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Tue, 26 Oct 2010 08:39:03 GMT) Full text and rfc822 format available.

Message #50 received at 588339@bugs.debian.org (full text, mbox):

From: Sven Joachim <svenjoac@gmx.de>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Ken Bloom <kbloom@gmail.com>, 588339@bugs.debian.org, Modestas Vainius <modestas@vainius.eu>, debian-kernel@lists.debian.org
Subject: Re: Bug#588339: sync/fsync in dpkg
Date: Tue, 26 Oct 2010 10:36:11 +0200
On 2010-10-21 19:14 +0200, Jonathan Nieder wrote:

> Ken Bloom wrote:
>
>> And what mount options are you using? If you're using
>> defaults, /etc/mtab (and therefore the mount command) won't know what
>> the default values are, but you can check /proc/mounts which will
>> include the data= mount option.
>
> data=ordered.  That's the default for ext4.

Yes, and data=writeback does not make much of a difference.  However,
using the "nodelalloc" mount option does wonders, increasing unpacking
speed in dpkg 1.15.7 by a factor of ~8.

Sven






Information forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#588339; Package dpkg. (Tue, 26 Oct 2010 13:45:08 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Chanoch (Ken) Bloom" <kbloom@gmail.com>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Tue, 26 Oct 2010 13:45:08 GMT) Full text and rfc822 format available.

Message #55 received at 588339@bugs.debian.org (full text, mbox):

From: "Chanoch (Ken) Bloom" <kbloom@gmail.com>
To: Sven Joachim <svenjoac@gmx.de>, 588339@bugs.debian.org, Modestas Vainius <modestas@vainius.eu>, debian-kernel@lists.debian.org
Subject: Re: Bug#588339: sync/fsync in dpkg
Date: Tue, 26 Oct 2010 08:39:31 -0500
On Tue, 2010-10-26 at 10:36 +0200, Sven Joachim wrote:
> On 2010-10-21 19:14 +0200, Jonathan Nieder wrote:
> 
> > Ken Bloom wrote:
> >
> >> And what mount options are you using? If you're using
> >> defaults, /etc/mtab (and therefore the mount command) won't know what
> >> the default values are, but you can check /proc/mounts which will
> >> include the data= mount option.
> >
> > data=ordered.  That's the default for ext4.
> 
> Yes, and data=writeback does not make much of a difference.  However,
> using the "nodelalloc" mount option does wonders, increasing unpacking
> speed in dpkg 1.15.7 by a factor of ~8.
> 
> Sven

So here's the upshot:

Delayed allocation is supposed to make your filesystem blazingly fast,
but in order to get data safety you *need* to call fsync() to make sure
your data's on disk[1]. However, fsync() causes real performance hit,
made worse by the fact that you can only call fsync() on one file at a
time, so to fsync() a whole unpacked .deb, you have to block on fsync()
possibly hundreds of times. And it seems that fsync() is only writing
one file to disk at a time, as it should be.

If you turn off delayed allocation, then filesystem operations slow down
in general, but fsync() gets faster somehow (I'm not sure why), and you
get greater data safety in the first place.

I think the proper thing to do at this point is come up with a concise
summary of the different options you've tried, and their performance,
and send that to the Linux Kernel Mailing List
(linux-kernel@vger.kernel.org)[2] and ask them what your options are,
what their performance characteristics are, and what their safety
characteristics are. And also the tradeoffs of using various mount
options. Ask about ext3, ext4, and btrfs. We can decide later whether
ext3 really concerns us. (If you email LKML, then please don't CC: any
of us, and don't CC: the bug report. LKML threads can be very high
traffic. You can look up the message you sent in the LKML archives at
lkml.org, and send the URL of your message to the BTS.)

--Ken

[1] However, since 2.6.30, ext4 will make cause a file to have any
delayed allocation blocks to be allocated immediately when a file is
replaced, which I think is dpkg's use case. I think that means you
should be able to get the safety you seek without calling fsync() at
all. See
http://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/

[2] I don't expect debian-kernel has the expertise to answer these
questions.




Added tag(s) pending. Request was from Guillem Jover <guillem@debian.org> to control@bugs.debian.org. (Thu, 25 Nov 2010 06:54:03 GMT) Full text and rfc822 format available.

Message sent on to Andreas Beckmann <debian@abeckmann.de>:
Bug#588339. (Thu, 25 Nov 2010 06:54:14 GMT) Full text and rfc822 format available.

Message #60 received at 588339-submitter@bugs.debian.org (full text, mbox):

From: Guillem Jover <guillem@debian.org>
To: 588339-submitter@bugs.debian.org
Subject: Bug#588339 marked as pending
Date: Thu, 25 Nov 2010 06:50:20 +0000
tag 588339 pending
thanks

Hello,

Bug #588339 reported by you has been fixed in the Git repository. You can
see the changelog below, and you can check the diff of the fix at:

    http://git.debian.org/?p=dpkg/dpkg.git;a=commitdiff;h=5ee4e4e

---
commit 5ee4e4e0458088cde1625ddb5a3d736f31a335d3
Author: Guillem Jover <guillem@debian.org>
Date:   Thu Jul 29 09:11:02 2010 +0200

    build: Disable usage of synchronous sync(2) by default
    
    It causes undesired I/O on unrelated file systems. It also makes the
    code behave differently on Linux systems.
    
    Allow the possibility to enable it again for the benefit of downstreams,
    which might want to use it regardless of its problems. Although this
    code path will most probably be removed in the near future.
    
    Closes: #588339, #595927, #600075

diff --git a/debian/changelog b/debian/changelog
index f66daa9..4fcd521 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -11,6 +11,10 @@ dpkg (1.15.8.6) UNRELEASED; urgency=low
     correct series file if the source package provides vendor specific patch
     sets.
 
+  [ Guillem Jover ]
+  * Disable by default usage of synchronous sync(2), as it causes undesired
+    I/O on unrelated file systems. Closes: #588339, #595927, #600075
+
   [ Updated man page translations ]
   * French (Christian Perrier). Including a typo fix
     and a typographical change reported by Vincent Danjean




Reply sent to Guillem Jover <guillem@debian.org>:
You have taken responsibility. (Thu, 25 Nov 2010 07:03:06 GMT) Full text and rfc822 format available.

Notification sent to Andreas Beckmann <debian@abeckmann.de>:
Bug acknowledged by developer. (Thu, 25 Nov 2010 07:03:06 GMT) Full text and rfc822 format available.

Message #65 received at 588339-close@bugs.debian.org (full text, mbox):

From: Guillem Jover <guillem@debian.org>
To: 588339-close@bugs.debian.org
Subject: Bug#588339: fixed in dpkg 1.15.8.6
Date: Thu, 25 Nov 2010 07:02:13 +0000
Source: dpkg
Source-Version: 1.15.8.6

We believe that the bug you reported is fixed in the latest version of
dpkg, which is due to be installed in the Debian FTP archive:

dpkg-dev_1.15.8.6_all.deb
  to main/d/dpkg/dpkg-dev_1.15.8.6_all.deb
dpkg_1.15.8.6.dsc
  to main/d/dpkg/dpkg_1.15.8.6.dsc
dpkg_1.15.8.6.tar.bz2
  to main/d/dpkg/dpkg_1.15.8.6.tar.bz2
dpkg_1.15.8.6_amd64.deb
  to main/d/dpkg/dpkg_1.15.8.6_amd64.deb
dselect_1.15.8.6_amd64.deb
  to main/d/dpkg/dselect_1.15.8.6_amd64.deb
libdpkg-dev_1.15.8.6_amd64.deb
  to main/d/dpkg/libdpkg-dev_1.15.8.6_amd64.deb
libdpkg-perl_1.15.8.6_all.deb
  to main/d/dpkg/libdpkg-perl_1.15.8.6_all.deb



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 588339@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Guillem Jover <guillem@debian.org> (supplier of updated dpkg package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.8
Date: Thu, 25 Nov 2010 07:10:48 +0100
Source: dpkg
Binary: libdpkg-dev dpkg dpkg-dev libdpkg-perl dselect
Architecture: source amd64 all
Version: 1.15.8.6
Distribution: unstable
Urgency: low
Maintainer: Dpkg Developers <debian-dpkg@lists.debian.org>
Changed-By: Guillem Jover <guillem@debian.org>
Description: 
 dpkg       - Debian package management system
 dpkg-dev   - Debian package development tools
 dselect    - Debian package management front-end
 libdpkg-dev - Debian package management static library
 libdpkg-perl - Dpkg perl modules
Closes: 584254 588339 595455 595927 596168 596519 597023 597651 598473 599923 600075 600240 601852 602518 604769
Changes: 
 dpkg (1.15.8.6) unstable; urgency=low
 .
   [ Raphaël Hertzog ]
   * Ensure debian/source/local-options is always excluded from the source
     package even if the user provides customized -i or -I options.
     Closes: #597023
   * Fix Dpkg::Version's handling of version with a debian revision but an
     empty version (e.g. "-0.1"). Thanks to James Vega <jamessan@debian.org>
     for the patch. Closes: #597651
   * With "3.0 (quilt)" source package, create .pc/.quilt_series with the
     correct series file if the source package provides vendor specific patch
     sets.
 .
   [ Guillem Jover ]
   * Disable by default usage of synchronous sync(2), as it causes undesired
     I/O on unrelated file systems. Closes: #588339, #595927, #600075
   * Add new --force-unsafe-io to disable safe I/O operations on unpack.
     Closes: #584254
 .
   [ Updated man page translations ]
   * French (Christian Perrier). Including a typo fix and a typographical
     change reported by Vincent Danjean. Closes: #601852
   * Spanish (Omar Campagne). Closes: #596519
 .
   [ Updated programs translations ]
   * Basque (Iñaki Larrañaga Murgoitio). Closes: #599923
   * Catalan (Jordi Mallach).
   * Danish (Ask Hjorth Larsen). Closes: #600240
   * German (Sven Joachim). Improved by Holger Wansing.
   * Italian (Pietro Battiston). Fix translation of "however". Closes: #602518
   * Portuguese (Miguel Figueiredo). Closes: #596168
   * Romanian (Andrei Popescu). Closes: #604769
   * Russian (Yuri Kozlov). Closes: #595455
   * Vietnamese (Clytie Siddall). Closes: #598473
 .
   [ Updated scripts translations ]
   * Catalan (Jordi Mallach).
   * German (Sven Joachim).
 .
   [ Updated dselect translations ]
   * Catalan (Jordi Mallach).
   * German (Sven Joachim).
Checksums-Sha1: 
 0ac67a10e335d0ec5375ac523998546403525ff9 1208 dpkg_1.15.8.6.dsc
 ebc9a6087f8f8c56c973f26f9bdb17ef1c570f0c 5222815 dpkg_1.15.8.6.tar.bz2
 7a6e46450c5d8d89746a5ba16a6ed7b9306b2a07 426852 libdpkg-dev_1.15.8.6_amd64.deb
 ba234426fb283f292efc9a0b202a09b5c579b504 2338026 dpkg_1.15.8.6_amd64.deb
 196702535a3a609f134375d5d3c39bb26663f4ca 894426 dselect_1.15.8.6_amd64.deb
 e6affdd25dcdc0b9c73e5059bc95f34531ed52e4 801736 dpkg-dev_1.15.8.6_all.deb
 47cb193e3096afb8035632a1b6bae0da9771c226 682848 libdpkg-perl_1.15.8.6_all.deb
Checksums-Sha256: 
 a4355f87fa1466edcefff224182ae7824d1469d1ddc28126d00cd361c611e0a9 1208 dpkg_1.15.8.6.dsc
 b319621a4d0f9fa7b356b4def978bad0b18b944405f2be9eace5e2713b5f1f49 5222815 dpkg_1.15.8.6.tar.bz2
 eedd636f39cb03a28758558cd6f4b700a2168c1adb9e4c8a59ab4ca07549bad5 426852 libdpkg-dev_1.15.8.6_amd64.deb
 6d3265e9aa6ef2d7ec341687df80e55da5996daa5d581e1864e9466bc7c36321 2338026 dpkg_1.15.8.6_amd64.deb
 b4a073849a3944c4f7b13b67cc6db8a78e8edd38ca50d86134f7aa1977c40f4b 894426 dselect_1.15.8.6_amd64.deb
 3368e9efe1206c720b92c69b89f3f43e8a0a4d13949aa24deceb02b2452ef756 801736 dpkg-dev_1.15.8.6_all.deb
 6a05978ee576c2848aa0f748cd0208547c0f39856a782ba227db331b7cf24bd3 682848 libdpkg-perl_1.15.8.6_all.deb
Files: 
 6d69ce9fd47b97aef50f5ba1209c4b24 1208 admin required dpkg_1.15.8.6.dsc
 4102648a08a4416bfc3e4f4275a438e4 5222815 admin required dpkg_1.15.8.6.tar.bz2
 1c45d231edef121f36a575a6322e928b 426852 libdevel optional libdpkg-dev_1.15.8.6_amd64.deb
 4525021598d6810dcb3fa114e5433ba1 2338026 admin required dpkg_1.15.8.6_amd64.deb
 b5662cfacfb969c3db3668cab230729b 894426 admin optional dselect_1.15.8.6_amd64.deb
 85db60da1da45941fd1789d156bdbd01 801736 utils optional dpkg-dev_1.15.8.6_all.deb
 00816223970f7700379ef13d2110878b 682848 perl optional libdpkg-perl_1.15.8.6_all.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAkzuA9IACgkQuW9ciZ2SjJtWeACgw/Em7Z5+xaFqj6lJk3DpESzc
y/EAnA0oig96OBJD4pOI1fM62qXmzP00
=ugvO
-----END PGP SIGNATURE-----





Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Wed, 05 Jan 2011 07:34:07 GMT) Full text and rfc822 format available.

Bug unarchived. Request was from Ken Bloom <kbloom@gmail.com> to control@bugs.debian.org. (Wed, 13 Apr 2011 18:51:08 GMT) Full text and rfc822 format available.

Bug archived. Request was from Ken Bloom <kbloom@gmail.com> to control@bugs.debian.org. (Wed, 13 Apr 2011 18:51:09 GMT) Full text and rfc822 format available.

Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Mon Apr 21 06:21:20 2014; Machine Name: buxtehude.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.