Debian Bug report logs - #337827
rsync should handle new sparse files better with -S

version graph

Package: rsync; Maintainer for rsync is Paul Slootman <paul@debian.org>; Source for rsync is src:rsync.

Reported by: Goswin von Brederlow <goswin-v-b@web.de>

Date: Sun, 6 Nov 2005 20:03:04 UTC

Severity: wishlist

Found in versions rsync/2.6.4-6, rsync/2.6.8-2

Forwarded to https://bugzilla.samba.org/show_bug.cgi?id=5801

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Paul Slootman <paul@debian.org>:
Bug#337827; Package rsync. Full text and rfc822 format available.

Acknowledgement sent to Goswin Brederlow <brederlo@informatik.uni-tuebingen.de>:
New Bug report received and forwarded. Copy sent to Paul Slootman <paul@debian.org>. Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: Goswin Brederlow <brederlo@informatik.uni-tuebingen.de>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: rsync should handle new sparse files better with -S
Date: Sun, 06 Nov 2005 19:39:59 +0100
Package: rsync
Version: 2.6.4-6
Severity: wishlist

Hi,

I noticed that the sparse file handling of rsync could use some
improvements when transmitting a new file. It seems to me like rsync
will copy the zeros of a sparse file verbatim over the connection
limiting the transfere to the network bandwith (or cpu power with
-z). This can be greatly increased (a factor of 10 here) by running
"dd if=/dev/zero of=file bs=1 count=1 seek=1000000" beforehand because
then rsync will transmit block matches for every block of zeroes
instead of verbatim data.

My suggestion would be to have an implizit block of zeroes (block
number ~0 or -1) either always or when the -S option is given (needs
client+server support) or to insert a faked block at the end of the
file when generating checksums (only needs client changes).


On a grander scale the rsync protocol could be extended to cover
repetitive blocks and send a block match to the destination file
instead of the source file. One bit in the block number could be
reused for this.

MfG
	Goswin

-- System Information:
Debian Release: 3.1
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.8-frosties-2
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)

Versions of packages rsync depends on:
ii  libc6                       2.3.2.ds1-22 GNU C Library: Shared libraries an
ii  libpopt0                    1.7-5        lib for parsing cmdline parameters

-- no debconf information



Information forwarded to debian-bugs-dist@lists.debian.org, Paul Slootman <paul@debian.org>:
Bug#337827; Package rsync. Full text and rfc822 format available.

Acknowledgement sent to Fabrice Lorrain <Fabrice.Lorrain@free.fr>:
Extra info received and forwarded to list. Copy sent to Paul Slootman <paul@debian.org>. Full text and rfc822 format available.

Message #10 received at 337827@bugs.debian.org (full text, mbox):

From: Fabrice Lorrain <Fabrice.Lorrain@free.fr>
To: Debian Bug Tracking System <337827@bugs.debian.org>
Subject: Re: rsync should handle new sparse files better with -S
Date: Sat, 14 Oct 2006 21:29:21 +0200
Package: rsync
Version: 2.6.8-2
Followup-For: Bug #337827


Hello,

Any progress on this bug ?

The way rsync is handling sparse file is suboptimal. It leaves any
backup policy based on rsync open to a trivial DoS with thinks link the following :

dd if=/dev/zero of=bigfake bs=1k count=1 seek=2000000000

rsync -e ssh -avS bigfake user@localhost:/tmp

At that point you wait for 2TB of unusfull zeros been transferred
between the src-server and the backup_server... Annoying.

I've been beaten by this feature twice already. Students borking some
seek/lseek maths while writing to files... We got several 100GB files
to transfert during the backup at night...

@+,
	Fab



-- System Information:
Debian Release: testing/unstable
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-1-k7
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)

Versions of packages rsync depends on:
ii  libc6                        2.3.6.ds1-6 GNU C Library: Shared libraries
ii  libpopt0                     1.10-3      lib for parsing cmdline parameters

rsync recommends no packages.

-- no debconf information



Information forwarded to debian-bugs-dist@lists.debian.org:
Bug#337827; Package rsync. Full text and rfc822 format available.

Acknowledgement sent to Paul Slootman <paul@debian.org>:
Extra info received and forwarded to list. Full text and rfc822 format available.

Message #15 received at 337827@bugs.debian.org (full text, mbox):

From: Paul Slootman <paul@debian.org>
To: Fabrice Lorrain <Fabrice.Lorrain@free.fr>, 337827@bugs.debian.org
Subject: Re: Bug#337827: rsync should handle new sparse files better with -S
Date: Sat, 14 Oct 2006 22:38:56 +0200
On Sat 14 Oct 2006, Fabrice Lorrain wrote:

> Any progress on this bug ?

I'm afraid not...
I'll talk to the upstream maintainer to see what possibilities there are
for extending the protocol to handle this.

> The way rsync is handling sparse file is suboptimal. It leaves any
> backup policy based on rsync open to a trivial DoS with thinks link the following :
> 
> dd if=/dev/zero of=bigfake bs=1k count=1 seek=2000000000
> 
> rsync -e ssh -avS bigfake user@localhost:/tmp
> 
> At that point you wait for 2TB of unusfull zeros been transferred
> between the src-server and the backup_server... Annoying.

I understand...

> I've been beaten by this feature twice already. Students borking some
> seek/lseek maths while writing to files... We got several 100GB files
> to transfert during the backup at night...

Using -z will speed things up quite a lot, as the zeroes compress well.
However, perhaps a better workaround in the meantime is to exclude
(student) files that are larger than a reasonable amount via the
--max-size option.


Paul Slootman



Information forwarded to debian-bugs-dist@lists.debian.org, Paul Slootman <paul@debian.org>:
Bug#337827; Package rsync. Full text and rfc822 format available.

Acknowledgement sent to Fabrice Lorrain <Fabrice.Lorrain@free.fr>:
Extra info received and forwarded to list. Copy sent to Paul Slootman <paul@debian.org>. Full text and rfc822 format available.

Message #20 received at 337827@bugs.debian.org (full text, mbox):

From: Fabrice Lorrain <Fabrice.Lorrain@free.fr>
To: Paul Slootman <paul@debian.org>
Cc: 337827@bugs.debian.org
Subject: Re: Bug#337827: rsync should handle new sparse files better with -S
Date: Sun, 15 Oct 2006 10:42:41 +0200
Paul Slootman a écrit :
> On Sat 14 Oct 2006, Fabrice Lorrain wrote:
> 
> 
>>Any progress on this bug ?
> 
> 
> I'm afraid not...
> I'll talk to the upstream maintainer to see what possibilities there are
> for extending the protocol to handle this.

Thanks.

>>The way rsync is handling sparse file is suboptimal. It leaves any
>>backup policy based on rsync open to a trivial DoS with thinks link the following :
>>
>>dd if=/dev/zero of=bigfake bs=1k count=1 seek=2000000000
>>
>>rsync -e ssh -avS bigfake user@localhost:/tmp
>>
>>At that point you wait for 2TB of unusfull zeros been transferred
>>between the src-server and the backup_server... Annoying.
> 
> 
> I understand...
> 
> 
>>I've been beaten by this feature twice already. Students borking some
>>seek/lseek maths while writing to files... We got several 100GB files
>>to transfert during the backup at night...
> 
> 
> Using -z will speed things up quite a lot, as the zeroes compress well.

Yep, if I can apply this option only on sparse files. It will slow down 
quit a bit our backups if we use it per default.

> However, perhaps a better workaround in the meantime is to exclude
> (student) files that are larger than a reasonable amount via the
> --max-size option.

Or ask the student to clean his/her mess. Thank's for the tip nontheless.

@+,

	Fab



Set Bug forwarded-to-address to 'https://bugzilla.samba.org/show_bug.cgi?id=5801'. Request was from Matt McCutchen <matt@mattmccutchen.net> to control@bugs.debian.org. (Wed, 17 Mar 2010 03:33:12 GMT) Full text and rfc822 format available.

Changed Bug submitter to 'Goswin von Brederlow <goswin-v-b@web.de>' from 'Goswin Brederlow <brederlo@informatik.uni-tuebingen.de>' Request was from Goswin von Brederlow <goswin-v-b@web.de> to control@bugs.debian.org. (Wed, 14 Apr 2010 12:12:27 GMT) Full text and rfc822 format available.

Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Thu Apr 17 15:38:45 2014; Machine Name: beach.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.