Debian Bug report logs - #708705
publican: ships many copies of common resources

version graph

Package: publican; Maintainer for publican is Mikhail Gusarov <dottedmag@debian.org>; Source for publican is src:publican.

Reported by: "Aaron M. Ucko" <ucko@debian.org>

Date: Fri, 17 May 2013 21:24:01 UTC

Severity: minor

Found in version publican/3.1.5-2

Fixed in version publican/3.1.5-3

Done: Raphaël Hertzog <hertzog@debian.org>

Bug is archived. No further changes may be made.

Forwarded to https://bugzilla.redhat.com/show_bug.cgi?id=966143

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, ucko@debian.org, Mikhail Gusarov <dottedmag@debian.org>:
Bug#708705; Package publican. (Fri, 17 May 2013 21:24:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Aaron M. Ucko" <ucko@debian.org>:
New Bug report received and forwarded. Copy sent to ucko@debian.org, Mikhail Gusarov <dottedmag@debian.org>. (Fri, 17 May 2013 21:24:06 GMT) Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: "Aaron M. Ucko" <ucko@debian.org>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: publican: ships many copies of common resources
Date: Fri, 17 May 2013 16:37:09 -0400
Package: publican
Version: 3.1.5-2
Severity: minor

Per http://dedup.debian.net/compare/publican/publican, publican ships
many copies of common resources (images, CSS files, etc.) under
/usr/share/publican/Common_Content and /usr/share/publican/doc,
accounting for most of its massive size increase from 2.8-3.  (It's
gone up from 6.4 MiB to 50.6 MiB.)

Could you please arrange to ship only one copy of each duplicated
file, at least within /usr/share/publican/Common_Content?

Thanks!



Information forwarded to debian-bugs-dist@lists.debian.org, Mikhail Gusarov <dottedmag@debian.org>:
Bug#708705; Package publican. (Sun, 19 May 2013 12:27:14 GMT) Full text and rfc822 format available.

Acknowledgement sent to Raphael Hertzog <hertzog@debian.org>:
Extra info received and forwarded to list. Copy sent to Mikhail Gusarov <dottedmag@debian.org>. (Sun, 19 May 2013 12:27:14 GMT) Full text and rfc822 format available.

Message #10 received at 708705@bugs.debian.org (full text, mbox):

From: Raphael Hertzog <hertzog@debian.org>
To: "Aaron M. Ucko" <ucko@debian.org>, 708705@bugs.debian.org
Subject: Re: Bug#708705: publican: ships many copies of common resources
Date: Sun, 19 May 2013 14:24:30 +0200
Hi,

On Fri, 17 May 2013, Aaron M. Ucko wrote:
> Per http://dedup.debian.net/compare/publican/publican, publican ships
> many copies of common resources (images, CSS files, etc.) under
> /usr/share/publican/Common_Content and /usr/share/publican/doc,
> accounting for most of its massive size increase from 2.8-3.  (It's
> gone up from 6.4 MiB to 50.6 MiB.)
> 
> Could you please arrange to ship only one copy of each duplicated
> file, at least within /usr/share/publican/Common_Content?

It doesn't look trivial. Each set of language files ought to be
self-contained so that any generated document is independant. So replacing
with symlinks is not satisfactory (unless we modify the publican build
logic to replace symlinks with the corresponding file).

Replacing with hardlinks is better but is quite uncommon in Debian
packages (there's a lintian warning suggesting it's a bad idea).

Last but not least, I'm not going to manually deduplicate all those
files so someone should really create a helper script that would
deduplicate a sub-directory.

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Get the Debian Administrator's Handbook:
→ http://debian-handbook.info/get/



Information forwarded to debian-bugs-dist@lists.debian.org, Mikhail Gusarov <dottedmag@debian.org>:
Bug#708705; Package publican. (Mon, 20 May 2013 22:57:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to ucko@debian.org (Aaron M. Ucko):
Extra info received and forwarded to list. Copy sent to Mikhail Gusarov <dottedmag@debian.org>. (Mon, 20 May 2013 22:57:04 GMT) Full text and rfc822 format available.

Message #15 received at 708705@bugs.debian.org (full text, mbox):

From: ucko@debian.org (Aaron M. Ucko)
To: Raphael Hertzog <hertzog@debian.org>
Cc: "Aaron M. Ucko" <ucko@debian.org>, 708705@bugs.debian.org
Subject: Re: Bug#708705: publican: ships many copies of common resources
Date: Mon, 20 May 2013 18:53:15 -0400
Raphael Hertzog <hertzog@debian.org> writes:

> It doesn't look trivial.

Understood, but the extra usage isn't so trivial either.  I had
envisioned shipping (and taking care to follow) symlinks, or perhaps
automatically taking contents from some new common directory when no
locale-specific variant shadows them.

Thanks for considering this suggestion!

-- 
Aaron M. Ucko, KB1CJC (amu at alum.mit.edu, ucko at debian.org)
http://www.mit.edu/~amu/ | http://stuff.mit.edu/cgi/finger/?amu@monk.mit.edu



Information forwarded to debian-bugs-dist@lists.debian.org, Mikhail Gusarov <dottedmag@debian.org>:
Bug#708705; Package publican. (Wed, 22 May 2013 15:15:11 GMT) Full text and rfc822 format available.

Acknowledgement sent to Raphael Hertzog <hertzog@debian.org>:
Extra info received and forwarded to list. Copy sent to Mikhail Gusarov <dottedmag@debian.org>. (Wed, 22 May 2013 15:15:11 GMT) Full text and rfc822 format available.

Message #20 received at 708705@bugs.debian.org (full text, mbox):

From: Raphael Hertzog <hertzog@debian.org>
To: "Aaron M. Ucko" <ucko@debian.org>
Cc: 708705@bugs.debian.org
Subject: Re: Bug#708705: publican: ships many copies of common resources
Date: Wed, 22 May 2013 17:10:55 +0200
Control: forwarded -1 https://bugzilla.redhat.com/show_bug.cgi?id=966143

On Mon, 20 May 2013, Aaron M. Ucko wrote:
> Understood, but the extra usage isn't so trivial either.  I had
> envisioned shipping (and taking care to follow) symlinks, or perhaps
> automatically taking contents from some new common directory when no
> locale-specific variant shadows them.

Yes, it definitely makes sense. I opened an upstream bug report asking for
this.

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Get the Debian Administrator's Handbook:
→ http://debian-handbook.info/get/



Set Bug forwarded-to-address to 'https://bugzilla.redhat.com/show_bug.cgi?id=966143'. Request was from Raphael Hertzog <hertzog@debian.org> to 708705-submit@bugs.debian.org. (Wed, 22 May 2013 15:15:11 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Mikhail Gusarov <dottedmag@debian.org>:
Bug#708705; Package publican. (Wed, 12 Jun 2013 11:42:14 GMT) Full text and rfc822 format available.

Acknowledgement sent to Helmut Grohne <helmut@subdivi.de>:
Extra info received and forwarded to list. Copy sent to Mikhail Gusarov <dottedmag@debian.org>. (Wed, 12 Jun 2013 11:42:14 GMT) Full text and rfc822 format available.

Message #27 received at 708705@bugs.debian.org (full text, mbox):

From: Helmut Grohne <helmut@subdivi.de>
To: Raphael Hertzog <hertzog@debian.org>, 708705@bugs.debian.org
Cc: "Aaron M. Ucko" <ucko@debian.org>
Subject: Re: Bug#708705: publican: ships many copies of common resources
Date: Wed, 12 Jun 2013 13:38:08 +0200
On Sun, May 19, 2013 at 02:24:30PM +0200, Raphael Hertzog wrote:
> On Fri, 17 May 2013, Aaron M. Ucko wrote:
> > Per http://dedup.debian.net/compare/publican/publican, publican ships
> > many copies of common resources (images, CSS files, etc.) under
> > /usr/share/publican/Common_Content and /usr/share/publican/doc,
> > accounting for most of its massive size increase from 2.8-3.  (It's
> > gone up from 6.4 MiB to 50.6 MiB.)
> > 
> > Could you please arrange to ship only one copy of each duplicated
> > file, at least within /usr/share/publican/Common_Content?

Wow. People now reporting bugs based on dedup.d.n. That's what I wrote
it for! \o/ Next time, please include a link to
http://wiki.debian.org/dedup.debian.net, because it includes useful
information for the maintainer.

> It doesn't look trivial. Each set of language files ought to be
> self-contained so that any generated document is independant. So replacing
> with symlinks is not satisfactory (unless we modify the publican build
> logic to replace symlinks with the corresponding file).

I would advise against any manual solution. It just causes work at
little benefit.

> Replacing with hardlinks is better but is quite uncommon in Debian
> packages (there's a lintian warning suggesting it's a bad idea).

I have discussed the question about hard link usage a number of times
now. Conclusions so far:

 * Hard links to files in the same directory (not subdirectory) are
   always ok. (Example: bzip2)
 * When you have many small files, hard links reduce the installation
   size over sym links due to savings in inodes.
 * Hard links, that cross directories should be ok, if the hierarchy is
   completely owned by the package in question. This includes
   /usr/lib/$package and /usr/share/$package. Of course this does not
   cover hard links from /usr/lib/$package/foo to
   /usr/share/$package/bar. As a rule of thumb: If a package is the only
   package to create a directory, you can use hard links therein.

> Last but not least, I'm not going to manually deduplicate all those
> files so someone should really create a helper script that would
> deduplicate a sub-directory.

The wiki page above gives some explanations on how to achieve this using
rdfind and symlinks. A helper utility does not exist.

In your case I'd suggest the following line as part of the build
process.

rdfind -makehardlinks true -outputname /dev/null debian/publican/usr/share/publican

Should you have any questions, just ask. In any case feedback on the
usability and documentation of dedup.d.n is welcome.

Helmut



Information forwarded to debian-bugs-dist@lists.debian.org, Mikhail Gusarov <dottedmag@debian.org>:
Bug#708705; Package publican. (Fri, 14 Jun 2013 22:18:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to ucko@debian.org (Aaron M. Ucko):
Extra info received and forwarded to list. Copy sent to Mikhail Gusarov <dottedmag@debian.org>. (Fri, 14 Jun 2013 22:18:04 GMT) Full text and rfc822 format available.

Message #32 received at 708705@bugs.debian.org (full text, mbox):

From: ucko@debian.org (Aaron M. Ucko)
To: Helmut Grohne <helmut@subdivi.de>
Cc: Raphael Hertzog <hertzog@debian.org>, 708705@bugs.debian.org, "Aaron M. Ucko" <ucko@debian.org>
Subject: Re: Bug#708705: publican: ships many copies of common resources
Date: Fri, 14 Jun 2013 18:14:55 -0400
Helmut Grohne <helmut@subdivi.de> writes:

> Wow. People now reporting bugs based on dedup.d.n. That's what I wrote
> it for! \o/

Thanks for establishing it! :-)

> Next time, please include a link to
> http://wiki.debian.org/dedup.debian.net, because it includes useful
> information for the maintainer.

OK, thanks.  Sorry for missing it earlier.

>  * Hard links, that cross directories should be ok, if the hierarchy is
>    completely owned by the package in question. This includes

FWIW, I've dealt with a filesystem (OpenAFS) that supports no
cross-directory hard links whatsoever, perhaps because its ACLs are
per-directory rather than per-file.  However, it's a network filesystem
and generally somewhat idiosyncratic, so supporting package installation
into OpenAFS isn't necessarily critical.

-- 
Aaron M. Ucko, KB1CJC (amu at alum.mit.edu, ucko at debian.org)
http://www.mit.edu/~amu/ | http://stuff.mit.edu/cgi/finger/?amu@monk.mit.edu



Reply sent to Raphaël Hertzog <hertzog@debian.org>:
You have taken responsibility. (Wed, 19 Jun 2013 22:24:05 GMT) Full text and rfc822 format available.

Notification sent to "Aaron M. Ucko" <ucko@debian.org>:
Bug acknowledged by developer. (Wed, 19 Jun 2013 22:24:05 GMT) Full text and rfc822 format available.

Message #37 received at 708705-close@bugs.debian.org (full text, mbox):

From: Raphaël Hertzog <hertzog@debian.org>
To: 708705-close@bugs.debian.org
Subject: Bug#708705: fixed in publican 3.1.5-3
Date: Wed, 19 Jun 2013 22:20:59 +0000
Source: publican
Source-Version: 3.1.5-3

We believe that the bug you reported is fixed in the latest version of
publican, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 708705@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Raphaël Hertzog <hertzog@debian.org> (supplier of updated publican package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@ftp-master.debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Format: 1.8
Date: Wed, 19 Jun 2013 23:22:59 +0200
Source: publican
Binary: publican
Architecture: source all
Version: 3.1.5-3
Distribution: unstable
Urgency: low
Maintainer: Mikhail Gusarov <dottedmag@debian.org>
Changed-By: Raphaël Hertzog <hertzog@debian.org>
Description: 
 publican   - Tool for publishing material authored in DocBook XML
Closes: 708705
Changes: 
 publican (3.1.5-3) unstable; urgency=low
 .
   * Depend on docbook-xsl >= 1.77 and drop debian/patches/revert-footnote-change
     whose sole purpose was to make publican work with an older docbook-xsl.
   * Deduplicate files in /usr/share/publican. We use rdfind to create
     hardlinks between all identical files. Thanks to Helmut Grohne for
     the tool suggestion and to Aaron M. Ucko for the report. Closes: #708705
   * Update Standards-Version to 3.9.4.
   * Update Vcs related URL from git.debian.org to anonscm.debian.org.
Checksums-Sha1: 
 49ade6a034b567799e5cd82aed4965f617ee7a04 2995 publican_3.1.5-3.dsc
 ef0deb35e34626c917604cf7d6dbfc0433ba7644 7036 publican_3.1.5-3.debian.tar.gz
 6aee1ed4a9e956a874e4d14b173e6319c409b7de 7469712 publican_3.1.5-3_all.deb
Checksums-Sha256: 
 94bab7024391377f7268e89d96f1d7b0ada51d39012858565a2bd7e1b1d4cf4a 2995 publican_3.1.5-3.dsc
 9807a5b11963f90863a36f268fe2291ac89f053203a0193cb271c194444e022b 7036 publican_3.1.5-3.debian.tar.gz
 53a04f74208b304cd6dfca7e4ecd28de9d5dda88a1e11e3d8e7357bbbd793e47 7469712 publican_3.1.5-3_all.deb
Files: 
 6a6e1b7a60cc847c03d1cbb242bcd193 2995 perl optional publican_3.1.5-3.dsc
 72efcda5f75011689ceb8d72deb605ac 7036 perl optional publican_3.1.5-3.debian.tar.gz
 86fdbd5284c58b74fc29e5699606cb0e 7469712 perl optional publican_3.1.5-3_all.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Signed by Raphael Hertzog

iQIcBAEBCAAGBQJRwif2AAoJEOYZBF3yrHKajrIP/A+A/hjaTNiYUueaCKSzDUVJ
h2zcqF1clkw9oPb67J5pGfllv6vulq2vaR4nAexp08aY+z44vpC4//HrSowuEgo+
r1I25jN1tmZQhgclwi7au4h4IjpHcZLnlbTAN+lqIlTkdukLUFzqeo/TeXX4mPQa
Y5qEwHV9AM90WdObppVa5gXD/q+Oi9c5coi6PMAw7oRdICpy2rnkJFL/odtjoQI1
WJGeMc+FQtMvhoZhpKlTSeSjaIAO41majtjolfK2ioa6uJ/9AiPPblQGREGvxDKP
tDu/yU2usLcph5Bj2feACpaODyA2OCZ/lYFFfN+o0rB/DoJIYmAS5pMnGueukfCf
RJkPaSLbXdVarIPw/wH041L6vSqvam/NxW1T76j9R6hZKpyOZk4bz9SfSFJJRqzQ
jYCUtExWpmmtZxeds1BykbxTk03g29yaIWre1drA8xvWyCd5UZPsnHtSbJ3EvzsU
oKnfxC96fdZx3PaWRynG8l5GHFcyO42554XG/S9tgrzUAM3lCxnzE8Fy7s7erFeo
G0+aQnOj0teI6M05Xfq9Gff4oQYtHgu7q79lISj+gq3DqrY0jBY5WwjZEQfVSYkI
aNKhzw9R+CZgHJvLzwBSVUB5bPqlNwnjmqStSGfGU9Uk4caDHa7n8u/vRPzKztlW
EY7FN8ZcxTw9+XrIoLrn
=lYqv
-----END PGP SIGNATURE-----




Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Sun, 28 Jul 2013 07:25:27 GMT) Full text and rfc822 format available.

Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Sun Apr 20 20:01:08 2014; Machine Name: buxtehude.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.