Debian Bug report logs - #593442
dpkg-genchanges can produce broken UTF-8 in Description

version graph

Package: dpkg; Maintainer for dpkg is Dpkg Developers <debian-dpkg@lists.debian.org>; Source for dpkg is src:dpkg (PTS, buildd, popcon).

Reported by: Colin Watson <cjwatson@ubuntu.com>

Date: Wed, 18 Aug 2010 09:27:01 UTC

Severity: normal

Found in version dpkg/1.15.8.4

Fixed in version dpkg/1.15.8.5

Done: Guillem Jover <guillem@debian.org>

Bug is archived. No further changes may be made.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#593442; Package dpkg. (Wed, 18 Aug 2010 09:27:04 GMT) (full text, mbox, link).


Acknowledgement sent to Colin Watson <cjwatson@ubuntu.com>:
New Bug report received and forwarded. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Wed, 18 Aug 2010 09:27:04 GMT) (full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: Colin Watson <cjwatson@ubuntu.com>
To: submit@bugs.debian.org
Subject: dpkg-genchanges can produce broken UTF-8 in Description
Date: Wed, 18 Aug 2010 10:23:58 +0100
Package: dpkg
Version: 1.15.8.4
Severity: normal
User: ubuntu-devel@lists.ubuntu.com
Usertags: origin-ubuntu maverick

dpkg-genchanges truncates the Description field to 65 bytes rather than
65 characters.  When given a package with a Description field in UTF-8
and containing non-ASCII characters, this means that it can produce
invalid UTF-8 due to truncating in the middle of a character.  For a
real-world example, see:

  http://launchpadlibrarian.net/51958054/language-pack-kde-nb-base_10.04%2B20100714_source.changes

A fix for this is to insert:

  binmode($fh, ":encoding(UTF-8)");

... just before the call to $self->parse in
Dpkg::Interface::Storable::load(); this causes Perl to assume character
semantics on the data read from that file, which causes sprintf %.65s to
yield 65 characters rather than 65 bytes.  However, this will require
every file read through that interface to be valid UTF-8, and I don't
know whether that is appropriate, so I leave this to your judgement.

Thanks,

-- 
Colin Watson                                       [cjwatson@ubuntu.com]




Added tag(s) pending. Request was from Raphaël Hertzog <hertzog@debian.org> to control@bugs.debian.org. (Sat, 21 Aug 2010 14:42:02 GMT) (full text, mbox, link).


Message sent on to Colin Watson <cjwatson@ubuntu.com>:
Bug#593442. (Sat, 21 Aug 2010 14:42:07 GMT) (full text, mbox, link).


Message #10 received at 593442-submitter@bugs.debian.org (full text, mbox, reply):

From: Raphaël Hertzog <hertzog@debian.org>
To: 593442-submitter@bugs.debian.org
Subject: Bug#593442 marked as pending
Date: Sat, 21 Aug 2010 14:40:06 +0000
tag 593442 pending
thanks

Hello,

Bug #593442 reported by you has been fixed in the Git repository. You can
see the changelog below, and you can check the diff of the fix at:

    http://git.debian.org/?p=dpkg/dpkg.git;a=commitdiff;h=f42344b

---
commit f42344b5fb3fda487eb1b7583bd1bd2ec84f2334
Author: Raphaël Hertzog <hertzog@debian.org>
Date:   Sat Aug 21 16:28:03 2010 +0200

    dpkg-genchanges: correctly truncate descriptions with multibyte characters
    
    Ensure the scalar used to truncate the description is character-based
    and not byte-based. But switch it back to a byte-based scalar afterwards
    to avoid bad-conversion to latin1 when output in a filehandle without
    any explicit encodind.
    
    This should really be fixed in Dpkg::Control but that would be an invasive
    change at this point of the squeeze release.
    
    Reported-by: Colin Watson <cjwatson@ubuntu.com>

diff --git a/debian/changelog b/debian/changelog
index ae724cf..f95a2ed 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -9,6 +9,10 @@ dpkg (1.15.8.5) UNRELEASED; urgency=low
   * Clarify effect of “dpkg --purge” on homedir files in dpkg(1).
     Thanks to The Fungi <fungi@yuggoth.org>. Closes: #593628
 
+  [ Raphaël Hertzog ]
+  * Fix dpkg-genchanges to not split the short description in the middle of a
+    UTF8 character. Closes: #593442
+
   [ Updated programs translations ]
   * Italian (Milo Casagrande). Closes: #592953
 




Information forwarded to debian-bugs-dist@lists.debian.org, Dpkg Developers <debian-dpkg@lists.debian.org>:
Bug#593442; Package dpkg. (Sat, 21 Aug 2010 14:48:06 GMT) (full text, mbox, link).


Acknowledgement sent to Raphael Hertzog <hertzog@debian.org>:
Extra info received and forwarded to list. Copy sent to Dpkg Developers <debian-dpkg@lists.debian.org>. (Sat, 21 Aug 2010 14:48:06 GMT) (full text, mbox, link).


Message #15 received at 593442@bugs.debian.org (full text, mbox, reply):

From: Raphael Hertzog <hertzog@debian.org>
To: Colin Watson <cjwatson@ubuntu.com>, 593442@bugs.debian.org
Subject: Re: Bug#593442: dpkg-genchanges can produce broken UTF-8 in Description
Date: Sat, 21 Aug 2010 16:45:23 +0200
Hi,

On Wed, 18 Aug 2010, Colin Watson wrote:
> A fix for this is to insert:
> 
>   binmode($fh, ":encoding(UTF-8)");
> 
> ... just before the call to $self->parse in
> Dpkg::Interface::Storable::load(); this causes Perl to assume character
> semantics on the data read from that file, which causes sprintf %.65s to
> yield 65 characters rather than 65 bytes.  However, this will require
> every file read through that interface to be valid UTF-8, and I don't
> know whether that is appropriate, so I leave this to your judgement.

It's definitely not the right place for such a fix IMO. It should be in
Dpkg::Control ideally... but it's an high impact change that needs careful
preparation.

For now I have pushed a non-invasive fix for squeeze that corrects
dpkg-genchanges only.

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer ◈ [Flattr=20693]

Follow my Debian News ▶ http://RaphaelHertzog.com (English)
                      ▶ http://RaphaelHertzog.fr (Français)




Reply sent to Guillem Jover <guillem@debian.org>:
You have taken responsibility. (Tue, 14 Sep 2010 00:21:07 GMT) (full text, mbox, link).


Notification sent to Colin Watson <cjwatson@ubuntu.com>:
Bug acknowledged by developer. (Tue, 14 Sep 2010 00:21:07 GMT) (full text, mbox, link).


Message #20 received at 593442-close@bugs.debian.org (full text, mbox, reply):

From: Guillem Jover <guillem@debian.org>
To: 593442-close@bugs.debian.org
Subject: Bug#593442: fixed in dpkg 1.15.8.5
Date: Tue, 14 Sep 2010 00:17:15 +0000
Source: dpkg
Source-Version: 1.15.8.5

We believe that the bug you reported is fixed in the latest version of
dpkg, which is due to be installed in the Debian FTP archive:

dpkg-dev_1.15.8.5_all.deb
  to main/d/dpkg/dpkg-dev_1.15.8.5_all.deb
dpkg_1.15.8.5.dsc
  to main/d/dpkg/dpkg_1.15.8.5.dsc
dpkg_1.15.8.5.tar.bz2
  to main/d/dpkg/dpkg_1.15.8.5.tar.bz2
dpkg_1.15.8.5_amd64.deb
  to main/d/dpkg/dpkg_1.15.8.5_amd64.deb
dselect_1.15.8.5_amd64.deb
  to main/d/dpkg/dselect_1.15.8.5_amd64.deb
libdpkg-dev_1.15.8.5_amd64.deb
  to main/d/dpkg/libdpkg-dev_1.15.8.5_amd64.deb
libdpkg-perl_1.15.8.5_all.deb
  to main/d/dpkg/libdpkg-perl_1.15.8.5_all.deb



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 593442@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Guillem Jover <guillem@debian.org> (supplier of updated dpkg package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.8
Date: Tue, 14 Sep 2010 01:26:21 +0200
Source: dpkg
Binary: libdpkg-dev dpkg dpkg-dev libdpkg-perl dselect
Architecture: source amd64 all
Version: 1.15.8.5
Distribution: unstable
Urgency: low
Maintainer: Dpkg Developers <debian-dpkg@lists.debian.org>
Changed-By: Guillem Jover <guillem@debian.org>
Description: 
 dpkg       - Debian package management system
 dpkg-dev   - Debian package development tools
 dselect    - Debian package management front-end
 libdpkg-dev - Debian package management static library
 libdpkg-perl - Dpkg perl modules
Closes: 592953 593442 593628 594011 594167 594218 594440 594513 595175 595208 595299 595468 595556 595615 595643 595968 596173 596333 596417 596518 596657
Changes: 
 dpkg (1.15.8.5) unstable; urgency=low
 .
   [ Guillem Jover ]
   * Do not print a warning when parsing status or status log files on
     half-installed packages w/o a Description or Maintainer field, as
     this happens normally when the package was never installed before.
     Closes: #594167
   * Improve git format documentation in dpkg-source(1).
     Thanks to Joey Hess, based on a patch by Tanguy Ortolo.
   * Clarify effect of “dpkg --purge” on homedir files in dpkg(1).
     Thanks to The Fungi <fungi@yuggoth.org>. Closes: #593628
   * Add gettext plurals infrastructure support.
   * Add gettext messages for plural forms. Closes: #594218
   * Fix possible but improbable segfault in update-alternatives in case
     the master file name contains a format string specifier. Reported by
     Sandro Cazzaniga.
   * Fix realloc usage on compat scandir() implementation.
 .
   [ Raphaël Hertzog ]
   * Fix dpkg-genchanges to not split the short description in the middle of a
     UTF8 character. Closes: #593442
   * Drop -k parameter from the tar call used by dpkg-source to extract
     tarballs. Upstream binary files modified by the packager were not properly
     installed due to this. Thanks to James Westby for the report.
     Closes: #594440
   * Make dpkg Breaks: dpkg-dev (<< 1.15.8) so that older versions of dpkg-dev
     that did not depend on libdpkg-perl must be upgraded together with dpkg.
     Closes: #596417
 .
   [ Helge Kreutzmann ]
   * Fix encoding of German addendum. Closes: #595643
 .
   [ Updated programs translations ]
   * Esperanto (Felipe Castro). Closes: #596173
   * French (Christian Perrier).
   * German (Sven Joachim).
   * Indonesian (Arief S Fitrianto). Closes: #596657
   * Italian (Milo Casagrande). Closes: #592953, #595615
   * Japanese (Kenshi Muto). Closes: #595468
   * Korean (Changwoo Ryu). Closes: #595556
   * Norwegian Bokmål (Hans Nordhaug). Closes: #595208
   * Simplified Chinese (Aron Xu). Closes: #594513
   * Slovak (Ivan Masár). Closes: #595968
   * Swedish (Peter Krefting).
   * Thai (Theppitak Karoonboonyanan). Closes: #594011
 .
   [ Updated man page translations ]
   * French (Christian Perrier).
   * German (Helge Kreutzmann).
   * Swedish (Peter Krefting).
 .
   [ Updated scripts translations ]
   * French (Christian Perrier). Includes a fix to a specific
     message translation that was imprecise. Closes: #596333
   * German (Helge Kreutzmann). Improved by Holger Wansing.
   * Norwegian Bokmål (Hans Fredrik Nordhaug). Closes: #595299
   * Spanish (Omar Campagne).  Closes: #596518
   * Swedish (Peter Krefting).
   * Russian (Yuri Kozlov). Closes: #595175
Checksums-Sha1: 
 6d2df4c9a1a33334ed55c493fa7a108bdc99dfda 1208 dpkg_1.15.8.5.dsc
 a520d5454da0af80a1dd10b3d58c5d1b17f5a042 5174315 dpkg_1.15.8.5.tar.bz2
 cc0fcd3cd9d5dbcf13c6f12ab9eaea27ac83dba3 421182 libdpkg-dev_1.15.8.5_amd64.deb
 6a4e7df15b5d2ef8244860dd02142ed6206397d4 2254946 dpkg_1.15.8.5_amd64.deb
 30726fb7e02eaa7e2ff36229666711d006daa29e 886392 dselect_1.15.8.5_amd64.deb
 fd09e3e701bcdd0a8a9fa9ab8ac8def51d33b31e 773296 dpkg-dev_1.15.8.5_all.deb
 d67089cc4be0967ac1345c6e2e365f4dee0e6510 671750 libdpkg-perl_1.15.8.5_all.deb
Checksums-Sha256: 
 27158640588c7fa93ad628a8d6bbc327753d934695e97c80efed48098f8481ea 1208 dpkg_1.15.8.5.dsc
 2ef55e8eb6c1e8c3dfb54c8ccc9a883fec7540b705c5179ca7a198bebe2f18bc 5174315 dpkg_1.15.8.5.tar.bz2
 883444f1e7c400ddbf53b4c8e416c157bed6a464add8a1c34105359d6f533a94 421182 libdpkg-dev_1.15.8.5_amd64.deb
 69fbb0e9734c585a19861ca89c59ebb76be51c3a805f3fecfd84317c978bd646 2254946 dpkg_1.15.8.5_amd64.deb
 7d26a5506e48614b9ad6305b52758a7cb7970b9073242742506a39d052c21ac7 886392 dselect_1.15.8.5_amd64.deb
 59cafac0a264746deb8a10c0ca220e37f34c70972024d2fed6109e7ac1951633 773296 dpkg-dev_1.15.8.5_all.deb
 0c12562d2729b3e6b64c2d3dd0d28954c1f6b101498700a6dbdd7ed5329e5e60 671750 libdpkg-perl_1.15.8.5_all.deb
Files: 
 23e01c303dea963c6f67ab4249c85b36 1208 admin required dpkg_1.15.8.5.dsc
 b9b817389e655ec2c12465de5c619011 5174315 admin required dpkg_1.15.8.5.tar.bz2
 a834d92dda104b9d16e55f1b7896125b 421182 libdevel optional libdpkg-dev_1.15.8.5_amd64.deb
 b04a507ce35e2773436cfc9b0cba9de8 2254946 admin required dpkg_1.15.8.5_amd64.deb
 318800d0eba360529762cbe54df708ce 886392 admin optional dselect_1.15.8.5_amd64.deb
 d1d97371ab155681f6afcde50fe18df8 773296 utils optional dpkg-dev_1.15.8.5_all.deb
 2d804999b3a361c3c9ee2dac93bb16d9 671750 perl optional libdpkg-perl_1.15.8.5_all.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAkyOudwACgkQuW9ciZ2SjJvGLwCfYmMMl1CPlS1JOXNPdLr7nu3c
J3gAoIhBEtCWFUzekK/+oZ6Ta1c6gtOS
=O1+J
-----END PGP SIGNATURE-----





Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Wed, 27 Oct 2010 07:33:52 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Sun Jan 7 07:02:50 2018; Machine Name: beach

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.