Debian Bug report logs -
#395007
aptitude --show-deps is broken for multibyte descriptive text
Reported by: Kobayashi Noritada <nori1@dolphin.c.u-tokyo.ac.jp>
Date: Tue, 24 Oct 2006 11:18:49 UTC
Severity: normal
Tags: l10n, patch
Found in version aptitude/0.4.3-1
Fixed in version aptitude/0.4.4-1
Done: Daniel Burrows <dburrows@debian.org>
Bug is archived. No further changes may be made.
Toggle useless messages
Report forwarded to debian-bugs-dist@lists.debian.org, Daniel Burrows <dburrows@debian.org>:
Bug#395007; Package aptitude.
(full text, mbox, link).
Acknowledgement sent to Kobayashi Noritada <nori1@dolphin.c.u-tokyo.ac.jp>:
New Bug report received and forwarded. Copy sent to Daniel Burrows <dburrows@debian.org>.
(full text, mbox, link).
Message #5 received at submit@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Package: aptitude
Version: 0.4.3-1
Severity: normal
Tags: patch l10n
Hi,
akira yamada (akira) reported on his book[1] about deb package and packaging
that Japanese descriptive text from `aptitude --show-deps' is broken.
Actually, following code at reason_string_list in src/cmdline/cmdline_prompt.cc
cannot handle multibyte characters:
s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
Here attached two patches to solve this problem: one is a version I made by
referring to sample code[2] from Junichi Uekawa (dancer), and the other is a
version privately presented from Kouhei Sutou. I rebuilt with each of these
patches and checked that both can work as a solution. However, although I can
understand basically what these patches are doing, I don't have a good C++
skill, and cannot determine which is a better solution nor create a better and
compact patch free of side effect.
Could you please choose one or create a better solution to fix this bug? :-)
[1] http://www.gihyo.co.jp/books/4-7741-2768-X (in Japanese)
[2] http://lists.debian.or.jp/debian-devel/200610/msg00016.html (in Japanese)
Thanks,
-nori
-- System Information:
Debian Release: 3.1
Architecture: i386 (i686)
Kernel: Linux 2.6.8-3-686
Locale: LANG=ja_JP.eucJP, LC_CTYPE=ja_JP.eucJP (charmap=EUC-JP)
Versions of packages aptitude depends on:
ii apt [libapt-pkg-libc6 0.5.28.6 Advanced front-end for dpkg
ii libc6 2.3.2.ds1-22sarge4 GNU C Library: Shared libraries an
ii libgcc1 1:3.4.3-13sarge1 GCC support library
ii libncurses5 5.4-4 Shared libraries for terminal hand
ii libsigc++-1.2-5c102 1.2.5-4 type-safe Signal Framework for C++
ii libstdc++5 1:3.3.5-13 The GNU Standard C++ Library v3
-- no debconf information
[showdeps-1.diff (text/x-c, attachment)]
[showdeps-2.diff (text/x-c, attachment)]
Information forwarded to debian-bugs-dist@lists.debian.org:
Bug#395007; Package aptitude.
(full text, mbox, link).
Acknowledgement sent to Daniel Burrows <dburrows@debian.org>:
Extra info received and forwarded to list.
(full text, mbox, link).
Message #10 received at 395007@bugs.debian.org (full text, mbox, reply):
On Tue, Oct 24, 2006 at 08:16:31PM +0900, Kobayashi Noritada <nori1@dolphin.c.u-tokyo.ac.jp> was heard to say:
> akira yamada (akira) reported on his book[1] about deb package and packaging
> that Japanese descriptive text from `aptitude --show-deps' is broken.
> Actually, following code at reason_string_list in src/cmdline/cmdline_prompt.cc
> cannot handle multibyte characters:
>
> s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
>
> Here attached two patches to solve this problem: one is a version I made by
> referring to sample code[2] from Junichi Uekawa (dancer), and the other is a
> version privately presented from Kouhei Sutou. I rebuilt with each of these
> patches and checked that both can work as a solution. However, although I can
> understand basically what these patches are doing, I don't have a good C++
> skill, and cannot determine which is a better solution nor create a better and
> compact patch free of side effect.
>
> Could you please choose one or create a better solution to fix this bug? :-)
It seems to me like this does the job:
diff -rN -u old-head/src/cmdline/cmdline_prompt.cc new-head/src/cmdline/cmdline_prompt.cc
--- old-head/src/cmdline/cmdline_prompt.cc 2006-10-25 16:54:54.000000000 -0700
+++ new-head/src/cmdline/cmdline_prompt.cc 2006-10-25 16:54:54.000000000 -0700
@@ -19,6 +19,7 @@
#include <vscreen/fragment.h>
#include <vscreen/vscreen.h>
+#include <vscreen/transcode.h>
#include <apt-pkg/algorithms.h>
#include <apt-pkg/dpkgpm.h>
@@ -83,7 +84,8 @@
first=false;
}
- s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
+ wstring dep_name = transcode(const_cast<pkgCache::DepIterator &>(why->dep).DepType());
+ s += transcode(dep_name.substr(0, 1));
s+=": ";
s+=why->pkg.Name();
}
The main drawback is that it doesn't do error-checking, which means
that a broken translation file will result in all the dependency types
turning into "?". This leads me to a broader question, though: is the
first character really a suitable abbreviation for the dependency type
in all languages? I wonder, for instance, whether just the first
Chinese character will be understood by Chinese speakers as a shortening
of the dependency type string. Probably I should eventually add a
special set of "dependency abbreviation" translations, but right now
Christian will kill me if I do that. ;-)
> --- src/cmdline/cmdline_prompt.cc.orig 2006-10-24 03:16:42.000000000 +0900
> +++ src/cmdline/cmdline_prompt.cc 2006-10-24 18:01:46.000000000 +0900
> @@ -83,7 +83,13 @@
> first=false;
> }
>
> - s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
> + mbstate_t mbstate;
> + size_t len;
> + char *dep_type=strdup(const_cast<pkgCache::DepIterator &>(why->dep).DepType());
> + memset(&mbstate, 0, sizeof(mbstate));
> + len=mbrlen(dep_type, strlen(dep_type), &mbstate);
> + dep_type[len]=0;
> + s+=dep_type;
> s+=": ";
> s+=why->pkg.Name();
> }
I assume this is pretty efficient, but it's also not consistent with
the rest of the aptitude codebase (and I doubt that efficiency matters
here).
> --- src/cmdline/cmdline_prompt.cc.orig 2006-10-24 03:16:42.000000000 +0900
> +++ src/cmdline/cmdline_prompt.cc 2006-10-24 18:36:45.000000000 +0900
> @@ -19,6 +19,7 @@
>
> #include <vscreen/fragment.h>
> #include <vscreen/vscreen.h>
> +#include <vscreen/transcode.h>
>
> #include <apt-pkg/algorithms.h>
> #include <apt-pkg/dpkgpm.h>
> @@ -83,7 +84,27 @@
> first=false;
> }
>
> - s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
> +
> + bool converting_success = false;
> + std::string dep_type;
> + std::wstring w_dep_type;
> +
> + dep_type = const_cast<pkgCache::DepIterator &>(why->dep).DepType();
> + if (transcode(dep_type, w_dep_type))
> + {
> + std::string dep_type_first_char;
> + std::wstring w_dep_type_first_char;
> + w_dep_type_first_char = w_dep_type.substr(0, 1);
> + if (transcode(w_dep_type_first_char, dep_type_first_char))
> + {
> + s+=dep_type_first_char;
> + converting_success = true;
> + }
> + }
> +
> + if (!converting_success)
> + s+=dep_type[0];
> +
> s+=": ";
> s+=why->pkg.Name();
> }
This uses aptitude's conventions for transcoding strings, but the
verbosity is a bit awkward. Moreover, if the translation fails, falling
back to displaying the original string won't help: a failed translation
probably means that the string is in a charset that can't be displayed!
I'd lean in favor of the simple approach as a short-term solution, and
using a proper separate translation in the long term.
Daniel
Information forwarded to debian-bugs-dist@lists.debian.org, Daniel Burrows <dburrows@debian.org>:
Bug#395007; Package aptitude.
(full text, mbox, link).
Acknowledgement sent to Kobayashi Noritada <nori1@dolphin.c.u-tokyo.ac.jp>:
Extra info received and forwarded to list. Copy sent to Daniel Burrows <dburrows@debian.org>.
(full text, mbox, link).
Message #15 received at 395007@bugs.debian.org (full text, mbox, reply):
Hi,
From: Daniel Burrows
Subject: Re: Bug#395007: aptitude --show-deps is broken for multibyte descriptive text
Date: Wed, 25 Oct 2006 17:08:26 -0700
> On Tue, Oct 24, 2006 at 08:16:31PM +0900, Kobayashi Noritada <nori1@dolphin.c.u-tokyo.ac.jp> was heard to say:
> > Actually, following code at reason_string_list in src/cmdline/cmdline_prompt.cc
> > cannot handle multibyte characters:
> >
> > s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
> It seems to me like this does the job:
>
> diff -rN -u old-head/src/cmdline/cmdline_prompt.cc new-head/src/cmdline/cmdline_prompt.cc
> --- old-head/src/cmdline/cmdline_prompt.cc 2006-10-25 16:54:54.000000000 -0700
> +++ new-head/src/cmdline/cmdline_prompt.cc 2006-10-25 16:54:54.000000000 -0700
> @@ -19,6 +19,7 @@
>
> #include <vscreen/fragment.h>
> #include <vscreen/vscreen.h>
> +#include <vscreen/transcode.h>
>
> #include <apt-pkg/algorithms.h>
> #include <apt-pkg/dpkgpm.h>
> @@ -83,7 +84,8 @@
> first=false;
> }
>
> - s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
> + wstring dep_name = transcode(const_cast<pkgCache::DepIterator &>(why->dep).DepType());
> + s += transcode(dep_name.substr(0, 1));
> s+=": ";
> s+=why->pkg.Name();
> }
I've rebuilt and checked that this fix works.
> The main drawback is that it doesn't do error-checking, which means
> that a broken translation file will result in all the dependency types
> turning into "?". This leads me to a broader question, though: is the
> first character really a suitable abbreviation for the dependency type
> in all languages? I wonder, for instance, whether just the first
> Chinese character will be understood by Chinese speakers as a shortening
> of the dependency type string. Probably I should eventually add a
> special set of "dependency abbreviation" translations, but right now
> Christian will kill me if I do that. ;-)
Yes, you are right. Fortunately, at least for Japanese translation,
which uses Chinese characters as expressions of dependencies and is
completely different from English text, the first character of
dependency type is a valid abbreviation, but I can never say that the
situation is the same for all the languages. This should be a quick
hack as you says.
> > - s+=const_cast<pkgCache::DepIterator &>(why->dep).DepType()[0];
> > + mbstate_t mbstate;
> > + size_t len;
> > + char *dep_type=strdup(const_cast<pkgCache::DepIterator &>(why->dep).DepType());
> > + memset(&mbstate, 0, sizeof(mbstate));
> > + len=mbrlen(dep_type, strlen(dep_type), &mbstate);
> > + dep_type[len]=0;
> > + s+=dep_type;
> I assume this is pretty efficient, but it's also not consistent with
> the rest of the aptitude codebase (and I doubt that efficiency matters
> here).
Exactly I also bothered about the inconsistency of code for this patch...
> I'd lean in favor of the simple approach as a short-term solution, and
> using a proper separate translation in the long term.
Thank you. I agree with that policy.
Many thanks,
-nori
Reply sent to Daniel Burrows <dburrows@debian.org>:
You have taken responsibility.
(full text, mbox, link).
Notification sent to Kobayashi Noritada <nori1@dolphin.c.u-tokyo.ac.jp>:
Bug acknowledged by developer.
(full text, mbox, link).
Message #20 received at 395007-close@bugs.debian.org (full text, mbox, reply):
Source: aptitude
Source-Version: 0.4.4-1
We believe that the bug you reported is fixed in the latest version of
aptitude, which is due to be installed in the Debian FTP archive:
aptitude-doc-cs_0.4.4-1_all.deb
to pool/main/a/aptitude/aptitude-doc-cs_0.4.4-1_all.deb
aptitude-doc-en_0.4.4-1_all.deb
to pool/main/a/aptitude/aptitude-doc-en_0.4.4-1_all.deb
aptitude-doc-fi_0.4.4-1_all.deb
to pool/main/a/aptitude/aptitude-doc-fi_0.4.4-1_all.deb
aptitude-doc-fr_0.4.4-1_all.deb
to pool/main/a/aptitude/aptitude-doc-fr_0.4.4-1_all.deb
aptitude_0.4.4-1.diff.gz
to pool/main/a/aptitude/aptitude_0.4.4-1.diff.gz
aptitude_0.4.4-1.dsc
to pool/main/a/aptitude/aptitude_0.4.4-1.dsc
aptitude_0.4.4-1_i386.deb
to pool/main/a/aptitude/aptitude_0.4.4-1_i386.deb
aptitude_0.4.4.orig.tar.gz
to pool/main/a/aptitude/aptitude_0.4.4.orig.tar.gz
A summary of the changes between this version and the previous one is
attached.
Thank you for reporting the bug, which will now be closed. If you
have further comments please address them to 395007@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.
Debian distribution maintenance software
pp.
Daniel Burrows <dburrows@debian.org> (supplier of updated aptitude package)
(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Format: 1.7
Date: Thu, 26 Oct 2006 21:02:00 -0700
Source: aptitude
Binary: aptitude-doc-cs aptitude-doc-fr aptitude-doc-fi aptitude-doc-en aptitude
Architecture: source all i386
Version: 0.4.4-1
Distribution: unstable
Urgency: low
Maintainer: Daniel Burrows <dburrows@debian.org>
Changed-By: Daniel Burrows <dburrows@debian.org>
Description:
aptitude - terminal-based apt frontend
aptitude-doc-cs - Czech manual for aptitude, a terminal-based apt frontend
aptitude-doc-en - English manual for aptitude, a terminal-based apt frontend
aptitude-doc-fi - Finnish manual for aptitude, a terminal-based apt frontend
aptitude-doc-fr - French manual for aptitude, a terminal-based apt frontend
Closes: 38973 351531 351531 361050 374919 381481 386307 386852 387336 387537 387579 387734 387803 388045 388401 388552 388552 388594 389581 389583 389763 389942 390736 390971 391061 391531 391663 391684 392305 392305 392305 392870 392903 392924 393070 393643 394696 395007 395201
Changes:
aptitude (0.4.4-1) unstable; urgency=low
.
* New upstream release.
.
- Bulleting has been fixed and re-enabled by default.
(Closes: #388594)
.
- Change the default settings to leave unused Linux kernel
images on the system. (Closes: #386307)
.
- Produce more useful errors for corrupted or unverifiable downloads
(Closes: #387537).
.
- Make minibuffer messages disappear when a key is pressed again.
(Closes: #395201)
.
- Remove an assertion about the timing behavior of timed mutex locks,
which apparently behave differently in virtual machines.
(Closes: #381481)
.
- Document the "unhold" command-line action. (Closes: #387336)
.
- Make the package selected by a search appear at the top of the
screen, so that it's visible underneath the search dialog.
(Closes: #389763)
.
- Make the progress indicator less visually distracting by eliminating
the yellow "progress" effect (which on many system just produces a
distracting yellow and blue flashing) (Closes: #390971).
.
- Unblock all signals (particularly WINCH) before running dpkg, so
that processes spawned by dpkg don't end up with weird signal masks.
(Closes: #392870)
.
- Use the first *character*, not the first byte, when abbreviating
dependency names in the command-line preview (Closes: #395007).
.
- Documentation fixes from Kobayashi Noritatda (Closes: #389942).
.
- Translation updates:
* Basque (Closes: #38973)
* Brazilian (Closes: #387734)
* Catalan
* Chinese (Simplified) (Closes: #392305)
* Chinese (Traditional)
* Czech (Closes: #361050)
* Danish
* Dutch (Closes: #393643)
* Dzongkha (Closes: #388045)
* Finnish (Closes: #351531)
* French (Closes: #388552, #351531)
* Galacian (Closes: #387579)
* German
* Hungarian
* Italian
* Japanese (Closes: #389581, #389583, #390736, #391061)
* Khmer (Closes: #374919)
* Kurdish (Closes: #387803)
* Norwegian Bokmal (Closes: #391684)
* Portuguese (Closes: #393070)
* Romanian (Closes: #388401)
* Russian (Closes: #392305)
* Slovak (Closes: #386852, #394696)
* Spanish (Closes: #391663)
* Swedish (Closes: #391531)
* Turkish (Closes: #392305)
* Vietnamese (Closes: #388552, #392903, #392924)
Files:
d2a2a8265e452836399698f36802217e 802 admin - aptitude_0.4.4-1.dsc
cdb1ffb692ba17376859dc51a361ac94 5281245 admin - aptitude_0.4.4.orig.tar.gz
081c5886ef0c7a0ff679a78dee228e96 24721 admin - aptitude_0.4.4-1.diff.gz
3f5b5b06765f081a1bbe29649c92509c 338182 doc optional aptitude-doc-cs_0.4.4-1_all.deb
0a15f3e63e070453b734923e56193fb1 324392 doc optional aptitude-doc-en_0.4.4-1_all.deb
fc67aadeda25e39cb0be8d49585d1b9e 256578 doc optional aptitude-doc-fi_0.4.4-1_all.deb
2a78f29545279874cd75cb9e7b61ea53 266760 doc optional aptitude-doc-fr_0.4.4-1_all.deb
0cbb6dd4a3131175d0756faf82959660 2869878 admin important aptitude_0.4.4-1_i386.deb
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
iD8DBQFFQZFYch6xsM7kSXgRAmHlAKD1O/qpaPXqDBKPbvl3idPsJ1O4kgCg1R9q
BCpZimErZ3TqM662jouDYjk=
=wuOG
-----END PGP SIGNATURE-----
Bug archived.
Request was from Debbugs Internal Request <owner@bugs.debian.org>
to internal_control@bugs.debian.org.
(Sun, 24 Jun 2007 12:25:31 GMT) (full text, mbox, link).
Send a report that this bug log contains spam.
Debian bug tracking system administrator <owner@bugs.debian.org>.
Last modified:
Mon Jun 5 03:13:57 2023;
Machine Name:
buxtehude
Debian Bug tracking system
Debbugs is free software and licensed under the terms of the GNU
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson,
2005-2017 Don Armstrong, and many other contributors.