Debian Bug report logs -
#555331
[col] improperly fails with Invalid or incomplete multibyte or wide character
Reported by: Raphael Hertzog <hertzog@debian.org>
Date: Mon, 9 Nov 2009 12:03:02 UTC
Severity: serious
Tags: fixed-upstream
Found in version man-db/2.5.6-3
Fixed in version man-db/2.5.6-4
Done: Colin Watson <cjwatson@debian.org>
Bug is archived. No further changes may be made.
Toggle useless messages
Report forwarded
to debian-bugs-dist@lists.debian.org, lintian@packages.debian.org, man-db@packages.debian.org, Debian Bsdmainutils Team <pkg-bsdmainutils@teams.debian.net>:
Bug#555331; Package bsdmainutils.
(Mon, 09 Nov 2009 12:03:05 GMT) (full text, mbox, link).
Acknowledgement sent
to Raphael Hertzog <hertzog@debian.org>:
New Bug report received and forwarded. Copy sent to lintian@packages.debian.org, man-db@packages.debian.org, Debian Bsdmainutils Team <pkg-bsdmainutils@teams.debian.net>.
(Mon, 09 Nov 2009 12:03:05 GMT) (full text, mbox, link).
Message #5 received at submit@bugs.debian.org (full text, mbox, reply):
Package: bsdmainutils
Version: 8.0.1
Severity: serious
Since today I gets lots of lintian warnings (manpage-has-errors-from-man)
on my dpkg builds because col fails with:
col: Invalid or incomplete multibyte or wide character
You can reproduce it by doing this:
LANG=C man --warnings -E UTF-8 -l /usr/share/man/man8/update-alternatives.8.gz >/dev/null
I don't know if it's col's fault or if it's man-db that does not use col
properly but since col changed recently (and not man-db), I filed the bug
against col. Note that dropping LANG=C makes the warning go away so it's
most certainly locale related. Using any other locale seems to work, even
one that is not UTF-8.
Severity serious to avoid propagation to testing until we know more on the
nature of the problem.
Cheers,
-- System Information:
Debian Release: squeeze/sid
APT prefers unstable
APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable'), (150, 'experimental')
Architecture: i386 (x86_64)
Kernel: Linux 2.6.30-2-amd64 (SMP w/2 CPU cores)
Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages bsdmainutils depends on:
ii bsdutils 1:2.16.1-4 Basic utilities from 4.4BSD-Lite
ii debianutils 3.2.1 Miscellaneous utilities specific t
ii libc6 2.10.1-5 GNU C Library: Shared libraries
ii libncurses5 5.7+20090803-2 shared libraries for terminal hand
bsdmainutils recommends no packages.
Versions of packages bsdmainutils suggests:
ii cpp 4:4.3.4-1 The GNU C preprocessor (cpp)
pn vacation <none> (no description available)
ii wamerican [wordlist] 6-3 American English dictionary words
ii wfrench [wordlist] 1.2.3-7 French dictionary words for /usr/s
ii whois 4.7.36 an intelligent whois client
-- no debconf information
--
Raphaël Hertzog
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Bsdmainutils Team <pkg-bsdmainutils@teams.debian.net>:
Bug#555331; Package bsdmainutils.
(Mon, 09 Nov 2009 14:33:09 GMT) (full text, mbox, link).
Acknowledgement sent
to Michael Meskes <meskes@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Bsdmainutils Team <pkg-bsdmainutils@teams.debian.net>.
(Mon, 09 Nov 2009 14:33:09 GMT) (full text, mbox, link).
Message #10 received at 555331@bugs.debian.org (full text, mbox, reply):
On Mon, Nov 09, 2009 at 12:48:03PM +0100, Raphael Hertzog wrote:
> I don't know if it's col's fault or if it's man-db that does not use col
> properly but since col changed recently (and not man-db), I filed the bug
> against col. Note that dropping LANG=C makes the warning go away so it's
> most certainly locale related. Using any other locale seems to work, even
> one that is not UTF-8.
Please see #555330 for some details as I already saw the same thing. What
happens is that col sets the locale accordingly (to C) and then reads the
document using getwchar(). This operation returns the error you mentioned upon
reading the UTF-8 hyphen (e2 80 90). To me this doesn't look like a bug in col,
but an incorrect call from lintian as man is asked to produce UTF-8 encoding
while col isn't switched to it. Apparently the C locale does not define the
encoding.
Michael
--
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ: 179140304, AIM/Yahoo/Skype: michaelmeskes, Jabber: meskes@jabber.org
VfL Borussia! Forca Barca! Go SF 49ers! Use: Debian GNU/Linux, PostgreSQL
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Bsdmainutils Team <pkg-bsdmainutils@teams.debian.net>:
Bug#555331; Package bsdmainutils.
(Mon, 09 Nov 2009 15:36:05 GMT) (full text, mbox, link).
Acknowledgement sent
to Colin Watson <cjwatson@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Bsdmainutils Team <pkg-bsdmainutils@teams.debian.net>.
(Mon, 09 Nov 2009 15:36:06 GMT) (full text, mbox, link).
Message #15 received at 555331@bugs.debian.org (full text, mbox, reply):
On Mon, Nov 09, 2009 at 12:48:03PM +0100, Raphael Hertzog wrote:
> Package: bsdmainutils
> Version: 8.0.1
> Severity: serious
>
> Since today I gets lots of lintian warnings (manpage-has-errors-from-man)
> on my dpkg builds because col fails with:
> col: Invalid or incomplete multibyte or wide character
>
> You can reproduce it by doing this:
> LANG=C man --warnings -E UTF-8 -l /usr/share/man/man8/update-alternatives.8.gz >/dev/null
>
> I don't know if it's col's fault or if it's man-db that does not use col
> properly but since col changed recently (and not man-db), I filed the bug
> against col. Note that dropping LANG=C makes the warning go away so it's
> most certainly locale related. Using any other locale seems to work, even
> one that is not UTF-8.
>
> Severity serious to avoid propagation to testing until we know more on the
> nature of the problem.
This bug is somewhere in the intersection of bsdmainutils, man-db,
lintian, and locales. Have fun. :-)
The proximate cause is that man uses -Tutf8 and thus outputs UTF-8
hyphens even under LANG=C (compare #547695), and that confuses col now
that it knows about the encoding of its input data.
However, the upstream patch referred to in #547695 is not sufficient
here. lintian uses the '-E UTF-8' option, which forces man to use UTF-8,
overriding the default. This used to work fine when col was dumb; now
that it's smart, things are a bit more problematic. The reason that
lintian does this is that it needs to force UTF-8 output somehow or else
CJK manual pages tend not to work properly, but there is no UTF-8 locale
that's guaranteed to be available on all systems.
In the short term, I think the best approach would be for man to set
LC_CTYPE to some appropriate locale that matches the encoding requested
by -E while running col. I'll see if I can arrange for this. However,
such a locale is not actually guaranteed to exist. Perhaps lintian needs
to generate a UTF-8 locale if it can't find one otherwise, a bit like
the hack in installation-locale; or perhaps we should just make sure
that there's always a C.UTF-8 locale on the system, which could be used
to get UTF-8 character type semantics without implying a particular
language or country.
--
Colin Watson [cjwatson@debian.org]
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Bsdmainutils Team <pkg-bsdmainutils@teams.debian.net>:
Bug#555331; Package bsdmainutils.
(Mon, 09 Nov 2009 16:42:10 GMT) (full text, mbox, link).
Acknowledgement sent
to Colin Watson <cjwatson@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Bsdmainutils Team <pkg-bsdmainutils@teams.debian.net>.
(Mon, 09 Nov 2009 16:42:10 GMT) (full text, mbox, link).
Message #20 received at 555331@bugs.debian.org (full text, mbox, reply):
reassign 555331 man-db 2.5.6-3
user man-db@packages.debian.org
usertags 555331 target-2.5.7
tags 555331 fixed-upstream
clone 555331 -1
reassign -1 lintian 2.2.17
retitle -1 lintian: ensure that there's always a UTF-8 locale for use when running man?
severity -1 wishlist
thanks
On Mon, Nov 09, 2009 at 03:15:02PM +0000, Colin Watson wrote:
> In the short term, I think the best approach would be for man to set
> LC_CTYPE to some appropriate locale that matches the encoding requested
> by -E while running col. I'll see if I can arrange for this.
Fixed upstream, so I'm going to claim this as a man-db bug:
Mon Nov 9 16:27:44 GMT 2009 Colin Watson <cjwatson@debian.org>
* src/encodings.c (find_charset_locale): New function.
* src/encodings.h (find_charset_locale): Add prototype.
* src/man.c (make_roff_command): When invoking col, ensure that
LC_CTYPE is set to an appropriate locale for the selected
character set (Debian bug #555331).
* NEWS: Document this.
> However, such a locale is not actually guaranteed to exist. Perhaps
> lintian needs to generate a UTF-8 locale if it can't find one
> otherwise, a bit like the hack in installation-locale; or perhaps we
> should just make sure that there's always a C.UTF-8 locale on the
> system, which could be used to get UTF-8 character type semantics
> without implying a particular language or country.
I've cloned a bug for this.
--
Colin Watson [cjwatson@debian.org]
Bug reassigned from package 'bsdmainutils' to 'man-db'.
Request was from Colin Watson <cjwatson@debian.org>
to control@bugs.debian.org.
(Mon, 09 Nov 2009 16:57:13 GMT) (full text, mbox, link).
Bug No longer marked as found in versions bsdmainutils/8.0.1.
Request was from Colin Watson <cjwatson@debian.org>
to control@bugs.debian.org.
(Mon, 09 Nov 2009 16:57:13 GMT) (full text, mbox, link).
Bug Marked as found in versions man-db/2.5.6-3.
Request was from Colin Watson <cjwatson@debian.org>
to control@bugs.debian.org.
(Mon, 09 Nov 2009 16:57:14 GMT) (full text, mbox, link).
Added tag(s) fixed-upstream.
Request was from Colin Watson <cjwatson@debian.org>
to control@bugs.debian.org.
(Mon, 09 Nov 2009 16:57:15 GMT) (full text, mbox, link).
Bug 555331 cloned as bug 555408.
Request was from Colin Watson <cjwatson@debian.org>
to control@bugs.debian.org.
(Mon, 09 Nov 2009 16:57:16 GMT) (full text, mbox, link).
Reply sent
to Colin Watson <cjwatson@debian.org>:
You have taken responsibility.
(Tue, 10 Nov 2009 12:36:12 GMT) (full text, mbox, link).
Notification sent
to Raphael Hertzog <hertzog@debian.org>:
Bug acknowledged by developer.
(Tue, 10 Nov 2009 12:36:12 GMT) (full text, mbox, link).
Message #35 received at 555331-close@bugs.debian.org (full text, mbox, reply):
Source: man-db
Source-Version: 2.5.6-4
We believe that the bug you reported is fixed in the latest version of
man-db, which is due to be installed in the Debian FTP archive:
man-db_2.5.6-4.diff.gz
to main/m/man-db/man-db_2.5.6-4.diff.gz
man-db_2.5.6-4.dsc
to main/m/man-db/man-db_2.5.6-4.dsc
man-db_2.5.6-4_i386.deb
to main/m/man-db/man-db_2.5.6-4_i386.deb
A summary of the changes between this version and the previous one is
attached.
Thank you for reporting the bug, which will now be closed. If you
have further comments please address them to 555331@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.
Debian distribution maintenance software
pp.
Colin Watson <cjwatson@debian.org> (supplier of updated man-db package)
(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Format: 1.8
Date: Tue, 10 Nov 2009 11:58:25 +0000
Source: man-db
Binary: man-db
Architecture: source i386
Version: 2.5.6-4
Distribution: unstable
Urgency: low
Maintainer: Colin Watson <cjwatson@debian.org>
Changed-By: Colin Watson <cjwatson@debian.org>
Description:
man-db - on-line manual pager
Closes: 547695 553623 554914 555331
Changes:
man-db (2.5.6-4) unstable; urgency=low
.
* Backport from trunk:
- If the locale encoding is ASCII, then use the ascii device even if
preconv is available; it will do a better job than producing UTF-8
output and then recoding that to ASCII (closes: #547695).
- Include <unistd.h> in src/encodings.c for dup and STDIN_FILENO
(closes: #553623).
- When invoking col, ensure that LC_CTYPE is set to an appropriate
locale for the selected character set (closes: #555331).
* Add man-db/auto-update debconf template, which may be preseeded to false
to disable rebuilding the database when man-db is triggered (closes:
#554914).
Checksums-Sha1:
908e668f6580e03e10af58c8e22fe98b5e6ce05c 1090 man-db_2.5.6-4.dsc
f80c80d65f5286188222807f0aa79b9916dee98e 67315 man-db_2.5.6-4.diff.gz
9795b9780522a5a23e17105ecc66dc2c609f5d68 1176396 man-db_2.5.6-4_i386.deb
Checksums-Sha256:
a2baa707bb6296e94ede4adc4fd556051fab07831fa5ab28b65ebc9f790271aa 1090 man-db_2.5.6-4.dsc
0f2d7d9492d0dcd308b2f3f346cfbb8b9eef68cdb0f52203cac833b9f83e383f 67315 man-db_2.5.6-4.diff.gz
a563ab65f8a635df85a0cd9a93a37aa68b58c25c7d00611021998180f9dc548e 1176396 man-db_2.5.6-4_i386.deb
Files:
d5bf3146bade6d031fd6d245b7383312 1090 doc important man-db_2.5.6-4.dsc
86cf07f2efb8528d3a65ac73d43663dc 67315 doc important man-db_2.5.6-4.diff.gz
ece6139ef95a3461fce49e9e1ac4113d 1176396 doc important man-db_2.5.6-4_i386.deb
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Colin Watson <cjwatson@debian.org> -- Debian developer
iD8DBQFK+VY89t0zAhD6TNERAttLAJ9loPO1pnWEcTqrrgnbLAtrxJ1L8wCfR0VP
ttFRsYCG9eHFO0wfswY+d3Q=
=f8p/
-----END PGP SIGNATURE-----
Information forwarded
to debian-bugs-dist@lists.debian.org, Colin Watson <cjwatson@debian.org>:
Bug#555331; Package man-db.
(Sat, 14 Nov 2009 13:57:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Paul Wise <pabs@debian.org>:
Extra info received and forwarded to list. Copy sent to Colin Watson <cjwatson@debian.org>.
(Sat, 14 Nov 2009 13:57:03 GMT) (full text, mbox, link).
Message #40 received at 555331@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
usertag 555331 + bittenby
found 555331 man-db/2.5.6-4
thanks
In an up-to-date cowbuilder chroot I still get this issue:
(cowbuilder)root@chianamo:~# LANG=C man --warnings -E UTF-8 -l /usr/share/man/man8/update-alternatives.8.gz >/dev/null
col: Invalid or incomplete multibyte or wide character
(cowbuilder)root@chianamo:~# apt-cache policy man-db
man-db:
Installed: 2.5.6-4
Candidate: 2.5.6-4
Version table:
*** 2.5.6-4 0
500 ftp://xxxxxxxxxxxxxxx sid/main Packages
100 /var/lib/dpkg/status
Looking at the patch, I thought it would be because the locales package
is not installed and thus /usr/share/i18n/SUPPORTED is not available.
Unfortunately, installing locales does not silence the warning:
(cowbuilder)root@chianamo:~# apt-get install locales
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
locales
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 4749kB of archives.
After this operation, 12.9MB of additional disk space will be used.
Get:1 ftp://xxxxxxxxxxxxxxxxx sid/main locales 2.10.1-7 [4749kB]
Fetched 4749kB in 35s (133kB/s)
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously deselected package locales.
(Reading database ... 15286 files and directories currently installed.)
Unpacking locales (from .../locales_2.10.1-7_all.deb) ...
Processing triggers for man-db ...
Setting up locales (2.10.1-7) ...
Generating locales (this might take a while)...
Generation complete.
(cowbuilder)root@chianamo:~# LANG=C man --warnings -E UTF-8 -l /usr/share/man/man8/update-alternatives.8.gz >/dev/null
col: Invalid or incomplete multibyte or wide character
This is on amd64 in a sid cowbuilder chroot.
--
bye,
pabs
http://wiki.debian.org/PaulWise
[signature.asc (application/pgp-signature, inline)]
Bug Marked as found in versions man-db/2.5.6-4; no longer marked as fixed in versions man-db/2.5.6-4 and reopened.
Request was from Paul Wise <pabs@debian.org>
to control@bugs.debian.org.
(Sat, 14 Nov 2009 13:57:04 GMT) (full text, mbox, link).
Reply sent
to Colin Watson <cjwatson@debian.org>:
You have taken responsibility.
(Sun, 15 Nov 2009 12:57:06 GMT) (full text, mbox, link).
Notification sent
to Raphael Hertzog <hertzog@debian.org>:
Bug acknowledged by developer.
(Sun, 15 Nov 2009 12:57:06 GMT) (full text, mbox, link).
Message #47 received at 555331-done@bugs.debian.org (full text, mbox, reply):
Source: man-db
Source-Version: 2.5.6-4
On Sat, Nov 14, 2009 at 09:53:37PM +0800, Paul Wise wrote:
> In an up-to-date cowbuilder chroot I still get this issue:
>
> (cowbuilder)root@chianamo:~# LANG=C man --warnings -E UTF-8 -l /usr/share/man/man8/update-alternatives.8.gz >/dev/null
> col: Invalid or incomplete multibyte or wide character
> (cowbuilder)root@chianamo:~# apt-cache policy man-db
> man-db:
> Installed: 2.5.6-4
> Candidate: 2.5.6-4
> Version table:
> *** 2.5.6-4 0
> 500 ftp://xxxxxxxxxxxxxxx sid/main Packages
> 100 /var/lib/dpkg/status
>
> Looking at the patch, I thought it would be because the locales package
> is not installed and thus /usr/share/i18n/SUPPORTED is not available.
> Unfortunately, installing locales does not silence the warning:
You've hit the corner case which I already cloned as bug 555408. I don't
think we need to keep this bug open for that as well.
As I said in a previous message:
However, such a locale is not actually guaranteed to exist. Perhaps
lintian needs to generate a UTF-8 locale if it can't find one
otherwise, a bit like the hack in installation-locale; or perhaps we
should just make sure that there's always a C.UTF-8 locale on the
system, which could be used to get UTF-8 character type semantics
without implying a particular language or country.
If you generate some random UTF-8 locale (uncomment it in
/etc/locale.gen and run 'sudo locale-gen'), then that will work around
the problem for you.
Regards,
--
Colin Watson [cjwatson@debian.org]
Bug archived.
Request was from Debbugs Internal Request <owner@bugs.debian.org>
to internal_control@bugs.debian.org.
(Tue, 15 Dec 2009 07:36:26 GMT) (full text, mbox, link).
Send a report that this bug log contains spam.
Debian bug tracking system administrator <owner@bugs.debian.org>.
Last modified:
Tue Jan 9 17:04:17 2018;
Machine Name:
buxtehude
Debian Bug tracking system
Debbugs is free software and licensed under the terms of the GNU
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson,
2005-2017 Don Armstrong, and many other contributors.