Debian Bug report logs - #195674
dialog: doesn't support multibyte characters

version graph

Package: dialog; Maintainer for dialog is Santiago Vila <sanvila@debian.org>; Source for dialog is src:dialog.

Reported by: Tomohiro KUBOTA <debian@tmail.plala.or.jp>

Date: Sun, 1 Jun 2003 13:48:02 UTC

Severity: normal

Tags: upstream

Found in version 0.9b-20030308-1

Fixed in version dialog/0.9b-20030910-1

Done: Santiago Vila <sanvila@debian.org>

Bug is archived. No further changes may be made.

Forwarded to Thomas E. Dickey <dickey@invisible-island.net>

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
New Bug report received and forwarded. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: submit@bugs.debian.org
Subject: dialog: doesn't support multibyte characters
Date: Sun, 01 Jun 2003 22:36:09 +0900 (JST)
Package: dialog
Version: 0.9b-20030308-1

"dialog" doesn't support multibyte characters such as UTF-8 and EUC-JP.
Please watch the following screenshots.

http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialog-ja-eucjp.png
http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialog-ja-utf8.png
http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialog-ru-koi8r.png
http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialog-ru-utf8.png

These are the results of "dpkg-reconfigure debconf" in various locales
with "dialog".  Note that these screens should show five selection items.
(Note that I used an improved version of debconf which I will discuss
in debian-i18n or other mailing list soon.)

You may think that the first screenshot looks almost nice other than
tiny problem around the right edge.  However, the screen has a severe
problem that there are only two items of "Readline" and "Gnome".  Other
three items which are translated into Japanese are not displayed.

Two UTF-8 screenshots in Japanese and Russian are obviously ugly.  This
is probably because "dialog" assumes the width of a string on screen
is same as the number of bytes of the string, which is not valid in
multibyte encodings.  Also, they lacks selection items.

KOI8-R screenshot looks nice, which shows that "dialog" works well
for 8bit legacy character encodings.  Note that legacy character
encodings are multibyte (not 8bit) for east Asian languages and
east Asian people cannot use "dialog" at all.

I think you will have to consult with the upstream developer(s).

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #10 received at submit@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>, 195674@bugs.debian.org
Cc: submit@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 1 Jun 2003 10:04:10 -0400
On Sun, Jun 01, 2003 at 10:36:09PM +0900, Tomohiro KUBOTA wrote:
> Package: dialog
> Version: 0.9b-20030308-1
> 
> "dialog" doesn't support multibyte characters such as UTF-8 and EUC-JP.
> Please watch the following screenshots.
> 
> http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialog-ja-eucjp.png
> http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialog-ja-utf8.png
> http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialog-ru-koi8r.png
> http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialog-ru-utf8.png
> 
> These are the results of "dpkg-reconfigure debconf" in various locales
> with "dialog".  Note that these screens should show five selection items.
> (Note that I used an improved version of debconf which I will discuss
> in debian-i18n or other mailing list soon.)

did you try linking dialog with libncursesw?  (none of your comments indicate
that you did).

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Santiago Vila <sanvila@unex.es>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #20 received at 195674@bugs.debian.org (full text, mbox):

From: Santiago Vila <sanvila@unex.es>
To: dickey@herndon4.his.com, 195674@bugs.debian.org
Cc: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 1 Jun 2003 16:19:32 +0200 (CEST)
On Sun, 1 Jun 2003, Thomas Dickey wrote:

> did you try linking dialog with libncursesw?  (none of your comments
> indicate that you did).

The report refers to the Debian dialog package, which is not linked
with libncursesw. Should it be?



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #25 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Santiago Vila <sanvila@unex.es>
Cc: dickey@herndon4.his.com, 195674@bugs.debian.org, Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 1 Jun 2003 10:32:09 -0400
On Sun, Jun 01, 2003 at 04:19:32PM +0200, Santiago Vila wrote:
> On Sun, 1 Jun 2003, Thomas Dickey wrote:
> 
> > did you try linking dialog with libncursesw?  (none of your comments
> > indicate that you did).
> 
> The report refers to the Debian dialog package, which is not linked
> with libncursesw. Should it be?

Essentially that's what he's complaining about.  Debian's package does
what it's designed to do:
	libncursesw handles UTF-8,
	libncurses won't.

However, note that my most recent fix for libncursesw which affects Linux
console is only a few weeks ago; Debian's package hasn't caught up to that, the
last time I noticed.  dialog + libncursesw should work well enough in X to
demonstrate the point though.

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #30 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 01 Jun 2003 23:41:34 +0900 (JST)
Hello,

From: Thomas Dickey <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 1 Jun 2003 10:04:10 -0400

> did you try linking dialog with libncursesw?  (none of your comments indicate
> that you did).

When I wrote the last mail, I have not.  (I used Debian Sid's "dialog"
package).  Now I did (by recompiling the package).  However, the
situation got even worse.  There are no readable Japanese characters
in both of ja_JP.eucJP and ja_JP.UTF-8 locales.

Please check:

http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialogw-ja-eucjp.png
http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialogw-ja-utf8.png
http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-utf8.png

Version of libncursesw: 5.3.20030510

BTW, why not replace libncrses with libncursesw?

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #35 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 1 Jun 2003 10:49:46 -0400
On Sun, Jun 01, 2003 at 11:41:34PM +0900, Tomohiro KUBOTA wrote:
> Hello,
> 
> From: Thomas Dickey <dickey@herndon4.his.com>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Sun, 1 Jun 2003 10:04:10 -0400
> 
> > did you try linking dialog with libncursesw?  (none of your comments indicate
> > that you did).
> 
> When I wrote the last mail, I have not.  (I used Debian Sid's "dialog"
> package).  Now I did (by recompiling the package).  However, the
> situation got even worse.  There are no readable Japanese characters
> in both of ja_JP.eucJP and ja_JP.UTF-8 locales.
> 
> Please check:
> 
> http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialogw-ja-eucjp.png
> http://www.debian.or.jp/~kubota/mojibake/debconf-dialog-dialogw-ja-utf8.png
> http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-utf8.png
> 
> Version of libncursesw: 5.3.20030510
> 
> BTW, why not replace libncrses with libncursesw?

it's slower and larger, does not address the same target environment.

why don't you drive a 10-ton truck to work?  (I wouldn't).

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #40 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: sanvila@unex.es, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 03 Jun 2003 08:20:38 +0900 (JST)
Hi,

From: Thomas Dickey <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 1 Jun 2003 10:32:09 -0400

> However, note that my most recent fix for libncursesw which affects Linux
> console is only a few weeks ago; Debian's package hasn't caught up to that, the
> last time I noticed.  dialog + libncursesw should work well enough in X to
> demonstrate the point though.

I tested with libncursesw 5.3 (20030531).  However, all of my tests
failed.  None of Japanese characters are displayed at all.  I felt
that some drastic improvement is needed.


For example,

    http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-utf8-2.png

This is an image where I tried to use Japanese (UTF-8) in '--msgbox',
which is shown in the window title.  I imagine it is likely that 'dialog'
assume the input (parameter for --msgbox option) is ISO-8859-1 and
only the output (sent to the terminal) is UTF-8.

Of course, in UTF-8 locales, both of input and output must be regarded
as UTF-8.  And more, a character may be fullwidth or zerowidth.



    http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-eucjp-2.png
    http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-eucjp-3.png
    http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-eucjp-4.png

In EUC-JP locale.  The same Japanese text was supplied in EUC-JP.
The result drastically changes depending on terminal emulators, which
suggests that the stream sent to the terminals is invalid.  I don't
understand what occurs.  dialog and libncursesw should check the
current locale and follow it, at least for major locales such as
ISO-8859-*, EUC-*, and KOI8-*, like other many softwares do.

In case of xterm+luit, usage of DEC character (lines) will always
cause Mojibake in characters in G1 of ISO-2022 meaning (for example,
JIS X 0208 in EUC-JP).  This is caused by inappropriate definition
of enacs, rmacs, and smacs in terminfo, as I pointed out this a few
years ago.  Since this is offtopic here, I would like to discuss
on this point elsewhere.



    http://www.debian.or.jp/~kubota/mojibake/dialog-ja-eucjp.png

This is a test using libncurses (not libncursesw) in EUC-JP locale.
In this level, it works very well.  This is because doublebyte
characters are fullwidth and 1:1 ratio of bytes:columns is kept
by chance.

However, since libncurses is not aware of multibyte characters nor
fullwidth characters, it can be easily broken.  This image is 
prepared just to demonstrate how libncursesw+dialog should be.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #45 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, sanvila@unex.es, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Mon, 2 Jun 2003 19:35:33 -0400
On Tue, Jun 03, 2003 at 08:20:38AM +0900, Tomohiro KUBOTA wrote:
> Hi,
> 
> From: Thomas Dickey <dickey@herndon4.his.com>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Sun, 1 Jun 2003 10:32:09 -0400
> 
> > However, note that my most recent fix for libncursesw which affects Linux
> > console is only a few weeks ago; Debian's package hasn't caught up to that, the
> > last time I noticed.  dialog + libncursesw should work well enough in X to
> > demonstrate the point though.
> 
> I tested with libncursesw 5.3 (20030531).  However, all of my tests
> failed.  None of Japanese characters are displayed at all.  I felt
> that some drastic improvement is needed.
> 
> 
> For example,
> 
>     http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-utf8-2.png
> 
> This is an image where I tried to use Japanese (UTF-8) in '--msgbox',
> which is shown in the window title.  I imagine it is likely that 'dialog'
> assume the input (parameter for --msgbox option) is ISO-8859-1 and
> only the output (sent to the terminal) is UTF-8.

something like that (see below)
 
> In case of xterm+luit, usage of DEC character (lines) will always
> cause Mojibake in characters in G1 of ISO-2022 meaning (for example,
> JIS X 0208 in EUC-JP).  This is caused by inappropriate definition
> of enacs, rmacs, and smacs in terminfo, as I pointed out this a few
> years ago.  Since this is offtopic here, I would like to discuss
> on this point elsewhere.

ok (I haven't looked at EUC-JP locales)
 
>     http://www.debian.or.jp/~kubota/mojibake/dialog-ja-eucjp.png
> 
> This is a test using libncurses (not libncursesw) in EUC-JP locale.
> In this level, it works very well.  This is because doublebyte
> characters are fullwidth and 1:1 ratio of bytes:columns is kept
> by chance.
> 
> However, since libncurses is not aware of multibyte characters nor
> fullwidth characters, it can be easily broken.  This image is 
> prepared just to demonstrate how libncursesw+dialog should be.

I have a current bug-report for waddstr() not treating multibyte characters
properly.  Perhaps once that's resolved, it will fix your issues.  (Though I
did notice before 5.3 some references to waddstr() being expected to do this,
it got lost among other to-do items).

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #50 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, sanvila@unex.es, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 14 Jun 2003 19:06:35 -0400
On Tue, Jun 03, 2003 at 08:20:38AM +0900, Tomohiro KUBOTA wrote:
 
> For example,
> 
>     http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-utf8-2.png

it would be nice to have a copy of the script that you used for the demo,
since I don't read Japanese, but can check for several types of problems
without knowing that much.

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #55 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: sanvila@unex.es, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 15 Jun 2003 09:51:51 +0900 (JST)
Hi,

> >     http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-utf8-2.png
> 
> it would be nice to have a copy of the script that you used for the demo,
> since I don't read Japanese, but can check for several types of problems
> without knowing that much.

It is written at the window title.  To input the exact Japanese,
you can use:

    dialog --msgbox `printf "\xe3\x81\x82\xe3\x81\x84\xe3\x81\x86\xe3\x81\x88\xe3\x81\x8a"` 10 20

in UTF-8 locale (like "xterm -u8") or

    dialog --msgbox `printf "\xa4\xa2\xa4\xa4\xa4\xa6\xa4\xa8\xa4\xaa"` 10 20

in EUC-JP locale (like "kterm -km euc" or "mlterm -E EUC-JP").

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #60 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: sanvila@unex.es, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 15 Jun 2003 13:52:40 +0900 (JST)
Hi,

Sorry I found that my old screenshots are not appropriate because
I failed to compile "dialog" with libncursesw5.

I prepared new screenshots using "dialog" command with the newest
libncursesw5 (20030614 patch).

http://www.debian.or.jp/~kubota/mojibake/dialogw-de-latin1-101.png
http://www.debian.or.jp/~kubota/mojibake/dialogw-de-utf8-101.png
http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-eucjp-101.png
http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-utf8-101.png
http://www.debian.or.jp/~kubota/mojibake/dialogw-ru-koi8r-101.png
http://www.debian.or.jp/~kubota/mojibake/dialogw-ru-utf8-101.png

For "de-latin1" and "de-utf8", I intended to display
&auml;&ouml;&uuml;&szlig; in SGML entity expression,
(0xe4 0xf6 0xfc 0xdf in ISO-8859-1, 0xc3 0xa4 0xc3 0xb6
0xc3 0xbc 0xc3 0x9f in UTF-8).  "de-latin1" seems to work
well.  "de-utf8" shows that input string is interpreted
as ISO-8859-1 and the output works well.

For "ja-eucjp" and "ja-utf8", I intended to display a string
shown on the window titles of the screenshots. (HIRAGANA LETTER A,
HIRAGANA LETTER I, HIRAGANA LETTER U, HIRAGANA LETTER E, HIRAGANA
LETTER O, 0xa4 0xa2 0xa4 0xa4 0xa4 0xa6 0xa4 0xa8 0xa4 0xaa in EUC-JP,
0xe3 0x81 0x82 0xe3 0x81 0x84 0xe3 0x81 0x86 0xe3 0x81 0x88 0xe3
0x81 0x8a in UTF-8).  Both of them are unreadable.  The screenshots
shows that input strings are interpreted as ISO-8859-1 and the
output works almost well (while the line-folding of fullwidth
character in "ja-eucjp" is not beautiful).

For "ru-koi8r" and "ru-utf8", I intended to display as
shown in
http://www.debian.or.jp/~kubota/mojibake/dialogw-ru-koi8r-101a.png .
The russian part of the string is: 0xeb 0xce 0xc9 0xc7 0xc1
in KOI8-R and 0xd0 0x9a 0xd0 0xbd 0xd0 0xb8 0xd0 0xb3 0xd0 0xb0
in UTF-8.  Again, these screenshots can be explained by assuming
that the input string is interpreted as ISO-8859-1 and the output
is properly processed.  I.e., in "ru-koi8r", input string is
interpreted as ISO-8859-1 characters but no ISO-8859-1 characters
are displayable in KOI8-R locale, which leads whitespaces.  In
"ru-utf8", input string is interpreted as ISO-8859-1 characters
and properly converted into UTF-8 on output.

From these screenshots, it is clear that input string is (still)
interpreted as ISO-8859-1 regardless of locale.  However, the output
stream seems to be (automatically) converted from wchar_t to locale
encoding.  Fullwidth characters seem to be regarded as fullwidth
because it is folded near the right position, not near the double
of right position.  ("ja-eucjp").

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Noted your statement that Bug has been forwarded to Thomas E. Dickey <dickey@invisible-island.net>. Request was from Santiago Vila <sanvila@unex.es> to control@bugs.debian.org. Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #67 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>, 195674@bugs.debian.org
Cc: dickey@herndon4.his.com, sanvila@unex.es
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 15 Jun 2003 12:46:00 -0400
On Sun, Jun 15, 2003 at 09:51:51AM +0900, Tomohiro KUBOTA wrote:
> Hi,
> 
> > >     http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-utf8-2.png
> > 
> > it would be nice to have a copy of the script that you used for the demo,
> > since I don't read Japanese, but can check for several types of problems
> > without knowing that much.
> 
> It is written at the window title.  To input the exact Japanese,
> you can use:
> 
>     dialog --msgbox `printf "\xe3\x81\x82\xe3\x81\x84\xe3\x81\x86\xe3\x81\x88\xe3\x81\x8a"` 10 20
> 
> in UTF-8 locale (like "xterm -u8") or

thanks - I think I can do something with this.  Running the printf alone, I
see 5 symbols.  Running dialog linked against my current ncursesw, I see only
some Latin-1 symbols.  So I have some more work to do.
 
>     dialog --msgbox `printf "\xa4\xa2\xa4\xa4\xa4\xa6\xa4\xa8\xa4\xaa"` 10 20
> 
> in EUC-JP locale (like "kterm -km euc" or "mlterm -E EUC-JP").

I will try this later (once I get the UTF-8 configuration working properly).

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #72 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Thomas Dickey <dickey@herndon4.his.com>
Cc: Tomohiro KUBOTA <debian@tmail.plala.or.jp>, 195674@bugs.debian.org, sanvila@unex.es
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 15 Jun 2003 18:08:16 -0400
On Sun, Jun 15, 2003 at 12:46:00PM -0400, Thomas Dickey wrote:
> On Sun, Jun 15, 2003 at 09:51:51AM +0900, Tomohiro KUBOTA wrote:
> > Hi,
> > 
> > > >     http://www.debian.or.jp/~kubota/mojibake/dialogw-ja-utf8-2.png
> > > 
> > > it would be nice to have a copy of the script that you used for the demo,
> > > since I don't read Japanese, but can check for several types of problems
> > > without knowing that much.
> > 
> > It is written at the window title.  To input the exact Japanese,
> > you can use:
> > 
> >     dialog --msgbox `printf "\xe3\x81\x82\xe3\x81\x84\xe3\x81\x86\xe3\x81\x88\xe3\x81\x8a"` 10 20
> > 
> > in UTF-8 locale (like "xterm -u8") or

I've gotten partly through this, may have it completed for the next patch.
See

	ftp://invisible-island.net/temp/dialog.png

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #77 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: 195674@bugs.debian.org, sanvila@unex.es
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Mon, 16 Jun 2003 08:25:35 +0900 (JST)
Hi,

From: Thomas Dickey <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 15 Jun 2003 18:08:16 -0400

> I've gotten partly through this, may have it completed for the next patch.
> See
> 
> 	ftp://invisible-island.net/temp/dialog.png

Perfect!  I expect that the problem has been fixed also for EUC-JP.
(Well, you can also test it with other non-ISO-8859-1 encodings.  For
example, usage of an Euro symbol in ISO-8859-15 or UTF-8 will be a test.)

Another test would be needed on line-folding of fullwidth characters,
by using smaller window size or longer Japanese string (for example,
just repeating the string).  When the window width is continuously
changed, line-folding position changes once per twice, since a 
fullwidth character occupies two columns.

Further test would be a combining character, though I don't know if it
is very important.  (Thai and Vietnamese people need it).  For example,
on UTF-8 terminals,

   printf "aa\xcc\x81"

will display "aa" U+0301.  U+0301 is COMBINING ACUTE ACCENT which
adds an acute accent on the second "a".  Please use xterm or mlterm
in UTF-8.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #82 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>, 195674@bugs.debian.org
Cc: dickey@herndon4.his.com, sanvila@unex.es
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 15 Jun 2003 19:59:05 -0400
On Mon, Jun 16, 2003 at 08:25:35AM +0900, Tomohiro KUBOTA wrote:
> Hi,
> 
> From: Thomas Dickey <dickey@herndon4.his.com>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Sun, 15 Jun 2003 18:08:16 -0400
> 
> > I've gotten partly through this, may have it completed for the next patch.
> > See
> > 
> > 	ftp://invisible-island.net/temp/dialog.png
> 
> Perfect!  I expect that the problem has been fixed also for EUC-JP.

perhaps (I'll test that once I'm done polishing this change).  The difference
between this and the others was that dialog calls addch() one byte at
a time, bypassing addstr().

> (Well, you can also test it with other non-ISO-8859-1 encodings.  For
> example, usage of an Euro symbol in ISO-8859-15 or UTF-8 will be a test.)
> 
> Another test would be needed on line-folding of fullwidth characters,
> by using smaller window size or longer Japanese string (for example,
> just repeating the string).  When the window width is continuously
> changed, line-folding position changes once per twice, since a 
> fullwidth character occupies two columns.

The case I was working on yesterday does something like that.
 
> Further test would be a combining character, though I don't know if it
> is very important.  (Thai and Vietnamese people need it).  For example,
> on UTF-8 terminals,
> 
>    printf "aa\xcc\x81"
> 
> will display "aa" U+0301.  U+0301 is COMBINING ACUTE ACCENT which
> adds an acute accent on the second "a".  Please use xterm or mlterm
> in UTF-8.

ok (again, this "should" work, but depending on the combinations that
haven't been tested, it may not yet).

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #87 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: 195674@bugs.debian.org, sanvila@unex.es
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 17 Jun 2003 07:37:09 +0900 (JST)
Hi,

From: Thomas Dickey <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 15 Jun 2003 19:59:05 -0400

> perhaps (I'll test that once I'm done polishing this change).  The difference
> between this and the others was that dialog calls addch() one byte at
> a time, bypassing addstr().

Well, do you mean that you confirmed that addstr() is responsible,
or, that dialog was using libncursesw in wrong way?

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #92 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, 195674@bugs.debian.org, sanvila@unex.es
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Mon, 16 Jun 2003 19:10:41 -0400
On Tue, Jun 17, 2003 at 07:37:09AM +0900, Tomohiro KUBOTA wrote:
> Hi,
> 
> From: Thomas Dickey <dickey@herndon4.his.com>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Sun, 15 Jun 2003 19:59:05 -0400
> 
> > perhaps (I'll test that once I'm done polishing this change).  The difference
> > between this and the others was that dialog calls addch() one byte at
> > a time, bypassing addstr().
> 
> Well, do you mean that you confirmed that addstr() is responsible,
> or, that dialog was using libncursesw in wrong way?

neither.  dialog was using libncursesw correctly, but in an area that I had
overlooked.  It is calling waddch() to simulate the effect of addstr.  But
addstr in libncursesw must (unlike in libncurses) check for multibyte
sequences.  I had noticed that this was implied by the X/Open documentation,
written a to-do note about it last September, but had forgotten to implement
it.  There was a recent bug report which dealt with addstr, that I finished on
Saturday.  This is another side of it.  (Offhand I don't know of other areas
that might break, but expect that we'll run into them sometime).

Right now I'm looking at a different report which deals with the input side,
not sure if it is covered by one of these cases.  But to make it easier to
test, am starting to modify dialog so I can paste multibyte characters into the
inputbox widget - more useful than the test-cases that I've been doing within
ncurses.  (I also had a to-do item to implement this in lynx, but lynx is more
complicated to change than dialog).

If you're impatient, I put a copy of yesterday's incremental patch at
	ftp://invisible-island.net/temp/

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #97 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: 195674@bugs.debian.org, sanvila@unex.es
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Wed, 18 Jun 2003 12:28:42 +0900 (JST)
Hi,

From: Thomas Dickey <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Mon, 16 Jun 2003 19:10:41 -0400

> If you're impatient, I put a copy of yesterday's incremental patch at
> 	ftp://invisible-island.net/temp/

I tested the patch 20030614a and found that it basically works
in UTF-8, EUC-JP, and KOI8-R locales.

Line-folding doesn't work well, i.e, it is processed in byte-
oriented way (not character-oriented) and wcwidth() is not
considered, too, though I think you know this problem.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #102 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, 195674@bugs.debian.org, sanvila@unex.es
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Wed, 18 Jun 2003 06:06:41 -0400
On Wed, Jun 18, 2003 at 12:28:42PM +0900, Tomohiro KUBOTA wrote:
> Hi,
> 
> From: Thomas Dickey <dickey@herndon4.his.com>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Mon, 16 Jun 2003 19:10:41 -0400
> 
> > If you're impatient, I put a copy of yesterday's incremental patch at
> > 	ftp://invisible-island.net/temp/
> 
> I tested the patch 20030614a and found that it basically works
> in UTF-8, EUC-JP, and KOI8-R locales.
> 
> Line-folding doesn't work well, i.e, it is processed in byte-
> oriented way (not character-oriented) and wcwidth() is not
> considered, too, though I think you know this problem.

I guess so.  The code is supposed to be breaking on a character boundary,
though it's possible it is not working properly.  (I'm working on several
things concurrently, but will probably have the multibyte-input working
with dialog available this week).

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #107 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, sanvila@unex.es, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 21 Jun 2003 20:04:22 -0400
I'm not done, but tonight's patch for ncurses has all of the fixes that I know
that I'll need for dialog.  Much of today has gone into flushing out a couple
of small bugs from ncurses, and letting me work on rewriting the text-entry
logic used in dialog's inputbox.  (Currently that part only handles 8-bit
inputs, though it wasn't mentioned in the bug report).  I'll continue on dialog
until that part looks ok, and then revisit your comments about this.

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #112 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: sanvila@unex.es, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 28 Jun 2003 13:52:26 +0900 (JST)
Hi,

From: Thomas Dickey <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 21 Jun 2003 20:04:22 -0400

> I'm not done, but tonight's patch for ncurses has all of the fixes that I know
> that I'll need for dialog.  Much of today has gone into flushing out a couple
> of small bugs from ncurses, and letting me work on rewriting the text-entry
> logic used in dialog's inputbox.  (Currently that part only handles 8-bit
> inputs, though it wasn't mentioned in the bug report).  I'll continue on dialog
> until that part looks ok, and then revisit your comments about this.

Though I don't know this helps you or not, I found a possible misdesign
in ncursesw.

At first, I found that addch() works with multibyte characters while
addstr() doesn't work.  Strictly speaking, addstr() doesn't work for
characters which are equivalent to Unicode characters above U+0100.
For example, Latin-1 characters in UTF-8 is 2 bytes but it works well,
while Cyrillic characters in KOI-8 is 1 byte but it doesn't work.

I checked the incoming value (CharOf(ch)) for waddch_literal() in
ncurses/base/lib_addch.c .  When I called addch() with UTF-8 (or EUC-JP)
string like:

  char * string = "some string including Japanese in UTF-8 (or EUC-JP)";
  int j, len;
  len = strlne(string);
  for (j=0; j<len; j++) addch(string[j]);

then the incoming value is raw 8bit value of the UTF-8 (or EUC-JP), while
when I called addstr() like:

  char * string = "some string including Japanese in UTF-8 (or EUC-JP)";
  addstr(string);

then the incoming value is UCS-4 value of the UTF-8 (or EUC-JP) characters.
Note that my test platform is GNU Libc whose wchar_t is UCS-4.

The value must be *either* raw 8bit value *or* wchar_t.  It is impossible
for waddch_literal() to support *both* of them.

The incoming value of waddch_literal() is NCURSES_CH_T and it is cchar_t
in wide mode, and cchar_t.chars (CharOf(ch)) is wchar_t.  Thus I think
CharOf(ch) must be wchar_t (UCS-4 in GNU Libc).  On the other hand, it
works only when CharOf(ch) is raw 8bit value....

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #117 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: sanvila@unex.es, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 28 Jun 2003 15:18:31 +0900 (JST)
Hi,

Additional information.

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 28 Jun 2003 13:52:26 +0900 (JST)

> At first, I found that addch() works with multibyte characters while
> addstr() doesn't work.

When I modified 

    if_WIDEC({
	if (Charable(ch)) {

into

    if_WIDEC({
	if (Charable(ch) || 1) {

around line 144 of ncurses/base/lib_addch.c, then I succeeded to use
addstr() with Japanese (both in UTF-8 and EUC-JP).  Of course, in this
case, addch() doesn't work.

You know, this part is from your 20030621 patch.  This part has a problem
that it treats wchar_t value as if it is ordinary char value (locale
encoding).

Since the encoding for API of libncursesw (such as addstr()) must be 
locale encoding (not wchar_t), there must be a conversion from locale
encoding to wchar_t at a certain point in libncursesw.  I think SetChar
is responsible for this.  I thought so because the first parameter for
SetChar() is NCURSES_CH_T which includes wchar_t in wide mode.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #122 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, sanvila@unex.es, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 28 Jun 2003 06:45:56 -0400
On Sat, Jun 28, 2003 at 01:52:26PM +0900, Tomohiro KUBOTA wrote:
> Though I don't know this helps you or not, I found a possible misdesign
> in ncursesw.
> 
> At first, I found that addch() works with multibyte characters while
> addstr() doesn't work.  Strictly speaking, addstr() doesn't work for
> characters which are equivalent to Unicode characters above U+0100.
> For example, Latin-1 characters in UTF-8 is 2 bytes but it works well,
> while Cyrillic characters in KOI-8 is 1 byte but it doesn't work.
...
> The value must be *either* raw 8bit value *or* wchar_t.  It is impossible
> for waddch_literal() to support *both* of them.

yes - it isn't obvious (I should add some comments on that one, explaining
what the data are at that point).
 
> The incoming value of waddch_literal() is NCURSES_CH_T and it is cchar_t
> in wide mode, and cchar_t.chars (CharOf(ch)) is wchar_t.  Thus I think
> CharOf(ch) must be wchar_t (UCS-4 in GNU Libc).  On the other hand, it
> works only when CharOf(ch) is raw 8bit value....

That sounds right.  addch() has to handle multibyte characters, one byte at
a time, while addstr handles more than one.  In either case, the data are
interpreted according to the locale.  Since addstr() calls waddch_literal(),
it has to reorganize the information.  I thought I had it working consistently,
but will check to see what you're telling me.

This is different from the problem in dialog - without rewriting it to use
wchar_t's, I have to add some logic to translate to/from columns and index
into the edited string.  So (unless I find other problems in ncursesw),
I'd anticipated no other changes to ncursesw (except for bug reports ;-).

thanks.

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #127 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>, 195674@bugs.debian.org
Cc: dickey@herndon4.his.com, sanvila@unex.es
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 28 Jun 2003 06:48:59 -0400
On Sat, Jun 28, 2003 at 03:18:31PM +0900, Tomohiro KUBOTA wrote:
> Hi,
> 
> Additional information.
> 
> From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Sat, 28 Jun 2003 13:52:26 +0900 (JST)
> 
> > At first, I found that addch() works with multibyte characters while
> > addstr() doesn't work.
> 
> When I modified 
> 
>     if_WIDEC({
> 	if (Charable(ch)) {
> 
> into
> 
>     if_WIDEC({
> 	if (Charable(ch) || 1) {
> 
> around line 144 of ncurses/base/lib_addch.c, then I succeeded to use
> addstr() with Japanese (both in UTF-8 and EUC-JP).  Of course, in this
> case, addch() doesn't work.

since addch() is lower-level than addstr() - though they both call the
same functions - I think the fix will be in addstr().
 
> You know, this part is from your 20030621 patch.  This part has a problem
> that it treats wchar_t value as if it is ordinary char value (locale
> encoding).
> 
> Since the encoding for API of libncursesw (such as addstr()) must be 
> locale encoding (not wchar_t), there must be a conversion from locale
> encoding to wchar_t at a certain point in libncursesw.  I think SetChar
> is responsible for this.  I thought so because the first parameter for
> SetChar() is NCURSES_CH_T which includes wchar_t in wide mode.
> 
> ---
> Tomohiro KUBOTA <kubota@debian.org>
> http://www.debian.or.jp/~kubota/
> 
> 
> 

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #132 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: sanvila@unex.es, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 28 Jun 2003 20:19:56 +0900 (JST)
[Message part 1 (text/plain, inline)]
Hi,

From: Thomas Dickey <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 28 Jun 2003 06:45:56 -0400

> That sounds right.  addch() has to handle multibyte characters, one byte at
> a time, while addstr handles more than one.
...

I wrote a test patch.  It is written only to prove this idea works, and
should not be used for real implementation.  My patch:

 - overrides Charable(ch) test around line 144 of ncurses/base/lib_addch.c
   and
 - converts from locale encoding to wchar_t at SetChar() in waddch().

Using this, both of addch() and addstr() works.  My patch is very
incomplete to keep simplicity and clearity so that you can understand
what occurs easily.  (My patch cannot be compiled in non-wide mode!)

I send you a small test program also.


> This is different from the problem in dialog - without rewriting it to use
> wchar_t's, I have to add some logic to translate to/from columns and index
> into the edited string.  So (unless I find other problems in ncursesw),
> I'd anticipated no other changes to ncursesw (except for bug reports ;-).

I found one more problem on dialog.  When a multibyte character is
used as the first character of <tag> string of --menu, the multibyte
character is divided between the first byte and the second byte.
The code is around line 67 in menubox.c .  Can you fix this without
using wchar_t nor mblen() ?

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/

[ncurses-5.3-20030621-kubota.patch.gz (application/octet-stream, attachment)]
[wtest.c (text/plain, inline)]
/* TEST OF NCURSES(W)                                       */
/* 2003-06-28 Tomohiro KUBOTA                               */
/* Compilation in Debian:                                   */
/*   cc wtest.c -I/usr/include/ncursesw -o wtest -lncursesw */

#include <curses.h>
#include <locale.h>
#include <stdlib.h>

struct localelist {
    char *locale, *string;
} llist[] = {
     /* a b c alpha beta gamma hiragana-a hiragana-i hiragana-u x y z */
    {"UTF-8",
     "abc\xce\xb1\xce\xb2\xce\xb3\xe3\x81\x82\xe3\x81\x84\xe3\x81\x86xyz"},

     /* hiragana-{a i u e o} (first five hiragana's) in EUC-JP */
    {"ja_JP",
     "abc\xa4\xa2\xa4\xa4\xa4\xa6\xa4\xa8\xa4\xaaxyz"},

     /* cyrillic-{R u s s k i short-i} (means Russian) in KOI8-R */
    {"ru_RU.KOI8-R",
     "abc\xf2\xd5\xd3\xd3\xcb\xc9\xcaxyz"},

    {NULL,
     "espa\xf1ol"}
};
     

int main(int argc, char **argv)
{
    unsigned char *str, *loc;
    int j,l;
    wchar_t wstr[1000];

    loc = setlocale(LC_ALL, "");

    if (argc >= 2) str = argv[1];
    else
    {
	struct localelist *p;
	str = NULL;
	for (p=llist; p->locale; p++)
	    if (strstr(loc, p->locale)) {str = p->string; break;}
	if (!str) str = p->string;
    }

    /* initialize screen */
    initscr();

    /* write to screen (1) */
    move(1, 10);
    addstr(str);

    /* write to screen (2) */
    move(3, 10);
    l = strlen(str);
    for (j=0; j<l; j++) addch(str[j]);

    /* write to screen (3) */
    move(5, 10);
    mbstowcs(wstr, str, 999); wstr[999] = 0;
    addwstr(wstr);

    /* update screen */
    refresh();

    getch();
    endwin();
}

Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #137 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, sanvila@unex.es, 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 28 Jun 2003 07:48:43 -0400
On Sat, Jun 28, 2003 at 08:19:56PM +0900, Tomohiro KUBOTA wrote:
> Hi,
> 
> From: Thomas Dickey <dickey@herndon4.his.com>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Sat, 28 Jun 2003 06:45:56 -0400
> 
> > That sounds right.  addch() has to handle multibyte characters, one byte at
> > a time, while addstr handles more than one.
> ...
> 
> I wrote a test patch.  It is written only to prove this idea works, and
> should not be used for real implementation.  My patch:
> 
>  - overrides Charable(ch) test around line 144 of ncurses/base/lib_addch.c
>    and
>  - converts from locale encoding to wchar_t at SetChar() in waddch().
> 
> Using this, both of addch() and addstr() works.  My patch is very
> incomplete to keep simplicity and clearity so that you can understand
> what occurs easily.  (My patch cannot be compiled in non-wide mode!)
> 
> I send you a small test program also.

thanks (I'll start by testing this).

> > This is different from the problem in dialog - without rewriting it to use
> > wchar_t's, I have to add some logic to translate to/from columns and index
> > into the edited string.  So (unless I find other problems in ncursesw),
> > I'd anticipated no other changes to ncursesw (except for bug reports ;-).
> 
> I found one more problem on dialog.  When a multibyte character is
> used as the first character of <tag> string of --menu, the multibyte
> character is divided between the first byte and the second byte.
> The code is around line 67 in menubox.c .  Can you fix this without
> using wchar_t nor mblen() ?

Possibly - it sounds like the same sort of problem that I was working on
in inputstr.c - working around implicit assumptions that one byte is one
character.  There are several places in dialog (such as text-justification)
where this is assumed.
 
-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #142 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: 195674@bugs.debian.org, sanvila@unex.es
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Mon, 30 Jun 2003 08:32:09 +0900 (JST)
Hi,

From: Thomas Dickey <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sat, 28 Jun 2003 06:48:59 -0400

> since addch() is lower-level than addstr() - though they both call the
> same functions - I think the fix will be in addstr().

I checked new incremental diff.  I tested my program which I sent you
in UTF-8 locale.  addstr() and addch() worked well.  However, addwstr()
didn't work very well --- non-Latin1 characters are written in reverse.
I imagine there are still some confusion between char (or char|attributes
in high bits) and wchar_t.

wchar_t seems to be sometimes used to store non-wchar_t value such
as char or char|attribute.  (For example, SetChar().)  Though it itself
isn't a bug (in means that it doesn't cause problematic behavior, like
using size_t or time_t for char), I think this is not a good idea and
it makes maintenance/bugfixes/contributions/improvements/etc difficult.

Anyway, I have not found any scenes where this bug affects dialog.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #147 received at 195674@bugs.debian.org (full text, mbox):

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, 195674@bugs.debian.org, sanvila@unex.es
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 29 Jun 2003 19:52:54 -0400 (EDT)
On Mon, 30 Jun 2003, Tomohiro KUBOTA wrote:

> Hi,
>
> From: Thomas Dickey <dickey@herndon4.his.com>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Sat, 28 Jun 2003 06:48:59 -0400
>
> > since addch() is lower-level than addstr() - though they both call the
> > same functions - I think the fix will be in addstr().
>
> I checked new incremental diff.  I tested my program which I sent you
> in UTF-8 locale.  addstr() and addch() worked well.  However, addwstr()
> didn't work very well --- non-Latin1 characters are written in reverse.
> I imagine there are still some confusion between char (or char|attributes
> in high bits) and wchar_t.

I rewrote that last night, but don't have a good test case for it yet
(essentially had to modify that to work with the cleanup of
waddch_literal(), and put on my to-do list to make a test case since I was
short on time).

I noticed (this might be the same case) that the last line of your wtest.c
example had some of the characters highlighted. I noticed it mainly
because xterm did not display the characters - only some underlining,
while mlterm (using Debian stable) displayed the reverse, underline and
characters.

(But it was late - I've been putting together a new development machine,
which is complicated since I install several systems which do not like
to be on the same box).

> wchar_t seems to be sometimes used to store non-wchar_t value such
> as char or char|attribute.  (For example, SetChar().)  Though it itself
> isn't a bug (in means that it doesn't cause problematic behavior, like
> using size_t or time_t for char), I think this is not a good idea and
> it makes maintenance/bugfixes/contributions/improvements/etc difficult.

SetChar() doesn't do that - see line 597 of curses.priv.h

More likely (reading the code) a sign-extension in lib_addch.c line 411:
                if ((code = waddch(win, PUTC_buf[n])) == ERR) {
                                        ^^^(char)

waddch_literal is rather ugly since it uses a cchar_t in wide-character
configuration just to be able to use the same macros.  I should rewrite
it (but not late at night ;-)

> Anyway, I have not found any scenes where this bug affects dialog.
>
> ---
> Tomohiro KUBOTA <kubota@debian.org>
> http://www.debian.or.jp/~kubota/
>
>
>

-- 
T.E.Dickey <dickey@herndon4.his.com>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #152 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: sanvila@unex.es
Cc: 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 22 Jul 2003 11:07:19 +0900 (JST)
From: Santiago Vila <sanvila@unex.es>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Sun, 1 Jun 2003 16:19:32 +0200 (CEST)

> On Sun, 1 Jun 2003, Thomas Dickey wrote:
> 
> > did you try linking dialog with libncursesw?  (none of your comments
> > indicate that you did).
> 
> The report refers to the Debian dialog package, which is not linked
> with libncursesw. Should it be?

Recently a new upstream version of libncurses is installed into Debian,
which I hope fixes its bug related to multibyte and fullwidth characters.

Thus, now please link dialog with libncursesw, not libncurses.  Though
Thomas have not fixed all of dialog's bugs, a part of problems should
be fixed.

* Improved (I discussed with the upstream):
  - panel width is properly calculated when multibyte or fullwidth
    characters are used.
  - multibyte and fullwidth characters can be inputed into --inputbox.
* Not yet fixed (problems which I am aware of):
  - line-folding in --msgbox is not handled properly.
  - tags in --menu cannot handle multibyte characters.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #157 received at 195674@bugs.debian.org (full text, mbox):

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>, 195674@bugs.debian.org
Cc: sanvila@unex.es, Santiago Vila <sanvila@debian.org>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 22 Jul 2003 06:25:44 -0400 (EDT)
On Tue, 22 Jul 2003, Tomohiro KUBOTA wrote:

> From: Santiago Vila <sanvila@unex.es>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Sun, 1 Jun 2003 16:19:32 +0200 (CEST)
>
> > On Sun, 1 Jun 2003, Thomas Dickey wrote:
> >
> > > did you try linking dialog with libncursesw?  (none of your comments
> > > indicate that you did).
> >
> > The report refers to the Debian dialog package, which is not linked
> > with libncursesw. Should it be?
>
> Recently a new upstream version of libncurses is installed into Debian,
> which I hope fixes its bug related to multibyte and fullwidth characters.
>
> Thus, now please link dialog with libncursesw, not libncurses.  Though
> Thomas have not fixed all of dialog's bugs, a part of problems should
> be fixed.

Actually there should be two packages (dialog and dialog-wide).  That's
because ncursesw is not part of the base.

-- 
T.E.Dickey <dickey@herndon4.his.com>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #162 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: 195674@bugs.debian.org, sanvila@unex.es, sanvila@debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 22 Jul 2003 22:08:40 +0900 (JST)
Hi,

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 22 Jul 2003 06:25:44 -0400 (EDT)

> Actually there should be two packages (dialog and dialog-wide).  That's
> because ncursesw is not part of the base.

"dialog" package is Priority:optional, thus there are no problem it
Depends: on libncursesw5 which is also Priority:optional.

If there should be two packages, it should be internationalized "dialog"
for standard use (for equality between languages) and "dialog-8bit" for
people who care disk space or speed.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #167 received at 195674@bugs.debian.org (full text, mbox):

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, 195674@bugs.debian.org, sanvila@unex.es, sanvila@debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 22 Jul 2003 09:32:54 -0400 (EDT)
On Tue, 22 Jul 2003, Tomohiro KUBOTA wrote:

> Hi,
>
> From: "Thomas E. Dickey" <dickey@herndon4.his.com>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Tue, 22 Jul 2003 06:25:44 -0400 (EDT)
>
> > Actually there should be two packages (dialog and dialog-wide).  That's
> > because ncursesw is not part of the base.
>
> "dialog" package is Priority:optional, thus there are no problem it
> Depends: on libncursesw5 which is also Priority:optional.
>
> If there should be two packages, it should be internationalized "dialog"
> for standard use (for equality between languages) and "dialog-8bit" for
> people who care disk space or speed.

There's no simple answer.  A copy of dialog built with ncursesw could be
installed with the same executable name as with ncurses, and would handle
all of the scripts that the latter does.  (My intent is also to keep the
library interfaces the same between both configurations, though I have
made changes to allow this, e.g,. adding the fkey parameter to several
functions).

-- 
T.E.Dickey <dickey@herndon4.his.com>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #172 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: 195674@bugs.debian.org, sanvila@unex.es, sanvila@debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Wed, 23 Jul 2003 08:15:07 +0900 (JST)
Hi,

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 22 Jul 2003 09:32:54 -0400 (EDT)

> There's no simple answer.  A copy of dialog built with ncursesw could be
> installed with the same executable name as with ncurses, and would handle
> all of the scripts that the latter does.

Yes, it should.  Is it not true (at least so far)?


> (My intent is also to keep the
> library interfaces the same between both configurations, though I have
> made changes to allow this, e.g,. adding the fkey parameter to several
> functions).

Well, do you mean compatibility between ncurses and ncursesw, or between
dialog linked with ncurses and dialog linked with ncursesw?  I think
both versions of dialog have the same user interface (while you said that
ncursesw is not exactly compatible to ncurses).  Thus, I think dialog
with ncursesw can substitute for dialog with ncurses.  Am I wrong?

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #177 received at 195674@bugs.debian.org (full text, mbox):

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, 195674@bugs.debian.org, sanvila@unex.es, sanvila@debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 22 Jul 2003 20:11:17 -0400 (EDT)
On Wed, 23 Jul 2003, Tomohiro KUBOTA wrote:

> Hi,
>
> From: "Thomas E. Dickey" <dickey@herndon4.his.com>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Tue, 22 Jul 2003 09:32:54 -0400 (EDT)
>
> > There's no simple answer.  A copy of dialog built with ncursesw could be
> > installed with the same executable name as with ncurses, and would handle
> > all of the scripts that the latter does.
>
> Yes, it should.  Is it not true (at least so far)?

yes - the point of my comment is that although the resulting program
would have different dependencies, it can be used interchangeably in
the current applications.  (the ncursesw-version of course provides
additional capabilities).
>
>
> > (My intent is also to keep the
> > library interfaces the same between both configurations, though I have
> > made changes to allow this, e.g,. adding the fkey parameter to several
> > functions).
>
> Well, do you mean compatibility between ncurses and ncursesw, or between
> dialog linked with ncurses and dialog linked with ncursesw?  I think
> both versions of dialog have the same user interface (while you said that
> ncursesw is not exactly compatible to ncurses).  Thus, I think dialog
> with ncursesw can substitute for dialog with ncurses.  Am I wrong?

I was commenting about the dialog library, which is also installed
by a Debian package.  The interface of that library is (so far) the
same, whether it is compiled with ncurses or ncursesw.

-- 
T.E.Dickey <dickey@herndon4.his.com>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #182 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: 195674@bugs.debian.org, sanvila@unex.es, sanvila@debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Wed, 23 Jul 2003 09:31:03 +0900 (JST)
Hi,

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 22 Jul 2003 20:11:17 -0400 (EDT)


> yes - the point of my comment is that although the resulting program
> would have different dependencies, it can be used interchangeably in
> the current applications.  (the ncursesw-version of course provides
> additional capabilities).

> I was commenting about the dialog library, which is also installed
> by a Debian package.  The interface of that library is (so far) the
> same, whether it is compiled with ncurses or ncursesw.

I see.  Then I think it is not a problem.  The only problem is a disk
space for libncursesw5, but this disk space is anyway needed when more
packages which Depends: on libncurses5 will be modified to Depends: on
libncursesw5 (otherwise they cannot support multibyte encodings and
useless for east Asians).  libncurses5 will be a legacy package for
softwares which need strict compatibility or must be very fast.

Are there any more problems than disk space when dialog Depends: on
libncursesw5 instead of libncurses5?

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
 





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #187 received at 195674@bugs.debian.org (full text, mbox):

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, 195674@bugs.debian.org, sanvila@unex.es, sanvila@debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 22 Jul 2003 20:55:36 -0400 (EDT)
On Wed, 23 Jul 2003, Tomohiro KUBOTA wrote:

> Are there any more problems than disk space when dialog Depends: on
> libncursesw5 instead of libncurses5?

speed, space and of course that ncursesw is newer than ncurses, noting
that it's the development version which we're using rather than the 5.3
release.

-- 
T.E.Dickey <dickey@herndon4.his.com>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #192 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: 195674@bugs.debian.org, sanvila@unex.es, sanvila@debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Wed, 23 Jul 2003 21:07:43 +0900 (JST)
Hi,

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 22 Jul 2003 20:55:36 -0400 (EDT)


> that it's the development version which we're using rather than the 5.3
> release.

As I wrote, I don't think speed and space are a problem for Debian
dialog package.  Also, Debian dialog package itself is based on very
new development version and it doesn't matter if it uses (development
version of) ncursesw.

Do the speed, space, and development-versionness really matter for Debian
dialog package?  (Even for special cases where they are really important,
"dialog-8bit" package can be an alternative.)

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #197 received at 195674@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: dickey@herndon4.his.com, 195674@bugs.debian.org, sanvila@unex.es, sanvila@debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Wed, 23 Jul 2003 15:32:35 -0400
On Wed, Jul 23, 2003 at 09:07:43PM +0900, Tomohiro KUBOTA wrote:

> Do the speed, space, and development-versionness really matter for Debian
> dialog package?  (Even for special cases where they are really important,
> "dialog-8bit" package can be an alternative.)

That's really up to the package maintainer.

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #202 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: 195674@bugs.debian.org, sanvila@unex.es, sanvila@debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Thu, 24 Jul 2003 08:47:58 +0900 (JST)
Hi,

From: Thomas Dickey <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Wed, 23 Jul 2003 15:32:35 -0400

> > Do the speed, space, and development-versionness really matter for Debian
> > dialog package?  (Even for special cases where they are really important,
> > "dialog-8bit" package can be an alternative.)
> 
> That's really up to the package maintainer.

Sure, and this is why I sent the original mail [1] only to the package
maintainer and Debian BTS.

  [1] 20030722.110719.57973636.debian@tmail.plala.or.jp
      or http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=195674&msg=90

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Santiago Vila <sanvila@unex.es>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #207 received at 195674@bugs.debian.org (full text, mbox):

From: Santiago Vila <sanvila@unex.es>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 19 Aug 2003 18:12:08 +0200 (CEST)
Hi.

I've just uploaded dialog_20030818-1, which is now linked against
ncursesw as you requested.

I have intentionally left this bug open because I believe it's not
completely fixed yet (there may be glitches here and there).



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #212 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: sanvila@unex.es
Cc: 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Wed, 20 Aug 2003 09:55:52 +0900 (JST)
Hi,

> I've just uploaded dialog_20030818-1, which is now linked against
> ncursesw as you requested.

Thank you.

> I have intentionally left this bug open because I believe it's not
> completely fixed yet (there may be glitches here and there).

A good decision.  This version is not yet bug-free enough as
suitable for debconf's frontend with east Asian translations.
(For example, translated items for selection box are not shown.)

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #217 received at 195674@bugs.debian.org (full text, mbox):

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>, 195674@bugs.debian.org
Cc: debian-bugs-dist@lists.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 19 Aug 2003 21:08:52 -0400 (EDT)
On Wed, 20 Aug 2003, Tomohiro KUBOTA wrote:

> Hi,
>
> > I've just uploaded dialog_20030818-1, which is now linked against
> > ncursesw as you requested.
>
> Thank you.
>
> > I have intentionally left this bug open because I believe it's not
> > completely fixed yet (there may be glitches here and there).
>
> A good decision.  This version is not yet bug-free enough as
> suitable for debconf's frontend with east Asian translations.
> (For example, translated items for selection box are not shown.)

Which one is "selection box"?  (I modified checkbox.c, menubox.c,
did not see anything obvious to change in fselect.c, etc.).

-- 
T.E.Dickey <dickey@herndon4.his.com>
http://invisible-island.net
ftp://invisible-island.net



Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #222 received at 195674@bugs.debian.org (full text, mbox):

From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
To: dickey@herndon4.his.com
Cc: 195674@bugs.debian.org, debian-bugs-dist@lists.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Wed, 20 Aug 2003 15:02:26 +0900 (JST)
Hi,

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Tue, 19 Aug 2003 21:08:52 -0400 (EDT)

> Which one is "selection box"?  (I modified checkbox.c, menubox.c,
> did not see anything obvious to change in fselect.c, etc.).

Well, sorry, I tested again and found that major problems which
I reported (multibyte/fullwidth tag in --menu, line wrapping with
fullwidth characters) are fixed!

Thank you very much!

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/





Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@herndon4.his.com:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #227 received at 195674@bugs.debian.org (full text, mbox):

From: "Thomas E. Dickey" <dickey@herndon4.his.com>
To: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Cc: 195674@bugs.debian.org
Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
Date: Wed, 20 Aug 2003 05:58:33 -0400 (EDT)
On Wed, 20 Aug 2003, Tomohiro KUBOTA wrote:

> Hi,
>
> From: "Thomas E. Dickey" <dickey@herndon4.his.com>
> Subject: Re: Bug#195674: dialog: doesn't support multibyte characters
> Date: Tue, 19 Aug 2003 21:08:52 -0400 (EDT)
>
> > Which one is "selection box"?  (I modified checkbox.c, menubox.c,
> > did not see anything obvious to change in fselect.c, etc.).
>
> Well, sorry, I tested again and found that major problems which
> I reported (multibyte/fullwidth tag in --menu, line wrapping with
> fullwidth characters) are fixed!
>
> Thank you very much!

no problem.  I expect there are minor problems (as well as parts of the
code that I have not considered).

-- 
T.E.Dickey <dickey@herndon4.his.com>
http://invisible-island.net
ftp://invisible-island.net



Tags added: upstream Request was from Santiago Vila <sanvila@unex.es> to control@bugs.debian.org. Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#195674; Package dialog. Full text and rfc822 format available.

Acknowledgement sent to dickey@his.com (Thomas Dickey):
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>. Full text and rfc822 format available.

Message #234 received at 195674@bugs.debian.org (full text, mbox):

From: dickey@his.com (Thomas Dickey)
To: 195674@bugs.debian.org
Cc: dickey@his.com (Thomas Dickey)
Subject: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=195674
Date: Wed, 10 Sep 2003 19:30:19 -0400
>                       Debian Bug report logs - #195674
>                 dialog: doesn't support multibyte characters

I fixed the last part of this in tonight's snapshot (dialog-0.9b-20030910).
-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net



Reply sent to Santiago Vila <sanvila@debian.org>:
You have taken responsibility. Full text and rfc822 format available.

Notification sent to Tomohiro KUBOTA <debian@tmail.plala.or.jp>:
Bug acknowledged by developer. Full text and rfc822 format available.

Message #239 received at 195674-close@bugs.debian.org (full text, mbox):

From: Santiago Vila <sanvila@debian.org>
To: 195674-close@bugs.debian.org
Subject: Bug#195674: fixed in dialog 0.9b-20030910-1
Date: Thu, 11 Sep 2003 05:47:08 -0400
Source: dialog
Source-Version: 0.9b-20030910-1

We believe that the bug you reported is fixed in the latest version of
dialog, which is due to be installed in the Debian FTP archive:

dialog_0.9b-20030910-1.diff.gz
  to pool/main/d/dialog/dialog_0.9b-20030910-1.diff.gz
dialog_0.9b-20030910-1.dsc
  to pool/main/d/dialog/dialog_0.9b-20030910-1.dsc
dialog_0.9b-20030910-1_i386.deb
  to pool/main/d/dialog/dialog_0.9b-20030910-1_i386.deb
dialog_0.9b-20030910.orig.tar.gz
  to pool/main/d/dialog/dialog_0.9b-20030910.orig.tar.gz



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 195674@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Santiago Vila <sanvila@debian.org> (supplier of updated dialog package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Thu, 11 Sep 2003 11:31:14 +0200
Source: dialog
Binary: dialog
Architecture: source i386
Version: 0.9b-20030910-1
Distribution: unstable
Urgency: low
Maintainer: Santiago Vila <sanvila@debian.org>
Changed-By: Santiago Vila <sanvila@debian.org>
Description: 
 dialog     - Displays user-friendly dialog boxes from shell scripts
Closes: 195674 209336
Changes: 
 dialog (0.9b-20030910-1) unstable; urgency=low
 .
   * New upstream release.
   * Support for multibyte characters should now be complete (Closes: #195674).
   * Fixed "RENAMED" result from inputmenu widget (Closes: #209336).
Files: 
 fba7286d5d0f6147b6f7bb3e7dca812b 596 misc optional dialog_0.9b-20030910-1.dsc
 b2cbe9eaaa8e355e8c14b958c4089e4e 235404 misc optional dialog_0.9b-20030910.orig.tar.gz
 a0c060221d605a042f3462264683cbb7 6535 misc optional dialog_0.9b-20030910-1.diff.gz
 1f97d8c951ef2a864ba1b00199aca5ba 142588 misc optional dialog_0.9b-20030910-1_i386.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/YEH0d9Uuvj7yPNYRAnzsAJ95XigZd5aP6lBZtX9An4PeLYD7tACglC5a
WRC7RfZN5xCbXtfZWiykIRI=
=deNP
-----END PGP SIGNATURE-----




Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Wed Apr 23 14:30:09 2014; Machine Name: buxtehude.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.