Debian Bug report logs - #769748
gpm cannot distinguish between Russian and English "a" if a font is terminus unicode bold with console-cyrillic

version graph

Package: gpm; Maintainer for gpm is Axel Beckert <abe@debian.org>; Source for gpm is src:gpm (PTS, buildd, popcon).

Reported by: Askar Safin <safinaskar@gmail.com>

Date: Sun, 16 Nov 2014 04:09:02 UTC

Severity: normal

Tags: l10n, upstream

Found in version gpm/1.20.4-6

Reply or subscribe to this bug.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Peter Samuelson <peter@p12n.org>:
Bug#769748; Package gpm. (Sun, 16 Nov 2014 04:09:07 GMT) (full text, mbox, link).


Acknowledgement sent to Askar Safin <safinaskar@mail.ru>:
New Bug report received and forwarded. Copy sent to Peter Samuelson <peter@p12n.org>. (Sun, 16 Nov 2014 04:09:07 GMT) (full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: Askar Safin <safinaskar@mail.ru>
To: submit@bugs.debian.org
Cc: gpm@lists.linux.it, zinoviev@debian.org
Subject: gpm cannot distinguish between Russian and English "a" if a font is terminus unicode bold with console-cyrillic
Date: Sun, 16 Nov 2014 07:06:19 +0300
Package: gpm
Version: 1.20.4-6
Severity: normal
Tags: upstream l10n

Steps to reproduce:
* Install gpm and console-cyrillic
* Configure console-cyrillic and choose terminus unicode bold as a font
* Restart console-cyrillic
* Type Russian letters "а" and "о" (which look exactly same as English "a" and "o")
* Copy them to clipboard using gpm
* Paste

What I get:
* I get English "a" and "o"

What I expected to get:
* Russian "а" and "о"

This bug is really bad. In the past, I didn't notice the bug and happily copied-and-pasted my Russian text from one text file to another.
This introduced a lot of words with mixed Russian and English letters such as "cлoвo". But one day I noticed that two files look exactly same,
but "diff" says they are different. And then I finally understand that there is such bug, and my files are full of such words with mixed letters.

This bug doesn't reproduce if I use unicyr font.

I also CC'd gpm@lists.linux.it (gpm mailing list) and zinoviev@debian.org (console-cyrillic maintainer)

-- System Information:
Debian Release: 7.7
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.2.0-4-amd64 (SMP w/8 CPU cores)
Locale: LANG=ru_RU.UTF-8, LC_CTYPE=ru_RU.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages gpm depends on:
ii  debconf [debconf-2.0]  1.5.49
ii  dpkg                   1.16.15
ii  install-info           4.13a.dfsg.1-10
ii  libc6                  2.18-4
ii  libgpm2                1.20.4-6
ii  lsb-base               4.1+Debian8+deb7u1
ii  ucf                    3.0025+nmu3

gpm recommends no packages.

gpm suggests no packages.

-- debconf information:
  gpm/responsiveness:
  gpm/repeat_type: none
  gpm/append:
  gpm/restart: false
  gpm/sample_rate:
  gpm/device: /dev/input/mice
  gpm/type: exps2


==
Askar Safin
http://vk.com/safinaskar
Kazan, Russia

Changed Bug submitter to 'Askar Safin <safinaskar@gmail.com>' from 'Askar Safin <safinaskar@mail.ru>'. Request was from Askar Safin <safinaskar@gmail.com> to control@bugs.debian.org. (Tue, 02 Jan 2024 04:00:05 GMT) (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Axel Beckert <abe@debian.org>:
Bug#769748; Package gpm. (Wed, 17 Jan 2024 19:42:02 GMT) (full text, mbox, link).


Message #10 received at 769748@bugs.debian.org (full text, mbox, reply):

From: Jakub Wilk <jwilk@jwilk.net>
To: Askar Safin <safinaskar@gmail.com>, 769748@bugs.debian.org
Cc: console-cyrillic@packages.debian.org
Subject: Re: Bug#769748: gpm cannot distinguish between Russian and English "a" if a font is terminus unicode bold with console-cyrillic
Date: Wed, 17 Jan 2024 20:39:17 +0100
* Askar Safin <safinaskar@mail.ru>, 2014-11-16 07:06:
>Steps to reproduce:
>* Install gpm and console-cyrillic
>* Configure console-cyrillic and choose terminus unicode bold as a font
>* Restart console-cyrillic
>* Type Russian letters "а" and "о" (which look exactly same as English "a" and "o")
>* Copy them to clipboard using gpm
>* Paste
>
>What I get:
>* I get English "a" and "o"
>
>What I expected to get:
>* Russian "а" and "о"

This is not a gpm bug. Gpm is not deeply involved in the copying and 
pasting operation: it just tells the kernels what part of screen to copy 
and when to paste.

What your seeing is caused by limitations of the Linux kernel and a font 
design choice:

1. The kernel limits the number of glyphs in the font to 256 (or 512 if 
you're OK with reduced number of available colors).

2. To work around this limit, and to fit as many characters as possible 
into a single font, the Terminus fonts reuses the same glyphs for 
characters that look the same (or almost the same), such as Latin "a" 
and Cyrillic "а".

3. The kernel didn't track which Unicode _characters_ appeared on 
screen, only which _glyphs_ did. (So the distinction between Latin "a" 
and Cyrillic "а" got lost.)

Fortunately, the last point is partially fixed since v4.19:
if you read from a /dev/vcsuN device, the kernel will start to keep 
track of Unicode characters on screen, and they should be properly 
copy-pastable.

See the following kernel commits:
https://git.kernel.org/linus/9bfdc2611d417be453c3deb7a7ef2ffc718febfa
https://git.kernel.org/linus/d21b0be246bf3bbf569e6e239f56abb529c7154e

-- 
Jakub Wilk



Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Sun Oct 26 16:59:25 2025; Machine Name: berlioz

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU General Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.