Package: gettext; Maintainer for gettext is Santiago Vila <sanvila@debian.org>; Source for gettext is src:gettext (PTS, buildd, popcon).
Reported by: Neil Williams <codehelp@debian.org>
Date: Wed, 27 Feb 2008 18:24:02 UTC
Severity: wishlist
Found in version gettext/0.17-2
Done: Santiago Vila <sanvila@unex.es>
Bug is archived. No further changes may be made.
View this report as an mbox folder, status mbox, maintainer mbox
Report forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(full text, mbox, link).
Acknowledgement sent to Neil Williams <codehelp@debian.org>:
New Bug report received and forwarded. Copy sent to Santiago Vila <sanvila@debian.org>.
(full text, mbox, link).
Message #5 received at submit@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Package: gettext
Version: 0.17-2
Severity: wishlist
msgfmt does support --endianness {big|little} from the source in
gettext-tools/src/msgfmt.c
neil@dwarf:po$ file messages.mo
messages.mo: GNU message catalog (little endian), revision 0, 14
messages
neil@dwarf:po$ msgfmt --endianness big cs.po
neil@dwarf:po$ file messages.mo
messages.mo: GNU message catalog (big endian), revision 0, 14 messages
neil@dwarf:po$
Please document this in the manpage and ask upstream if it can also be
output in the --help output.
(endianness is important when crossbuilding packages containing PO
files.)
-- System Information:
Debian Release: lenny/sid
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.24-1-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages gettext depends on:
ii gettext-base 0.17-2 GNU Internationalization utilities
ii libc6 2.7-8 GNU C Library: Shared libraries
ii libgomp1 4.3-20080219-1 GCC OpenMP (GOMP) support library
Versions of packages gettext recommends:
ii lynx 2.8.6-2 Text-mode WWW Browser
ii wget 1.10.2-3 retrieves files from the web
-- no debconf information
[signature.asc (application/pgp-signature, inline)]
Reply sent to Santiago Vila <sanvila@unex.es>:
You have taken responsibility.
(full text, mbox, link).
Notification sent to Neil Williams <codehelp@debian.org>:
Bug acknowledged by developer.
(full text, mbox, link).
Message #10 received at 468209-done@bugs.debian.org (full text, mbox, reply):
On Wed, 27 Feb 2008, Neil Williams wrote:
> Package: gettext
> Version: 0.17-2
> Severity: wishlist
>
> msgfmt does support --endianness {big|little} from the source in
> gettext-tools/src/msgfmt.c
>
> neil@dwarf:po$ file messages.mo
> messages.mo: GNU message catalog (little endian), revision 0, 14
> messages
> neil@dwarf:po$ msgfmt --endianness big cs.po
> neil@dwarf:po$ file messages.mo
> messages.mo: GNU message catalog (big endian), revision 0, 14 messages
> neil@dwarf:po$
>
> Please document this in the manpage and ask upstream if it can also be
> output in the --help output.
>
> (endianness is important when crossbuilding packages containing PO
> files.)
No, it is not. Binary .mo files as used by libc and gettext are always
little endian, regardless of the machine architecture. Otherwise it
would be impossible for us to have "Architecture: all" packages
like util-linux-locales containing just binary .mo files.
Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(full text, mbox, link).
Acknowledgement sent to Neil Williams <codehelp@debian.org>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(full text, mbox, link).
Message #15 received at 468209@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
reopen 468209
quit
On Thu, 28 Feb 2008 12:10:23 +0100 (CET)
Santiago Vila <sanvila@unex.es> wrote:
> On Wed, 27 Feb 2008, Neil Williams wrote:
> > msgfmt does support --endianness {big|little} from the source in
> > gettext-tools/src/msgfmt.c
> >
> > neil@dwarf:po$ file messages.mo
> > messages.mo: GNU message catalog (little endian), revision 0, 14
> > messages
> > neil@dwarf:po$ msgfmt --endianness big cs.po
> > neil@dwarf:po$ file messages.mo
> > messages.mo: GNU message catalog (big endian), revision 0, 14 messages
> > neil@dwarf:po$
> >
> > Please document this in the manpage and ask upstream if it can also be
> > output in the --help output.
> >
> > (endianness is important when crossbuilding packages containing PO
> > files.)
>
> No, it is not. Binary .mo files as used by libc and gettext are always
> little endian, regardless of the machine architecture.
Not necessarily. That is just how gettext currently works but it is not
how gettext *must* work, hence the --endianness option in the source
code.
> Otherwise it
> would be impossible for us to have "Architecture: all" packages
> like util-linux-locales containing just binary .mo files.
>
Actually, endianness *IS* important because we should not be *having*
Architecture:all packages that contian .mo files - that was
demonstrated during my talk on TDebs at Fosdem.
On big endian systems, the CPU wastes time converting the endianness at
loadtime which is important for embedded devices.
All packages containing .mo files should be Arch:any and this is
something I will be fixing during the course of TDeb development in
Debian. All the other questions that arise from this (increased package
numbers, extra builds, repository implications, userspace controls and
cache sizes) have all got solutions that are currently working in
Emdebian and which are due to be applied to Debian (probably after
Lenny).
TDebs might also need to drop the hash table in the .mo, again
discussed at Fosdem, but I'm currently working on whether that is
necessary and whether it has positive or negative consequences on the
use of .mo files on embedded devices.
I started working on TDebs for Emdebian thinking exactly the same way,
that .mo files were immune to other problems of endianness etc. (The
slides at Fosdem claimed that TDeb packages would be Arch:all until
the question and answer section of the talk). They are not Arch:all. It
is just that msgfmt defaults to little unless --endianness is specified,
irrespective of the build machine architecture.
In some ways, this is a bug but documenting the --endianness option
allows others to not make the same mistake again.
.mo files *are* architecture dependent and should be handled as such.
Just because 'it happens to work' right now does not mean it is the
correct way to handle .mo files.
Source packages that put .mo files into an Arch:all binary are buggy -
implementing that fix will involve lots of work in Debian to handle the
increase in package numbers but leave that to me - I'll sort out the
mass bug filing(s) when other TDeb support is implemented elsewhere in
Debian after Lenny. (In fact, these bugs will simply disappear anyway
because the implementation of TDebs means that *no* other package in
Debian would contain any .mo files, all .mo files would only exist in
TDeb packages which will be Architecture:any.)
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
[Message part 2 (application/pgp-signature, inline)]
Bug reopened, originator not changed.
Request was from Neil Williams <codehelp@debian.org>
to control@bugs.debian.org.
(Fri, 29 Feb 2008 07:36:08 GMT) (full text, mbox, link).
Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(full text, mbox, link).
Acknowledgement sent to Santiago Vila <sanvila@unex.es>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(full text, mbox, link).
Message #22 received at 468209@bugs.debian.org (full text, mbox, reply):
On Fri, 29 Feb 2008, Neil Williams wrote: > .mo files *are* architecture dependent and should be handled as such. > Just because 'it happens to work' right now does not mean it is the > correct way to handle .mo files. I'm curious: Do you plan to do the same with PCM .wav files? (They are "always" little-endian, like .mo files). I can agree that "it works" is not always a good reason to do things in a certain way, but what you are proposing is a change in something which is a de facto standard, for very little benefit (saving some cpu cycles). When the cost of something is very high and the benefit is very small, the natural thing to do is to keep things as they are. Anyway, I can forward your suggestion upstream if you insist, but I don't plan to deviate from upstream gettext if the authors reject your suggestion.
Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(full text, mbox, link).
Acknowledgement sent to Neil Williams <codehelp@debian.org>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(full text, mbox, link).
Message #27 received at 468209@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On Fri, 29 Feb 2008 11:51:19 +0100 (CET)
Santiago Vila <sanvila@unex.es> wrote:
> On Fri, 29 Feb 2008, Neil Williams wrote:
>
> > .mo files *are* architecture dependent and should be handled as such.
> > Just because 'it happens to work' right now does not mean it is the
> > correct way to handle .mo files.
>
> I'm curious: Do you plan to do the same with PCM .wav files?
Not for Emdebian - we are unlikely to be handling .wav files due to
storage constraints.
However, if there are any other situations where Arch:all files have
endian problems, I will be pursuing those to provide an Arch:any
mechanism if there is a role for such files on embedded devices. .wav
is simply too large for embedded, .gsm is more likely or .ogg.
So the answer to your question, really, is YES. I intend to seek
correct endianness support for any binary format in Debian that does
not already implement it and for which there is a logical reason to
need that format on an embedded device where endianness conversions are
a significant issue.
Music isn't that much of a problem - if a user has to wait a few
hundred clock cycles more to hear a music track is not a big issue.
What does matter is if a user has to wait several dozen clock cycles
*every time any application loads* merely to get the .mo content into
the correct presentation. This is a *MAJOR* usability issue. It will
make the entire OS appear slow in any translated locale.
If Debian wants to be able to support embedded and low resource
machines, Debian has to accept that there will be changes needed to
enable such support.
> (They are "always" little-endian, like .mo files).
>
> I can agree that "it works" is not always a good reason to do things
> in a certain way, but what you are proposing is a change in something
> which is a de facto standard, for very little benefit (saving some cpu
> cycles).
Please remember this is for embedded devices. It works now only becuase
Debian really isn't the Universal OS - lots of parts of Debian are
simply wrong for embedded usage which is why it is taking so long for
Emdebian to make progress.
Nobody cares about a few dozen clock cycles on a dual core GHz amd64 -
but it becomes a quite noticeable delay on an iPAQ. The delay is
repeated every single time any application is started outside the C
locale.
All I'm asking for here is *documentation* that this is how it needs to
be done for certain situations so that others do not make the same
mistake that both you and I have done - assuming that the current
Debian method of Arch:all for .mo is acceptable. It is not, sadly.
> When the cost of something is very high and the benefit is
> very small, the natural thing to do is to keep things as they are.
The cost of this change is irrelevant because the cost of implementing
TDebs is already high and this change does not make that any higher.
Whether TDebs are Arch:all or Arch:any makes no difference to the
amount of work required to implement TDebs in Debian. It merely adds a
tiny amount of work for the buildd network.
The benefits of TDebs (installation sizes, separate translator uploads,
faster translation updates etc.) far, far outweigh the temporary work
of getting them implemented in Debian. It is ludicrous that Debian
insists on installing over 250Mb of *unused and unusable* .mo files
in a default GNOME installation when any one locale needs just 9Mb or
less. (Check the size of your /usr/share/locale/ directory and compare
that with the collected size of the few languages that you actually
speak. Granted that will be a few more than me but I doubt anyone can
speak/read all of the 90+ languages installed by default in Debian.
I'd be surprised if anyone would require more than half a dozen.)
> Anyway, I can forward your suggestion upstream if you insist,
No, I don't see that this needs to be forwarded upstream, this is an
issue within Debian - primarily within the manpage as far as this bug
report is concerned.
All I want is for the manpage of msgfmt to explain that --endianness
{big|little} *is* supported, *why* it is supported and why it is
important for certain situations.
> but I don't
> plan to deviate from upstream gettext if the authors reject your suggestion.
The upstream authors already *explicitly support* endianness because the
option is part of the source code for msgfmt!
(gettext-tools/src/msgfmt.c)
I think upstream may have a better grasp of the issue than you may
imagine because other non-Debian embedded developments can use
--endianness.
Whether or not anything in gettext-Debian changes, I will implement
TDebs using --endianness in calls to msgfmt. All I'm asking is that the
reasons that I have set out for this are clearly explained in the
manpage in Debian so that other developers do not waste time believing
that the current method used for powerful desktop machines is in any
way appropriate for low resource units.
The option exists, it works and the manpage should document it - just
as with any application in Debian. The fact that Debian does not
currently use that option for desktop systems should be mentioned as
long as it is clear that this is not suitable for all devices and that
future support of translations in Debian is likely to use the
--endianness option to ensure compatibility with all supported devices,
not just desktop etc.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
[Message part 2 (application/pgp-signature, inline)]
Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(full text, mbox, link).
Acknowledgement sent to Michelle Konzack <linux4michelle@freenet.de>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(full text, mbox, link).
Message #32 received at 468209@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Hello Neil,
Am 2008-02-29 07:27:47, schrieb Neil Williams:
> Source packages that put .mo files into an Arch:all binary are buggy -
> implementing that fix will involve lots of work in Debian to handle the
> increase in package numbers but leave that to me - I'll sort out the
> mass bug filing(s) when other TDeb support is implemented elsewhere in
> Debian after Lenny. (In fact, these bugs will simply disappear anyway
> because the implementation of TDebs means that *no* other package in
> Debian would contain any .mo files, all .mo files would only exist in
> TDeb packages which will be Architecture:any.)
Does this mean, if I write/wrote a program (BaSH script + Xdialog) which
is currently "Arch:all" I have to build it after you for all 12 Arch?
This wold mean, that my 15 MByte source would produce 12 binary packages
of 15 MByte where ONLY the 40kByte .mo file would be different...
I do not know, whether this is realy desirable, even if I understand
your problem quiet well (fighting currently with ARM, MIPS and SH CPUs)
Thanks, Greetings and nice Day
Michelle Konzack
Systemadministrator
24V Electronic Engineer
Tamay Dogan Network
Debian GNU/Linux Consultant
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
##################### Debian GNU/Linux Consultant #####################
Michelle Konzack Apt. 917 ICQ #328449886
+49/177/9351947 50, rue de Soultz MSN LinuxMichi
+33/6/61925193 67100 Strasbourg/France IRC #Debian (irc.icq.com)
[signature.pgp (application/pgp-signature, inline)]
Information forwarded to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(full text, mbox, link).
Acknowledgement sent to Bruno Haible <bruno@clisp.org>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(full text, mbox, link).
Message #37 received at 468209@bugs.debian.org (full text, mbox, reply):
Hi, While reading http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=468209 I have to agree with most of what Santiago said. Neil Williams said: > On big endian systems, the CPU wastes time converting the endianness at > loadtime which is important for embedded devices. Do you have any figures? In my opinion, the non-native endianness costs a few CPU cycles at every non-cached gettext() invocation, but nothing at load time. The thing that costs at load time is when the locale encoding and the PO file encoding don't match: e.g. if the PO file was in ISO-8859-2 and the .mo file is used in an UTF-8 locale. > Please document this in the manpage and ask upstream if it can also be > output in the --help output. > ... > No, I don't see that this needs to be forwarded upstream, this is an > issue within Debian - primarily within the manpage as far as this bug > report is concerned. It would be wrong to document some option in Debian that is not documented upstream. The upstream maintainer can at any moment withdraw this option, change its syntax, make it dump core etc., without notice (no word about it in the NEWS file). Bruno
Reply sent
to Santiago Vila <sanvila@unex.es>:
You have taken responsibility.
(Sun, 16 May 2010 11:15:06 GMT) (full text, mbox, link).
Notification sent
to Neil Williams <codehelp@debian.org>:
Bug acknowledged by developer.
(Sun, 16 May 2010 11:15:06 GMT) (full text, mbox, link).
Message #42 received at 468209-done@bugs.debian.org (full text, mbox, reply):
On Sun, 3 Aug 2008, Bruno Haible wrote: > Hi, > > While reading > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=468209 > I have to agree with most of what Santiago said. > > Neil Williams said: > > On big endian systems, the CPU wastes time converting the endianness at > > loadtime which is important for embedded devices. > > Do you have any figures? In my opinion, the non-native endianness costs a > few CPU cycles at every non-cached gettext() invocation, but nothing at load > time. The thing that costs at load time is when the locale encoding and the > PO file encoding don't match: e.g. if the PO file was in ISO-8859-2 and the > .mo file is used in an UTF-8 locale. You have not explained why you have to convert the .mo file at loadtime, as opposed to let gettext do its job and do it at every gettext() invocation for each individual string that has to be shown. If you still plan to convert the .mo file at loadtime for fun, it's up to you, but I don't feel myself concerned about it as debian gettext maintainer. > > Please document this in the manpage and ask upstream if it can also be > > output in the --help output. > > ... > > No, I don't see that this needs to be forwarded upstream, this is an > > issue within Debian - primarily within the manpage as far as this bug > > report is concerned. > > It would be wrong to document some option in Debian that is not documented > upstream. The upstream maintainer can at any moment withdraw this option, > change its syntax, make it dump core etc., without notice (no word about it > in the NEWS file). I would even say that the fact that something is not documented probably means that it's an option that should not be used. In such case, documenting it might make more harm than good. By documenting such option, I would be giving some sort of "bless" (so to speak) to Architecture dependent packages containing .mo files, something that I consider a waste of time, as the standard for .mo files is to be little endian regardless of the native architecture. So no, I will not document an option which I feel nobody should use. As I have never been a fan of the wontfix tag, I'm closing this. If you still want to discuss about this, please show us any figures at the very minimum, as Bruno suggested. Thanks.
Bug archived.
Request was from Debbugs Internal Request <owner@bugs.debian.org>
to internal_control@bugs.debian.org.
(Mon, 14 Jun 2010 07:31:09 GMT) (full text, mbox, link).
Bug unarchived.
Request was from Jakub Wilk <jwilk@debian.org>
to control@bugs.debian.org.
(Thu, 03 Nov 2011 15:03:03 GMT) (full text, mbox, link).
Information forwarded
to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(Thu, 03 Nov 2011 15:27:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Jakub Wilk <jwilk@debian.org>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(Thu, 03 Nov 2011 15:27:03 GMT) (full text, mbox, link).
Message #51 received at 468209@bugs.debian.org (full text, mbox, reply):
* Santiago Vila <sanvila@unex.es>, 2008-02-28, 12:10:
>>msgfmt does support --endianness {big|little} from the source in
>>gettext-tools/src/msgfmt.c
>>
>>neil@dwarf:po$ file messages.mo
>>messages.mo: GNU message catalog (little endian), revision 0, 14
>>messages
>>neil@dwarf:po$ msgfmt --endianness big cs.po
>>neil@dwarf:po$ file messages.mo
>>messages.mo: GNU message catalog (big endian), revision 0, 14 messages
>>neil@dwarf:po$
>>
>>Please document this in the manpage and ask upstream if it can also be
>>output in the --help output.
>>
>>(endianness is important when crossbuilding packages containing PO
>>files.)
>
>No, it is not. Binary .mo files as used by libc and gettext are always
>little endian, regardless of the machine architecture.
Hmm, this doesn't seem to be the case (anymore?). As far as I can see,
msgfmt produces files with native endianness.
With the advent of multi-arch, such behavior has become a problem. If a
package is marked as "Multi-Arch: same" all the files (including *.mo)
have to be identical across all architectures.
Either the --endianness option should be documented (so that M-A:same
packages could use when needed), or msgfmt should produce little-endian
files even on big-endian architectures. Please tell if I should reopen
this bug, or rather file a new one requesting using little-endian
everywhere.
--
Jakub Wilk
Information forwarded
to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(Thu, 03 Nov 2011 17:39:08 GMT) (full text, mbox, link).
Acknowledgement sent
to Santiago Vila <sanvila@unex.es>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(Thu, 03 Nov 2011 17:39:08 GMT) (full text, mbox, link).
Message #56 received at 468209@bugs.debian.org (full text, mbox, reply):
El 03/11/11 16:24, Jakub Wilk escribió:
> * Santiago Vila <sanvila@unex.es>, 2008-02-28, 12:10:
>>> msgfmt does support --endianness {big|little} from the source in
>>> gettext-tools/src/msgfmt.c
>>>
>>> neil@dwarf:po$ file messages.mo
>>> messages.mo: GNU message catalog (little endian), revision 0, 14
>>> messages
>>> neil@dwarf:po$ msgfmt --endianness big cs.po
>>> neil@dwarf:po$ file messages.mo
>>> messages.mo: GNU message catalog (big endian), revision 0, 14 messages
>>> neil@dwarf:po$
>>>
>>> Please document this in the manpage and ask upstream if it can also
>>> be output in the --help output.
>>>
>>> (endianness is important when crossbuilding packages containing PO
>>> files.)
>>
>> No, it is not. Binary .mo files as used by libc and gettext are always
>> little endian, regardless of the machine architecture.
>
> Hmm, this doesn't seem to be the case (anymore?). As far as I can see,
> msgfmt produces files with native endianness.
I didn't know but yes, that seems to be the case now. I've just checked
by running "file *" on a locale directory in my old powerpc.
> With the advent of multi-arch, such behavior has become a problem. If a
> package is marked as "Multi-Arch: same" all the files (including *.mo)
> have to be identical across all architectures.
Hmm, why do they have to be identical?
It is not enough that both types of systems (big and little endian) are
able to read and use both types of .mo files, as it seems to be the case?
If .mo files are useable everywhere, regardless of their endianess, I
would say that the multi-arch requirement is not reasonable.
> Either the --endianness option should be documented (so that M-A:same
> packages could use when needed), or msgfmt should produce little-endian
> files even on big-endian architectures. Please tell if I should reopen
> this bug, or rather file a new one requesting using little-endian
> everywhere.
I would prefer a new bug because the rationale for considering it as a
bug would be quite different. Previously it was said about performance
reasons, but figures about that never were shown.
However, I'm happy to discuss about this in this old report first, at
least until I really understand the nature of the new bug.
Information forwarded
to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(Thu, 10 Nov 2011 11:36:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Jakub Wilk <jwilk@debian.org>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(Thu, 10 Nov 2011 11:36:08 GMT) (full text, mbox, link).
Message #61 received at 468209@bugs.debian.org (full text, mbox, reply):
* Santiago Vila <sanvila@unex.es>, 2011-11-03, 18:38: >>With the advent of multi-arch, such behavior has become a problem. If >>a package is marked as "Multi-Arch: same" all the files (including >>*.mo) have to be identical across all architectures. > >Hmm, why do they have to be identical? > >It is not enough that both types of systems (big and little endian) are >able to read and use both types of .mo files, as it seems to be the >case? > >If .mo files are useable everywhere, regardless of their endianess, I >would say that the multi-arch requirement is not reasonable. "Multi-Arch: same" makes it possible for users to install a package for more than one architecture at the same time. If files with same name are not identical across architectures, package manager has to resolve the conflict somehow, and it does it by simply aborting the installation, e.g. like that: | # apt-get install -qq libavahi-common-data:powerpc | (Reading database ... 59644 files and directories currently installed.) | Unpacking libavahi-common-data:powerpc (from .../libavahi-common-data_0.6.30-5_powerpc.deb) ... | dpkg: error processing /var/cache/apt/archives/libavahi-common-data_0.6.30-5_powerpc.deb (--unpack): | './usr/share/locale/he/LC_MESSAGES/avahi.mo' is different from the same file on the system | configured to not write apport reports | dpkg-deb: error: subprocess paste was killed by signal (Broken pipe) | Errors were encountered while processing: | /var/cache/apt/archives/libavahi-common-data_0.6.30-5_powerpc.deb | E: Sub-process /usr/bin/dpkg returned an error code (1) Does it make things clear? Note that the problem would affect only tiny minority of packages: "Multi-Arch: same" is useful mainly for shared libraries and they rarely come with translations. -- Jakub Wilk
Information forwarded
to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(Thu, 10 Nov 2011 18:39:06 GMT) (full text, mbox, link).
Acknowledgement sent
to Santiago Vila <sanvila@unex.es>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(Thu, 10 Nov 2011 18:39:07 GMT) (full text, mbox, link).
Message #66 received at 468209@bugs.debian.org (full text, mbox, reply):
El 10/11/11 12:34, Jakub Wilk escribió: > * Santiago Vila <sanvila@unex.es>, 2011-11-03, 18:38: >>> With the advent of multi-arch, such behavior has become a problem. If >>> a package is marked as "Multi-Arch: same" all the files (including >>> *.mo) have to be identical across all architectures. >> >> Hmm, why do they have to be identical? >> >> It is not enough that both types of systems (big and little endian) >> are able to read and use both types of .mo files, as it seems to be >> the case? >> >> If .mo files are useable everywhere, regardless of their endianess, I >> would say that the multi-arch requirement is not reasonable. > > "Multi-Arch: same" makes it possible for users to install a package for > more than one architecture at the same time. If files with same name are > not identical across architectures, package manager has to resolve the > conflict somehow, and it does it by simply aborting the installation, > e.g. like that: > | # apt-get install -qq libavahi-common-data:powerpc > | (Reading database ... 59644 files and directories currently installed.) > | Unpacking libavahi-common-data:powerpc (from > .../libavahi-common-data_0.6.30-5_powerpc.deb) ... > | dpkg: error processing > /var/cache/apt/archives/libavahi-common-data_0.6.30-5_powerpc.deb > (--unpack): > | './usr/share/locale/he/LC_MESSAGES/avahi.mo' is different from the > same file on the system > | configured to not write apport reports > | dpkg-deb: error: subprocess paste was killed by signal (Broken pipe) > | Errors were encountered while processing: > | /var/cache/apt/archives/libavahi-common-data_0.6.30-5_powerpc.deb > | E: Sub-process /usr/bin/dpkg returned an error code (1) > > Does it make things clear? Yes, I now see what the problem is, but I don't see that making every .mo file to be always little endian again is the best solution. We could also tell dpkg somehow that different files in /usr/share/locale are ok in this case. > Note that the problem would affect only tiny minority of packages: > "Multi-Arch: same" is useful mainly for shared libraries and they rarely > come with translations. In such case, making those packages to depend on another "Arch: all" package containing just the translations would solve the issue, would it not? (For the record, I happen to maintain a library containing translations, and I have always seen it as an "anomaly", this would force me to do what I feel is the "right thing").
Information forwarded
to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(Sat, 19 Nov 2011 10:57:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Neil Williams <codehelp@debian.org>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(Sat, 19 Nov 2011 10:57:11 GMT) (full text, mbox, link).
Message #71 received at 468209@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On Fri, 18 Nov 2011 23:39:20 -0600 Peter Samuelson <peter@p12n.org> wrote: > > [Jakub Wilk] > > The most common reasons for cross-architecture differences appear to > > be (in random order): > > > - Compiling GNU message catalogs with gettext, which uses native > > endianness (see bug #468209). (Hmm, I did get v.carried away earlier in this bug report didn't I? Sorry, Santiago. I genuinely thought this was a lot more important at the time. All that cross-building was driving me scatty.) > Having read that bug log, it's not clear to me whether there's a > consensus about what to do about these. Neil thinks we need native > endian .mo (which is problematic for multiarch) Native isn't quite the right meaning - same endianness as the DEB_HOST_ARCH platform is what I intended to implement in Emdebian based on the --endianness option to msgfmt. (Native could mean DEB_BUILD_ARCH which is worse for cross-building than just picking one endianness and sticking with it.) >, others think we need > .mo to be Arch: all and "dont-care"-endian. Has any consensus emerged? After more work on exactly what's going on with endianness and .mo files, Emdebian switched to making our TDebs Arch:all and letting gettext deal with the endianness before the Squeeze release, by which time the cross-built version of Emdebian was already inoperable. I should have followed that up to the bug log - it was closed and archived at that point but I forgot to check it. Sorry. http://www.emdebian.org/emdebian/tdebs.html As long as the behaviour is always *consistent* across native builds and cross-builds, I would be happy with having all .mo files with the same endianness. By preference, little endian. Like Santiago (468209#66), I maintain a library which has translations. For that package I have put the .mo files into a -data package (qof-data) precisely because that allows the Architecture:any libqof2 to depend on the Arch:all qof-data, thereby solving the MultiArch problem. This is also compatible with the TDeb proposal for Arch:all TDebs which contain .mo files (as well as the more difficult problem of the translated debconf templates file) because packages like qof-data can easily be processed as TDebs once those mechanisms are available. > And is it worth splitting out a -l10n or -data package from a library > just so the library itself can be M-A: same? (I suppose a side benefit > is you can use Recommends and cut down a little on the size of your > strict Dependency closure.) Yes, it is worth having -l10n, -tdeb, -data packages, precisely to get the library M-A: same. MultiArch wasn't a consideration when #468209 was opened or when the TDeb proposal was created in 2008. There are other reasons to split out the .mo files: 1. Updates to translations should not require source NMU's. 2. Translation data should not be distributed in architecture-dependent packages. 3. Translators should have a common interface for getting updates into Debian (possibly with automated TDeb generation after i18n team review). However, the TDeb proposal for Debian is stalled due to disagreement with the dpkg maintainers over the implementation mechanism. I want to see the arrival of a Multiarch-aware dpkg in unstable before raising that discussion again. In the meantime, I'm happy for all libraries packaging translations to add -l10n, -tdeb, -data or -common Arch:all packages in order to meet the higher priority of implementing MultiArch. Indirectly, this would help the eventual implementation of TDebs. If there's to be a release goal for that, I'm happy to help with it. Santiago wrote: > In such case, making those packages to depend on another "Arch: all" > package containing just the translations would solve the issue, would it > not? > > (For the record, I happen to maintain a library containing translations, > and I have always seen it as an "anomaly", this would force me to do > what I feel is the "right thing"). I agree with this completely and, as above, have fixed the anomaly that way for one library which uses gettext translations. There remains some disagreement: Santiago wrote: > Yes, I now see what the problem is, but I don't see that making every > .mo file to be always little endian again is the best solution. We could > also tell dpkg somehow that different files in /usr/share/locale are ok > in this case. Having at first put a lot of time into generating .mo files which have matching endianness to the DEB_HOST_ARCH, I have changed my mind on exactly how this should work. Emdebian has tried .mo files which differ between architectures and it isn't worth the effort. Santiago was right the first time, I was wrong: let gettext deal with the load at runtime and don't fuss about the endianness of the file in /usr/share. *However*, I think that this means that we *should* make every .mo file to always be little endian. I'm sorry if this seems like an about turn but what I originally wanted from #468209 was merely the documentation of the option so that Emdebian could use it outside of Debian builds to deterministically set the endianness and do the tests. If that had meant changing stuff in Debian, I was happy to do that. Things turned out differently. I did not intend #468209 to cause a change in Debian behaviour by implementing or not implementing a change in how gettext operated. From reading the log, I don't see where any change of gettext behaviour within Debian was mentioned as being implemented in gettext itself. It was all about whether the --endianness option should be documented and how it could be used *outside* Debian in the cross-built version of Emdebian which itself is to be completely redesigned once MultiArch is in place. (Nothing here affects the binary-compatible "Grip" version of Emdebian which I'm currently working on integrating into Debian and which already uses Arch:all TDebs.) -- Neil Williams ============= http://www.linux.codehelp.co.uk/
[Message part 2 (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(Sat, 19 Nov 2011 15:39:03 GMT) (full text, mbox, link).
Acknowledgement sent
to johnandsara2@cox.net:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(Sat, 19 Nov 2011 15:39:03 GMT) (full text, mbox, link).
Message #76 received at 468209@bugs.debian.org (full text, mbox, reply):
not that i'm multi-lingual (i use google to translate!) don't we get all .mo avail. in packages already? (i hope) # locate "*.mo" | wc 13341 13341 648305 ... and if build/tar admins say "choose another method" why not try asking them what? can't i delete .mo locally if i'm bitwise desperate for disk space? have fun! JohnHendrickson Neil Williams wrote: > On Fri, 18 Nov 2011 23:39:20 -0600 > Peter Samuelson <peter@p12n.org> wrote: > >> [Jakub Wilk] >>> The most common reasons for cross-architecture differences appear to >>> be (in random order): >>> - Compiling GNU message catalogs with gettext, which uses native >>> endianness (see bug #468209). > > (Hmm, I did get v.carried away earlier in this bug report didn't I? > Sorry, Santiago. I genuinely thought this was a lot more important at > the time. All that cross-building was driving me scatty.) > >> Having read that bug log, it's not clear to me whether there's a >> consensus about what to do about these. Neil thinks we need native >> endian .mo (which is problematic for multiarch) > > Native isn't quite the right meaning - same endianness as the > DEB_HOST_ARCH platform is what I intended to implement in Emdebian > based on the --endianness option to msgfmt. (Native could mean > DEB_BUILD_ARCH which is worse for cross-building than just picking one > endianness and sticking with it.) > >> , others think we need >> .mo to be Arch: all and "dont-care"-endian. Has any consensus emerged? > > After more work on exactly what's going on with endianness and .mo > files, Emdebian switched to making our TDebs Arch:all and letting > gettext deal with the endianness before the Squeeze release, by which > time the cross-built version of Emdebian was already inoperable. I > should have followed that up to the bug log - it was closed and > archived at that point but I forgot to check it. Sorry. > > http://www.emdebian.org/emdebian/tdebs.html > > As long as the behaviour is always *consistent* across native builds and > cross-builds, I would be happy with having all .mo files with the same > endianness. By preference, little endian. > > Like Santiago (468209#66), I maintain a library which has translations. > For that package I have put the .mo files into a -data package > (qof-data) precisely because that allows the Architecture:any libqof2 > to depend on the Arch:all qof-data, thereby solving the MultiArch > problem. > > This is also compatible with the TDeb proposal for Arch:all TDebs which > contain .mo files (as well as the more difficult problem of the > translated debconf templates file) because packages like qof-data can > easily be processed as TDebs once those mechanisms are available. > >> And is it worth splitting out a -l10n or -data package from a library >> just so the library itself can be M-A: same? (I suppose a side benefit >> is you can use Recommends and cut down a little on the size of your >> strict Dependency closure.) > > Yes, it is worth having -l10n, -tdeb, -data packages, precisely to get > the library M-A: same. MultiArch wasn't a consideration when #468209 > was opened or when the TDeb proposal was created in 2008. > > There are other reasons to split out the .mo files: > > 1. Updates to translations should not require source NMU's. > > 2. Translation data should not be distributed in architecture-dependent > packages. > > 3. Translators should have a common interface for getting updates into > Debian (possibly with automated TDeb generation after i18n team review). > > However, the TDeb proposal for Debian is stalled due to disagreement > with the dpkg maintainers over the implementation mechanism. I want to > see the arrival of a Multiarch-aware dpkg in unstable before raising > that discussion again. > > In the meantime, I'm happy for all libraries packaging translations to > add -l10n, -tdeb, -data or -common Arch:all packages in order to meet > the higher priority of implementing MultiArch. Indirectly, this would > help the eventual implementation of TDebs. > > If there's to be a release goal for that, I'm happy to help with it. > > Santiago wrote: >> In such case, making those packages to depend on another "Arch: all" >> package containing just the translations would solve the issue, would it >> not? >> >> (For the record, I happen to maintain a library containing translations, >> and I have always seen it as an "anomaly", this would force me to do >> what I feel is the "right thing"). > > I agree with this completely and, as above, have fixed the anomaly that > way for one library which uses gettext translations. > > There remains some disagreement: > > Santiago wrote: >> Yes, I now see what the problem is, but I don't see that making every >> .mo file to be always little endian again is the best solution. We could >> also tell dpkg somehow that different files in /usr/share/locale are ok >> in this case. > > Having at first put a lot of time into generating .mo files which have > matching endianness to the DEB_HOST_ARCH, I have changed my mind on > exactly how this should work. Emdebian has tried .mo files which differ > between architectures and it isn't worth the effort. Santiago was right > the first time, I was wrong: let gettext deal with the load at runtime > and don't fuss about the endianness of the file in /usr/share. > *However*, I think that this means that we *should* make every .mo file > to always be little endian. > > I'm sorry if this seems like an about turn but what I originally wanted > from #468209 was merely the documentation of the option so that > Emdebian could use it outside of Debian builds to deterministically set > the endianness and do the tests. If that had meant changing stuff in > Debian, I was happy to do that. Things turned out differently. > > I did not intend #468209 to cause a change in Debian behaviour by > implementing or not implementing a change in how gettext operated. From > reading the log, I don't see where any change of gettext behaviour > within Debian was mentioned as being implemented in gettext itself. It > was all about whether the --endianness option should be documented and > how it could be used *outside* Debian in the cross-built version of > Emdebian which itself is to be completely redesigned once MultiArch is > in place. > > (Nothing here affects the binary-compatible "Grip" version of Emdebian > which I'm currently working on integrating into Debian and which > already uses Arch:all TDebs.) >
Information forwarded
to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(Sat, 26 Nov 2011 02:42:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Steve Langasek <vorlon@debian.org>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(Sat, 26 Nov 2011 02:42:03 GMT) (full text, mbox, link).
Message #81 received at 468209@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On Sat, Nov 19, 2011 at 10:55:53AM +0000, Neil Williams wrote: > As long as the behaviour is always *consistent* across native builds and > cross-builds, I would be happy with having all .mo files with the same > endianness. By preference, little endian. I agree; little-endian is both the most common endianness for current hardware, and the endianness used by our lowest-end release-supported hardware, so little-endian would make the most sense. But the cost of using big-endian is probably not so high that it would be worth arguing about either. > > And is it worth splitting out a -l10n or -data package from a library > > just so the library itself can be M-A: same? (I suppose a side benefit > > is you can use Recommends and cut down a little on the size of your > > strict Dependency closure.) > Yes, it is worth having -l10n, -tdeb, -data packages, precisely to get > the library M-A: same. MultiArch wasn't a consideration when #468209 > was opened or when the TDeb proposal was created in 2008. No, having to split this data out into separate packages is a significant cost for maintainers and on the archive and simply the wrong way to do it. Automatic package splits for the likes of tdebs are fine, but we should not be forced to split binary packages in the archive for data files such as .mo files that could readily be made architecture-independent. > Having at first put a lot of time into generating .mo files which have > matching endianness to the DEB_HOST_ARCH, I have changed my mind on > exactly how this should work. Emdebian has tried .mo files which differ > between architectures and it isn't worth the effort. Santiago was right > the first time, I was wrong: let gettext deal with the load at runtime > and don't fuss about the endianness of the file in /usr/share. > *However*, I think that this means that we *should* make every .mo file > to always be little endian. If the .mo files are always little-endian, then there's no need at all to split the package. -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developer http://www.debian.org/ slangasek@ubuntu.com vorlon@debian.org
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Santiago Vila <sanvila@debian.org>:
Bug#468209; Package gettext.
(Sun, 27 Nov 2011 17:51:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Santiago Vila <sanvila@unex.es>:
Extra info received and forwarded to list. Copy sent to Santiago Vila <sanvila@debian.org>.
(Sun, 27 Nov 2011 17:51:03 GMT) (full text, mbox, link).
Message #86 received at 468209@bugs.debian.org (full text, mbox, reply):
On Fri, 25 Nov 2011, Steve Langasek wrote: > No, having to split this data out into separate packages is a significant > cost for maintainers and on the archive and simply the wrong way to do it. > Automatic package splits for the likes of tdebs are fine, but we should not > be forced to split binary packages in the archive for data files such as .mo > files that could readily be made architecture-independent. Ok, I'm not 100% convinced, but after reading this, I'm willing to ask gettext authors that they make little-endian the default again, or at least they tell me how we could achieve that (configure options, environment variables, whatever). Steve: Could you please report this as a *new* bug and summarize the problems that native endianness in gettext create to multi-arch? (I would prefer to keep this issue as a different one from the old bug). A Subject like "msgfmt creates mo files in native endianness" would be fine. Thanks.
Bug archived.
Request was from Debbugs Internal Request <owner@bugs.debian.org>
to internal_control@bugs.debian.org.
(Mon, 26 Dec 2011 07:32:31 GMT) (full text, mbox, link).
Send a report that this bug log contains spam.
Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.