Debian Bug report logs -
#959474
Issues with Chinese language (all variants) when building some pages in buster
Reply or subscribe to this bug.
Toggle useless messages
Report forwarded
to debian-bugs-dist@lists.debian.org, debian-i18n@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Sat, 02 May 2020 18:45:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Laura Arjona Reina <larjona@debian.org>:
New Bug report received and forwarded. Copy sent to debian-i18n@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>.
(Sat, 02 May 2020 18:45:03 GMT) (full text, mbox, link).
Message #5 received at submit@bugs.debian.org (full text, mbox, reply):
Package: www.debian.org
Severity: normal
User: www.debian.org@packages.debian.org
Usertags: scripts
X-Debbugs-CC: debian-l10n-chinese@lists.debian.org
X-Debbugs-CC: debian-i18n@lists.debian.org
Hi all,
TL;DR
There are some issues with some Chinese pages when they are built in a
buster machine.
We need to fix those issues (at least the "Malformed UTF-8 character
[...] at ../../bin/tocn.pl [...]" ones) so DSA can upgrade the
www-master machine to buster. See the summary of the log at the bottom
to know which files produce this error.
I have no idea of how to fix the issues, so any help from the Chinese
team or web team mates is greatly appreciated..
Additional issues may arise (e.g. I still didn't test the release-notes
or doc-manual), any help testing is welcome too, please create bug
reports for each different issue or update the existing ones. Thanks!
LONG VERSION
I've done a test build of the /english and /chinese subdirs in a buster
machine, and I have noticed some warnings/errors related to the Chinese
pages (some, not all of them).
It would be desirable to upgrade www-master machine to buster as soon as
possible, so any help with this (from website or Chinese team members)
is very appreciated.
Below you can find an extract of the build log, including only the the
files for which I got some error or warning message.
After the build, I have compared the problematic HTML files of a build
in stretch and a build in buster with a diff tool, to see if there were
significant changes in the html output due to these issues.
Here are my results:
* For the messages of the type ", [zh_TW]Invalid UTF8: " when building,
I couldn't note any difference between the output of a stretch build and
the output of a buster build.
I would say this is not a blocker for the buster upgrade of www-master.
* For the messages of the type "Malformed UTF-8 character [...] at
../../bin/tocn.pl [...]" I have seen important changes in the HTML diff,
I think the output in the stretch build is totally broken (fortunately,
there are not many files in that situation).
I would say this is a blocker for the buster upgrade of www-master, but
I would prefer somebody of the Chinese team to confirm (try to build
those files in a buster machine, and review the output).
Additional notes:
* I have only tested the wml build, not the rest of the cron scripts
that run on www-master. I will try to do it in the following days, but
if you already know any that works well (e.g. release-notes,
doc-manuals...) just tell so I can skip them.
* When I build files in my machines, there is something wrong in my
environment that I don't get the .po files integrated every time, so for
example the Chinese pages I build show the menus and footnote in
English. Therefore, if there is any issue with the encoding of the .po
files themselves, I guess I cannot detect it until I fix my particular
issue :/
* The local build that I make uses the SAMPLE_FILES that are needed in
some folders; so additional issues may arise when we use the actual
files that are generated at runtime in the often and lessoften cron jobs.
That's all for now, I think. Thanks for your patience reading and for
your help!
Kind regards,
--
Laura Arjona Reina
https://wiki.debian.org/LauraArjona
--- extract of the build log file
/chinese
Processing
donations.wml:
[zh_CN]Invalid UTF8:
ïŒç¹å»âæ·»å å°èŽç©èœŠâïŒç¶å宿å©äœè¿çšã
, [zh_TW]Invalid UTF8:
ïŒç¹å»âæ·»å å°èŽç©èœŠâïŒç¶å宿å©äœè¿çšã
, [zh_HK]Invalid UTF8:
ïŒç¹å»âæ·»å å°èŽç©èœŠâïŒç¶å宿å©äœè¿çšã
.
make[1]: Entering directory '/webwml/chinese/Bugs'
Processing Reporting.wml: [zh_CN]Invalid UTF8:
°äžæ¬¡ç€ºäŸäŒè¯çè¿çšã</li>
, [zh_TW]Invalid UTF8: °äžæ¬¡ç€ºäŸäŒè¯çè¿çšã</li>
, [zh_HK]Invalid UTF8: °äžæ¬¡ç€ºäŸäŒè¯çè¿çšã</li>
.
make[2]: Entering directory '/webwml/chinese/News/2000'
Processing 20000815.wml:
[zh_CN]Invalid UTF8: µ·å€æåçéŒååå©ïŒå
æ¬ïŒ
, [zh_TW]Invalid UTF8: µ·å€æåçéŒååå©ïŒå
æ¬ïŒ
, [zh_HK]Invalid UTF8: µ·å€æåçéŒååå©ïŒå
æ¬ïŒ
.
make[2]: Entering directory '/webwml/chinese/News/2009'
Processing 20090214.wml: [zh_CN]Invalid UTF8: Sun SPARC (sparc)ã
, [zh_TW]Invalid UTF8: Sun SPARC (sparc)ã
, [zh_HK]Invalid UTF8: Sun SPARC (sparc)ã
.
make[2]: Entering directory '/webwml/chinese/News/weekly'
copying index.zh-cn.html to ../../../../www/News/weekly/./2002/48
Processing index.wml: [zh_CN]Malformed UTF-8 character (unexpected end
of string) in substitution (s///) at ../../bin/tocn.pl line 13, <> line 146.
Malformed UTF-8 character (unexpected end of string) in substitution
(s///) at ../../bin/tocn.pl line 15, <> line 146.
panic: do_trans_simple_utf8 line 362 at ../../bin/tocn.pl line 20, <>
line 146.
, [zh_TW]Invalid UTF8: å
, [zh_HK]Invalid UTF8: å
.
copying index.zh-cn.html to ../../../../www/News/weekly/./2002/49
copying index.zh-cn.html to ../../../../www/News/weekly/./2003/09
Processing index.wml: [zh_CN]Invalid UTF8: æªæè¿°äºåŸå®è£
, [zh_TW]Invalid UTF8: ä»¶æè¿°äºåŸå®è£
, [zh_HK]Invalid UTF8: ä»¶æè¿°äºåŸå®è£
.
copying index.zh-cn.html to ../../../../www/News/weekly/./2003/10
Processing index.wml: [zh_CN]Invalid UTF8: 们ç<a
href="../../../../events/talks">æŒè®²é¡µé¢</a>æ¥åïŒ
, [zh_TW]Invalid UTF8: 们ç<a
href="../../../../events/talks">æŒè®²é¡µé¢</a>æ¥åïŒ
, [zh_HK]Invalid UTF8: 们ç<a
href="../../../../events/talks">æŒè®²é¡µé¢</a>æ¥åïŒ
.
copying index.zh-cn.html to ../../../../www/News/weekly/./2012/15
make[1]: Entering directory '/webwml/chinese/devel'
Processing
testing.wml:
[zh_CN],
[zh_TW]Invalid
UTF8: °äº 4
åäžæç®æŽæ°çè»ä»¶å
ïŒå ç²å®åæç Žå£äŸè³Žã<q>(0)</q> æ¯ç¡
, [zh_HK]Invalid
UTF8: °äº 4
åäžæç®æŽæ°çè»ä»¶å
ïŒå ç²å®åæç Žå£äŸè³Žã<q>(0)</q> æ¯ç¡
.
make[2]: Entering directory '/webwml/chinese/devel/join'
Processing index.wml: [zh_CN]Malformed UTF-8 character: \xe9\x98\x0a
(unexpected non-continuation byte 0x0a, 2 bytes after start byte 0xe9;
need 3 bytes, got 2) in substitution (s///) at ../../bin/tocn.pl line
108, <> line 52.
, [zh_TW], [zh_HK].
copying index.zh-cn.html to ../../../../www/devel/join
copying index.zh-hk.html to ../../../../www/devel/join
copying index.zh-tw.html to ../../../../www/devel/join
make[1]: Entering directory '/webwml/chinese/international'
Processing index.wml: [zh_CN]Malformed UTF-8 character: \xe9\x98\x0a
(unexpected non-continuation byte 0x0a, 2 bytes after start byte 0xe9;
need 3 bytes, got 2) in substitution (s///) at ../bin/tocn.pl line 108,
<> line 89.
, [zh_TW]Invalid UTF8:
çšåº
, [zh_HK]Invalid UTF8:
çšåº
.
make[2]: Entering directory '/webwml/chinese/international/Chinese'
Processing thanks.wml: [zh_CN]Invalid UTF8: «é»çæå
, [zh_TW]Invalid UTF8: «é»çæå
, [zh_HK]Invalid UTF8: «é»çæå
.
make[1]: Entering directory '/webwml/chinese/intro'
Processing about.wml: [zh_CN], [zh_TW], [zh_HK]panic: swash_fetch got
swatch of unexpected bit width, slen=512, needents=64 at ../bin/tohk.pl
line 131, <> line 95.
.
make -C legal install
make[1]: Entering directory '/webwml/chinese/legal'
Processing index.wml: [zh_CN]Malformed UTF-8 character: \xe9\x98\x0a
(unexpected non-continuation byte 0x0a, 2 bytes after start byte 0xe9;
need 3 bytes, got 2) in substitution (s///) at ../bin/tocn.pl line 108,
<> line 68.
, [zh_TW], [zh_HK].
copying index.zh-cn.html to ../../../www/legal
copying index.zh-hk.html to ../../../www/legal
copying index.zh-tw.html to ../../../www/legal
make[1]: Entering directory '/webwml/chinese/releases'
Processing proposed-updates.wml: [zh_CN],
[zh_TW]Invalid UTF8: èœæçµå°é proposed-updates
, [zh_HK]Invalid UTF8: èœæçµå°é proposed-updates
.
make[2]: Entering directory '/webwml/chinese/releases/hamm'
Processing HOWTO.upgrade.wml: [zh_CN], [zh_TW]Malformed UTF-8 character:
\xe5\x8c\x0a (unexpected non-continuation byte 0x0a, 2 bytes after start
byte 0xe5; need 3 bytes, got 2) in substitution (s///) at
../../bin/totw.pl line 111, <> line 71.
, [zh_HK].
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Sun, 03 May 2020 21:00:04 GMT) (full text, mbox, link).
Acknowledgement sent
to Holger Wansing <hwansing@mailbox.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Sun, 03 May 2020 21:00:04 GMT) (full text, mbox, link).
Message #10 received at 959474@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Hi,
Laura Arjona Reina <larjona@debian.org> wrote:
> There are some issues with some Chinese pages when they are built in a
> buster machine.
> We need to fix those issues (at least the "Malformed UTF-8 character
> [...] at ../../bin/tocn.pl [...]" ones) so DSA can upgrade the
> www-master machine to buster. See the summary of the log at the bottom
> to know which files produce this error.
> I have no idea of how to fix the issues, so any help from the Chinese
> team or web team mates is greatly appreciated..
> Additional issues may arise (e.g. I still didn't test the release-notes
> or doc-manual), any help testing is welcome too, please create bug
> reports for each different issue or update the existing ones. Thanks!
>
> LONG VERSION
>
> I've done a test build of the /english and /chinese subdirs in a buster
> machine, and I have noticed some warnings/errors related to the Chinese
> pages (some, not all of them).
>
> It would be desirable to upgrade www-master machine to buster as soon as
> possible, so any help with this (from website or Chinese team members)
> is very appreciated.
>
> Below you can find an extract of the build log, including only the the
> files for which I got some error or warning message.
>
> After the build, I have compared the problematic HTML files of a build
> in stretch and a build in buster with a diff tool, to see if there were
> significant changes in the html output due to these issues.
>
> Here are my results:
>
> * For the messages of the type ", [zh_TW]Invalid UTF8: " when building,
> I couldn't note any difference between the output of a stretch build and
> the output of a buster build.
>
> I would say this is not a blocker for the buster upgrade of www-master.
Don't know what I did different than Laura, but here some of the built html files
with "Invalid UTF8: ... " messages are lacking much of the content, compared
to the one currently at www-master.
So maybe they are also serious.
> * For the messages of the type "Malformed UTF-8 character [...] at
> ../../bin/tocn.pl [...]" I have seen important changes in the HTML diff,
> I think the output in the stretch build is totally broken (fortunately,
> there are not many files in that situation).
>
> I would say this is a blocker for the buster upgrade of www-master, but
> I would prefer somebody of the Chinese team to confirm (try to build
> those files in a buster machine, and review the output).
Maybe someone from the chinese people can solve this, but if not, I want
to propose a possible (temporary) solution:
If I delete the files below from the webwml/chinese tree, I can build
chinese without any errors. So, probably we can go with a workaround like this:
delete this files, to remove these upgrade blockers out of the way, upgrade
wolkenstein to buster, and then try to re-add the files step-by-step, maybe
with some modifications at some point, to get the original situation back.
Holger
--
Holger Wansing <hwansing@mailbox.org>
PGP-Fingerprint: 496A C6E8 1442 4B34 8508 3529 59F1 87CA 156E B076
[files-deleted-from-chinese.txt (text/plain, attachment)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Tue, 05 May 2020 00:33:07 GMT) (full text, mbox, link).
Acknowledgement sent
to Boyuan Yang <byang@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Tue, 05 May 2020 00:33:07 GMT) (full text, mbox, link).
Message #15 received at 959474@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Hi all,
(with my Debian Chinese Team hat on)
(see bottom...)
在 2020-05-03星期日的 22:57 +0200,Holger Wansing写道:
> Hi,
>
> Laura Arjona Reina <larjona@debian.org> wrote:
> > There are some issues with some Chinese pages when they are built in a
> > buster machine.
> > We need to fix those issues (at least the "Malformed UTF-8 character
> > [...] at ../../bin/tocn.pl [...]" ones) so DSA can upgrade the
> > www-master machine to buster. See the summary of the log at the bottom
> > to know which files produce this error.
> > I have no idea of how to fix the issues, so any help from the Chinese
> > team or web team mates is greatly appreciated..
> > Additional issues may arise (e.g. I still didn't test the release-notes
> > or doc-manual), any help testing is welcome too, please create bug
> > reports for each different issue or update the existing ones. Thanks!
> >
> > LONG VERSION
> >
> > I've done a test build of the /english and /chinese subdirs in a buster
> > machine, and I have noticed some warnings/errors related to the Chinese
> > pages (some, not all of them).
> >
> > It would be desirable to upgrade www-master machine to buster as soon as
> > possible, so any help with this (from website or Chinese team members)
> > is very appreciated.
> >
> > Below you can find an extract of the build log, including only the the
> > files for which I got some error or warning message.
> >
> > After the build, I have compared the problematic HTML files of a build
> > in stretch and a build in buster with a diff tool, to see if there were
> > significant changes in the html output due to these issues.
> >
> > Here are my results:
> >
> > * For the messages of the type ", [zh_TW]Invalid UTF8: " when building,
> > I couldn't note any difference between the output of a stretch build and
> > the output of a buster build.
> >
> > I would say this is not a blocker for the buster upgrade of www-master.
>
> Don't know what I did different than Laura, but here some of the built html
> files
> with "Invalid UTF8: ... " messages are lacking much of the content, compared
> to the one currently at www-master.
> So maybe they are also serious.
>
> > * For the messages of the type "Malformed UTF-8 character [...] at
> > ../../bin/tocn.pl [...]" I have seen important changes in the HTML diff,
> > I think the output in the stretch build is totally broken (fortunately,
> > there are not many files in that situation).
> >
> > I would say this is a blocker for the buster upgrade of www-master, but
> > I would prefer somebody of the Chinese team to confirm (try to build
> > those files in a buster machine, and review the output).
>
> Maybe someone from the chinese people can solve this, but if not, I want
> to propose a possible (temporary) solution:
>
> If I delete the files below from the webwml/chinese tree, I can build
> chinese without any errors. So, probably we can go with a workaround like
> this:
> delete this files, to remove these upgrade blockers out of the way, upgrade
> wolkenstein to buster, and then try to re-add the files step-by-step, maybe
> with some modifications at some point, to get the original situation back.
Thanks for raising this issue. These build errors might have multiple causes,
but I stripped the issue down to a (possible) regression of wml. Let's fix
this issue first before talking about others.
=======================================
$ wml --version
This is WML Version 2.12.2
Copyright (c) 1996-2001 Ralf S. Engelschall.
Copyright (c) 1999-2001 Denis Barbier.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
$ cat /etc/issue
Debian GNU/Linux bullseye/sid \n \l
$ cat a.wml
<p>
包
</p>
$ hexdump -C a.wml
00000000 3c 70 3e 0a e5 8c 85 0a 3c 2f 70 3e 0a |<p>.....</p>.|
0000000d
$ wml a.wml > test.txt
$ cat test.txt
<p>
�
</p>
$ hexdump -C test.txt
00000000 3c 70 3e 0a e5 8c 0a 3c 2f 70 3e 0a |<p>....</p>.|
0000000c
$
==================================================
The single character in the a.wml above is U+5305 [1], namely "CJK Unified
Ideograph-5305", a commonly-used Chinese character. Its UTF-8 encoding is
"0xE5 0x8C 0x85". However after wml transformation, only "0xE5 0x8C" was kept
and the "0x85" was dropped. That's surely a regression.
I am using Debian Unstable but similar things also happen in Buster.
I cc-ed the wml maintainer in Debian. Axel, is there any possibility to solve
this regression in both Sid/Testing and Stable?
--
Regards,
Boyuan Yang
[1] https://www.compart.com/en/unicode/U+5305
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Tue, 05 May 2020 01:03:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Axel Beckert <abe@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Tue, 05 May 2020 01:03:02 GMT) (full text, mbox, link).
Message #20 received at 959474@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Control: clone -1 -2
Control: reasign -2 wml 2.12.2~ds1-2
Control: retitle -2 wml: Regression in "htmlstrip -O2" (default) with Chinese language
Hi,
Boyuan Yang wrote:
> Thanks for raising this issue.
Thanks from me, too. I wasn't aware of such a regression, sorry.
> These build errors might have multiple causes,
> but I stripped the issue down to a (possible) regression of wml. Let's fix
> this issue first before talking about others.
>
> =======================================
> $ wml --version
> This is WML Version 2.12.2
> Copyright (c) 1996-2001 Ralf S. Engelschall.
> Copyright (c) 1999-2001 Denis Barbier.
>
> This program is distributed in the hope that it will be useful,
> but WITHOUT ANY WARRANTY; without even the implied warranty of
> MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> GNU General Public License for more details.
> $ cat /etc/issue
> Debian GNU/Linux bullseye/sid \n \l
>
> $ cat a.wml
> <p>
> 包
> </p>
> $ hexdump -C a.wml
> 00000000 3c 70 3e 0a e5 8c 85 0a 3c 2f 70 3e 0a |<p>.....</p>.|
> 0000000d
> $ wml a.wml > test.txt
> $ cat test.txt
> <p>
> �
> </p>
> $ hexdump -C test.txt
> 00000000 3c 70 3e 0a e5 8c 0a 3c 2f 70 3e 0a |<p>....</p>.|
> 0000000c
> $
[…]
> I am using Debian Unstable but similar things also happen in Buster.
Can confirm that this is a regression between Stretch and Buster. :-(
> The single character in the a.wml above is U+5305 [1], namely "CJK Unified
> Ideograph-5305", a commonly-used Chinese character. Its UTF-8 encoding is
> "0xE5 0x8C 0x85". However after wml transformation, only "0xE5 0x8C" was kept
> and the "0x85" was dropped. That's surely a regression.
Ack. Figured out that it's pass 8 of 9 passes in WML:
→ cat a.wml | wml -p1-8
<p>
�
</p>
→ cat a.wml | wml -p1-7
<p>
包
</p>
→ cat a.wml | wml -p1-7,9
<p>
包
</p>
→ echo 包 | /usr/share/wml/exec/wml_p8_htmlstrip
�
→
Pass 8 is htmlstrip, something similar uglifyjs, but for HTML.
Since that pass should be only for delivery performance and disk space
reasons, it likely can be left out easily.
So I see multiple ways to more or less quickly fix this issue in the
Debian web:
* Always call wml with "-p1-7,9".
* Call wml with "-p1-7,9" if any of the affected languages is build.
* Add <nostrip>…</nostrip> containers in the header and footer
templates for the affected langauges.
To be more precise, it's the optimisation level 2 of htmlstrip:
→ echo 包 | /usr/share/wml/exec/wml_p8_htmlstrip -O 0
包
→ echo 包 | /usr/share/wml/exec/wml_p8_htmlstrip -O 1
包
→ echo 包 | /usr/share/wml/exec/wml_p8_htmlstrip -O 2
�
→
The man page says:
Level 2:
Good stripping: Same as level 1 plus compression of
multiple whitespaces (more then one in sequence) to single
whitespaces [txt,tag] and stripping of trailing whitespaces
at the of of a line [txt,tag,pre].
This level is the default because while providing good
optimization the HTML markup is not destroyed and remains
human readable.
So instead of skipping htmlstrip completely, everywhere, where I
suggested passing "-p1-7,9", also "-O1" could be passed to wml as
this is passed to htmlstrip:
→ cat a.wml | wml -O1
<p>
包
</p>
> I cc-ed the wml maintainer in Debian. Axel, is there any possibility to solve
> this regression in both Sid/Testing and Stable?
I think the above is a good first workaround on buster. With this
mail, I clone the bug report and will try to figure out what change in
htmlstrip caused the regression and/or how it can be fixed.
I though currently have issues building more recent upstream versions
of WML which is the reason why wml in Unstable hasn't seen an update
yet. A more recent version is in git, but IIRC there was another
release or two recently, at which I haven't looked yet.
Regards, Axel
--
,''`. | Axel Beckert <abe@debian.org>, https://people.debian.org/~abe/
: :' : | Debian Developer, ftp.ch.debian.org Admin
`. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5
`- | 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE
[signature.asc (application/pgp-signature, inline)]
Bug 959474 cloned as bug 959761
Request was from Axel Beckert <abe@debian.org>
to 959474-submit@bugs.debian.org.
(Tue, 05 May 2020 01:03:02 GMT) (full text, mbox, link).
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Tue, 05 May 2020 01:39:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Axel Beckert <abe@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Tue, 05 May 2020 01:39:02 GMT) (full text, mbox, link).
Message #27 received at 959474@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Hi,
found the culprit quicker than expected. I'm though no more sure if
it's really a WML issue or if sits even deeper:
Axel Beckert wrote:
> → echo 包 | /usr/share/wml/exec/wml_p8_htmlstrip -O 1
> 包
> → echo 包 | /usr/share/wml/exec/wml_p8_htmlstrip -O 2
> �
Level 2 actually only consists of these two regular expressions being
applied:
* s|(\S+)[ \t]{2,}|$1 |sg
* s|\s+\n|\n|sg
It's the latter one (a really simple regexp) which causes the
breakage. But not always. It depends on which Perl version
compatibility level is used:
→ echo 包 | perl -pe 's|\s+\n|\n|sg;'
包
→ echo 包 | perl -pE 's|\s+\n|\n|sg;'
�
"-E' instead of "-e" means "use the most recent Perl version feature
set", for this bug it is equivalent to "use 5.014;" as that's what is
used in htmlstrip.
From some point of view, we're lucky, because the feature set of Perl
5.14 wasn't that big: "say state switch unicode_strings".
It's obvious that neither say, state nor switch are causing this. So
it seems as if "use feature unicode_strings" is the culprit. Proof:
→ echo 包 | perl -pe 's|\s+\n|\n|sg;'
包
→ echo 包 | perl -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
�
Which kinda sounds like a Perl bug. Cc'ing the maintainers of Debian's
perl package (not the whole Debian Perl Team), maybe they have some
insight what actually goes wrong here and if that's indeed a Perl bug.
I'm leaving #959761 open in wml as I now have an idea how to fix this
there (adding "no feature unicode_strings" to htmlstrip in the hope
that this doesn't do any collateral damage):
→ echo 包 | perl -pE 'no feature unicode_strings; s|\s+\n|\n|sg;'
包
Regards, Axel
--
,''`. | Axel Beckert <abe@debian.org>, https://people.debian.org/~abe/
: :' : | Debian Developer, ftp.ch.debian.org Admin
`. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5
`- | 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, perl@packages.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Tue, 05 May 2020 02:24:03 GMT) (full text, mbox, link).
Acknowledgement sent
to "Yao Wei (魏銘廷)" <mwei@lxde.org>:
Extra info received and forwarded to list. Copy sent to perl@packages.debian.org, Debian WWW Team <debian-www@lists.debian.org>.
(Tue, 05 May 2020 02:24:03 GMT) (full text, mbox, link).
Message #32 received at 959474@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Package: www.debian.org
Followup-For: Bug #959474
Hi,
After a bit of investigation of Perl source code (5.31.11 downloaded
from upstream) I found the they have weird handling of whitespace when
`feature unicode_strings` turned on. I am not a perl person and I
haven't executed the source code yet, so my interpretation might be
wrong.
When `unicode_strings` is on, `in_uni_8_bit` should true internally, and
in three places of pp.c:6040, pp.c:6076, pp.c:6114 `isSPACE_L1` is
called to check whether the examining character is a whitespace, by
checking whether the character is 0x85 or 0xA0 (handy.h:1611). In the
case of the character 包, the last byte of 3-byte UTF-8 code is 0x85,
henceforth the problem.
-- System Information:
Debian Release: bullseye/sid
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 5.6.0-1-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Tue, 05 May 2020 02:24:04 GMT) (full text, mbox, link).
Acknowledgement sent
to Boyuan Yang <byang@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Tue, 05 May 2020 02:24:04 GMT) (full text, mbox, link).
Message #37 received at 959474@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Hi,
在 2020-05-05星期二的 03:34 +0200,Axel Beckert写道:
> → echo 包 | perl -pe 's|\s+\n|\n|sg;'
> 包
> → echo 包 | perl -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
> �
>
> Which kinda sounds like a Perl bug. Cc'ing the maintainers of Debian's
> perl package (not the whole Debian Perl Team), maybe they have some
> insight what actually goes wrong here and if that's indeed a Perl bug.
I guess it is a Perl bug. I am listing more Chinese characters other than "包"
here that can trigger the problem:
% echo 包 | perl -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
�
% echo 赠 | perl -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
�
% echo 传 | perl -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
�
% echo 阅 | perl -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
�
% echo 加 | perl -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
�
% echo 者 | perl -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
�
% echo -n 赠 | hexdump -C
00000000 e8 b5 a0
% echo -n 传 | hexdump -C
00000000 e4 bc a0
% echo -n 包 | hexdump -C
00000000 e5 8c 85
% echo -n 阅 | hexdump -C
00000000 e9 98 85
% echo -n 加 | hexdump -C
00000000 e5 8a a0
% echo -n 者 | hexdump -C
00000000 e8 80 85
(Note that 0xA0 and 0x85 at the end.)
Mwei (https://nm.debian.org/person/mwei/) just talked to me saying that it
could be a bug with isSPACE_L1 macro in perl's pp.c. He will be replying the
email soon.
--
Thanks,
Boyuan Yang
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Tue, 05 May 2020 03:48:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Yao Wei <mwei@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Tue, 05 May 2020 03:48:02 GMT) (full text, mbox, link).
Message #42 received at 959474@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On Mon, May 04, 2020 at 10:19:02PM -0400, Boyuan Yang wrote:
> Mwei (https://nm.debian.org/person/mwei/) just talked to me saying that it
> could be a bug with isSPACE_L1 macro in perl's pp.c. He will be replying the
> email soon.
>
Hi,
(I used reportbug to handle reply of this thread, and I missed a lot of
recipients here. This is a resend of reply in #959474. Sorry for the
noise.)
After a bit of investigation of Perl source code (5.31.11 downloaded
from upstream) I found the they have weird handling of whitespace when
`feature unicode_strings` turned on. I am not a perl person and I
haven't executed the source code yet, so my interpretation might be
wrong.
When `unicode_strings` is on, `in_uni_8_bit` should true internally, and
in three places of pp.c:6040, pp.c:6076, pp.c:6114 `isSPACE_L1` is
called to check whether the examining character is a whitespace, by
checking whether the character is 0x85 or 0xA0 (handy.h:1611). In the
case of the character 包, the last byte of 3-byte UTF-8 code is 0x85,
henceforth the problem.
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Tue, 05 May 2020 05:57:02 GMT) (full text, mbox, link).
Message #45 received at 959474@bugs.debian.org (full text, mbox, reply):
(not a Perl maintainer here)
-=| Axel Beckert, 05.05.2020 03:34:28 +0200 |=-
> → echo 包 | perl -pe 's|\s+\n|\n|sg;'
> 包
> → echo 包 | perl -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
> �
>
> Which kinda sounds like a Perl bug. Cc'ing the maintainers of Debian's
> perl package (not the whole Debian Perl Team), maybe they have some
> insight what actually goes wrong here and if that's indeed a Perl
> bug.
Seems like a user (wml) bug to me (improper handling of UTF-8 encoded data):
→ echo 包赠传阅加者 | perl -CS -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
包赠传阅加者
From perlrun(1):
-C [number/list]
The -C flag controls some of the Perl Unicode features.
As of 5.8.1, the -C can be followed either by a number or a list
of option letters. The letters, their numeric values, and effects
are as follows; listing the letters is equal to summing the
numbers.
I 1 STDIN is assumed to be in UTF-8
O 2 STDOUT will be in UTF-8
E 4 STDERR will be in UTF-8
S 7 I + O + E
Perhaps the strings in wml need to be decoded from UTF-8 so that they
aren't treated as a sequence of independent bytes?
U+0085 is "Next line (NEL)", which seems to be treated as "\n".
(
Strangely, replacing -CS with a call to STDIN->binmode("UTF-8")
doesn't help:
echo 包 | perl -E 'STDIN->binmode("UTF-8"); while(<>) { s|\s+\n|\n|sg; print }'
�
Explicitly using Encode helps:
echo 包 | perl -E 'use Encode qw(decode_utf8); while(<>) { $_ = decode_utf8($_); s|\s+\n|\n|sg; print }'
Wide character in print at -e line 1, <> line 1.
包
(whe wide character warning is expected, because STDOUT is not instructed how to encode unicode characters)
)
-- dam
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Tue, 05 May 2020 07:45:02 GMT) (full text, mbox, link).
Message #48 received at 959474@bugs.debian.org (full text, mbox, reply):
* Damyan Ivanov <dmn@debian.org>, 2020-05-05, 08:45:
>Strangely, replacing -CS with a call to STDIN->binmode("UTF-8")
>doesn't help:
>
> echo 包 | perl -E 'STDIN->binmode("UTF-8"); while(<>) { s|\s+\n|\n|sg; print }'
> �
That's because "UTF-8" is not a valid argument for binmode().
You want:
$ echo 包 | perl -E 'STDIN->binmode(":encoding(UTF-8)") or die; while(<>) { s|\s+\n|\n|sg; print }'
Wide character in print at -e line 1, <> line 1.
包
or:
$ echo 包 | perl -E 'STDIN->binmode(":utf8") or die; while(<>) { s|\s+\n|\n|sg; print }'
Wide character in print at -e line 1, <> line 1.
包
--
Jakub Wilk
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Tue, 05 May 2020 08:57:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Axel Beckert <abe@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Tue, 05 May 2020 08:57:03 GMT) (full text, mbox, link).
Message #53 received at 959474@bugs.debian.org (full text, mbox, reply):
Hi Damyan,
Damyan Ivanov wrote:
> (not a Perl maintainer here)
Did help nevertheless. Just didn't want to spam the whole Perl Team
with potential Perl bugs. ;-)
> -=| Axel Beckert, 05.05.2020 03:34:28 +0200 |=-
> > → echo 包 | perl -pe 's|\s+\n|\n|sg;'
> > 包
> > → echo 包 | perl -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
> > �
> >
> > Which kinda sounds like a Perl bug. Cc'ing the maintainers of Debian's
> > perl package (not the whole Debian Perl Team), maybe they have some
> > insight what actually goes wrong here and if that's indeed a Perl
> > bug.
>
> Seems like a user (wml) bug to me (improper handling of UTF-8 encoded data):
>
> → echo 包赠传阅加者 | perl -CS -M"feature unicode_strings" -pe 's|\s+\n|\n|sg;'
> 包赠传阅加者
>
> >From perlrun(1):
>
> -C [number/list]
> The -C flag controls some of the Perl Unicode features.
>
> As of 5.8.1, the -C can be followed either by a number or a list
> of option letters. The letters, their numeric values, and effects
> are as follows; listing the letters is equal to summing the
> numbers.
>
> I 1 STDIN is assumed to be in UTF-8
> O 2 STDOUT will be in UTF-8
> E 4 STDERR will be in UTF-8
> S 7 I + O + E
Thanks! I was not aware of the -C option...
> Perhaps the strings in wml need to be decoded from UTF-8 so that they
> aren't treated as a sequence of independent bytes?
... and would have expect "use feature unicode_strings;" already
activates all of this.
> U+0085 is "Next line (NEL)", which seems to be treated as "\n".
I see.
> Strangely, replacing -CS with a call to STDIN->binmode("UTF-8")
> doesn't help:
>
> echo 包 | perl -E 'STDIN->binmode("UTF-8"); while(<>) { s|\s+\n|\n|sg; print }'
> �
>
> Explicitly using Encode helps:
>
> echo 包 | perl -E 'use Encode qw(decode_utf8); while(<>) { $_ = decode_utf8($_); s|\s+\n|\n|sg; print }'
> Wide character in print at -e line 1, <> line 1.
> 包
Thanks, will try to use whatever works from these.
Regards, Axel
--
,''`. | Axel Beckert <abe@debian.org>, https://people.debian.org/~abe/
: :' : | Debian Developer, ftp.ch.debian.org Admin
`. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5
`- | 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Tue, 05 May 2020 10:18:02 GMT) (full text, mbox, link).
Acknowledgement sent
to gregor herrmann <gregoa@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Tue, 05 May 2020 10:18:02 GMT) (full text, mbox, link).
Message #58 received at 959474@bugs.debian.org (full text, mbox, reply):
On Tue, 05 May 2020 10:53:29 +0200, Axel Beckert wrote:
> > Perhaps the strings in wml need to be decoded from UTF-8 so that they
> > aren't treated as a sequence of independent bytes?
> ... and would have expect "use feature unicode_strings;" already
> activates all of this.
(I haven't read the thread in detail …).
Personally I often use "use utf8:all" (from libutf8-all-perl) if I'm
reasonably sure that the input is not weird and I want to output
utf-8. It is sometimes a bit slow but handles all the en/decoding in
my experience.
> > Explicitly using Encode helps:
> >
> > echo 包 | perl -E 'use Encode qw(decode_utf8); while(<>) { $_ = decode_utf8($_); s|\s+\n|\n|sg; print }'
> > Wide character in print at -e line 1, <> line 1.
> > 包
% time echo 包 | perl -E 'use Encode qw(decode_utf8); while(<>) { $_ = decode_utf8($_); s|\s+\n|\n|sg; print }'
Wide character in print at -e line 1, <> line 1.
包
echo 包 0.00s user 0.00s system 42% cpu 0.002 total
perl -E 0.03s user 0.01s system 97% cpu 0.034 total
% time echo 包 | perl -Mutf8::all -E ' while(<>) { s|\s+\n|\n|sg; print }'
包
echo 包 0.00s user 0.00s system 63% cpu 0.002 total
perl -Mutf8::all -E ' while(<>) { s|\s+\n|\n|sg; print }' 0.04s user 0.01s system 98% cpu 0.050 total
% time echo 包 | perl -CS -E 'while(<>) { s|\s+\n|\n|sg; print }'
包
echo 包 0.00s user 0.00s system 60% cpu 0.002 total
perl -CS -E 'while(<>) { s|\s+\n|\n|sg; print }' 0.00s user 0.00s system 83% cpu 0.005 total
Cheers,
gregor
--
.''`. https://info.comodo.priv.at -- Debian Developer https://www.debian.org
: :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D 85FA BB3A 6801 8649 AA06
`. `' Member VIBE!AT & SPI Inc. -- Supporter Free Software Foundation Europe
`- BOFH excuse #378: Operators killed by year 2000 bug bite.
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Wed, 06 May 2020 02:57:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Boyuan Yang <byang@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Wed, 06 May 2020 02:57:02 GMT) (full text, mbox, link).
Message #63 received at 959474@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Hi Axel,
I just tested the new wml 2.12.2~ds1-3 on Chinese translations for website
(webwml). It looks like the previous bug has been properly fixed.
Since the webmaster team is trying to upgrade the machine from Debian 9 to
Debian 10, it should be better if we have this fix pushed into stable soon.
Can you make a stable update for package wml with this fix?
--
Thanks,
Boyuan Yang
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Wed, 06 May 2020 08:33:06 GMT) (full text, mbox, link).
Acknowledgement sent
to Axel Beckert <abe@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Wed, 06 May 2020 08:33:07 GMT) (full text, mbox, link).
Message #68 received at 959474@bugs.debian.org (full text, mbox, reply):
Hi Boyuan,
Boyuan Yang wrote:
> I just tested the new wml 2.12.2~ds1-3 on Chinese translations for website
> (webwml). It looks like the previous bug has been properly fixed.
Thanks a lot for testing and verifying!
> Since the webmaster team is trying to upgrade the machine from Debian 9 to
> Debian 10, it should be better if we have this fix pushed into stable soon.
> Can you make a stable update for package wml with this fix?
As mentioned on IRC (not sure if you're on #debian-www, probably not),
this is my plan.
I'll though will have to wait until wml 2.12.2~ds1-3 migrates to
testing. Should happen within 2 or 3 days once autopkgtest has been
run and passed.
Laura though meant on IRC that the webmasters might not want to wait
until the next stable update.
But maybe I can get it to stable-proposed-updates soon and they can
use it from there, so that wouldn't cause much of a lag.
(While I was writing this mail, on #debian-www it was decided that
they will use one of the workarounds, likely the -O1" one.)
Regards, Axel
--
,''`. | Axel Beckert <abe@debian.org>, https://people.debian.org/~abe/
: :' : | Debian Developer, ftp.ch.debian.org Admin
`. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5
`- | 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Sun, 07 Jun 2020 13:45:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Laura Arjona Reina <larjona@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Sun, 07 Jun 2020 13:45:02 GMT) (full text, mbox, link).
Message #73 received at 959474@bugs.debian.org (full text, mbox, reply):
Hi all
As a workaround for the Debian website, until wml 2.12.2~ds1-3 or higher
arrives to stable, I have added the option "-O1" to the options passed
to wml for Chinese, in the /chinese/Make.lang file:
+# Add "-O1" to wml to be passed to htmlstrip, to avoid malformed UTF-8
+# see bug #959474
+# This option needs to be kept in Chinese until wml 2.12.2~ds1-3 or higher
+# arrives to Debian stable
+
+WMLOPTIONSZH = -O1
WMLOUTPUT = -o UNDEFuZH@uCNuCNHKuCNTW:$(*F).zh-cn.html.tmp@g+w \
-o UNDEFuZH@uHKuCNHKuHKTWuTWHK:$(*F).zh-hk.html.tmp@g+w \
@@ -54,7 +60,7 @@ WMLPROLOG = --prolog=$(FORMAT_ZH)
# Remove initial blank line due "[ZH::]" in $(TEMPLDIR)/common_tags.wml,
# an unfortunate but necessary workaround of a bug in slice < 1.3.9
WMLEPILOG = --epilog=$(STRIP_INITIAL_BLANK_LINE)
-WML = wml $(WMLOPTIONS) $(WMLOUTPUT) $(WMLPROLOG) $(WMLEPILOG)
+WML = wml $(WMLOPTIONS) $(WMLOPTIONSZH) $(WMLOUTPUT) $(WMLPROLOG)
$(WMLEPILOG)
I have compared the results of builds in stretch and buster both with
and without the option, and there are no changes in stretch, and the
UTF-8 issues are fixed in buster with the option (by the way, thanks
Boyuan for the additional fixes you did to mitigate the error).
So, I think that Bug#959474 can be closed, but I'll leave it open until
we effectively migrate to Buster and see the results in www.debian.org
"live" :-)
Thanks everybody for your work!
Kind regards,
--
Laura Arjona Reina
https://wiki.debian.org/LauraArjona
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Sun, 07 Jun 2020 14:06:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Axel Beckert <abe@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Sun, 07 Jun 2020 14:06:02 GMT) (full text, mbox, link).
Message #78 received at 959474@bugs.debian.org (full text, mbox, reply):
Hi,
Laura Arjona Reina wrote:
> I have compared the results of builds in stretch and buster both with
> and without the option, and there are no changes in stretch, and the
> UTF-8 issues are fixed in buster with the option
Thanks for these tests.
> So, I think that Bug#959474 can be closed, but I'll leave it open until
> we effectively migrate to Buster and see the results in www.debian.org
> "live" :-)
Just ot be sure: I should still provide a stable update for buster,
right?
(Sorry, was a bit busy IRL and nearly forgot about this open "to do"
item. So thanks for the reminder.)
Regards, Axel
--
,''`. | Axel Beckert <abe@debian.org>, https://people.debian.org/~abe/
: :' : | Debian Developer, ftp.ch.debian.org Admin
`. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5
`- | 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Sun, 07 Jun 2020 19:27:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Laura Arjona Reina <larjona@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Sun, 07 Jun 2020 19:27:03 GMT) (full text, mbox, link).
Message #83 received at 959474@bugs.debian.org (full text, mbox, reply):
Hi
El 7/6/20 a las 16:02, Axel Beckert escribió:
> Just ot be sure: I should still provide a stable update for buster,
> right?
>
I don't know if the type of bug qualifies for a stable update.
For www.debian.org, we'll be using the -O1 workaround for building the
Chinese pages, and that's about optimization, we don't lose any
functionality, so I think we can wait for bullseye.
Boyuan, please correct me if I am wrong...
Kind regards,
--
Laura Arjona Reina
https://wiki.debian.org/LauraArjona
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Wed, 10 Jun 2020 00:48:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Boyuan Yang <byang@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Wed, 10 Jun 2020 00:48:03 GMT) (full text, mbox, link).
Message #88 received at 959474@bugs.debian.org (full text, mbox, reply):
在 2020-06-07星期日的 21:23 +0200,Laura Arjona Reina写道:
> Hi
>
> El 7/6/20 a las 16:02, Axel Beckert escribió:
>
> > Just ot be sure: I should still provide a stable update for buster,
> > right?
> >
>
> I don't know if the type of bug qualifies for a stable update.
If I were the maintainer, I would give it a try to make the stable
update. (Why not?)
> For www.debian.org, we'll be using the -O1 workaround for building
> the
> Chinese pages, and that's about optimization, we don't lose any
> functionality, so I think we can wait for bullseye.
>
> Boyuan, please correct me if I am wrong...
If we have the workaround applied, website building with Chinese
contents should not be an issue anymore.
--
Thanks,
Boyuan Yang
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#959474; Package www.debian.org.
(Thu, 28 Jan 2021 02:03:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Changwoo Ryu <cwryu@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>.
(Thu, 28 Jan 2021 02:03:03 GMT) (full text, mbox, link).
Message #93 received at 959474@bugs.debian.org (full text, mbox, reply):
Korean is affected too and I added the "-O1" option workaround also to Korean.
Send a report that this bug log contains spam.
Debian bug tracking system administrator <owner@bugs.debian.org>.
Last modified:
Sun Jun 4 07:03:37 2023;
Machine Name:
buxtehude
Debian Bug tracking system
Debbugs is free software and licensed under the terms of the GNU
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson,
2005-2017 Don Armstrong, and many other contributors.