Debian Bug report logs - #567781
www.debian.org: converting the website to UTF-8

Package: www.debian.org; Maintainer for www.debian.org is Debian WWW Team <debian-www@lists.debian.org>;

Reported by: Simon Paillard <simon.paillard@resel.enst-bretagne.fr>

Date: Sun, 31 Jan 2010 12:21:02 UTC

Severity: wishlist

Tags: confirmed

Done: David Prévot <taffit@debian.org>

Bug is archived. No further changes may be made.

Summary: Hi,

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, pabs@debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Sun, 31 Jan 2010 12:21:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Simon Paillard <simon.paillard@resel.enst-bretagne.fr>:
New Bug report received and forwarded. Copy sent to pabs@debian.org, Debian WWW Team <debian-www@lists.debian.org>. (Sun, 31 Jan 2010 12:21:05 GMT) Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: Simon Paillard <simon.paillard@resel.enst-bretagne.fr>
To: submit@bugs.debian.org
Subject: www.debian.org: converting the website to UTF-8
Date: Sun, 31 Jan 2010 13:19:31 +0100
Package: www.debian.org
Severity: wishlist

On Wed, Nov 04, 2009 at 11:38:07AM +0800, Paul Wise wrote:
> The English parts of the website currently seem to be encoded in
> iso-8859-1. This causes issues in some cases where UTF-8 is used instead
> of HTML symbol entities. An example of this is the draft financial
> partners page:
> 
> http://www.debian.org/partners/financial_partners.en.html
> 
> Would it not be better to encode the website in UTF-8?

Now UTF8 is a defacto charset standard in Debian and other
distributions, moving the pages to UTF8 would avoid encoding
limitations and conversion mistakes.

Out of the 49 languages available on the website, 19 don't use UTF8:
$ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1 | sort
catalan
chinese
croatian
czech
-> english
greek
hungarian
indonesian
italian
korean
lithuanian
norwegian
polish
portuguese
romanian
romanian
russian
spanish
swedish

The process used and issues met during migration of french translation 
is available at:
http://lists.debian.org/debian-www/2009/07/msg00230.html

The main specific issue of moving the english version over other
languages will be updating the po files of all languages.

-- 
Simon Paillard




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Wed, 03 Feb 2010 22:33:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Andrei Popescu <andreimpopescu@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 03 Feb 2010 22:33:03 GMT) Full text and rfc822 format available.

Message #10 received at 567781@bugs.debian.org (full text, mbox):

From: Andrei Popescu <andreimpopescu@gmail.com>
To: 567781@bugs.debian.org
Subject: Re: Bug#567781: www.debian.org: converting the website to UTF-8
Date: Thu, 4 Feb 2010 00:28:02 +0200
[Message part 1 (text/plain, inline)]
On Sun,31.Jan.10, 13:19:31, Simon Paillard wrote:

> Out of the 49 languages available on the website, 19 don't use UTF8:
> $ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1 | sort
...
> romanian
> romanian

False positive:

,----[ grep CHARSET webwml/romanian/.wmlrc ]
| #-D CHARSET=iso-8859-2
| -D CHARSET=utf-8
| #-D CHARSET=iso-8859-16
`----

(I did the conversion myself ;)

Regards,
Andrei
-- 
Offtopic discussions among Debian users and developers:
http://lists.alioth.debian.org/mailman/listinfo/d-community-offtopic
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Sun, 09 May 2010 21:09:09 GMT) Full text and rfc822 format available.

Acknowledgement sent to Hector Oron <hector.oron@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Sun, 09 May 2010 21:09:09 GMT) Full text and rfc822 format available.

Message #15 received at 567781@bugs.debian.org (full text, mbox):

From: Hector Oron <hector.oron@gmail.com>
To: 567781@bugs.debian.org
Cc: debian-l10n-catalan@lists.debian.org
Subject: Re: webwml: iso -> utf-8?
Date: Sun, 9 May 2010 23:05:34 +0200
Hello,

2010/5/9 Guillem Jover <guillem@debian.org>:
> Ei!
>
> On Thu, 2010-04-22 at 12:22:45 +0200, Hector Oron wrote:
>> ¿Seria convenient aplicar?
>
>> --- .wmlrc      2010-04-22 12:21:06.000000000 +0200
>> +++ .wmlrc.new  2010-04-22 12:21:33.000000000 +0200
>> @@ -1,7 +1,7 @@
>>  -D CUR_LANG=Catalan
>>  -D CUR_ISO_LANG=ca
>>  -D CUR_LOCALE=ca_ES
>> --D CHARSET=iso-8859-1
>> +-D CHARSET=utf-8
>>  -D HOME~.
>>  -D INTRO~intro
>>  -D DEVEL~devel
>>
>> Reference: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=567781
>
> Bé, al patch li falta canviar CUR_LOCALE i l'encoding dels fitxers que
> estaven en iso-8859-1. Així que vaig fer tots aquests canvis l'altre
> dia i ara esta tot en UTF-8.

  Guillem has looked into doing catalonian conversion to UTF-8.

Thanks,
-- 
 Héctor Orón

"Our Sun unleashes tremendous flares expelling hot gas into the Solar
System, which one day will disconnect us."




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Tue, 01 Jun 2010 20:27:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Simon Paillard <simon.paillard@resel.enst-bretagne.fr>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Tue, 01 Jun 2010 20:27:07 GMT) Full text and rfc822 format available.

Message #20 received at 567781@bugs.debian.org (full text, mbox):

From: Simon Paillard <simon.paillard@resel.enst-bretagne.fr>
To: Alexander Reichle-Schmehl <tolimar@debian.org>, 567781@bugs.debian.org
Subject: Re: Converting English wml files to utf-8?
Date: Tue, 1 Jun 2010 22:24:47 +0200
Hi,

On Tue, Jun 01, 2010 at 04:26:28PM +0200, Bas Zoetekouw wrote:
> > When adding new files (in the News or events area, but I guess it's
> > similar for others) I often run into encoding problems.  As Debian
> > systems use utf-8 as default these days I often forget to convert the
> > files I add (or sometimes I do remember and "iconv -f utf-8 -t latin1"
> > doesn't work).
> > Would it make sense to convert the English wml files to utf-8 as it has
> > been done with some other languages, too?
> 
> Yes, I think that's a very good idea.  Everyone should be able to handle
> utf8 nowadays, and using a consistent charset over as many translations
> as possible would definately reduce work, I would say.

FTR, already tracked by #567781

Specific care to translation-headers is needed, to avoid translators to
bump the headers of all up to date translations.

-- 
Simon Paillard




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Wed, 02 Jun 2010 08:57:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Alexander Reichle-Schmehl <tolimar@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 02 Jun 2010 08:57:06 GMT) Full text and rfc822 format available.

Message #25 received at 567781@bugs.debian.org (full text, mbox):

From: Alexander Reichle-Schmehl <tolimar@debian.org>
To: Gerfried Fuchs <rhonda@deb.at>, 567781@bugs.debian.org
Subject: Re: Converting English wml files to utf-8?
Date: Wed, 02 Jun 2010 10:56:17 +0200
Am 01.06.2010 17:05, schrieb Gerfried Fuchs:

>> Would it make sense to convert the English wml files to utf-8 as it has
>> been done with some other languages, too?
>  Like written in #debian-www, if people are still aware that in the
> special areas for news entries and similar where data gets incorporated
> into other languages entities still needs to get used (this won't change
> before _all_ languages are converted to utf8!) then I am all for it and
> see it as a step in the right direction.

Which parts of the news entries would that be?  I only know some RSS
feeds created from wml files, but I guess it would be possible to solve
that problem by telling the RSS creation script that the created RSS
feed is utf-8 encoded.


Best regards,
  Alexander




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Wed, 02 Jun 2010 09:27:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Alexander Reichle-Schmehl <alexander@schmehl.info>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 02 Jun 2010 09:27:03 GMT) Full text and rfc822 format available.

Message #30 received at 567781@bugs.debian.org (full text, mbox):

From: Alexander Reichle-Schmehl <alexander@schmehl.info>
To: Alexander Reichle-Schmehl <tolimar@debian.org>, 567781@bugs.debian.org
Subject: Re: Bug#567781: Converting English wml files to utf-8?
Date: Wed, 02 Jun 2010 11:24:25 +0200
Hi!

Am 02.06.2010 10:56, schrieb Alexander Reichle-Schmehl:

> Which parts of the news entries would that be?  I only know some RSS
> feeds created from wml files, but I guess it would be possible to solve
> that problem by telling the RSS creation script that the created RSS
> feed is utf-8 encoded.

Got answer via irc:  All content, that is added to some overview pages
(as the English text is taken for translated overview pages, if the
document hasn't been translated, yet).

That would be for

english/News/:  All <define-tag pagetitle>foo</define-tag>

english/News/weekly/:  All the SUMMARY="foo" of the "#use
wml::debian::weeklynews::header" headers (respectively ::projectnews::).


I'll try to look over them ASAP and report back.



Best regards,
  Alexander






Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Wed, 02 Jun 2010 09:30:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Gerfried Fuchs <rhonda@deb.at>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 02 Jun 2010 09:30:05 GMT) Full text and rfc822 format available.

Message #35 received at 567781@bugs.debian.org (full text, mbox):

From: Gerfried Fuchs <rhonda@deb.at>
To: Alexander Reichle-Schmehl <tolimar@debian.org>, 567781@bugs.debian.org
Subject: Re: Bug#567781: Converting English wml files to utf-8?
Date: Wed, 2 Jun 2010 11:28:04 +0200
	Hi!

* Alexander Reichle-Schmehl <tolimar@debian.org> [2010-06-02 10:56:17 CEST]:
> Am 01.06.2010 17:05, schrieb Gerfried Fuchs:
> >  Like written in #debian-www, if people are still aware that in the
> > special areas for news entries and similar where data gets incorporated
> > into other languages entities still needs to get used (this won't change
> > before _all_ languages are converted to utf8!) then I am all for it and
> > see it as a step in the right direction.
> 
> Which parts of the news entries would that be?

 For the News entries, the <define-tag pagetitle> (similarly in other
news parts like e.g. d-i), for the DPN the SUMMARY= part of the
wml::debian::projectnews::header, the DSAs have their <define-tag
description>, and then there are the regular .data files for various
other means.

 These areas are taken verbatim into translated pages and thus have to
stay 8bit clean, i.e. everything outside of the ascii range has to use
entities. This limitation has to stick around and to get remembered as
long as we don't settle on utf8 for everything in the CVS.

> I only know some RSS feeds created from wml files, but I guess it
> would be possible to solve that problem by telling the RSS creation
> script that the created RSS feed is utf-8 encoded.

 That's the easy part. :P  Though, thinking about it, having 8bit
characters in the english files in RSS feed aggregated parts might cause
troubles: The encoding of the language for which the rss feed is
generated might use a different encoding than utf8 and thus receive
b0rked characters for parts which they haven't translated yet.

 I think we need to limit ourself on entities for parts that got pulled
into RSS feeds, too. Unless of course the rss feed generator code is as
good as being able to pick up the encoding of the english subtree and
the encoding of the language itself and encoding-convert visa versa. But
that still depends on the assumption that every 8bit character used in
the english files is representable in the encoding of the specific
language.

 So long,
Rhonda




Added tag(s) confirmed. Request was from Gerfried Fuchs <rhonda@deb.at> to control@bugs.debian.org. (Sun, 19 Dec 2010 20:36:05 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Mon, 03 Jan 2011 22:39:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Simon Paillard <spaillard@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Mon, 03 Jan 2011 22:39:03 GMT) Full text and rfc822 format available.

Message #42 received at 567781@bugs.debian.org (full text, mbox):

From: Simon Paillard <spaillard@debian.org>
To: The7up <the7up@gmail.com>
Cc: debian-www@lists.debian.org, 567781@bugs.debian.org
Subject: Re: move to utf8
Date: Mon, 3 Jan 2011 23:36:33 +0100
Hi,

On Mon, Jan 03, 2011 at 09:27:30PM +0100, The7up wrote:
> May someone can help me?
> 
> The following steps are necessary to move Hungarian translation to utf-8?

See http://lists.debian.org/debian-www/2009/07/msg00233.html
 
> 1.
> for i in $(find -type f); do \
>  recode iso-8859-2..utf8 $i; \
> done

You should exclude png and pdf files files (or first run the command on *.wml
files to see easily remaining files).
 
> 2.
> update .wmlrc (-D CHARSET=utf-8)

Optionnal:

3. Recode po files to UTF8. 
4. Replace html entities to UTF8.


-- 
Simon Paillard




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Wed, 05 Jan 2011 20:27:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to The7up <the7up@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 05 Jan 2011 20:27:03 GMT) Full text and rfc822 format available.

Message #47 received at 567781@bugs.debian.org (full text, mbox):

From: The7up <the7up@gmail.com>
To: The7up <the7up@gmail.com>, debian-www@lists.debian.org, 567781@bugs.debian.org
Subject: Re: move to utf8
Date: Wed, 5 Jan 2011 21:23:01 +0100
[Message part 1 (text/plain, inline)]
Hi,

Something goes wrong...

After recoding wml and po (files says: UTF-8 Unicode English text), tried to
commit this files, but cvs cannot found differences, so do nothing :(

Any help? :S

Thnx,

Szabolcs Siebenhofer

On Mon, Jan 3, 2011 at 11:36 PM, Simon Paillard <spaillard@debian.org>wrote:

> Hi,
>
> On Mon, Jan 03, 2011 at 09:27:30PM +0100, The7up wrote:
> > May someone can help me?
> >
> > The following steps are necessary to move Hungarian translation to utf-8?
>
> See http://lists.debian.org/debian-www/2009/07/msg00233.html
>
> > 1.
> > for i in $(find -type f); do \
> >  recode iso-8859-2..utf8 $i; \
> > done
>
> You should exclude png and pdf files files (or first run the command on
> *.wml
> files to see easily remaining files).
>
> > 2.
> > update .wmlrc (-D CHARSET=utf-8)
>
> Optionnal:
>
> 3. Recode po files to UTF8.
> 4. Replace html entities to UTF8.
>
>
> --
> Simon Paillard
>
[Message part 2 (text/html, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 06 Jan 2011 17:09:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Osamu Aoki <osamu@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 06 Jan 2011 17:09:06 GMT) Full text and rfc822 format available.

Message #52 received at 567781@bugs.debian.org (full text, mbox):

From: Osamu Aoki <osamu@debian.org>
To: The7up <the7up@gmail.com>, 567781@bugs.debian.org
Subject: Re: Bug#567781: move to utf8
Date: Fri, 7 Jan 2011 02:03:55 +0900
On Wed, Jan 05, 2011 at 09:23:01PM +0100, The7up wrote:
> Hi,
> 
> Something goes wrong...
> 
> After recoding wml and po (files says: UTF-8 Unicode English text), tried to
> commit this files, but cvs cannot found differences, so do nothing :(

Which file are you looking.
 
> Any help? :S
> > >  recode iso-8859-2..utf8 $i; \

I did this on hungarian/po/bugs.hu.po

It converted nicely from ISO to UTF-8
$ file bugs.hu.po* 
bugs.hu.po:      UTF-8 Unicode PO (gettext message catalogue) text
bugs.hu.po.orig: ISO-8859 PO (gettext message catalogue) text

Second guessing is difficult.  Just try one by one to understand and
check it yourself.

Osamu




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 06 Jan 2011 17:33:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to The7up <the7up@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 06 Jan 2011 17:33:07 GMT) Full text and rfc822 format available.

Message #57 received at 567781@bugs.debian.org (full text, mbox):

From: The7up <the7up@gmail.com>
To: Osamu Aoki <osamu@debian.org>
Cc: 567781@bugs.debian.org
Subject: Re: Bug#567781: move to utf8
Date: Thu, 6 Jan 2011 18:30:30 +0100
[Message part 1 (text/plain, inline)]
Thanks for your help.
File recoding doesn't change time stamp of the file. So, after recode, touch
needed.

After using touch everything seems to be OK.

Thanks again :)

On Thu, Jan 6, 2011 at 6:03 PM, Osamu Aoki <osamu@debian.org> wrote:

> On Wed, Jan 05, 2011 at 09:23:01PM +0100, The7up wrote:
> > Hi,
> >
> > Something goes wrong...
> >
> > After recoding wml and po (files says: UTF-8 Unicode English text), tried
> to
> > commit this files, but cvs cannot found differences, so do nothing :(
>
> Which file are you looking.
>
> > Any help? :S
> > > >  recode iso-8859-2..utf8 $i; \
>
> I did this on hungarian/po/bugs.hu.po
>
> It converted nicely from ISO to UTF-8
> $ file bugs.hu.po*
> bugs.hu.po:      UTF-8 Unicode PO (gettext message catalogue) text
> bugs.hu.po.orig: ISO-8859 PO (gettext message catalogue) text
>
> Second guessing is difficult.  Just try one by one to understand and
> check it yourself.
>
> Osamu
>
[Message part 2 (text/html, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Mon, 10 Jan 2011 22:00:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Simon Paillard <spaillard@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Mon, 10 Jan 2011 22:00:06 GMT) Full text and rfc822 format available.

Message #62 received at 567781@bugs.debian.org (full text, mbox):

From: Simon Paillard <spaillard@debian.org>
To: The7up <the7up@gmail.com>, debian-www@lists.debian.org, 567781@bugs.debian.org
Subject: Re: move to utf8: po charset vs header
Date: Mon, 10 Jan 2011 22:57:49 +0100
On Mon, Jan 03, 2011 at 11:36:33PM +0100, Simon Paillard wrote:
> On Mon, Jan 03, 2011 at 09:27:30PM +0100, The7up wrote:
> > May someone can help me?
> > 
> > The following steps are necessary to move Hungarian translation to utf-8?
> 
> See http://lists.debian.org/debian-www/2009/07/msg00233.html

As mentionned there, po headers must be consistent with the actual content
regarding the charset.

I've updated the headers accordingly.

You can have a look at some remaining issues:
http://www-master.debian.org/build-logs/validate/hu
 
> > 1.
> > for i in $(find -type f); do \
> >  recode iso-8859-2..utf8 $i; \
> > done
> 
> You should exclude png and pdf files files (or first run the command on *.wml
> files to see easily remaining files).
>  
> > 2.
> > update .wmlrc (-D CHARSET=utf-8)
> 
> Optionnal:
> 
> 3. Recode po files to UTF8. 
> 4. Replace html entities to UTF8.
 

-- 
Simon Paillard




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Mon, 07 Feb 2011 22:57:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Francesca Ciceri <madamezou@yahoo.it>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Mon, 07 Feb 2011 22:57:04 GMT) Full text and rfc822 format available.

Message #67 received at 567781@bugs.debian.org (full text, mbox):

From: Francesca Ciceri <madamezou@yahoo.it>
To: 567781@bugs.debian.org
Subject: update on the utf-8 conversion
Date: Mon, 7 Feb 2011 23:56:09 +0100
[Message part 1 (text/plain, inline)]
Hi,
I just want to report that two more languages have realized the conversion to
UTF-8: greek and italian.

It would be my intention to take care of the conversion to UTF-8 of the unmaintained
translations.

Cheers,
Francesca
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Tue, 08 Feb 2011 20:39:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Simon Paillard <spaillard@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Tue, 08 Feb 2011 20:39:06 GMT) Full text and rfc822 format available.

Message #72 received at 567781@bugs.debian.org (full text, mbox):

From: Simon Paillard <spaillard@debian.org>
To: 567781@bugs.debian.org
Subject: Re: Bug#567781: update on the utf-8 conversion
Date: Tue, 8 Feb 2011 21:35:48 +0100
Hi,

On Mon, Feb 07, 2011 at 11:56:09PM +0100, Francesca Ciceri wrote:
> Hi,
> I just want to report that two more languages have realized the conversion to
> UTF-8: greek and italian.

2 less :)
 
> It would be my intention to take care of the conversion to UTF-8 of the unmaintained
> translations.

Be careful about moving to UTF8 languages you cannot read, IMO it's up to
native speakers to take care of that, to avoid any potential issue.

If there is no maintainer for that language, then there is no interest for the
moment to move the encoding ?

-- 
Simon Paillard




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Fri, 11 Feb 2011 00:09:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Charles Plessy <plessy@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Fri, 11 Feb 2011 00:09:06 GMT) Full text and rfc822 format available.

Message #77 received at 567781@bugs.debian.org (full text, mbox):

From: Charles Plessy <plessy@debian.org>
To: 567781@bugs.debian.org
Subject: Re: Debian WWW team IRC meeting, 15 February 21:30 UTC in #debian-www
Date: Fri, 11 Feb 2011 09:05:07 +0900
Le Thu, Feb 10, 2011 at 09:02:24PM +0100, Kåre Thor Olsen a écrit :
> On Tuesday the 15th of February at 21:30 UTC, the Debian WWW team will 
> hold a meeting in the #debian-www channel on the OFTC network.
> 
> The agenda includes several topics related to the new website layout, 
> e.g. a template and documentation for other teams wanting to re-use 
> the layout.
> 
> The entire agenda is being prepared on the wiki:
> 
> http://wiki.debian.org/Teams/Webmaster/TODO

Dear all,

I will not attend the meeting, but I would like to remind that I volunteer to
migrate the english pages to UTF-8. Just let me know if you are interested by
what I proposed.

http://lists.debian.org/debian-www/2011/01/msg00234.html

Apologises for not having appropriately posted through #567781 earlier.

Have a nice day,

-- 
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Wed, 16 Feb 2011 00:24:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Francesca Ciceri <madamezou@yahoo.it>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 16 Feb 2011 00:24:02 GMT) Full text and rfc822 format available.

Message #82 received at 567781@bugs.debian.org (full text, mbox):

From: Francesca Ciceri <madamezou@yahoo.it>
To: 567781@bugs.debian.org
Subject: Re: Bug#567781: update on the utf-8 conversion
Date: Wed, 16 Feb 2011 01:22:01 +0100
[Message part 1 (text/plain, inline)]
On Tue, 8 Feb 2011 21:35:48 +0100, Simon Paillard wrote:
>
>Be careful about moving to UTF8 languages you cannot read, IMO it's up to
>native speakers to take care of that, to avoid any potential issue.
>

You're right. I'll avoid any changes if there won't a translator of the
related language with which collaborate. ;)

BTW, in the meanwhile also russian, has migrated to UTF-8.
So now languages that don't use UTF-8 are:

$ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1 | sort
chinese
croatian
czech
english
indonesian
korean
lithuanian
polish
portuguese
spanish
swedish

(I've manually deleted romanian which is a false positive)

Only 11 languages left.
We knows that chinese needs squeeze upgrade (on wolkenstein?) to convert to utf-8.



Cheers,
Francesca
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Wed, 16 Mar 2011 22:36:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kåre Thor Olsen <kaare@nightcall.dk>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 16 Mar 2011 22:36:04 GMT) Full text and rfc822 format available.

Message #87 received at 567781@bugs.debian.org (full text, mbox):

From: Kåre Thor Olsen <kaare@nightcall.dk>
To: debian-www@lists.debian.org, 567781@bugs.debian.org
Subject: Re: Bug#567781: update on the utf-8 conversion
Date: Wed, 16 Mar 2011 23:25:50 +0100
On Wednesday 16 February 2011 01:22:01 Francesca Ciceri wrote:
> Only 11 languages left.

Swedish now done by agreement with the Swedish translation coordiantor.

-- 
Regards, Kaare




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Tue, 17 May 2011 00:06:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Charles Plessy <plessy@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Tue, 17 May 2011 00:06:03 GMT) Full text and rfc822 format available.

Message #92 received at 567781@bugs.debian.org (full text, mbox):

From: Charles Plessy <plessy@debian.org>
To: 567781@bugs.debian.org
Subject: Re: Conversion of english pages to Unicode, via HTML entities.
Date: Tue, 17 May 2011 09:03:42 +0900
Le Mon, May 16, 2011 at 07:34:59PM +0200, Simon Paillard a écrit :
> On Sun, May 15, 2011 at 10:24:48PM +0900, Charles Plessy wrote:
> > 
> > would it be welcome if I would start to replace iso-8859-1 characters
> > by HTML entities using smart-change for the english language, in order
> > to ease conversion to Unicode ?  As of today, there would be this
> > number of files changed in the following directories.
> [..]
> 
> No, I would even advice the other: remaining entities -> to the coding used by
> each language.

Entities can be removed after the conversion, and I can help for this as well.

I would like the English pages to be converted to Unicode, and offered my help
a couple of monthes ago.  I proposed to first go to the common denominator of
iso-8859-1 and Unicode, which is ASCII plus entities, and then to switch
encoding, and then to remove the entities.

I sent this to http://bugs.debian.org/567781#77 and I thought it was accepted
by the WWW team after discussion on IRC:

http://meetbot.debian.net/debian-www/2011/debian-www.2011-02-15-21.30.html 

The advantage if this proposition is that the work can be distributed over
time and people.

What are the other plans ?  If it is to have a massive overnight transition,
given my timezone, you can probably count me out…

Have a nice day,

-- 
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 19 May 2011 08:06:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Gerfried Fuchs <rhonda@deb.at>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 19 May 2011 08:06:03 GMT) Full text and rfc822 format available.

Message #97 received at 567781@bugs.debian.org (full text, mbox):

From: Gerfried Fuchs <rhonda@deb.at>
To: Charles Plessy <plessy@debian.org>, 567781@bugs.debian.org
Subject: Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.
Date: Thu, 19 May 2011 10:03:31 +0200
* Charles Plessy <plessy@debian.org> [2011-05-17 02:03:42 CEST]:
> Le Mon, May 16, 2011 at 07:34:59PM +0200, Simon Paillard a écrit :
> > On Sun, May 15, 2011 at 10:24:48PM +0900, Charles Plessy wrote:
> > > 
> > > would it be welcome if I would start to replace iso-8859-1 characters
> > > by HTML entities using smart-change for the english language, in order
> > > to ease conversion to Unicode ?  As of today, there would be this
> > > number of files changed in the following directories.
> > [..]
> > 
> > No, I would even advice the other: remaining entities -> to the
> > coding used by each language.
> 
> Entities can be removed after the conversion, and I can help for this
> as well.

 Why entity in the first place and then switch it back? That would mean
an additional required bump of translation-check headers and whatsnot. I
don't see the benefit in this? Like Simon pointed out, it would make
e.g. proof reading approaches unneeded complicated. There is no use for
this in the aereas that are not already using entities.

 Also, "can be removed after the conversion" would be after the
conversion of _all_ languages because otherwise you would catch the
entities that are currently still needed.

> I would like the English pages to be converted to Unicode, and offered
> my help a couple of monthes ago.  I proposed to first go to the common
> denominator of iso-8859-1 and Unicode, which is ASCII plus entities,
> and then to switch encoding, and then to remove the entities.
> 
> I sent this to http://bugs.debian.org/567781#77 and I thought it was accepted
> by the WWW team after discussion on IRC:
> 
> http://meetbot.debian.net/debian-www/2011/debian-www.2011-02-15-21.30.html 

 In theory yes, help is appreciated and you are invited to help, but
please try to understand our reasoning on why we consider that
translating the 8bit characters to entities now, bumping all
translation-check headers, putting default for english to utf8, removing
entities and *again* bumping all translation-check headers, is not the
most useful approach.

 For pages not uptodate that means being left behind for two more
"updates", which might result in bigger warning, and also requires
additional care after the second conversion to not replace an entity
that isn't meant to be a direct utf8 character (yet).

> What are the other plans ?  If it is to have a massive overnight transition,
> given my timezone, you can probably count me out…

 One of the plans might be to do it in a work session during debcamp,
which is only two months away. If you like to help, please coordinate
with the people that already have done a conversion, and try to
understand their concerns.

 Enjoy,
Rhonda
-- 
Fühlst du dich mutlos, fass endlich Mut, los      |
Fühlst du dich hilflos, geh raus und hilf, los    | Wir sind Helden
Fühlst du dich machtlos, geh raus und mach, los   | 23.55: Alles auf Anfang
Fühlst du dich haltlos, such Halt und lass los    |




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 19 May 2011 10:40:32 GMT) Full text and rfc822 format available.

Acknowledgement sent to Charles Plessy <plessy@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 19 May 2011 10:40:40 GMT) Full text and rfc822 format available.

Message #102 received at 567781@bugs.debian.org (full text, mbox):

From: Charles Plessy <plessy@debian.org>
To: 567781@bugs.debian.org
Subject: Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.
Date: Thu, 19 May 2011 19:29:43 +0900
Dear Gerfried and WWW team,

my proposition is the following:

1) Bring the English pages into a state where the files are the same
   regardless whether the encoding is iso-8859-1, ascii or utf8.

2) Make the English pages served as utf8 instead of iso-8859-1.

3) If necessary, convert entities to accented characters.

Apart from 2) I am proposing to do up to 100 % of the work, according to how
much others would like to participate.

I think that by large, in the English pages, the characters that are currently
accented are in the persons or location names, as most of the pages that need
to be converted are in the directories users, events, vote, security and News.
I think that the three-step conversion I propose will not interfere with the
possibility of spellchecking the pages that are actively worked on.  Note that
anyway the pages in vote, security and News usually do not have new content
added. 

I propose to use smart-change in the steps 1) and 3), so that the translators
are not disturbed.  I already made a test in February for one page in
devel/debian-med, and it worked – see commit ID 2rdf5isFrcBQZ66v.  Please note
that I am the contributor of the English version and of the only translation of
that page: I took great care of not disturbing other's work.

It is true that in 1) and 3) there is a risk of side effects. I will look for
them and revert them. 

I would like to repeat that I did my best to think about the translators, and
never ever proposed something that would bump their translation-check headers,
because I propose to use smart-change.

At Debcamp, technically, how do you intend to convert the English pages to
unicode without bumping translation-check headers ?

Have a nice day,

-- 
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 26 May 2011 08:33:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Gerfried Fuchs <rhonda@deb.at>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 26 May 2011 08:33:04 GMT) Full text and rfc822 format available.

Message #107 received at 567781@bugs.debian.org (full text, mbox):

From: Gerfried Fuchs <rhonda@deb.at>
To: Charles Plessy <plessy@debian.org>, 567781@bugs.debian.org
Subject: Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.
Date: Thu, 26 May 2011 10:31:47 +0200
  Hi again.

* Charles Plessy <plessy@debian.org> [2011-05-19 12:29:43 CEST]:
> Dear Gerfried and WWW team,
> 
> my proposition is the following:
> 
> 1) Bring the English pages into a state where the files are the same
>    regardless whether the encoding is iso-8859-1, ascii or utf8.
> 
> 2) Make the English pages served as utf8 instead of iso-8859-1.
> 
> 3) If necessary, convert entities to accented characters.

 I am aware of that and directly responded to it, there is no need to
re-iterate the same statement over and over again ...

> I propose to use smart-change in the steps 1) and 3), so that the
> translators are not disturbed.

 If a translation is outdated already, it will disturb translators, and
one bump only instead of two is far more convenient in that respect.

> It is true that in 1) and 3) there is a risk of side effects. I will
> look for them and revert them. 

 The thing is, I still don't understand the need and gain or benefit for
buying that side effect risks.

> At Debcamp, technically, how do you intend to convert the English
> pages to unicode without bumping translation-check headers ?

 I never stated that it would be doable without translation-check bumps,
I have no clue where you picked that one up. All I was saying that it
would be possible to do it with a single bump, and that when sitting
together at debcamp more people can watch and think along and check
things so that the risk of overlooking stuff is reduced to a minimum,
without IMHO totally unnecessary commits for converting to entities and
back that leave the gut feeling of overengineering and overcomplicating
things.

 Enjoy,
Rhonda
-- 
Fühlst du dich mutlos, fass endlich Mut, los      |
Fühlst du dich hilflos, geh raus und hilf, los    | Wir sind Helden
Fühlst du dich machtlos, geh raus und mach, los   | 23.55: Alles auf Anfang
Fühlst du dich haltlos, such Halt und lass los    |




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 26 May 2011 09:24:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Charles Plessy <plessy@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 26 May 2011 09:24:06 GMT) Full text and rfc822 format available.

Message #112 received at 567781@bugs.debian.org (full text, mbox):

From: Charles Plessy <plessy@debian.org>
To: 567781@bugs.debian.org
Subject: Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.
Date: Thu, 26 May 2011 18:21:39 +0900
Le Thu, May 26, 2011 at 10:31:47AM +0200, Gerfried Fuchs a écrit :
> 
> > I propose to use smart-change in the steps 1) and 3), so that the
> > translators are not disturbed.
> 
>  If a translation is outdated already, it will disturb translators

Hi Gerfried,

I did not realise that smart_change was not bumping version of outdated
translations, and this clarifies a big misunderstanding.  This said, I could
probably modify smart_change locally if needed.

If you plan to do the migration at Debcamp that is great: this is more free
time for me.  If after Debconf the migration is not done, I will come back with
my proposal.


Cheers,

-- 
Charles Plessy
Tsurumi, Kanagawa, Japan




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 26 May 2011 09:36:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Gerfried Fuchs <rhonda@deb.at>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 26 May 2011 09:36:08 GMT) Full text and rfc822 format available.

Message #117 received at 567781@bugs.debian.org (full text, mbox):

From: Gerfried Fuchs <rhonda@deb.at>
To: Charles Plessy <plessy@debian.org>, 567781@bugs.debian.org
Subject: Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.
Date: Thu, 26 May 2011 11:32:47 +0200
* Charles Plessy <plessy@debian.org> [2011-05-26 11:21:39 CEST]:
> Le Thu, May 26, 2011 at 10:31:47AM +0200, Gerfried Fuchs a écrit :
> > > I propose to use smart-change in the steps 1) and 3), so that the
> > > translators are not disturbed.
> > 
> >  If a translation is outdated already, it will disturb translators
> 
> I did not realise that smart_change was not bumping version of outdated
> translations, and this clarifies a big misunderstanding.  This said, I could
> probably modify smart_change locally if needed.

 Erm, smart_change can't bump versions of outdated translations - how
should translators otherwise be aware that they have to update their
translation? It makes me wonder in what way you would like to modify
smart_change locally.

> If you plan to do the migration at Debcamp that is great: this is more
> free time for me.  If after Debconf the migration is not done, I will
> come back with my proposal.

 Don't let this stop you from doing it - but please (and this will be my
last response on that track) explain to me why you would want to switch
to entities and later remove the entities again instead of directly
switching the encoding with one run? It seems to me you are avoiding
responding to that (you mentioned once that it could get spread over
longer time - but I don't see how it would take a long time to do that,
or need that?). And with which propsal exactly do you want to come back?

 Thanks,
Rhonda
-- 
Fühlst du dich mutlos, fass endlich Mut, los      |
Fühlst du dich hilflos, geh raus und hilf, los    | Wir sind Helden
Fühlst du dich machtlos, geh raus und mach, los   | 23.55: Alles auf Anfang
Fühlst du dich haltlos, such Halt und lass los    |




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 26 May 2011 10:57:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Charles Plessy <plessy@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 26 May 2011 10:57:06 GMT) Full text and rfc822 format available.

Message #122 received at 567781@bugs.debian.org (full text, mbox):

From: Charles Plessy <plessy@debian.org>
To: 567781@bugs.debian.org
Subject: Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.
Date: Thu, 26 May 2011 19:54:27 +0900
Le Thu, May 26, 2011 at 11:32:47AM +0200, Gerfried Fuchs a écrit :
> smart_change can't bump versions of outdated translations - how
> should translators otherwise be aware that they have to update their
> translation?

Good point.  I thought about keeping the outdate count constant during the
operation, but of course the information of which is the last translated
version would be corrupted.

I do not think that I will ever have a large enough time window to work on a
single-run switch by myself.  I would like the Unicode transition to be done,
and if nobody else does it I propose to do it myself or together, step by step,
in an asynchronous manner.  The time I dedicate to Debian is usually very
fragmented.

If I come back with a proposal, I will

 - list the pages that need a conversion,
 - report if they have outdated translations,
 - report which translations would be likely to be modified by a smart-change,
 - detail how I propose to solve that problem.

Cheers,

-- 
Charles




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 26 May 2011 10:57:09 GMT) Full text and rfc822 format available.

Acknowledgement sent to Gerfried Fuchs <rhonda@deb.at>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 26 May 2011 10:57:13 GMT) Full text and rfc822 format available.

Message #127 received at 567781@bugs.debian.org (full text, mbox):

From: Gerfried Fuchs <rhonda@deb.at>
To: Charles Plessy <plessy@debian.org>, 567781@bugs.debian.org
Subject: Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.
Date: Thu, 26 May 2011 12:56:06 +0200
   Hi!

 Just for the record, to cut the discussion short and instead do
something than being side-tracked any further by comments that ignore
the actual questions instead of trying to get things done, I converted
the 445 files in the english/ part that were latin1 encoded to utf8,
changed the .wmlrc and did call smart_change with a
stupidpatternthatdoesntexist to bump all the translations.

 So from now on, commits to english/ that are containing non-ascii
characters MUST be utf-8 encoded. We'll see what might (or might not)
break after the next build which is expected in half an hour.

 Enjoy,
Rhonda
-- 
Fühlst du dich mutlos, fass endlich Mut, los      |
Fühlst du dich hilflos, geh raus und hilf, los    | Wir sind Helden
Fühlst du dich machtlos, geh raus und mach, los   | 23.55: Alles auf Anfang
Fühlst du dich haltlos, such Halt und lass los    |




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Fri, 27 May 2011 17:45:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Francesca Ciceri <madamezou@yahoo.it>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Fri, 27 May 2011 17:45:03 GMT) Full text and rfc822 format available.

Message #132 received at 567781@bugs.debian.org (full text, mbox):

From: Francesca Ciceri <madamezou@yahoo.it>
To: 567781@bugs.debian.org
Cc: debian-www@lists.debian.org
Subject: Re: Bug#567781: update on the utf-8 conversion
Date: Fri, 27 May 2011 19:40:50 +0200
[Message part 1 (text/plain, inline)]
Hi all,
portuguese has been converted today, by agreement with
debian-l10n-portuguese (and with their help).

So, thanks also Rhonda's great work on /english/ pages, the situation is:

$ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1 | sort
croatian
czech
indonesian
korean
lithuanian
polish
spanish

i.e. Only 7 languages left! \o/ 
My next target would be spanish, if I'll find a spanish translator to
proofread the result (hint, hint!).

Cheers,
Francesca
-- 
<taffit> eof: when I want something | "Convince people with results,
done quickly, I don't wait for      |  rather than words"
others ;)			    |  Enrico Zini
				

[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 02 Jun 2011 19:09:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Gerfried Fuchs <rhonda@deb.at>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 02 Jun 2011 19:09:03 GMT) Full text and rfc822 format available.

Message #137 received at 567781@bugs.debian.org (full text, mbox):

From: Gerfried Fuchs <rhonda@deb.at>
To: 567781@bugs.debian.org, debian-www@lists.debian.org
Subject: Re: Bug#567781: update on the utf-8 conversion
Date: Thu, 2 Jun 2011 21:07:16 +0200
  Hi,

 another one down, Joy gave me green light for converting Croatian.

* Francesca Ciceri <madamezou@yahoo.it> [2011-05-27 19:40:50 CEST]:
> $ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1 | sort
> czech
> indonesian
> korean
> lithuanian
> polish
> spanish

 Anyone from these language teams around who doesn't want to be the last
one sticking behind?

 Enjoy!
Rhonda
-- 
Fühlst du dich mutlos, fass endlich Mut, los      |
Fühlst du dich hilflos, geh raus und hilf, los    | Wir sind Helden
Fühlst du dich machtlos, geh raus und mach, los   | 23.55: Alles auf Anfang
Fühlst du dich haltlos, such Halt und lass los    |




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Wed, 08 Jun 2011 13:27:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Osamu Aoki <osamu@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 08 Jun 2011 13:27:03 GMT) Full text and rfc822 format available.

Message #142 received at 567781@bugs.debian.org (full text, mbox):

From: Osamu Aoki <osamu@debian.org>
To: Gerfried Fuchs <rhonda@deb.at>, 567781@bugs.debian.org
Cc: Charles Plessy <plessy@debian.org>
Subject: Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.
Date: Wed, 8 Jun 2011 22:21:30 +0900
Hi,

On Thu, May 26, 2011 at 12:56:06PM +0200, Gerfried Fuchs wrote:
>    Hi!
> 
>  Just for the record, to cut the discussion short and instead do
> something than being side-tracked any further by comments that ignore
> the actual questions instead of trying to get things done, I converted
> the 445 files in the english/ part that were latin1 encoded to utf8,
> changed the .wmlrc and did call smart_change with a
> stupidpatternthatdoesntexist to bump all the translations.

Thanks.

>  So from now on, commits to english/ that are containing non-ascii
> characters MUST be utf-8 encoded. We'll see what might (or might not)
> break after the next build which is expected in half an hour.

File is in UTF-8 encoded but this conversion did not convert funkey
entities to more readable UTF-8 text.

For example:
---
authors "Osamu Aoki (&#38738;&#26408; &#20462;)">
<maintainer "Osamu Aoki (&#38738;&#26408; &#20462;)">
---

Can I make them into more readable UTF-8 strings:
---
authors "Osamu Aoki (青木 修)">
<maintainer "Osamu Aoki (青木 修)">
---

If noone object, I will...

Regards,

Osamu






Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Wed, 08 Jun 2011 14:18:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Prévot <taffit@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 08 Jun 2011 14:18:06 GMT) Full text and rfc822 format available.

Message #147 received at 567781@bugs.debian.org (full text, mbox):

From: David Prévot <taffit@debian.org>
To: 567781@bugs.debian.org
Subject: Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.
Date: Wed, 08 Jun 2011 10:15:41 -0400
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi,

Le 08/06/2011 09:21, Osamu Aoki a écrit :

> File is in UTF-8 encoded but this conversion did not convert funkey
> entities to more readable UTF-8 text.
> 
> For example:
> ---
> authors "Osamu Aoki (&#38738;&#26408; &#20462;)">
> <maintainer "Osamu Aoki (&#38738;&#26408; &#20462;)">
> ---
> 
> Can I make them into more readable UTF-8 strings:
> ---
> authors "Osamu Aoki (青木 修)">
> <maintainer "Osamu Aoki (青木 修)">
> ---
> 
> If noone object, I will...

If you wish, but please, don't touch any file that code may be used
verbatim in other language (all of them are not in UTF-8 yet) since it
will brake some of them (e.g. don't touch the code that is used to
generate POT files, and most of the *.src or *.def files), and do update
translation check of every translation of English pages you will be
editing (./smart_change.pl could be handy for that), I personally don't
care if you do the same changes in translated language, but as you seem
to care, feel free to change them too if they are UTF-8 ready.

For example, to fix your name in http://www.debian.org/doc/user-manuals
source, only in English, run :

  $ ./smart_change.pl -s english/doc/user-manuals.wml

after your change, or directly

  $ ./smart_change.pl -s 's/&#38738;&#26408; &#20462;/青木 修/g' \
                      english/doc/user-manuals.wml

to change every language. Note that the latter command won't work as
expected since you will be editing spanish/doc/user-manuals.wml for
example, which is not yet UTF-8 converted, so you will have to take
extra care of discarding such changes in non UTF-8 language, and bump
translation check anyway if possible since the content didn't change.

On the other hand, you could also wait for every language to be UTF-8
converted, in order to avoid possible mistake here.

Regards

David

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQIcBAEBCAAGBQJN74QLAAoJELgqIXr9/gnyiAMQAKBxKiaVNb67Yqh4NUuwqZR4
uUwrblVbBYBdOHnp4QrxbGxMu4Li2mhYmf8SEVd2iwiEKLY1BSDUptyVmei1WoSW
pQ2K0VNXDcNZhcrkP5HpFJQg6TblQmFuaQUmwZRJ+xuHho16dN0ysSncQPEDWHD9
AYRWfaSM00NtUwpd4K2qLAGfBk1z+pwb/W7Hwwfy9ApVzd5c2CjW8mBywa5x9xZS
cQJEkAGICb2B35YUDhfJJmlOhTQQ3lNBfs1m5ebwXxfSU5bpkVCzD/qdwKWtTe+d
BeQvPsIUy6dFzJ7E+1P9ZV1ryJ+E1vs36gEMpEPyZtuUqL0A/il1JeMc/MtVdCnf
PWpK+21BmWsFDaisI8V3R/IxfWH5L2fAW5LJG/IxryCUksQctZtDyaCqkbCDmycS
VeMCyPsK+8po9k5Xl9EHfO73QUE/iM3KtNoD6SjRDCHSYwmkKyLiriXKTcVjezKr
i+ioIzXAFplR/dKpi064uGmaaaHEKDZNAQVlhHGGvh76bTzY1PcHcSSWpMYfImnO
cTMUbIEpbRNkQ4Swzo0iopsWU+M31Wahf2tNM8KQ52M86OTlkLVViTR+19yZ+k8E
cMuB227nQIk8S6eUXTcdbpJ97INUayXMg3OJnGeI9Wso25bhPMCs2EyMXlKlNF6r
W4VA/9A9UZeNRcb/Jvc6
=EQl2
-----END PGP SIGNATURE-----




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Wed, 08 Jun 2011 14:33:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Gerfried Fuchs <rhonda@deb.at>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 08 Jun 2011 14:33:03 GMT) Full text and rfc822 format available.

Message #152 received at 567781@bugs.debian.org (full text, mbox):

From: Gerfried Fuchs <rhonda@deb.at>
To: Osamu Aoki <osamu@debian.org>, 567781@bugs.debian.org
Subject: Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.
Date: Wed, 8 Jun 2011 16:31:22 +0200
   Hey,

* Osamu Aoki <osamu@debian.org> [2011-06-08 15:21:30 CEST]:
> On Thu, May 26, 2011 at 12:56:06PM +0200, Gerfried Fuchs wrote:
> >  So from now on, commits to english/ that are containing non-ascii
> > characters MUST be utf-8 encoded. We'll see what might (or might not)
> > break after the next build which is expected in half an hour.
> 
> File is in UTF-8 encoded but this conversion did not convert funkey
> entities to more readable UTF-8 text.

 Where there were entities before they should still be left as entities
until all languages are converted into utf8. There are still a few left,
and the places we made use of entities in the english files they are
usually included verbatim into otherwise translated pages, so changing
them from entities to utf8 at this stage of the transition would break
the pages for those languages.

 So the same caveat still applies: In places that are extracted verbatim
into other language pages we still have to stick to plain ascii with
entities. This especially holds true for the DPN footers and the doc
data files.

 Thanks,
Rhonda
-- 
Fühlst du dich mutlos, fass endlich Mut, los      |
Fühlst du dich hilflos, geh raus und hilf, los    | Wir sind Helden
Fühlst du dich machtlos, geh raus und mach, los   | 23.55: Alles auf Anfang
Fühlst du dich haltlos, such Halt und lass los    |




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Wed, 08 Jun 2011 19:57:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Damyan Ivanov <dmn@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 08 Jun 2011 19:57:03 GMT) Full text and rfc822 format available.

Message #157 received at 567781@bugs.debian.org (full text, mbox):

From: Damyan Ivanov <dmn@debian.org>
To: debian-www@lists.debian.org
Cc: 567781@bugs.debian.org
Subject: Re: Bug#567781: Conversion of english pages to Unicode, via HTML entities.
Date: Wed, 8 Jun 2011 22:53:34 +0300
[Message part 1 (text/plain, inline)]
-=| victory, Thu, Jun 09, 2011 at 03:54:51AM +0900 |=-
> On Wed, 08 Jun 2011 10:15:41 -0400
> David Prévot wrote:
> 
> > > For example:
> > > authors "Osamu Aoki (&#38738;&#26408; &#20462;)">
> > > <maintainer "Osamu Aoki (&#38738;&#26408; &#20462;)">
> > > Can I make them into more readable UTF-8 strings:
> > > authors "Osamu Aoki (青木 修)">
> > > <maintainer "Osamu Aoki (青木 修)">
> > > If noone object, I will...
> 
> > If you wish, but please, don't touch any file that code may be used
> > verbatim in other language (all of them are not in UTF-8 yet) since it
> > will brake some of them (e.g. don't touch the code that is used to
> > generate POT files, and most of the *.src or *.def files), and do update
> > translation check of every translation of English pages you will be
> > editing (./smart_change.pl could be handy for that), I personally don't
> > care if you do the same changes in translated language, but as you 
> > seem
> > to care, feel free to change them too if they are UTF-8 ready.
> 
> -1;
> NOT all editors handle utf8 files correctly,

Is that so? Please file bugs (and use another editor in the mean 
time).

> so it's NOT good to change all translations.
> apparently it will get into some trouble.
> - the editor i'm using does not break most of langs but does break some..

Which one is that? Would be nice to know to avoid it :)

> - well, assumed that the editor does break nothing,
>   but please think how to edit those.
>   I don't want to edit files which have strings I can't read/input

So you can read &#38738; but not 青? To me the first is a complete 
enigma, while the second is a far-eastern hieroglyph (which meaning is 
still unknown to me, but hey, one can't know everything).

>   such as accent'ed characters and russian, arabian, etc.

You shouldn't have to edit them. If you need to change a file 
containing unknown characters, just don't change the text in foreign 
writing.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Sat, 13 Aug 2011 22:15:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Fernando C. Estrada" <fcestrada@fcestrada.com>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Sat, 13 Aug 2011 22:15:03 GMT) Full text and rfc822 format available.

Message #162 received at 567781@bugs.debian.org (full text, mbox):

From: "Fernando C. Estrada" <fcestrada@fcestrada.com>
To: 567781@bugs.debian.org
Cc: debian-l10n-spanish@lists.debian.org
Subject: Re: www.debian.org: converting the website to UTF-8
Date: Sat, 13 Aug 2011 17:11:30 -0500
Hi,

Thanks to Francesca the d-l10n-spanish pages are now in UTF-8.

$ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1 | sort
czech
indonesian
korean
lithuanian
polish

Regards,
-- 
Fernando C. Estrada

We apologize for the inconvenience, but we'd still like yout to test out
this kernel.
		-- Linus Torvalds, announcing another kernel patch




Summary recorded from message bug 567781 message 162 Request was from Simon Paillard <spaillard@debian.org> to control@bugs.debian.org. (Sat, 27 Aug 2011 17:00:03 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 08 Dec 2011 13:21:16 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Prévot <taffit@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 08 Dec 2011 13:21:24 GMT) Full text and rfc822 format available.

Message #169 received at 567781@bugs.debian.org (full text, mbox):

From: David Prévot <taffit@debian.org>
To: Marcin Owsiany <porridge@debian.org>
Cc: 567781@bugs.debian.org
Subject: Re: CVS webwml/english/distrib
Date: Thu, 08 Dec 2011 09:18:45 -0400
[Message part 1 (text/plain, inline)]
Hi Marcin,

Le 08/12/2011 07:56, CVS User porridge a écrit :
> Update of /cvs/webwml/webwml/english/distrib
[…]
> +# NOTE: ONLY USE ASCII CHARACTERS IN THIS FILE.
> +# DATA FROM THIS FILE IS INCLUDED IN SOME TRANSLATIONS WHICH ARE NOT UTF-8.

Why not addressing the root problem and fix #567781 once and for all?

$ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1
czech
indonesian
korean
lithuanian
polish

If you need help to convert the Polish translation to UTF-8, please say
so: it's the last “big” (over 8% of translated pages) and active
language that is stopping us from moving on, we can even prepare the
conversion and let you check that we didn't broke stuff.

The other four languages have not been updated for a long time (and are
less than 3% translated). If volunteers pop up, we would be happy help
them. If not, I'd be in favor of deactivating them until someone is
ready to take care of the conversion (i.e reviewing it to check if we
didn't break it) and the maintenance.

Regards

David

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 08 Dec 2011 14:42:17 GMT) Full text and rfc822 format available.

Acknowledgement sent to Marcin Owsiany <porridge@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 08 Dec 2011 14:42:17 GMT) Full text and rfc822 format available.

Message #174 received at 567781@bugs.debian.org (full text, mbox):

From: Marcin Owsiany <porridge@debian.org>
To: David Prévot <taffit@debian.org>
Cc: 567781@bugs.debian.org
Subject: Re: CVS webwml/english/distrib
Date: Thu, 8 Dec 2011 15:41:24 +0100
On Thu, Dec 08, 2011 at 09:18:45AM -0400, David Prévot wrote:
> Hi Marcin,
> 
> Le 08/12/2011 07:56, CVS User porridge a écrit :
> > Update of /cvs/webwml/webwml/english/distrib
> […]
> > +# NOTE: ONLY USE ASCII CHARACTERS IN THIS FILE.
> > +# DATA FROM THIS FILE IS INCLUDED IN SOME TRANSLATIONS WHICH ARE NOT UTF-8.
> 
> Why not addressing the root problem and fix #567781 once and for all?

That would be ideal, but I don't have the time to do it - I imagine
doing it properly would require setting up a test website installation
to ensure sanity, and then and synchronizing with the build schedule to
ensure consistency, given that CVS does not support atomic commits.

> $ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1
> czech
> indonesian
> korean
> lithuanian
> polish
> 
> If you need help to convert the Polish translation to UTF-8, please say
> so: it's the last “big” (over 8% of translated pages) and active
> language that is stopping us from moving on, we can even prepare the
> conversion and let you check that we didn't broke stuff.

I'm going to have intermittent internet access for ~the next month. I'd
rather do that in January, unless someone else steps up to do the
verification.

-- 
Marcin Owsiany <porridge@debian.org>             http://marcin.owsiany.pl/
GnuPG: 1024D/60F41216  FE67 DA2D 0ACA FC5E 3F75  D6F6 3A0D 8AA0 60F4 1216




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Thu, 08 Dec 2011 16:36:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Prévot <taffit@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 08 Dec 2011 16:36:03 GMT) Full text and rfc822 format available.

Message #179 received at 567781@bugs.debian.org (full text, mbox):

From: David Prévot <taffit@debian.org>
To: Marcin Owsiany <porridge@debian.org>, 567781@bugs.debian.org
Subject: Re: Bug#567781: CVS webwml/english/distrib
Date: Thu, 08 Dec 2011 12:32:40 -0400
[Message part 1 (text/plain, inline)]
Le 08/12/2011 10:41, Marcin Owsiany a écrit :
> On Thu, Dec 08, 2011 at 09:18:45AM -0400, David Prévot wrote:

>> Why not addressing the root problem and fix #567781 once and for all?
> 
> That would be ideal, but I don't have the time to do it - I imagine
> doing it properly would require setting up a test website installation
> to ensure sanity, and then and synchronizing with the build schedule to
> ensure consistency, given that CVS does not support atomic commits.

About the sync, the build happens at (3,7,11,15,19,23):24 UTC, the CVS
update is generally finished 5 minutes later, so it offers 6 windows of
3h55 a day if you really don't want to mess up during the build. If you
are very unlucky (and commit during the “cvs up”), CVS is clever enough
to re-update the files that are being committed at the end of the
update, but if you manage to not push all files in one single commit, at
worst, the Polish web will be partly ugly during 4 hours (until the next
rebuild), but you may be lucky (that happens too ;-) and then find a
webmaster available at this moment to fix the stuff and trigger a manual
rebuild of the Polish part followed by a mirror sync.

>> $ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1
>> czech
>> indonesian
>> korean
>> lithuanian
>> polish
>>
>> If you need help to convert the Polish translation to UTF-8, please say
>> so: it's the last “big” (over 8% of translated pages) and active
>> language that is stopping us from moving on, we can even prepare the
>> conversion and let you check that we didn't broke stuff.
> 
> I'm going to have intermittent internet access for ~the next month. I'd
> rather do that in January, unless someone else steps up to do the
> verification.

I just prepared it in case you or someone you trust is able to check if
something went wrong : http://tilapin.org/debian/

I also prepared the four other conversions on this test website, if
someone want to check nothing obvious went wrong for those too (well, I
have a poor bandwidth, if it's too painful, I'll push a test website
elsewhere).

There are sometime a few glitches after an UTF-8 conversion we are able
to spot thanks to the validation and tidy run, and even fix without the
help of translators BTW.

Regards

David

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Sat, 10 Dec 2011 21:03:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Marcin Owsiany <porridge@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Sat, 10 Dec 2011 21:03:07 GMT) Full text and rfc822 format available.

Message #184 received at 567781@bugs.debian.org (full text, mbox):

From: Marcin Owsiany <porridge@debian.org>
To: David Prévot <taffit@debian.org>
Cc: 567781@bugs.debian.org
Subject: Re: Bug#567781: CVS webwml/english/distrib
Date: Sat, 10 Dec 2011 21:58:35 +0100
On Thu, Dec 08, 2011 at 12:32:40PM -0400, David Prévot wrote:
> Le 08/12/2011 10:41, Marcin Owsiany a écrit :
> > On Thu, Dec 08, 2011 at 09:18:45AM -0400, David Prévot wrote:
> 
> >> Why not addressing the root problem and fix #567781 once and for all?
> > 
> > That would be ideal, but I don't have the time to do it - I imagine
> > doing it properly would require setting up a test website installation
> > to ensure sanity, and then and synchronizing with the build schedule to
> > ensure consistency, given that CVS does not support atomic commits.
> 
> About the sync, the build happens at (3,7,11,15,19,23):24 UTC, the CVS
> update is generally finished 5 minutes later, so it offers 6 windows of
> 3h55 a day if you really don't want to mess up during the build. If you
> are very unlucky (and commit during the “cvs up”), CVS is clever enough
> to re-update the files that are being committed at the end of the
> update, but if you manage to not push all files in one single commit, at
> worst, the Polish web will be partly ugly during 4 hours (until the next
> rebuild), but you may be lucky (that happens too ;-) and then find a
> webmaster available at this moment to fix the stuff and trigger a manual
> rebuild of the Polish part followed by a mirror sync.
> 
> >> $ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1
> >> czech
> >> indonesian
> >> korean
> >> lithuanian
> >> polish
> >>
> >> If you need help to convert the Polish translation to UTF-8, please say
> >> so: it's the last “big” (over 8% of translated pages) and active
> >> language that is stopping us from moving on, we can even prepare the
> >> conversion and let you check that we didn't broke stuff.
> > 
> > I'm going to have intermittent internet access for ~the next month. I'd
> > rather do that in January, unless someone else steps up to do the
> > verification.
> 
> I just prepared it in case you or someone you trust is able to check if
> something went wrong : http://tilapin.org/debian/
> 
> I also prepared the four other conversions on this test website, if
> someone want to check nothing obvious went wrong for those too (well, I
> have a poor bandwidth, if it's too painful, I'll push a test website
> elsewhere).

I had a look at a few pages and it looks OK.
Please go ahead.

-- 
Marcin Owsiany <porridge@debian.org>             http://marcin.owsiany.pl/
GnuPG: 1024D/60F41216  FE67 DA2D 0ACA FC5E 3F75  D6F6 3A0D 8AA0 60F4 1216




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#567781; Package www.debian.org. (Sat, 10 Dec 2011 22:09:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Prévot <taffit@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Sat, 10 Dec 2011 22:09:07 GMT) Full text and rfc822 format available.

Message #189 received at 567781@bugs.debian.org (full text, mbox):

From: David Prévot <taffit@debian.org>
To: Marcin Owsiany <porridge@debian.org>, Miroslav Kure <kurem@upcase.inf.upol.cz>, Juraj Kubelka <Juraj.Kubelka@email.cz>, Mahyuddin Susanto <udienz@debian-id.org>, Junaedi Kartawijaya <milisdebian@yahoo.com>, Izharul Haq <atoz@debian-id.org>, "Woo-il Song (송우일)" <wooil@debian.or.kr>, "Seongtae Yoo (유성태)" <alloying@nownuri.net>, "Hyun-Gwan Seo (서현관)" <westporch@gmail.com>, Martynas Sklizmantas <saint@ghost.lt>
Cc: 567781@bugs.debian.org, debian-l10n-czech@lists.debian.org, debian-l10n-indonesian@lists.debian.org, debian-l10n-korean@lists.debian.org
Subject: Moving the website to UTF-8
Date: Sat, 10 Dec 2011 18:04:45 -0400
[Message part 1 (text/plain, inline)]
Le 10/12/2011 16:58, Marcin Owsiany a écrit :
> On Thu, Dec 08, 2011 at 12:32:40PM -0400, David Prévot wrote:
>> Le 08/12/2011 10:41, Marcin Owsiany a écrit :
>>> On Thu, Dec 08, 2011 at 09:18:45AM -0400, David Prévot wrote:

>> I just prepared it in case you or someone you trust is able to check if
>> something went wrong : http://tilapin.org/debian/

> I had a look at a few pages and it looks OK.
> Please go ahead.

Thanks, it's done, wait a few hours for the full build to complete on
www-master. Next files should of course be committed in UTF-8 only ;-).

Now only four languages are missing to close this bug:

$ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1
czech
indonesian
korean
lithuanian

Coordinators of translations (and localizations team CCed), could you
please consider moving your website translation to UTF-8?

If you don't feel comfortable to do it yourself, could you please
confirm that your translated pages are fine on my test website:

	http://tilapin.org/debian/

Regards

David

[signature.asc (application/pgp-signature, attachment)]

Reply sent to David Prévot <taffit@debian.org>:
You have taken responsibility. (Mon, 02 Jan 2012 20:36:11 GMT) Full text and rfc822 format available.

Notification sent to Simon Paillard <simon.paillard@resel.enst-bretagne.fr>:
Bug acknowledged by developer. (Mon, 02 Jan 2012 20:36:11 GMT) Full text and rfc822 format available.

Message #194 received at 567781-done@bugs.debian.org (full text, mbox):

From: David Prévot <taffit@debian.org>
To: 567781-done@bugs.debian.org
Cc: Miroslav Kure <kurem@upcase.inf.upol.cz>, Juraj Kubelka <Juraj.Kubelka@email.cz>, "Woo-il Song (송우일)" <wooil@debian.or.kr>, "Seongtae Yoo (유성태)" <alloying@nownuri.net>, "Hyun-Gwan Seo (서현관)" <westporch@gmail.com>, Martynas Sklizmantas <saint@ghost.lt>, debian-l10n-czech@lists.debian.org, debian-l10n-korean@lists.debian.org
Subject: Re: Bug#567781: Moving the website to UTF-8
Date: Mon, 02 Jan 2012 16:32:32 -0400
[Message part 1 (text/plain, inline)]
Le 10/12/2011 18:04, David Prévot a écrit :

> Now only four languages are missing to close this bug:

Three actually: Indonesian has been taken care of in the mean time,
thanks to Izharul Haq.

> $ grep CHARSET */.wmlrc |grep -vi utf | cut -d "/" -f1
> czech
> korean
> lithuanian
> 
> Coordinators of translations (and localizations team CCed), could you
> please consider moving your website translation to UTF-8?

Actually, with the move to isoquery to handle country name translations,
non-UTF-8 languages were partly broken, so I just took care of the
remaining ones. Translators, if you notice any issue (after the next
rebuild), please get in touch with us if you need help fixing
(hopefully, nothing will go wrong, but well…).

Regards

David

--

13:43 < taffit> Maybe next year could be an UTF-8 one […]
13:45 < MadameZou>  \o/

[signature.asc (application/pgp-signature, attachment)]

Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Tue, 31 Jan 2012 07:44:05 GMT) Full text and rfc822 format available.

Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Sat Apr 19 18:06:59 2014; Machine Name: buxtehude.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.