Debian Bug report logs - #657557
packages.debian.org: Missing long descriptions

Package: www.debian.org; Maintainer for www.debian.org is Debian WWW Team <debian-www@lists.debian.org>;

Reported by: "Michał Kułach" <michalkulach@gmail.com>

Date: Thu, 26 Jan 2012 23:51:02 UTC

Severity: serious

Tags: confirmed, patch

Merged with 660961

Done: "Michał Kułach" <michalkulach@gmail.com>

Bug is archived. No further changes may be made.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Thu, 26 Jan 2012 23:51:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Michał Kułach" <michalkulach@gmail.com>:
New Bug report received and forwarded. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Thu, 26 Jan 2012 23:51:05 GMT) Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: "Michał Kułach" <michalkulach@gmail.com>
To: "submit@bugs.debian.org" <submit@bugs.debian.org>
Subject: packages.debian.org: Missing long descriptions
Date: Fri, 27 Jan 2012 00:48:47 +0100
Package: www.debian.org
Severity: serious

All long descriptions from testing, unstable and experimental are missing  
from packages.debian.org. Sorry if I overestimate severity.

-- 
Michał Kułach (pl DDTP team)




Added tag(s) confirmed. Request was from Simon Paillard <spaillard@debian.org> to control@bugs.debian.org. (Sun, 29 Jan 2012 20:39:02 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Tue, 07 Feb 2012 18:54:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Filipus Klutiero <chealer@gmail.com>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Tue, 07 Feb 2012 18:54:03 GMT) Full text and rfc822 format available.

Message #12 received at 657557@bugs.debian.org (full text, mbox):

From: Filipus Klutiero <chealer@gmail.com>
To: 657557@bugs.debian.org, 657557-subscribe@bugs.debian.org
Subject: Cause
Date: Tue, 07 Feb 2012 13:50:05 -0500
This was caused by the transition to description-less Packages indices 
(see http://lists.debian.org/debian-devel/2012/02/msg00149.html ).




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Tue, 21 Feb 2012 01:45:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Cyril Brulebois <kibi@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Tue, 21 Feb 2012 01:45:03 GMT) Full text and rfc822 format available.

Message #17 received at 657557@bugs.debian.org (full text, mbox):

From: Cyril Brulebois <kibi@debian.org>
To: Michał Kułach <michalkulach@gmail.com>, 657557@bugs.debian.org
Cc: control@bugs.debian.org
Subject: Re: Bug#657557: packages.debian.org: Missing long descriptions
Date: Tue, 21 Feb 2012 02:44:01 +0100
[Message part 1 (text/plain, inline)]
tag 657557 patch pending
thanks

Michał Kułach <michalkulach@gmail.com> (27/01/2012):
> Package: www.debian.org
> Severity: serious
> 
> All long descriptions from testing, unstable and experimental are
> missing from packages.debian.org.

I think my patches should solve that. I'll send them as follow-ups to
this mail through git send-email.

> Sorry if I overestimate severity.

As you might have seen, Simon tagged the bug “confirmed”, meaning you
didn't. ;)

Mraw,
KiBi.
[signature.asc (application/pgp-signature, inline)]

Added tag(s) pending and patch. Request was from Cyril Brulebois <kibi@debian.org> to control@bugs.debian.org. (Tue, 21 Feb 2012 01:45:04 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Tue, 21 Feb 2012 02:09:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Cyril Brulebois <kibi@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Tue, 21 Feb 2012 02:09:03 GMT) Full text and rfc822 format available.

Message #24 received at 657557@bugs.debian.org (full text, mbox):

From: Cyril Brulebois <kibi@debian.org>
To: 657557@bugs.debian.org
Subject: Fix for #657557
Date: Tue, 21 Feb 2012 03:06:04 +0100
***
*** DISCLAIMER:
***   I don't run a whole instance, I only played with a few
***   languages and a few suites, checking I could perform
***   packages → long descriptions (and translations) lookups
***   again once the patches applied, and run-parts done.
***


Tiny walkthrough for those patches against debian-master:

[PATCH 1/4] Download English translations files.
 → Already applied on the Debian instance according to Rhonda.

[PATCH 2/4] Add support for long descriptions in suites above squeeze.
 → Restores long descriptions, but there's a translation→package
   dependency (when translations are updated) which disappears with
   the next two patches.

[PATCH 3/4] Add support for --english-only to parse-translations.
[PATCH 4/4] Extract English translations before processing packages.
 → I think the commit messages really say what they do: making
   processing works fine at once.


If troubles arise from applying those patches, I could be tricked
into helping to fix it up.

Mraw,
KiBi.




Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Tue, 21 Feb 2012 02:09:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Cyril Brulebois <kibi@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Tue, 21 Feb 2012 02:09:05 GMT) Full text and rfc822 format available.

Message #29 received at 657557@bugs.debian.org (full text, mbox):

From: Cyril Brulebois <kibi@debian.org>
To: 657557@bugs.debian.org
Cc: Cyril Brulebois <kibi@debian.org>
Subject: [PATCH 1/4] Download English translations files.
Date: Tue, 21 Feb 2012 03:06:05 +0100
This will be required by the next commits.
---
 config.sh.sed.in |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/config.sh.sed.in b/config.sh.sed.in
index 9cd4c3f..ea9b355 100644
--- a/config.sh.sed.in
+++ b/config.sh.sed.in
@@ -41,7 +41,7 @@ search_url="/search"
 # Architectures
 # FIXME: unhardcode archs and suites
 polangs="bg de fi fr hu ja nl pl ru sk sv uk zh-cn zh-tw"
-ddtplangs="ca cs da de eo es eu fi fr hu it ja km ko nl pl pt pt-br ru sk sv uk zh zh-cn zh-tw"
+ddtplangs="ca cs da de en eo es eu fi fr hu it ja km ko nl pl pt pt-br ru sk sv uk zh zh-cn zh-tw"
 archives="us security debports backports volatile"
 sections="main contrib non-free"
 parts="$sections"
-- 
1.7.2.5





Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Tue, 21 Feb 2012 02:09:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Cyril Brulebois <kibi@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Tue, 21 Feb 2012 02:09:07 GMT) Full text and rfc822 format available.

Message #34 received at 657557@bugs.debian.org (full text, mbox):

From: Cyril Brulebois <kibi@debian.org>
To: 657557@bugs.debian.org
Cc: Cyril Brulebois <kibi@debian.org>
Subject: [PATCH 2/4] Add support for long descriptions in suites above squeeze.
Date: Tue, 21 Feb 2012 03:06:06 +0100
Use description-md5 from English translations (when available) to
restore the long descriptions during package processing.

Fix for:
  http://bugs.debian.org/657557

Archive change:
  http://lists.debian.org/debian-devel-announce/2012/01/msg00004.html
---
 bin/parse-packages |   23 +++++++++++++++++++++++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/bin/parse-packages b/bin/parse-packages
index a1c8d98..b926982 100755
--- a/bin/parse-packages
+++ b/bin/parse-packages
@@ -59,6 +59,10 @@ $/ = "";
 -d "$DBDIR/xapian.old" && rmtree("$DBDIR/xapian.old");
 mkpath( "$DBDIR/xapian.new" );
 
+# Needed to compensate removal of long descriptions from Packages files:
+my %descriptions_translated_db;
+tie %descriptions_translated_db, "DB_File", "files/db/descriptions_translated.db", O_RDONLY, 0666, $DB_BTREE;
+
 for my $suite (@SUITES) {
     my %package_names_suite = ();
     my %packages_all_db;
@@ -129,6 +133,23 @@ for my $suite (@SUITES) {
 		    $data{'tag'} = join ", ", @tags;
 		}
 
+		# If description-md5 is present, use a lookup, thanks
+		# to the English translation which got processed right
+		# before:
+		if ($data{'description-md5'}) {
+		    # The short description is a nice fallback:
+		    my $description = $data{'description'};
+		    my $lookup = $descriptions_translated_db{$data{'description-md5'}};
+		    if ($lookup) {
+			while ($lookup =~ /([^\001]*)\001([^\000]*)\000/g) {
+			    my ($language, $translated_description) = ($1, $2);
+			    $description = $translated_description
+				if $language eq 'en';
+			}
+			$data{'description'} = $description;
+		    }
+		}
+
 		# we add some additional data here
 		my $descr = "$data{'description'}\000$data{'package'}\000"
 		    .($data{'tag'}||'');
@@ -194,6 +215,8 @@ for my $suite (@SUITES) {
     untie %packages_all_db;
 }
 
+untie %descriptions_translated_db;
+
 print "Writing databases...\n";
 my %packages_small_db;
 tie %packages_small_db, "DB_File", "$DBDIR/packages_small.db.new",
-- 
1.7.2.5





Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Tue, 21 Feb 2012 02:09:09 GMT) Full text and rfc822 format available.

Acknowledgement sent to Cyril Brulebois <kibi@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Tue, 21 Feb 2012 02:09:09 GMT) Full text and rfc822 format available.

Message #39 received at 657557@bugs.debian.org (full text, mbox):

From: Cyril Brulebois <kibi@debian.org>
To: 657557@bugs.debian.org
Cc: Cyril Brulebois <kibi@debian.org>
Subject: [PATCH 3/4] Add support for --english-only to parse-translations.
Date: Tue, 21 Feb 2012 03:06:07 +0100
That makes it possible to extract the long descriptions from English
translations since they got dropped from Packages files in some suites.
---
 bin/parse-translations |   19 +++++++++++++++----
 1 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/bin/parse-translations b/bin/parse-translations
index d079b7f..00611a9 100755
--- a/bin/parse-translations
+++ b/bin/parse-translations
@@ -39,6 +39,17 @@ use Packages::Config qw( $TOPDIR $DBDIR @DDTP_LANGUAGES );
 &Packages::Config::init( './' );
 my %descriptions = ();
 
+# Make it possible to deal with either only English translations (to
+# get long descriptions back), or all of them (the default). Since a
+# single option needs to be supported, don't bother with getopt:
+my @langs = @DDTP_LANGUAGES;
+my $output = 'descriptions_translated';
+my $argument = shift @ARGV;
+if ($argument and $argument eq '--english-only') {
+    $output = 'descriptions_translated_english_only';
+    @langs = ('en');
+}
+
 $/ = "";
 
 -d $DBDIR || mkpath( $DBDIR );
@@ -50,7 +61,7 @@ my $fixja = Text::Iconv->new("EUC-JP", "UTF-8");
 # FIXME: unhardcode dists name
 my @dists = ('sid', 'wheezy', 'squeeze', 'lenny');
 
-foreach my $lang (@DDTP_LANGUAGES) {
+foreach my $lang (@langs) {
     (my $locale = $lang) =~ s/^([a-z]{2})-([a-z]{2})$/"$1_".uc($2)/e;
     print "Reading Translations for $lang ($locale)...";
     my $count = 0;
@@ -87,7 +98,7 @@ close PKG;
 
 print "Writing database (".scalar(keys %descriptions)." unique descriptions)...\n";
 my %descriptions_db;
-tie %descriptions_db, "DB_File", "$DBDIR/descriptions_translated.db.new",
+tie %descriptions_db, "DB_File", "$DBDIR/$output.db.new",
 	O_RDWR|O_CREAT, 0666, $DB_BTREE
 	or die "Error creating DB: $!";
 while (my ($md5, $v) = each(%descriptions)) {
@@ -104,5 +115,5 @@ while (my ($md5, $v) = each(%descriptions)) {
 }
 untie %descriptions_db;
 
-rename("$DBDIR/descriptions_translated.db.new",
-       "$DBDIR/descriptions_translated.db");
+rename("$DBDIR/$output.db.new",
+       "$DBDIR/$output.db");
-- 
1.7.2.5





Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Tue, 21 Feb 2012 02:09:11 GMT) Full text and rfc822 format available.

Acknowledgement sent to Cyril Brulebois <kibi@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Tue, 21 Feb 2012 02:09:11 GMT) Full text and rfc822 format available.

Message #44 received at 657557@bugs.debian.org (full text, mbox):

From: Cyril Brulebois <kibi@debian.org>
To: 657557@bugs.debian.org
Cc: Cyril Brulebois <kibi@debian.org>
Subject: [PATCH 4/4] Extract English translations before processing packages.
Date: Tue, 21 Feb 2012 03:06:08 +0100
Since long descriptions are determined using English translations (for
some suites only, right now), it's needed to process them before
Packages files are processed.
---
 bin/parse-packages        |   14 ++++++++------
 cron.d/200process_archive |    2 ++
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/bin/parse-packages b/bin/parse-packages
index b926982..e75a500 100755
--- a/bin/parse-packages
+++ b/bin/parse-packages
@@ -60,8 +60,8 @@ $/ = "";
 mkpath( "$DBDIR/xapian.new" );
 
 # Needed to compensate removal of long descriptions from Packages files:
-my %descriptions_translated_db;
-tie %descriptions_translated_db, "DB_File", "files/db/descriptions_translated.db", O_RDONLY, 0666, $DB_BTREE;
+my %descriptions_english_db;
+tie %descriptions_english_db, "DB_File", "files/db/descriptions_translated_english_only.db", O_RDONLY, 0666, $DB_BTREE;
 
 for my $suite (@SUITES) {
     my %package_names_suite = ();
@@ -139,11 +139,13 @@ for my $suite (@SUITES) {
 		if ($data{'description-md5'}) {
 		    # The short description is a nice fallback:
 		    my $description = $data{'description'};
-		    my $lookup = $descriptions_translated_db{$data{'description-md5'}};
+		    my $lookup = $descriptions_english_db{$data{'description-md5'}};
 		    if ($lookup) {
+			# There should only be an English translation
+			# in there, but let's make sure:
 			while ($lookup =~ /([^\001]*)\001([^\000]*)\000/g) {
-			    my ($language, $translated_description) = ($1, $2);
-			    $description = $translated_description
+			    my ($language, $english_description) = ($1, $2);
+			    $description = $english_description
 				if $language eq 'en';
 			}
 			$data{'description'} = $description;
@@ -215,7 +217,7 @@ for my $suite (@SUITES) {
     untie %packages_all_db;
 }
 
-untie %descriptions_translated_db;
+untie %descriptions_english_db;
 
 print "Writing databases...\n";
 my %packages_small_db;
diff --git a/cron.d/200process_archive b/cron.d/200process_archive
index b8f6a6d..29a7385 100755
--- a/cron.d/200process_archive
+++ b/cron.d/200process_archive
@@ -5,6 +5,8 @@
 cd "$topdir"
 
 date
+./bin/parse-translations --english-only
+date
 ./bin/parse-packages
 date
 ./bin/parse-sources
-- 
1.7.2.5





Information forwarded to debian-bugs-dist@lists.debian.org, kibi@debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Wed, 22 Feb 2012 00:57:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Goswin von Brederlow <goswin-v-b@web.de>:
Extra info received and forwarded to list. Copy sent to kibi@debian.org, Debian WWW Team <debian-www@lists.debian.org>. (Wed, 22 Feb 2012 00:57:07 GMT) Full text and rfc822 format available.

Message #49 received at 657557@bugs.debian.org (full text, mbox):

From: Goswin von Brederlow <goswin-v-b@web.de>
To: Debian Bug Tracking System <657557@bugs.debian.org>
Subject: [PATCH] Alternate patch for missing long descriptions
Date: Wed, 22 Feb 2012 01:54:18 +0100
[Message part 1 (text/plain, inline)]
Package: www.debian.org
Followup-For: Bug #657557

Hi,

Cyril and I disagree about the cause for the missing description and
the fix for it. So someone impartial please look over both out patches
and see which makes more sense. In both cases the english translations
must be added to ddtplangs [Patch 1/4] from Cyril [1], I didn't
want to repost that.

Here is how I see the problem:

1) Originaly Packages.bz2 contained the long description and no
   description-md5 field (still does for older releases).
2) Translations are indexed by the md5 of the original long description.
3) bin/parse-packages computes description-md5 and stores it in the
   packages database.
4) When generating the webpages the description-md5 is used to lookup
   the translated description with fallback to english and the
   original description.

But in sid the Packages.bz2 files now contain a description-md5 field
and only the short description instead of the short and long
description. This cause the computation in (3) above to compute the
md5sum of only the short description and then the translation lookup
in (4) fails to find a translation and falls back to the original
description, which is only the short description.



Now my patch (attached) fixes this problem by only computing the
description-md5 field if it is missing in Packages.bz2. A simple one
line fix. The rest of the code already does all the right things and
looks up the translation correclty including falling back to 'en' as
needed. E.g.:

----------------- http://localhost/sid/gcc ---------------------------
Package: gcc (4:4.6.2-4) 

GNU C compiler

This is the GNU C compiler, a fairly portable optimizing compiler for C.

This is a dependency package providing the default GNU C compiler.

Tags: Software Development: Compiler, C Development, User Interface: interface::commandline, role::metapackage, Role: Program, Application Suite: suite::gnu, works-with::software:source
Other Packages Related to gcc
...
----------------- http://localhost/de/sid/gcc ------------------------
Paket: gcc (4:4.6.2-4) 

Der GNU-C-Compiler

Dies ist der GNU-C-Compiler, ein recht portabler, optimierender C-Compiler.

Dies ist ein Abhängigkeitspaket, welches den Standard-GNU-C-Compiler zur Verfügung stellt.

Markierungen: Software-Entwicklung: Compiler, C-Entwicklung, Benutzer-Schnittstellen: interface::commandline, role::metapackage, Rolle: Programm, Anwendungs-Suite: suite::gnu, works-with::software:source
Andere Pakete mit Bezug zu gcc
...
------------------ http://localhost/de/sid/3depict ------------------
Paket: 3depict (0.0.9-1 und andere) 

visualisation and analysis for single valued point data

This program provides a graphical interface for the scientific analysis of real valued point data (x,y,z,value). This is primarily targeted towards Atom probe tomography applications, but may prove useful to other applications as well.

Markierungen: Benutzer-Schnittstellen: X-Window-System, Rolle: Programm, GUI-Baukasten: wxWidgets, Zweck: use::analysing, x11::application
Andere Pakete mit Bezug zu 3depict
----------------------------------------------------------------------

3depict has no german translation so the english "translation" is used
as it should.




Correct me if I'm wrong but here is how I understand Cyrils patch: It
works by fixing the symptom instead of the problem. In [PATCH 2/4] it
checks if the Packages.bz2 file contains an description-md5 field, If
so it looks up the english translation for the package and replaces
the description with the english translation, thereby restoring the
long description for the package (line 146 with his patch). And now
that the long description has been restored the computation of
description-md5 a few lines later computes to the right value, the one
that is already present. And with the right description-md5 value the
translation lookup when generating the pages functions again. [PATCH
3/4] and [PATCH 4/4] then tidy things up and make it possible to do an
english only run for bin/parse-translations. The reason for this is
that with his patch bin/parse-packages requires an in-sync translation
database to function.


MfG
	Goswin

1) http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=657557#29

-- System Information:
Debian Release: wheezy/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.1.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=C, LC_CTYPE=de_DE (charmap=ISO-8859-1)
Shell: /bin/sh linked to /bin/dash
[translations.patch (text/x-diff, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Wed, 22 Feb 2012 01:18:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Cyril Brulebois <kibi@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 22 Feb 2012 01:18:03 GMT) Full text and rfc822 format available.

Message #54 received at 657557@bugs.debian.org (full text, mbox):

From: Cyril Brulebois <kibi@debian.org>
To: Goswin von Brederlow <goswin-v-b@web.de>, 657557@bugs.debian.org
Subject: Re: Bug#657557: [PATCH] Alternate patch for missing long descriptions
Date: Wed, 22 Feb 2012 02:14:35 +0100
[Message part 1 (text/plain, inline)]
Goswin von Brederlow <goswin-v-b@web.de> (22/02/2012):
> Cyril and I disagree about the cause for the missing description and
> the fix for it.

Wrong, again. Please stop harrassing me, pretty fucking please.

KiBi.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Wed, 22 Feb 2012 01:54:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Cyril Brulebois <kibi@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 22 Feb 2012 01:54:03 GMT) Full text and rfc822 format available.

Message #59 received at 657557@bugs.debian.org (full text, mbox):

From: Cyril Brulebois <kibi@debian.org>
To: Goswin von Brederlow <goswin-v-b@web.de>, 657557@bugs.debian.org
Subject: Re: Bug#657557: [PATCH] Alternate patch for missing long descriptions
Date: Wed, 22 Feb 2012 02:51:55 +0100
[Message part 1 (text/plain, inline)]
For the sake of getting my fellow packages/translations maintainers to
get the whole picture since you insist so much on proposing a wrong
solution and calling mine “papering over the bug”, I'll reply anyway.

Goswin von Brederlow <goswin-v-b@web.de> (22/02/2012):
> Now my patch (attached) fixes this problem by only computing the
> description-md5 field if it is missing in Packages.bz2. A simple one
> line fix. The rest of the code already does all the right things and
> looks up the translation correclty including falling back to 'en' as
> needed.

That's only if you're interesting in getting the translations back. Now
go read the block of code right above $data{'description-md5'}…

> Correct me if I'm wrong but here is how I understand Cyrils patch: It
> works by fixing the symptom instead of the problem. In [PATCH 2/4] it
> checks if the Packages.bz2 file contains an description-md5 field, If
> so it looks up the english translation for the package and replaces
> the description with the english translation, thereby restoring the
> long description for the package (line 146 with his patch).

… and this is needed so that storing the description in the database
(what I pointed to above: $descr, $sdescr, etc.) happens properly,
meaning: the long description, not the short one only.

> And now that the long description has been restored the computation of
> description-md5 a few lines later computes to the right value, the one
> that is already present.

As said on IRC, if applied after my patches, you're optimizing out an
MD5 sum in case it's present already; which doesn't hurt. But that
doesn't fix the issue with short descriptions floating around.

Mraw,
KiBi.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Wed, 22 Feb 2012 09:37:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Goswin von Brederlow <goswin-v-b@web.de>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Wed, 22 Feb 2012 09:37:07 GMT) Full text and rfc822 format available.

Message #64 received at 657557@bugs.debian.org (full text, mbox):

From: Goswin von Brederlow <goswin-v-b@web.de>
To: Cyril Brulebois <kibi@debian.org>
Cc: Goswin von Brederlow <goswin-v-b@web.de>, 657557@bugs.debian.org
Subject: Re: Bug#657557: [PATCH] Alternate patch for missing long descriptions
Date: Wed, 22 Feb 2012 10:34:10 +0100
Cyril Brulebois <kibi@debian.org> writes:

> For the sake of getting my fellow packages/translations maintainers to
> get the whole picture since you insist so much on proposing a wrong
> solution and calling mine “papering over the bug”, I'll reply anyway.

Please don't make up quotes I never said. And thanks for finally
responding with more than "wrong" to support your case even if it
doesn't explain why long description in the package db are neccessary or
give an example what remains broken with just my patch.

> Goswin von Brederlow <goswin-v-b@web.de> (22/02/2012):
>> Now my patch (attached) fixes this problem by only computing the
>> description-md5 field if it is missing in Packages.bz2. A simple one
>> line fix. The rest of the code already does all the right things and
>> looks up the translation correclty including falling back to 'en' as
>> needed.
>
> That's only if you're interesting in getting the translations back.

Which is what is done when generating the webpages so this seems rather
important to me.

> Now
> go read the block of code right above $data{'description-md5'}…

Let see:

		# we add some additional data here
		my $descr = "$data{'description'}\000$data{'package'}\000"
		    .($data{'tag'}||'');

Description in database format.

		my $sdescr = $data{'description'};
		$sdescr =~ s/\n.*//os;

Description in text format.

		my $did = undef;
		if (exists($descriptions{$descr})) {
			$did  = $descriptions{$descr};
		} else {
			$did = 1 + $#descriptions;
			$descriptions[$did] = $descr;
			$descriptions{$descr} = $did;
		}

Reuse the same $did if the description already exists.

Ok, so without your patch this bit of code will share a $did if the
short description matches. That certainly differs from what it used to
do. So lets see what effect that has:

----------------- http://localhost/sid/comixcursors -----------------
Package: comixcursors (0.7.2-2) 

transitional dummy package

ComixCursors is a set of mouse pointer themes for X11 in the style of comic-book art.

This package is transitional to install the right-handed, translucent cursor set, which is now in the \u2018comixcursors-righthanded\u2019 package.

Tags: Role: Application Data, Dummy Package, X Window System: Theme
Packages providing comixcursors
...
------------- http://localhost/sid/ttf-aoyagi-soseki -----------------
Package: ttf-aoyagi-soseki (20070207-6) 

transitional dummy package

This package is a dummy transitional package. It can be safely removed.

Tags: Culture: Japanese, Made Of: Font, Role: Standalone Data, Dummy Package
...
----------------------------------------------------------------------

Oh wait, we always use the description from the translation file even if
it is the english "translation".

>> Correct me if I'm wrong but here is how I understand Cyrils patch: It
>> works by fixing the symptom instead of the problem. In [PATCH 2/4] it
>> checks if the Packages.bz2 file contains an description-md5 field, If
>> so it looks up the english translation for the package and replaces
>> the description with the english translation, thereby restoring the
>> long description for the package (line 146 with his patch).
>
> … and this is needed so that storing the description in the database
> (what I pointed to above: $descr, $sdescr, etc.) happens properly,
> meaning: the long description, not the short one only.

True. Without your patch the long description is no longer stored in the
package database, only in the translations database.

What you haven't explained is why that is needed. Without your patch the
packages_descriptions.db (ever only used by bin/parse-packages) and
descriptions_packages.db (used in DoShow.pm and Search.pm) will have
bogus entries.

But the package pages show the translation and searching in the
descriptions also seems to properly look into the english translations.
E.g. searching for "It can be safely removed" gives among others:

--- http://localhost/search?searchon=all&keywords=It+can+be+safely+removed ---
...
Package gnash-opengl
sid (unstable) (oldlibs): dummy package for gnash-opengl removal
0.8.10-3: all 

----------------- http://localhost/sid/gnash-opengl -----------------
Package: gnash-opengl (0.8.10-3) 

dummy package for gnash-opengl removal

This package is a transitional package for gnash-opengl removal.

It can be safely removed when Gnash is installed.

Tags: User Interface: X Window System, Role: Dummy Package, Program, Interface Toolkit: uitoolkit::gtk, use::playing, Supports Format: SWF, ShockWave Flash, Works with: works-with::audio, works-with::video, X Window System: Application
----------------------------------------------------------------------

As you can see the search string only appears in the long description
(meaning the english translation) and not the short description.

So where is the long description required in the package database?
Where is the descriptions_packages.db relevant? How do I get it to do
something wrong?

Or are they all just relicts from before translation support was added?

To me it seems like the unique description id (DID) should be changed
from integer to the description-md5. No need to search all descriptions
for every package again and again to generate the interger DID when we
already have a unique key in the description-md5 for just that use.

Just an optimization? Sure. But that would optimize away your patch or not?

>> And now that the long description has been restored the computation of
>> description-md5 a few lines later computes to the right value, the one
>> that is already present.
>
> As said on IRC, if applied after my patches, you're optimizing out an
> MD5 sum in case it's present already; which doesn't hurt.

I never denied that.

What baffels me though is that you maintain my patch is wrong and at the
same time say it is an optimization.

> But that
> doesn't fix the issue with short descriptions floating around.

What issue is that?

> Mraw,
> KiBi.

MfG
        Goswin

PS: BACKEND is out of sync with the rest of the code. E.g.
packages_small.db contains the Description-md5 now. Right where the
Notes suggested "maybe add did right before shortdescription?"




Forcibly Merged 657557 660961. Request was from David Prévot <taffit@debian.org> to control@bugs.debian.org. (Thu, 23 Feb 2012 14:30:24 GMT) Full text and rfc822 format available.

Forcibly Merged 657557 660961. Request was from Simon Paillard <spaillard@debian.org> to control@bugs.debian.org. (Mon, 27 Feb 2012 22:18:03 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian WWW Team <debian-www@lists.debian.org>:
Bug#657557; Package www.debian.org. (Fri, 02 Mar 2012 13:45:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Cyril Brulebois <kibi@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian WWW Team <debian-www@lists.debian.org>. (Fri, 02 Mar 2012 13:45:10 GMT) Full text and rfc822 format available.

Message #73 received at 657557@bugs.debian.org (full text, mbox):

From: Cyril Brulebois <kibi@debian.org>
To: 657557@bugs.debian.org
Subject: Re: Bug#657557: Fix for #657557
Date: Fri, 2 Mar 2012 14:40:51 +0100
[Message part 1 (text/plain, inline)]
Cyril Brulebois <kibi@debian.org> (21/02/2012):
> If troubles arise from applying those patches, I could be tricked
> into helping to fix it up.

Experimental was only working for packages which had long descriptions
before the archive got changed, I fixed it up with the following patch.

The local diff is applied on powell and I ran a refresh to make sure it
was doing fine.

http://packages-powell.debian.org/en/experimental/libxkbcommon0

We probably want to clean up what's showing up in git status at some
point, I guess.

Mraw,
KiBi.
[0001-Bug-657557-Process-translations-in-experimental-too.patch (text/x-diff, attachment)]
[signature.asc (application/pgp-signature, inline)]

Reply sent to "Michał Kułach" <michalkulach@gmail.com>:
You have taken responsibility. (Mon, 05 Mar 2012 21:45:06 GMT) Full text and rfc822 format available.

Notification sent to "Michał Kułach" <michalkulach@gmail.com>:
Bug acknowledged by developer. (Mon, 05 Mar 2012 21:45:06 GMT) Full text and rfc822 format available.

Message #78 received at 657557-close@bugs.debian.org (full text, mbox):

From: "Michał Kułach" <michalkulach@gmail.com>
To: 657557-close@bugs.debian.org
Subject: closing 657557
Date: Mon, 05 Mar 2012 22:40:35 +0100
Long descriptions are displayed correctly, so I'm closing this bug.

Thanks,
-- 
Michał Kułach (pl DDTP team)




Reply sent to "Michał Kułach" <michalkulach@gmail.com>:
You have taken responsibility. (Mon, 05 Mar 2012 21:45:07 GMT) Full text and rfc822 format available.

Notification sent to Jari Aalto <jari.aalto@cante.net>:
Bug acknowledged by developer. (Mon, 05 Mar 2012 21:45:07 GMT) Full text and rfc822 format available.

Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Tue, 03 Apr 2012 07:45:59 GMT) Full text and rfc822 format available.

Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Mon Apr 21 02:27:55 2014; Machine Name: beach.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.