Debian Bug report logs - #998059
sphinx: LANGUAGE environment variable inconsistently affects output of objects.inv

version graph

Package: sphinx; Maintainer for sphinx is Debian Python Team <team+python@tracker.debian.org>;

Reported by: "Chris Lamb" <lamby@debian.org>

Date: Fri, 29 Oct 2021 09:00:05 UTC

Severity: normal

Found in versions 4.2.0-5, 4.5.0, sphinx/3.2.1-2

Forwarded to https://github.com/sphinx-doc/sphinx/pull/10949

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, reproducible-bugs@lists.alioth.debian.org, Debian Python Team <team+python@tracker.debian.org>:
Bug#998059; Package sphinx. (Fri, 29 Oct 2021 09:00:07 GMT) (full text, mbox, link).


Acknowledgement sent to "Chris Lamb" <lamby@debian.org>:
New Bug report received and forwarded. Copy sent to reproducible-bugs@lists.alioth.debian.org, Debian Python Team <team+python@tracker.debian.org>. (Fri, 29 Oct 2021 09:00:07 GMT) (full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: "Chris Lamb" <lamby@debian.org>
To: submit@bugs.debian.org
Subject: sphinx: LANGUAGE environement variable inconsistently affects output of objects.inv
Date: Fri, 29 Oct 2021 09:57:41 +0100
Package: sphinx
Version: 4.2.0-5
Severity: normal
User: reproducible-builds@lists.alioth.debian.org
Usertags: toolchain
X-Debbugs-Cc: reproducible-bugs@lists.alioth.debian.org

Hi,

An update has rendered a lot of packages that use Sphinx
unreproducible — as in, generating different output regardless of
the surrounding environment.

I'm not entirely sure where the bug is here, but it seems like there is
something up with language handling and generating the objects.inv
file.

In particular, if we compare a 'first' build with LANGUAGE="en_GB:en"
and the 'second' with LANGUAGE="et_EE:et" environment variable, what
happens is that all of the documentation is identical *except* a
single entry in the objects.inv file which appears to be translated.

Decoding the zlib-encoded objects.inv file, I can see that the
difference is a translation one:

  # Sphinx inventory version 2
  # Project: OpenDrop
  # Version: 
  # The remainder of this file is compressed using zlib.

  developers/index std:doc -1 developers/index.html Developer notes
  -genindex std:label -1 genindex.html Index
  +genindex std:label -1 genindex.html Indeks
  getting_started/index std:doc -1 getting_started/index.html Getting Started
  index std:doc -1 index.html Overview
  modindex std:label -1 py-modindex.html Module Index
  py-modindex std:label -1 py-modindex.html Python Module Index

This is despite the output including the following logging message in
both builds:

  dumping search index in English (code: en)... done
  dumping object inventory... done

(Note the "code: en" here in both builds)

The curious thing is why 'Indeks' was translated and not, for
instance, 'Module Index'. Indeed, I've dumped some more info upstream
here, but filing a Debian bug as, as mentioned, it's causing a lot of
reproducibility issues.

  https://github.com/sphinx-doc/sphinx/issues/9778


Regards,

-- 
      ,''`.
     : :'  :     Chris Lamb
     `. `'`      lamby@debian.org / chris-lamb.co.uk
       `-




Changed Bug title to 'sphinx: LANGUAGE environment variable inconsistently affects output of objects.inv' from 'sphinx: LANGUAGE environement variable inconsistently affects output of objects.inv'. Request was from "Chris Lamb" <lamby@debian.org> to control@bugs.debian.org. (Sun, 31 Oct 2021 11:45:02 GMT) (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Debian Python Team <team+python@tracker.debian.org>:
Bug#998059; Package sphinx. (Wed, 02 Feb 2022 19:33:05 GMT) (full text, mbox, link).


Acknowledgement sent to "Rebecca N. Palmer" <rebecca_palmer@zoho.com>:
Extra info received and forwarded to list. Copy sent to Debian Python Team <team+python@tracker.debian.org>. (Wed, 02 Feb 2022 19:33:05 GMT) (full text, mbox, link).


Message #12 received at 998059@bugs.debian.org (full text, mbox, reply):

From: "Rebecca N. Palmer" <rebecca_palmer@zoho.com>
To: 998059@bugs.debian.org
Subject: reproducibility workaround for sphinx objects.inv
Date: Wed, 2 Feb 2022 19:30:04 +0000
For anyone else trying to work around this: LC_ALL=C.UTF-8 isn't enough 
but LC_ALL=C.UTF-8 LANGUAGE=C.UTF-8 is.

https://salsa.debian.org/med-team/snakemake/-/commit/8bf4db496feed2fa84f23018acba0033e321ef69
https://salsa.debian.org/med-team/snakemake/-/pipelines




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Python Team <team+python@tracker.debian.org>:
Bug#998059; Package sphinx. (Thu, 26 May 2022 22:39:03 GMT) (full text, mbox, link).


Acknowledgement sent to Nicolas Boulenguez <nicolas@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Python Team <team+python@tracker.debian.org>. (Thu, 26 May 2022 22:39:03 GMT) (full text, mbox, link).


Message #17 received at 998059@bugs.debian.org (full text, mbox, reply):

From: Nicolas Boulenguez <nicolas@debian.org>
To: 998059@bugs.debian.org
Subject: reproducibility workaround for sphinx objects.inv
Date: Fri, 27 May 2022 00:38:27 +0200
A subclass of such issues affects punctuation.
For example, a straight apostrophe ' sometimes becomes an apostrophe ’
in the libxmlada package.
There are several links in the smartquotes section of
https://www.sphinx-doc.org/en/master/usage/configuration.html

Easy work-around:
  smartquotes=False in conf.py

Proper fix, when possible:
  use the right character in the .md sources in the first place.



Information forwarded to debian-bugs-dist@lists.debian.org, debian-python@lists.debian.org, Debian Python Team <team+python@tracker.debian.org>:
Bug#998059; Package sphinx. (Sun, 02 Oct 2022 12:21:03 GMT) (full text, mbox, link).


Acknowledgement sent to James Addison <jay@jp-hosting.net>:
Extra info received and forwarded to list. Copy sent to debian-python@lists.debian.org, Debian Python Team <team+python@tracker.debian.org>. (Sun, 02 Oct 2022 12:21:03 GMT) (full text, mbox, link).


Message #22 received at 998059@bugs.debian.org (full text, mbox, reply):

From: James Addison <jay@jp-hosting.net>
To: Debian Bug Tracking System <998059@bugs.debian.org>
Subject: Re: sphinx: LANGUAGE environment variable inconsistently affects output of objects.inv
Date: Sun, 02 Oct 2022 13:17:39 +0100
Source: sphinx
Followup-For: Bug #998059
X-Debbugs-Cc: debian-python@lists.debian.org

(context: cross-posting based on an idea[1] that has been discussed upstream in sphinx's GitHub repository about how to resolve locale-based build variance)

The SPHINXOPTS[2] environment variable provides a way to selectively override defined environment variables (such as LANGUAGE) in a way that should only affect sphinx (limiting the effects on unrelated build steps).

For example:

  SPHINXOPTS='-D LANGUAGE="en_US.UTF-8"'

That would allow the objects.inv file to be built in a fixed language on a per-package basis.

It doesn't seem ideal to artificially limit documentation localization for affected packages, but could allow many of the affected packages to pass diffoscope reproducibility testing.

Alternatives explored: I wasn't able to identify a straightforward, supported way to disable creation of the objects.inv file, nor is it currently possible to invoke a multi-locale HTML sphinx build (an approach that might permit output of monolithic documentation outputs that wouldn't vary based on build environment locale).  I'll file a feature request for the former; the latter is tracked by an existing request[3].

[1] - https://github.com/sphinx-doc/sphinx/issues/9778#issuecomment-1264065231

[2] - https://github.com/sphinx-doc/sphinx/blob/v4.5.0/doc/man/sphinx-build.rst#environment-variables

[3] - https://github.com/sphinx-doc/sphinx/issues/788



Set Bug forwarded-to-address to 'https://github.com/sphinx-doc/sphinx/pull/10949'. Request was from James Addison <jay@jp-hosting.net> to control@bugs.debian.org. (Tue, 15 Nov 2022 10:24:03 GMT) (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, jay@jp-hosting.net, Debian Python Team <team+python@tracker.debian.org>:
Bug#998059; Package sphinx. (Tue, 15 Nov 2022 10:33:03 GMT) (full text, mbox, link).


Acknowledgement sent to James Addison <jay@jp-hosting.net>:
Extra info received and forwarded to list. Copy sent to jay@jp-hosting.net, Debian Python Team <team+python@tracker.debian.org>. (Tue, 15 Nov 2022 10:33:03 GMT) (full text, mbox, link).


Message #29 received at 998059@bugs.debian.org (full text, mbox, reply):

From: James Addison <jay@jp-hosting.net>
To: Debian Bug Tracking System <998059@bugs.debian.org>
Subject: Re: sphinx: LANGUAGE environment variable inconsistently affects output of objects.inv
Date: Tue, 15 Nov 2022 10:28:05 +0000
Source: sphinx
Version: 3.2.1-2
Followup-For: Bug #998059
X-Debbugs-Cc: jay@jp-hosting.net

After taking another look at this issue a couple of weeks ago, and some further discussion with Chris on the relevant GitHub issue, it seemed like a better approach (instead of disabling objects.inv output altogether, or configuring environment variables on a per-package basis) is to set a neutral/null translation locale during reproducible builds for sphinx.  A pull request to do that has been opened/forwarded upstream.



Information forwarded to debian-bugs-dist@lists.debian.org, lamby@debian.org, Debian Python Team <team+python@tracker.debian.org>:
Bug#998059; Package sphinx. (Mon, 10 Apr 2023 08:30:03 GMT) (full text, mbox, link).


Acknowledgement sent to James Addison <jay@jp-hosting.net>:
Extra info received and forwarded to list. Copy sent to lamby@debian.org, Debian Python Team <team+python@tracker.debian.org>. (Mon, 10 Apr 2023 08:30:03 GMT) (full text, mbox, link).


Message #34 received at 998059@bugs.debian.org (full text, mbox, reply):

From: James Addison <jay@jp-hosting.net>
To: Debian Bug Tracking System <998059@bugs.debian.org>
Subject: Re: sphinx: LANGUAGE environment variable inconsistently affects output of objects.inv
Date: Mon, 10 Apr 2023 09:26:10 +0100
Package: python3-sphinx
Followup-For: Bug #998059
X-Debbugs-Cc: lamby@debian.org
Control: found -1 4.5.0
Control: notfound -1 5.0.0

Dear Maintainer,

My updated understanding is that this issue was fixed[1] in version 5.0.0 of
Sphinx.

I've documented[2] the process I followed using 'git bisect' to determine when
the fix was introduced (between versions 4.5.0 and 5.0.0 of sphinx).

Since my previous comment, the (I now believe, incorrect) changeset that I'd
forwarded to resolve the problem in more recent Sphinx versions was accepted,
so I've offered an explanation and revert changeset that I hope may be
acceptable before the upcoming 6.2.0 release.

Although it could be time-consuming to understand and verify my findings, if
someone has time to do that then I'd be grateful (I've confused myself a few
times while working on this bug, and I don't trust my findings enough yet to
close it).

Thank you,
James

[1] - https://github.com/sphinx-doc/sphinx/commit/e4e58a4f2791e528cdaa861b96636a1e37a558ba

[2] - https://github.com/sphinx-doc/sphinx/issues/9778#issuecomment-1501172176



Marked as found in versions 4.5.0. Request was from James Addison <jay@jp-hosting.net> to 998059-submit@bugs.debian.org. (Mon, 10 Apr 2023 08:30:03 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Wed May 17 12:16:34 2023; Machine Name: bembo

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.