Debian Bug report logs - #991059
diffoscope: subprocess.CalledProcessError: Command unsquashfs... returned non-zero exit status 1

version graph

Package: diffoscope; Maintainer for diffoscope is Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>; Source for diffoscope is src:diffoscope (PTS, buildd, popcon).

Reported by: Holger Levsen <holger@debian.org>

Date: Tue, 13 Jul 2021 13:27:01 UTC

Severity: normal

Found in version diffoscope/177

Fixed in version diffoscope/181

Done: Chris Lamb <lamby@debian.org>

Bug is archived. No further changes may be made.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, rclobus@rclobus.nl, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Tue, 13 Jul 2021 13:27:03 GMT) (full text, mbox, link).


Acknowledgement sent to Holger Levsen <holger@debian.org>:
New Bug report received and forwarded. Copy sent to rclobus@rclobus.nl, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Tue, 13 Jul 2021 13:27:03 GMT) (full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: Holger Levsen <holger@debian.org>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: diffoscope: subprocess.CalledProcessError: Command unsquashfs... returned non-zero exit status 1
Date: Tue, 13 Jul 2021 15:22:50 +0200
[Message part 1 (text/plain, inline)]
Package: diffoscope
Version: 177
Severity: normal
x-debbugs-cc: Roland Clobus <rclobus@rclobus.nl>

Dear Maintainer,

https://jenkins.debian.net/job/reproducible_debian_live_build_cinnamon_bullseye/lastFailedBuild/consoleFull
shows a failure to run diffoscope on a cinnamon libe-build hybris.iso:

+ timeout 30m nice schroot --directory /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu -c source:jenkins-reproducible-unstable-diffoscope diffoscope -- --html /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu/live-build/cinnamon/live-image-amd64.hybrid.iso.html /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu/b1/live-build/cinnamon/live-image-amd64.hybrid.iso /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu/b2/live-build/cinnamon/live-image-amd64.hybrid.iso
+ RESULT=2
++ grep '^E: 15binfmt: update-binfmts: unable to open' /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu/tmp.VaTe8khm5d
++ true
+ LOG_RESULT=
+ '[' '!' -z '' ']'
+ true
+ set -e
+ cat /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu/tmp.VaTe8khm5d
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 494, in compare
    difference = self._compare_using_details(other, source)
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 429, in _compare_using_details
    details.extend(
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/container.py", line 130, in comparisons
    my_members = OrderedDict(self.get_adjusted_members_sizes())
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/container.py", line 122, in get_adjusted_members_sizes
    for name, member in self.get_adjusted_members():
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/container.py", line 78, in get_filtered_members
    for name in filter_excludes(self.get_member_names()):
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/squashfs.py", line 250, in get_member_names
    self.ensure_unpacked()
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/squashfs.py", line 263, in ensure_unpacked
    output = our_check_output(
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/command.py", line 117, in our_check_output
    return subprocess.check_output(cmd, *args, **kwargs)
  File "/usr/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('unsquashfs', '-n', '-f', '-no', '-li', '-d', '.', '/tmp/diffoscope_tj0g1di9_cinnamon/tmp_ss17mt0LibarchiveContainerWithFilelist/0/891.squashfs')' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/diffoscope/main.py", line 746, in main
    sys.exit(run_diffoscope(parsed_args))
  File "/usr/lib/python3/dist-packages/diffoscope/main.py", line 700, in run_diffoscope
    difference = compare_root_paths(path1, path2)
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/compare.py", line 69, in compare_root_paths
    difference = compare_files(file1, file2)
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/compare.py", line 125, in compare_files
    return file1.compare(file2, source)
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 494, in compare
    difference = self._compare_using_details(other, source)
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 430, in _compare_using_details
    self.as_container.compare(
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/libarchive.py", line 366, in compare
    differences.extend(super().compare(other, **kwargs))
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/container.py", line 191, in compare_pair
    difference = compare_files(
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/compare.py", line 125, in compare_files
    return file1.compare(file2, source)
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 515, in compare
    difference = self.compare_bytes(other, source=source)
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 382, in compare_bytes
    return compare_binary_files(self, other, source)
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/compare.py", line 151, in compare_binary_files
    return Difference.from_operation(
  File "/usr/lib/python3/dist-packages/diffoscope/difference.py", line 269, in from_operation
    return Difference.from_operation_exc(
  File "/usr/lib/python3/dist-packages/diffoscope/difference.py", line 290, in from_operation_exc
    feeder1, operation1, excluded1 = operation_and_feeder(path1)
  File "/usr/lib/python3/dist-packages/diffoscope/difference.py", line 287, in operation_and_feeder
    operation.start()
  File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/command.py", line 45, in start
    self._process = subprocess.run(
  File "/usr/lib/python3.9/subprocess.py", line 507, in run
    stdout, stderr = process.communicate(input, timeout=timeout)
  File "/usr/lib/python3.9/subprocess.py", line 1134, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "/usr/lib/python3.9/subprocess.py", line 2001, in _communicate
    data = os.read(key.fd, 32768)
MemoryError

Sadly I don't have those .iso files available but I suppose we provide them if needed.


-- 
cheers,
	Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

Words may inspire but only action creates change.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Wed, 14 Jul 2021 18:39:03 GMT) (full text, mbox, link).


Acknowledgement sent to Roland Clobus <rclobus@rclobus.nl>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Wed, 14 Jul 2021 18:39:03 GMT) (full text, mbox, link).


Message #10 received at 991059@bugs.debian.org (full text, mbox, reply):

From: Roland Clobus <rclobus@rclobus.nl>
To: Holger Levsen <holger@debian.org>, 991059@bugs.debian.org
Subject: Re: Bug#991059: diffoscope: subprocess.CalledProcessError: Command unsquashfs... returned non-zero exit status 1
Date: Wed, 14 Jul 2021 20:31:48 +0200
[Message part 1 (text/plain, inline)]
Hello Maintainer,

I'm planning to change the Jenkins job, such that the files that cause
diffoscope to crash will be published. Each file is 2.6GB.
The difference between these two files is located inside a squashfs
file, which is inside the iso file.
During the invocation of diffoscope, diffoscope needs lots of memory
(>32GB) and free space on /tmp (>32GB but <48GB).

In the mean time, the timeout of 30 minutes in Jenkins has been raised
to 120 minutes, but that still does not fix the crash.

I also noticed that the Jenkins job set 'ulimit -v 10485760', which I
didn't further investigate yet.

I've attempted to reproduce the ISO files (using an older snapshot) on
my own comupter, but there diffoscope was able to run until the end,
even though it needed 105 minutes wall time...

With kind regards,
Roland Clobus

On 13/07/2021 15:22, Holger Levsen wrote:
> Package: diffoscope
> Version: 177
> Severity: normal
> x-debbugs-cc: Roland Clobus <rclobus@rclobus.nl>
> 
> Dear Maintainer,
> 
> https://jenkins.debian.net/job/reproducible_debian_live_build_cinnamon_bullseye/lastFailedBuild/consoleFull
> shows a failure to run diffoscope on a cinnamon libe-build hybris.iso:
> 
> + timeout 30m nice schroot --directory /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu -c source:jenkins-reproducible-unstable-diffoscope diffoscope -- --html /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu/live-build/cinnamon/live-image-amd64.hybrid.iso.html /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu/b1/live-build/cinnamon/live-image-amd64.hybrid.iso /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu/b2/live-build/cinnamon/live-image-amd64.hybrid.iso
> + RESULT=2
> ++ grep '^E: 15binfmt: update-binfmts: unable to open' /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu/tmp.VaTe8khm5d
> ++ true
> + LOG_RESULT=
> + '[' '!' -z '' ']'
> + true
> + set -e
> + cat /srv/reproducible-results/live-build-cinnamon-Xj14Z0Pu/tmp.VaTe8khm5d
> Traceback (most recent call last):
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 494, in compare
>     difference = self._compare_using_details(other, source)
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 429, in _compare_using_details
>     details.extend(
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/container.py", line 130, in comparisons
>     my_members = OrderedDict(self.get_adjusted_members_sizes())
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/container.py", line 122, in get_adjusted_members_sizes
>     for name, member in self.get_adjusted_members():
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/container.py", line 78, in get_filtered_members
>     for name in filter_excludes(self.get_member_names()):
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/squashfs.py", line 250, in get_member_names
>     self.ensure_unpacked()
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/squashfs.py", line 263, in ensure_unpacked
>     output = our_check_output(
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/command.py", line 117, in our_check_output
>     return subprocess.check_output(cmd, *args, **kwargs)
>   File "/usr/lib/python3.9/subprocess.py", line 424, in check_output
>     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
>   File "/usr/lib/python3.9/subprocess.py", line 528, in run
>     raise CalledProcessError(retcode, process.args,
> subprocess.CalledProcessError: Command '('unsquashfs', '-n', '-f', '-no', '-li', '-d', '.', '/tmp/diffoscope_tj0g1di9_cinnamon/tmp_ss17mt0LibarchiveContainerWithFilelist/0/891.squashfs')' returned non-zero exit status 1.
> 
> During handling of the above exception, another exception occurred:
> 
> Traceback (most recent call last):
>   File "/usr/lib/python3/dist-packages/diffoscope/main.py", line 746, in main
>     sys.exit(run_diffoscope(parsed_args))
>   File "/usr/lib/python3/dist-packages/diffoscope/main.py", line 700, in run_diffoscope
>     difference = compare_root_paths(path1, path2)
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/compare.py", line 69, in compare_root_paths
>     difference = compare_files(file1, file2)
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/compare.py", line 125, in compare_files
>     return file1.compare(file2, source)
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 494, in compare
>     difference = self._compare_using_details(other, source)
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 430, in _compare_using_details
>     self.as_container.compare(
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/libarchive.py", line 366, in compare
>     differences.extend(super().compare(other, **kwargs))
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/container.py", line 191, in compare_pair
>     difference = compare_files(
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/compare.py", line 125, in compare_files
>     return file1.compare(file2, source)
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 515, in compare
>     difference = self.compare_bytes(other, source=source)
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/file.py", line 382, in compare_bytes
>     return compare_binary_files(self, other, source)
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/compare.py", line 151, in compare_binary_files
>     return Difference.from_operation(
>   File "/usr/lib/python3/dist-packages/diffoscope/difference.py", line 269, in from_operation
>     return Difference.from_operation_exc(
>   File "/usr/lib/python3/dist-packages/diffoscope/difference.py", line 290, in from_operation_exc
>     feeder1, operation1, excluded1 = operation_and_feeder(path1)
>   File "/usr/lib/python3/dist-packages/diffoscope/difference.py", line 287, in operation_and_feeder
>     operation.start()
>   File "/usr/lib/python3/dist-packages/diffoscope/comparators/utils/command.py", line 45, in start
>     self._process = subprocess.run(
>   File "/usr/lib/python3.9/subprocess.py", line 507, in run
>     stdout, stderr = process.communicate(input, timeout=timeout)
>   File "/usr/lib/python3.9/subprocess.py", line 1134, in communicate
>     stdout, stderr = self._communicate(input, endtime, timeout)
>   File "/usr/lib/python3.9/subprocess.py", line 2001, in _communicate
>     data = os.read(key.fd, 32768)
> MemoryError
> 
> Sadly I don't have those .iso files available but I suppose we provide them if needed.
> 
> 




[OpenPGP_signature (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Thu, 15 Jul 2021 16:45:02 GMT) (full text, mbox, link).


Message #13 received at 991059@bugs.debian.org (full text, mbox, reply):

From: Mattia Rizzolo <mattia@debian.org>
To: Roland Clobus <rclobus@rclobus.nl>, 991059@bugs.debian.org
Cc: Holger Levsen <holger@debian.org>
Subject: Re: Bug#991059: diffoscope: subprocess.CalledProcessError: Command unsquashfs... returned non-zero exit status 1
Date: Thu, 15 Jul 2021 18:41:00 +0200
[Message part 1 (text/plain, inline)]
On Wed, Jul 14, 2021 at 08:31:48PM +0200, Roland Clobus wrote:
> I'm planning to change the Jenkins job, such that the files that cause
> diffoscope to crash will be published. Each file is 2.6GB.

In this case, please have a look at reproducible_build.sh that already
does something similar.  you might want to consider moving the relevant
function to _common.sh and call it wherever else you need it.

> The difference between these two files is located inside a squashfs
> file, which is inside the iso file.
> During the invocation of diffoscope, diffoscope needs lots of memory
> (>32GB) and free space on /tmp (>32GB but <48GB).

Sigh, that's huge. :(

> In the mean time, the timeout of 30 minutes in Jenkins has been raised
> to 120 minutes, but that still does not fix the crash.
…
> I've attempted to reproduce the ISO files (using an older snapshot) on
> my own comupter, but there diffoscope was able to run until the end,
> even though it needed 105 minutes wall time...

105 in an idle system probably means that it would need 3/4 hours on
jenkins.  120 totally won't fit.


(BTW, this doesn't change that diffoscope shouldn't crash, just the
timeout wrapper exiting.)

-- 
regards,
                        Mattia Rizzolo

GPG Key: 66AE 2B4A FCCF 3F52 DA18  4D18 4B04 3FCD B944 4540      .''`.
More about me:  https://mapreri.org                             : :'  :
Launchpad user: https://launchpad.net/~mapreri                  `. `'`
Debian QA page: https://qa.debian.org/developer.php?login=mattia  `-
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Wed, 11 Aug 2021 09:51:03 GMT) (full text, mbox, link).


Acknowledgement sent to Roland Clobus <rclobus@rclobus.nl>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Wed, 11 Aug 2021 09:51:03 GMT) (full text, mbox, link).


Message #18 received at 991059@bugs.debian.org (full text, mbox, reply):

From: Roland Clobus <rclobus@rclobus.nl>
To: Mattia Rizzolo <mattia@debian.org>, 991059@bugs.debian.org
Cc: Holger Levsen <holger@debian.org>
Subject: Re: Bug#991059: diffoscope: out-of-memory
Date: Wed, 11 Aug 2021 11:49:46 +0200
[Message part 1 (text/plain, inline)]
On 15/07/2021 18:41, Mattia Rizzolo wrote:
> On Wed, Jul 14, 2021 at 08:31:48PM +0200, Roland Clobus wrote:

>> The difference between these two files is located inside a squashfs
>> file, which is inside the iso file.
>> During the invocation of diffoscope, diffoscope needs lots of memory
>> (>32GB) and free space on /tmp (>32GB but <48GB).

I've got the 2 ISO files that were used in a Jenkins run now, each 2.6GB.

In attempting to reproduce the case, I've mounted /tmp to a file instead
of using tmpfs and I've booted with 'single'. My computer has 32GB
memory, about 200MB is in use by the OS.

When I'm running diffoscope as root, I get the following metrics
(obtained by polling every 5 seconds):

21935816 1k blocks on /tmp (about 22GB)
1010 MiB memory at the peak
Needed time: 120 minutes +/ 10

However, when I run the same command as a regular user, I get an OOM
after about 5 minutes. At that time, the first squashfs image (2.3GB) is
completely decompressed to disc (8.0GB), and xxd is running.

The output of diffoscope with --debug:

2021-08-11 09:15:48 D: diffoscope.comparators.utils.file: Instantiating
a squashfs.SquashfsContainer for live/filesystem.squashfs
2021-08-11 09:15:48 D: diffoscope.comparators.squashfs: Extracting
/tmp/diffoscope_wzbylzvt_/tmpz60gl4slLibarchiveContainerWithFilelist/0/887.squashfs
to /tmp/diffoscope_wzbylzvt_/tmpg2r8zivtsquashfs
2021-08-11 09:15:48 D: diffoscope.comparators.utils.command: Calling
external command: unsquashfs -n -f -no -li -d .
/tmp/diffoscope_wzbylzvt_/tmpz60gl4slLibarchiveContainerWithFilelist/0/887.squashfs
2021-08-11 09:16:55 D: diffoscope.comparators.utils.command: Executing
xxd {}
2021-08-11 09:18:40 D: diffoscope.comparators.utils.command: Executing
xxd {}
Killed

So it appears to me that different code is activated for regular users
and root.
I hope this report helps in finding/fixing the issue.

With kind regards,
Roland Clobus

[OpenPGP_signature (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Thu, 12 Aug 2021 15:45:02 GMT) (full text, mbox, link).


Acknowledgement sent to Roland Clobus <rclobus@rclobus.nl>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Thu, 12 Aug 2021 15:45:03 GMT) (full text, mbox, link).


Message #23 received at 991059@bugs.debian.org (full text, mbox, reply):

From: Roland Clobus <rclobus@rclobus.nl>
To: Mattia Rizzolo <mattia@debian.org>, 991059@bugs.debian.org
Cc: Holger Levsen <holger@debian.org>
Subject: Re: Bug#991059: diffoscope: out-of-memory
Date: Thu, 12 Aug 2021 17:41:23 +0200
[Message part 1 (text/plain, inline)]
Answering my own mail.

On 11/08/2021 11:49, Roland Clobus wrote:
> So it appears to me that different code is activated for regular users
> and root.

I think I've found the cause for the different code paths.
The squashfs image contains devices, which can only be extracted as root.

Output of unsquashfs as root, returnvalue = 0:
created 255001 files
created 20202 directories
created 26672 symlinks
created 8 devices
created 0 fifos

Output of unsquashfs as user, returnvalue = 1:
created 255001 files
created 20202 directories
created 26672 symlinks
created 0 devices
created 0 fifos

With squashfuse (mounting the image in userspace) to replace the
unpacking to disc with unsquashfs, this issue might be avoided.

With kind regards,
Roland Clobus

[OpenPGP_signature (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Fri, 13 Aug 2021 17:03:03 GMT) (full text, mbox, link).


Acknowledgement sent to "Chris Lamb" <chris@reproducible-builds.org>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Fri, 13 Aug 2021 17:03:03 GMT) (full text, mbox, link).


Message #28 received at 991059@bugs.debian.org (full text, mbox, reply):

From: "Chris Lamb" <chris@reproducible-builds.org>
To: "Roland Clobus" <rclobus@rclobus.nl>, 991059@bugs.debian.org, "Mattia Rizzolo" <mattia@debian.org>
Cc: "Holger Levsen" <holger@debian.org>
Subject: Re: Bug#991059: diffoscope: out-of-memory
Date: Fri, 13 Aug 2021 16:52:03 -0000
Hi Roland,

> > So it appears to me that different code is activated for regular users
> > and root.

In addition to the filesystem device difference (discussed below), the
other highly relevant difference is that processes run as root are
terminated by the OOM killer with a slower priority.

This is unlikely to be the underlying issue of course, but it will
introduce uncertainty to any experiment or
testcase.

> I think I've found the cause for the different code paths.
> The squashfs image contains devices, which can only be extracted as root.

Ah, bravo — well discovered! Alas, I'm afraid I should have been able
to help you come to this earlier, for I clearly encountered precisely
this issue before and had completely forgotten about it:

  https://salsa.debian.org/reproducible-builds/diffoscope/commit/95dbe95a471e127798614727deea637186c1364f
  https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/63

(I will be the first to admit that I did not really resolve the
underlying problem, merely prevented it from coming up in the
testsuite.)

One question though — why would the character devices existing or not
be relevant to it OOMing? Or rather, why aren't they simply compared
in the normal way? Sure, if character devices exist they will take
extra time and resources to be compared, but surely your ISO does not
contain so many character devices that it adds a significant burden to
the comparison process?

> With squashfuse (mounting the image in userspace) to replace the
> unpacking to disc with unsquashfs, this issue might be avoided.

Oh that's an interesting idea. However, let's keep it in the back
pocket for now — filesystem mounting (particularly of the FUSE
variety) would not be a trivial addition to diffoscope, so we should
be sure the effort and complexity is justified first.

Let's take stock. What do we want this diffoscope invocation on
Jenkins to actually do? In the first instance, we obviously don't want
it to OOM. But do we want it to extract these character devices or
not? And if we want or can skip over them, what should we do in that
situation? And is that going to be helpful at all in this OOM
situation anyway?

(A side question: can you confirm whether diffoscope is running as
root or not in your particular Jenkins test? I don't want to
misinterpret the logs.)


Best wishes,

--
      o
    ⬋   ⬊      Chris Lamb
   o     o     reproducible-builds.org 💠
    ⬊   ⬋
      o





Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Sat, 14 Aug 2021 14:09:02 GMT) (full text, mbox, link).


Acknowledgement sent to Roland Clobus <rclobus@rclobus.nl>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Sat, 14 Aug 2021 14:09:03 GMT) (full text, mbox, link).


Message #33 received at 991059@bugs.debian.org (full text, mbox, reply):

From: Roland Clobus <rclobus@rclobus.nl>
To: Chris Lamb <chris@reproducible-builds.org>, 991059@bugs.debian.org, Mattia Rizzolo <mattia@debian.org>
Cc: Holger Levsen <holger@debian.org>
Subject: Re: Bug#991059: diffoscope: out-of-memory
Date: Sat, 14 Aug 2021 16:01:34 +0200
[Message part 1 (text/plain, inline)]
Hello Chris,

On 13/08/2021 18:52, Chris Lamb wrote:
>> I think I've found the cause for the different code paths.
>> The squashfs image contains devices, which can only be extracted as root.
> 
> Ah, bravo — well discovered! Alas, I'm afraid I should have been able
> to help you come to this earlier, for I clearly encountered precisely
> this issue before and had completely forgotten about it:
> 
>   https://salsa.debian.org/reproducible-builds/diffoscope/commit/95dbe95a471e127798614727deea637186c1364f
>   https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/63
> 
> (I will be the first to admit that I did not really resolve the
> underlying problem, merely prevented it from coming up in the
> testsuite.)

This looks exactly like the issue at hand.
As you wrote, the commit avoids the issue instead of resolving it.
The wishlist ticket to handle this issue was closed
(https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/65)

The hard part is, that unsquashfs only has two possible return values: 0
and 1. There is not discrimination made for the cause of an error state.

> One question though — why would the character devices existing or not
> be relevant to it OOMing? Or rather, why aren't they simply compared
> in the normal way? Sure, if character devices exist they will take
> extra time and resources to be compared, but surely your ISO does not
> contain so many character devices that it adds a significant burden to
> the comparison process?

There are only 8 devices in the image.

> Let's take stock. What do we want this diffoscope invocation on
> Jenkins to actually do? In the first instance, we obviously don't want
> it to OOM. But do we want it to extract these character devices or
> not? And if we want or can skip over them, what should we do in that
> situation? And is that going to be helpful at all in this OOM
> situation anyway?

The diffoscope invocation on Jenkins should primarily show whether the
content of the squashfs image is identical between two build runs, and
should list the differences of the files within the image. Attached is
the output that I can get running diffoscope as root.

The chance of character devices being different is nearly zero, so it
would be ok to have these as a blind spot.
Since the character devices are embedded in the squashfs image, they are
not active (not connected to a device). If they would have been created,
only a basic comparison suffices.

The difference that is found by diffoscope (when running as root) lies
in a difference of regular files within the squashfs image. Both
squashfs images have different lengths and (due to the compression) are
totally different.

However, because unsquashfs returns a non-zero value, diffoscope assumes
that the extraction failed and reverts to a binary comparison (using
xxd). The output of xxd is piped directly to memory. So the 2.6GiB
squashfs image will become a 9.5GiB xxd file. Running 'diff -u' on these
xxd files results on my computer (with 32GB) in an OOM.
Anyway, doing a binary comparison on squashfs files of this kind is not
that meaningful.

On jenkins.debian.net, the amount of memory is limited with 'ulimit -v'
to 10GB, so that limit is reached rather quickly.


Having written all this, I noticed that by focussing on the crash
itself, I lost the overall goal: having the differences within the
squashfs image listed.

If possible, I would like to see something like:
* If the return value of unsquashfs is non-zero, look whether stderr
only contains lines like
'create_inode: could not create character device ./dev/console, because
you're not superuser!'
* If that is the case, resume normal operation, pretending the return
code to be zero
* If not, then something else happened, which is out-of-scope for this
ticket and handled with the current code

> (A side question: can you confirm whether diffoscope is running as
> root or not in your particular Jenkins test? I don't want to
> misinterpret the logs.)

I'm 99.9% sure that I'm not running as root, because I would have needed
a sudo invocation (and there would not have been an OOM). Holger, can
you confirm this?

With kind regards,
Roland Clobus
[out8.html (text/html, attachment)]
[OpenPGP_signature (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Mon, 16 Aug 2021 13:15:03 GMT) (full text, mbox, link).


Acknowledgement sent to Holger Levsen <holger@layer-acht.org>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Mon, 16 Aug 2021 13:15:03 GMT) (full text, mbox, link).


Message #38 received at 991059@bugs.debian.org (full text, mbox, reply):

From: Holger Levsen <holger@layer-acht.org>
To: Roland Clobus <rclobus@rclobus.nl>, 991059@bugs.debian.org
Cc: Chris Lamb <chris@reproducible-builds.org>, Mattia Rizzolo <mattia@debian.org>
Subject: Re: Bug#991059: diffoscope: out-of-memory
Date: Mon, 16 Aug 2021 13:12:30 +0000
[Message part 1 (text/plain, inline)]
On Sat, Aug 14, 2021 at 04:01:34PM +0200, Roland Clobus wrote:
> > (A side question: can you confirm whether diffoscope is running as
> > root or not in your particular Jenkins test? I don't want to
> > misinterpret the logs.)
> I'm 99.9% sure that I'm not running as root, because I would have needed
> a sudo invocation (and there would not have been an OOM). Holger, can
> you confirm this?

diffoscope is run by call_diffoscope() from bin/reproducible_common.sh as
user jenkins.


-- 
cheers,
	Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

If nothing saves us from death, may love at least save us from life.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Tue, 17 Aug 2021 11:15:02 GMT) (full text, mbox, link).


Acknowledgement sent to "Chris Lamb" <lamby@debian.org>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Tue, 17 Aug 2021 11:15:03 GMT) (full text, mbox, link).


Message #43 received at 991059@bugs.debian.org (full text, mbox, reply):

From: "Chris Lamb" <lamby@debian.org>
To: "Roland Clobus" <rclobus@rclobus.nl>, 991059@bugs.debian.org
Subject: Re: Bug#991059: diffoscope: out-of-memory
Date: Tue, 17 Aug 2021 12:02:46 +0100
Hi Roland,

> If possible, I would like to see something like:
> * If the return value of unsquashfs is non-zero, look whether stderr
>   only contains lines like
>   'create_inode: could not create character device ./dev/console, because
>   you're not superuser!'
> * If that is the case, resume normal operation, pretending the return
>   code to be zero
> * If not, then something else happened, which is out-of-scope for this
>   ticket and handled with the current code

Great idea. I've tried implementing this on diffoscope 'master' and it
appears to work for me. Can you test it there, or shall I release it first?


Best wishes,

-- 
      ,''`.
     : :'  :     Chris Lamb
     `. `'`      lamby@debian.org / chris-lamb.co.uk
       `-



Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Tue, 17 Aug 2021 11:45:02 GMT) (full text, mbox, link).


Acknowledgement sent to Roland Clobus <rclobus@rclobus.nl>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Tue, 17 Aug 2021 11:45:02 GMT) (full text, mbox, link).


Message #48 received at 991059@bugs.debian.org (full text, mbox, reply):

From: Roland Clobus <rclobus@rclobus.nl>
To: Chris Lamb <lamby@debian.org>, 991059@bugs.debian.org
Subject: Re: Bug#991059: diffoscope: handling of errors in unsquashfs
Date: Tue, 17 Aug 2021 13:42:28 +0200
[Message part 1 (text/plain, inline)]
Hello Chris,

On 17/08/2021 13:02, Chris Lamb wrote:
>> If possible, I would like to see something like:
>> * If the return value of unsquashfs is non-zero, look whether stderr
>>   only contains lines like
>>   'create_inode: could not create character device ./dev/console, because
>>   you're not superuser!'
>> * If that is the case, resume normal operation, pretending the return
>>   code to be zero
>> * If not, then something else happened, which is out-of-scope for this
>>   ticket and handled with the current code
> 
> Great idea. I've tried implementing this on diffoscope 'master' and it
> appears to work for me. Can you test it there, or shall I release it first?

Thanks for the code.

The test suite (which uses git directly) fails at the moment:
https://jenkins.debian.net/job/reproducible_diffoscope_from_git/1063/consoleFull

In parallel, I'll run the code on the images that I have here (using the
latest code in git).

With kind regards,
Roland Clobus

[OpenPGP_signature (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Tue, 17 Aug 2021 12:45:02 GMT) (full text, mbox, link).


Acknowledgement sent to "Chris Lamb" <lamby@debian.org>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Tue, 17 Aug 2021 12:45:02 GMT) (full text, mbox, link).


Message #53 received at 991059@bugs.debian.org (full text, mbox, reply):

From: "Chris Lamb" <lamby@debian.org>
To: "Roland Clobus" <rclobus@rclobus.nl>, 991059@bugs.debian.org
Subject: Re: Bug#991059: diffoscope: handling of errors in unsquashfs
Date: Tue, 17 Aug 2021 13:34:46 +0100
Hi Roland,

> Thanks for the code.
>
> The test suite (which uses git directly) fails at the moment:
> https://jenkins.debian.net/job/reproducible_diffoscope_from_git/1063/consoleFull

Thanks. I will address this very shortly.


Best wishes,

--
      ,''`.
     : :'  :     Chris Lamb
     `. `'`      lamby@debian.org 🍥 chris-lamb.co.uk
       `-



Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Tue, 17 Aug 2021 14:09:05 GMT) (full text, mbox, link).


Acknowledgement sent to Roland Clobus <rclobus@rclobus.nl>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Tue, 17 Aug 2021 14:09:05 GMT) (full text, mbox, link).


Message #58 received at 991059@bugs.debian.org (full text, mbox, reply):

From: Roland Clobus <rclobus@rclobus.nl>
To: Chris Lamb <lamby@debian.org>, 991059@bugs.debian.org
Subject: Re: Bug#991059: diffoscope: handling of errors in unsquashfs
Date: Tue, 17 Aug 2021 16:05:19 +0200
[Message part 1 (text/plain, inline)]
Hello Chris,

On 17/08/2021 14:34, Chris Lamb wrote:
>> The test suite (which uses git directly) fails at the moment:
>> https://jenkins.debian.net/job/reproducible_diffoscope_from_git/1063/consoleFull
> 
> Thanks. I will address this very shortly.

Thanks, ab780bf6ad6bfa0c88767827065d3fdba4fd3b32 made the test pass in
Jenkins.
https://jenkins.debian.net/job/reproducible_diffoscope_from_git/1065/

Meanwhile, my computer worked for 2 hours, and now I can confirm that
the output is generated without issues and is correct.

With kind regards,
Roland Clobus

[OpenPGP_signature (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>:
Bug#991059; Package diffoscope. (Tue, 17 Aug 2021 14:57:04 GMT) (full text, mbox, link).


Acknowledgement sent to "Chris Lamb" <lamby@debian.org>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>. (Tue, 17 Aug 2021 14:57:04 GMT) (full text, mbox, link).


Message #63 received at 991059@bugs.debian.org (full text, mbox, reply):

From: "Chris Lamb" <lamby@debian.org>
To: "Roland Clobus" <rclobus@rclobus.nl>, 991059@bugs.debian.org
Subject: Re: Bug#991059: diffoscope: handling of errors in unsquashfs
Date: Tue, 17 Aug 2021 15:52:24 +0100
tags 991059 + pending
thanks

Hey,

> Thanks, ab780bf6ad6bfa0c88767827065d3fdba4fd3b32 made the test pass in
> Jenkins.
> https://jenkins.debian.net/job/reproducible_diffoscope_from_git/1065/
>
> Meanwhile, my computer worked for 2 hours, and now I can confirm that
> the output is generated without issues and is correct.

Huzzah! I'll release diffoscope with these changes very soon.


Best wishes,

--
      ,''`.
     : :'  :     Chris Lamb
     `. `'`      lamby@debian.org 🍥 chris-lamb.co.uk
       `-



Added tag(s) pending. Request was from "Chris Lamb" <lamby@debian.org> to control@bugs.debian.org. (Tue, 17 Aug 2021 14:57:05 GMT) (full text, mbox, link).


Reply sent to Chris Lamb <lamby@debian.org>:
You have taken responsibility. (Fri, 20 Aug 2021 09:21:03 GMT) (full text, mbox, link).


Notification sent to Holger Levsen <holger@debian.org>:
Bug acknowledged by developer. (Fri, 20 Aug 2021 09:21:03 GMT) (full text, mbox, link).


Message #70 received at 991059-close@bugs.debian.org (full text, mbox, reply):

From: Debian FTP Masters <ftpmaster@ftp-master.debian.org>
To: 991059-close@bugs.debian.org
Subject: Bug#991059: fixed in diffoscope 181
Date: Fri, 20 Aug 2021 09:18:38 +0000
Source: diffoscope
Source-Version: 181
Done: Chris Lamb <lamby@debian.org>

We believe that the bug you reported is fixed in the latest version of
diffoscope, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 991059@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Chris Lamb <lamby@debian.org> (supplier of updated diffoscope package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@ftp-master.debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Format: 1.8
Date: Fri, 20 Aug 2021 10:03:35 +0100
Source: diffoscope
Built-For-Profiles: nocheck
Architecture: source
Version: 181
Distribution: unstable
Urgency: medium
Maintainer: Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>
Changed-By: Chris Lamb <lamby@debian.org>
Closes: 991059
Changes:
 diffoscope (181) unstable; urgency=medium
 .
   [ Chris Lamb ]
 .
   * New features and bug fixes:
     - Don't require apksigner in order to compare .apk files using apktool.
     - Add a special-case to squshfs image extraction to not fail if we aren't
       root/superuser. (Closes: #991059)
     - Reduce the maximum line length to avoid O(n^2) Wagner-Fischer algorithm,
       which meant that diff generation took an inordinate amount of time.
       (Closes: reproducible-builds/diffoscope#272)
     - Include profiling information in --debug output if --profile is not set.
     - Don't print an orphan newline when the Black source code formatter
       self-test passes.
 .
   * Tests:
     - Update test to check specific contents of squashfs listing, otherwise it
       fails depending on the test systems uid-to-username mapping in passwd(5).
     - Assign "seen" and "expected" values to local variables to improve
       contextual information in/around failed tests.
 .
   * Misc changes:
     - Print the size of generated HTML, text (etc.) reports.
     - Profile calls to specialize and diffoscope.diff.linediff.
     - Update various copyright years.
Checksums-Sha1:
 0e5f4a306050ef58f7b9435a477fbc52e4ac75ef 4938 diffoscope_181.dsc
 d59229648a66bae0a166360d1e588238a7b79f48 1021288 diffoscope_181.tar.xz
 c474e7387490678f88b6538b42f3be351e07bb2d 6816 diffoscope_181_amd64.buildinfo
Checksums-Sha256:
 e6b13d0bf8d27cf62399086f4a62b7ad754cd4c0d71739a776764bb82600ea23 4938 diffoscope_181.dsc
 f4e3f1006b73b22f79a97704cc6c7dbc3d00925b9a3ceda95c2c3518e4841e96 1021288 diffoscope_181.tar.xz
 cc69b57590b24fc0b0d0e71c36da3e01481c7a8bc8948d5fd0fd0ce9d533ce8e 6816 diffoscope_181_amd64.buildinfo
Files:
 9818c1915fa81c37277b33e6146d7080 4938 devel optional diffoscope_181.dsc
 68abbf03316aef43860770c26ea37aaf 1021288 devel optional diffoscope_181.tar.xz
 0c71a23036ebc93ddca1d3469681f13c 6816 devel optional diffoscope_181_amd64.buildinfo

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEwv5L0nHBObhsUz5GHpU+J9QxHlgFAmEfcBwACgkQHpU+J9Qx
Hlh9sA/+LtgrnfLEY2kqtWdehj5cXF4si/Wfe7/jaHv3C1hEsIwci2y0CYbqyhC1
q4ChTNenj23Wtqvrb+QcOV+Va7yEW3IK8Vo6ski9BXYkDPcsKX+nfe2PC7BFs/Rm
lShdSJ+ZONgE053LOBsm6kBeH+d3jYX+NaZzeGKoB4orV1E69700sQ954OJ0fvYU
ZvGhprveH6GKmSirzzdE4Xw38crC6vY+JeRtMQiEy1ADy3WeUqjTvzioR+nD7oLd
ct82UAY96lzKqo6P81GCN/oRvpZSHHTSgF4tLXtC0kLkkHsDjbKhEyN5f1aJTdwP
V/obPgnF/PE8BOCAxOCIFcVBsEkdEpu5jRZW8cGSP0PNXXYENWzacJejzXrclpFu
i4GHNnXrody3+7iFURx6hI1gSjzinaMd31R/y2RrJ2yKLYz5rEaNKoDXmaUV4lbE
HrNXHCAcfz3+OQ9dgBh7fLJ2h+Nzl8t1zDKRPmta2d53rKdN8IqeATA/3v9W7V7X
9gEUwLgYS2totKwxS7aMSQSToQ2ZmOaSnbkME93mE/OyH5BZRjV6aq4faKsgLxOe
20f5CzUIw65sOzjAEAzrovU4a7d8G5TjxFBDe+Hhpi4mAMHfb8w9eEKX5O5gqUzL
BT8KpDAfoLPZYY5yAqubYTqvqbYptTwJ226C2AAd0IaVDGT9fsE=
=9G/H
-----END PGP SIGNATURE-----




Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Mon, 20 Sep 2021 07:28:27 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Wed May 17 09:25:21 2023; Machine Name: buxtehude

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.