Debian Bug report logs -
#776658
lintian.d.o: Use database to reduce memory footprint
Reported by: Niels Thykier <niels@thykier.net>
Date: Fri, 30 Jan 2015 17:33:02 UTC
Severity: important
Found in version lintian/2.5.30+deb8u3
Done: Felix Lechner <felix.lechner@lease-up.com>
Bug is archived. No further changes may be made.
Toggle useless messages
Report forwarded
to debian-bugs-dist@lists.debian.org, Debian Lintian Maintainers <lintian-maint@debian.org>:
Bug#776658; Package lintian.
(Fri, 30 Jan 2015 17:33:07 GMT) (full text, mbox, link).
Acknowledgement sent
to Niels Thykier <niels@thykier.net>:
New Bug report received and forwarded. Copy sent to Debian Lintian Maintainers <lintian-maint@debian.org>.
(Fri, 30 Jan 2015 17:33:07 GMT) (full text, mbox, link).
Message #5 received at submit@bugs.debian.org (full text, mbox, reply):
Package: lintian
Version: 2.5.30+deb8u3
Severity: important
The reporting framework consumes a rather substantial amount of
memory.
The harness process itself hogs ~1GB of RAM. This in itself is not
concerning. However, it retains this usage even while running lintian
and html_reports. For the former, it "just" needs the current "work
queue" in memory. For the latter, it should not need any memory worth
mentioning.
The html_reports process itself consumes up to 2GB while processing
templates. It is possible that there is nothing we can do about that
as there *is* a lot of data in play. But even then, we can free it as
soon as possible (so we do not keep it while running gnuplot at the
end of the run).
Currently, when harness -i runs, the gnuplot process seems to die for
"no apparent" reason. I suspect it is OOM'ed though harness +
html_reports "only" consumes 65-70%ish of the memory available and
gnuplot seems fairly cheap memory-wise in comparison.
When running harness -r alone, harness skips parts of the code that
makes it consume memory and that seems to be suffient to making
html_reports + gnuplot terminate successfully.
~Niels
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Lintian Maintainers <lintian-maint@debian.org>:
Bug#776658; Package lintian.
(Sat, 31 Jan 2015 01:42:10 GMT) (full text, mbox, link).
Acknowledgement sent
to Russ Allbery <rra@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian Lintian Maintainers <lintian-maint@debian.org>.
(Sat, 31 Jan 2015 01:42:10 GMT) (full text, mbox, link).
Message #10 received at 776658@bugs.debian.org (full text, mbox, reply):
Niels Thykier <niels@thykier.net> writes:
> The html_reports process itself consumes up to 2GB while processing
> templates. It is possible that there is nothing we can do about that
> as there *is* a lot of data in play. But even then, we can free it as
> soon as possible (so we do not keep it while running gnuplot at the
> end of the run).
I think the code currently takes a very naive approach and loads the
entire state of the world into memory, and Perl's memory allocation is
known to aggressively trade space for speed.
If instead it stored the various things it cared about in a local SQLite
database, it would be a bit slower, but it would consume much less
memory. I bet the speed difference wouldn't be too bad. And this would
have the possibly useful side effect of creating a SQLite database full of
interesting statistics that one could run rich queries against.
--
Russ Allbery (rra@debian.org) <http://www.eyrie.org/~eagle/>
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Lintian Maintainers <lintian-maint@debian.org>:
Bug#776658; Package lintian.
(Sat, 31 Jan 2015 09:07:59 GMT) (full text, mbox, link).
Acknowledgement sent
to Niels Thykier <niels@thykier.net>:
Extra info received and forwarded to list. Copy sent to Debian Lintian Maintainers <lintian-maint@debian.org>.
(Sat, 31 Jan 2015 09:07:59 GMT) (full text, mbox, link).
Message #15 received at 776658@bugs.debian.org (full text, mbox, reply):
On 2015-01-31 02:38, Russ Allbery wrote:
> Niels Thykier <niels@thykier.net> writes:
>
>> The html_reports process itself consumes up to 2GB while processing
>> templates. It is possible that there is nothing we can do about that
>> as there *is* a lot of data in play. But even then, we can free it as
>> soon as possible (so we do not keep it while running gnuplot at the
>> end of the run).
>
> I think the code currently takes a very naive approach and loads the
> entire state of the world into memory, and Perl's memory allocation is
> known to aggressively trade space for speed.
>
It does try to share a lot of the inner data structures - there are
indeed still some deficiencies to it. I really wish one could do things
like string interning in perl.
> If instead it stored the various things it cared about in a local SQLite
> database, it would be a bit slower, but it would consume much less
> memory. I bet the speed difference wouldn't be too bad. And this would
> have the possibly useful side effect of creating a SQLite database full of
> interesting statistics that one could run rich queries against.
>
That is definitely worth consideration - thanks for the suggestion. It
would imply an immense rewrite of html_reports. While it is certainly
long overdue, it is not something I suspect I will have time (or mental
capacity) to do on my own.
I have started a different approach (see [1] for WIP code). It is
mostly a parallel track to your idea, so they can certainly co-exist.
The goal of this approach is to:
* Split harness into a "simple" coordinator
* Remove the Lab as a (primary) data store (it is too fragile)
* Harness state as datastore
The details of my design decisions are:
Harness - simple coordinator
============================
In my opinion, a lot of the (to quote private/TODO) "yuckness" of
harness happens because we want very well determined failure handling,
but never wrote harness with a structure that makes that trivial.
Notably, we do not want harness to crash (without logging it first) and
especially not while working on the Lab (see next section).
By moving logic to of harness, this rewrite will become easier as there
is less to juggle around with. Further, by moving it out of harness
(and into an other process), we can ensure that any memory consumption
caused by this task will definitely be freed when the child process
terminates. I have previously tried to make harness free some of its
memory with no luck.
Removing the Lab as data store
==============================
For me, there are several advantages in this. Firstly, the lab is very
fragile - if anything crashes (or is interrupted) while updating the
lab, the metadata is often trivially out of sync and the lab is (partly)
corrupted[0]. The end result is often that lintian/harness croaks on
importing stuff until someone manually runs a $lab->repair. However,
this does not fix all types of corruptions (see the FIXME in
L::Lab->repair), so... /o\
By removing the Lab as a data store, we can use a simpler and more
robust data store (more on that in the next section) AND use throw away
labs. I had a talk with DSA (I think weasel) about getting a tmpfs disk
on another machine for the heavy lifting. This implies that we *can* in
fact throw away the lab after every run.
Harness state as datastore
==========================
I introduced a "harness state cache" a couple of versions back to track
which packages needed to be reprocessed, when we uploaded a new version
of lintian. This (YAML) file can be trivially extended to contain all
the necessary information required by harness and html_reports to
replace the Lab as a data store. It already features several advantages
to the Lab, namely:
* Atomic updates of the content (see save_state_cache in harness)
* Automatically recreated from scratch if it "vanishes".
* We can add/remove information to/from without having to update the
lab metadata.
Certainly, this file can (also?) be replaced by an SQL(-lite) database.
If someone is willing to do or help me with the SQL(-lite) part, I am
definitely open for it.
~Niels
[0] Unless you manage to successfully run $LAB->close - harness does
not, lintian generally does.
[1]
http://anonscm.debian.org/cgit/users/nthykier/lintian.git/log/?h=reporting-rewrite
NB: Rebased regularly.
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Lintian Maintainers <lintian-maint@debian.org>:
Bug#776658; Package lintian.
(Mon, 02 Feb 2015 21:15:10 GMT) (full text, mbox, link).
Acknowledgement sent
to Niels Thykier <niels@thykier.net>:
Extra info received and forwarded to list. Copy sent to Debian Lintian Maintainers <lintian-maint@debian.org>.
(Mon, 02 Feb 2015 21:15:10 GMT) (full text, mbox, link).
Message #20 received at 776658@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On 2015-01-31 10:06, Niels Thykier wrote:
> [...]
>
> I have started a different approach (see [1] for WIP code). It is
> mostly a parallel track to your idea, so they can certainly co-exist.
>
> The goal of this approach is to:
>
> * Split harness into a "simple" coordinator
> * Remove the Lab as a (primary) data store (it is too fragile)
> * Harness state as datastore
> [...]
I got a patch series to implement this (see also [BRANCH]). I have
also managed to do a few tests on lilburn.debian.org with no issues.
The commit messages are at the bare minimum - apologise for that.
Review/remarks welcome.
With the rewrite:
* harness now uses ~16MB (rather than 700ish MB).
* harness now supports processing in "throw-away" labs (e.g. temp
labs cleaned up by lintian at the end of the run).
- Using $USE_PERMANENT_LAB = 0 in config
- $LINTIAN_SCRATCH_SPACE (if defined) becomes TMPDIR of the lintian
processes (and also the directory of the temp-labs).
* it is now possible to run harness on lintian.d.o with up to date
live data without affecting the automated run[1].
* it is possible to terminate lintian (e.g. kill -TERM) during a run
and harness will still mark the completed packages as up to date.
- caveat: sending harness a signal (e.g. ^C'ing harness) will not.
- caveat: Not recommended for $USE_PERMANENT_LAB = 1 cases.
* the full run is now implemented as "reschedule-all (eventually)" and
then an incremental run to start off with. The "clean mode" works
similarly, except it throws out the existing state files first.
What is missing / known issues:
* A patch to reduce the memory usage of html_reports (but that is for a
different patch series).
Nice to have / scope + feature creep:
* Have harness stop lintian when the time is up, rather than letting
it run to the end of the current run (which can be several hours
away).
* The state cache should probably be gzip'ed as it takes 30+MB of disk
(compressed is at 5-6ish MB)
Bonus features:
* Add --schedule-limit-groups parameter to change the limit of groups
to process. The timeout remains hard-coded though.
* It /should/ migrate the "last-processed-by" from the previous state
cache (untested). While it will still reschedule all packages,
their relative priority is retained.
Remarks on the patches:
* The patches 0001-0003 are self-contained and can be cherry-picked to
master (in any order in fact).
* The patches 0004-0008 are *not* self-contained and must all be
applied. I decided to have a patch per tool being updated as I
was easier to write them that way.
- Nothing a "--no-ff" merge cannot solve though.
* The 0009-0011 are incremental patches on top of the 0004-0008 bundle.
If there are no reviews or objections, I intend merge this later this
week and push it to lilburn.d.o to make graphs work again.
Thanks,
~Niels
[1] Take a copy the config, logs/lintian.log and the harness-state.
Ensure $USE_PERMANENT_LAB is set to 0 and $LINTIAN_SCRATCH_SPACE is set
to a (for you) writable directory with enough capacity (probably *not*
/tmp). Furthermore change the rest of the config values to fit your
workspace and you are good to go.
You /may/ also want to pass --schedule-limit-group N unless you do not
mind waiting several hours at worst.
[BRANCH]:
http://anonscm.debian.org/cgit/users/nthykier/lintian.git/log/?h=reporting-rewrite
NB: Rebased regularly.
[0001-Move-save_state_cache-to-L-Util.patch (application/mbox, attachment)]
[0002-L-Util-Add-untaint-subroutine.patch (application/mbox, attachment)]
[0003-lintian-Add-status-log-for-use-by-harness.patch (application/mbox, attachment)]
[0004-reporting-sync-state-New-internal-reporting-command.patch (application/mbox, attachment)]
[0005-Make-find_backlog-check-out-of-date-flag-in-the-stat.patch (application/mbox, attachment)]
[0006-Rewrite-harness-to-use-reporting-sync-state.patch (application/mbox, attachment)]
[0007-Rewrite-html_reports-to-use-harness-state-cache.patch (application/mbox, attachment)]
[0008-maintainer.tmpl-Use-out-of-date-marker.patch (application/mbox, attachment)]
[0009-harness-Move-backlog-time-out-handling-into-process_.patch (application/mbox, attachment)]
[0010-harness-Harness-lintian-in-a-subprocess.patch (application/mbox, attachment)]
[0011-r-harness-Add-schedule-limit-groups-cmd-line-argumen.patch (application/mbox, attachment)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Lintian Maintainers <lintian-maint@debian.org>:
Bug#776658; Package lintian.
(Tue, 03 Feb 2015 08:33:08 GMT) (full text, mbox, link).
Acknowledgement sent
to Niels Thykier <niels@thykier.net>:
Extra info received and forwarded to list. Copy sent to Debian Lintian Maintainers <lintian-maint@debian.org>.
(Tue, 03 Feb 2015 08:33:08 GMT) (full text, mbox, link).
Message #25 received at 776658@bugs.debian.org (full text, mbox, reply):
On 2015-02-02 22:13, Niels Thykier wrote:
> I got a patch series to implement this (see also [BRANCH]). I have
> also managed to do a few tests on lilburn.debian.org with no issues.
> The commit messages are at the bare minimum - apologise for that.
> Review/remarks welcome.
>
> With the rewrite:
>
> * harness now uses ~16MB (rather than 700ish MB).
This was possibly a bit too short. The slightly longer version:
* The primary harness process only uses ~16MB.
* The sub-process running Lintian ends up using ~450MB. So the
lintian run now have ~250MB extra memory free.
* The html_reports tree will have ~700MB extra memory.
~Niels
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Lintian Maintainers <lintian-maint@debian.org>:
Bug#776658; Package lintian.
(Mon, 23 Feb 2015 07:51:05 GMT) (full text, mbox, link).
Acknowledgement sent
to Niels Thykier <niels@thykier.net>:
Extra info received and forwarded to list. Copy sent to Debian Lintian Maintainers <lintian-maint@debian.org>.
(Mon, 23 Feb 2015 07:51:05 GMT) (full text, mbox, link).
Message #30 received at 776658@bugs.debian.org (full text, mbox, reply):
On 2015-01-30 18:30, Niels Thykier wrote:
> Package: lintian
> Version: 2.5.30+deb8u3
> Severity: important
>
> The reporting framework consumes a rather substantial amount of
> memory.
>
> [...]
>
> The html_reports process itself consumes up to 2GB while processing
> templates. It is possible that there is nothing we can do about that
> as there *is* a lot of data in play. But even then, we can free it as
> soon as possible (so we do not keep it while running gnuplot at the
> end of the run).
>
> Currently, when harness -i runs, the gnuplot process seems to die for
> "no apparent" reason. I suspect it is OOM'ed though harness +
> html_reports "only" consumes 65-70%ish of the memory available and
> gnuplot seems fairly cheap memory-wise in comparison.
> [...]
>
> ~Niels
>
>
I managed to reproduce the original issue locally. The problem is
indeed an OOM issue. The root cause is that html_reports reserves a
substantial part of the available and with "overcommit memory" set to
false, the fork() call will fail due to insufficient memory. In
summary, we (briefly) require the memory usage of html_reports x2 when
generating graphs.
This issue is fundamentally still present in html_reports, but the
changes so far have freed sufficient memory that lilburn^Wlindsay.d.o
have enough memory to pull off the fork.
Solutions may include the SQLite database Russ talked about or a
"pre-fork" the graph generation process. Sadly, reading the lintian.log
is both the primary source of our memory consumption *and* a hard
dependency for generating the graphs.
~Niels
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Lintian Maintainers <lintian-maint@debian.org>:
Bug#776658; Package lintian.
(Sun, 29 Mar 2015 08:15:04 GMT) (full text, mbox, link).
Acknowledgement sent
to Niels Thykier <niels@thykier.net>:
Extra info received and forwarded to list. Copy sent to Debian Lintian Maintainers <lintian-maint@debian.org>.
(Sun, 29 Mar 2015 08:15:05 GMT) (full text, mbox, link).
Message #35 received at 776658@bugs.debian.org (full text, mbox, reply):
Minor update:
I have written and merged a few more patches to further reduce the
memory consumption of html_reports. Based on my local testing, these
changes reduce the "general case" memory consumption of html_reports by
~25%.
The work by reducing the cost of "sharing" data for two tags, so it is
only applicable when lintian emits 2 or more tags for the same package.
~Niels
Information forwarded
to debian-bugs-dist@lists.debian.org, Debian Lintian Maintainers <lintian-maint@debian.org>:
Bug#776658; Package lintian.
(Sun, 16 Jul 2017 09:30:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Niels Thykier <niels@thykier.net>:
Extra info received and forwarded to list. Copy sent to Debian Lintian Maintainers <lintian-maint@debian.org>.
(Sun, 16 Jul 2017 09:30:03 GMT) (full text, mbox, link).
Message #40 received at 776658@bugs.debian.org (full text, mbox, reply):
On Fri, 30 Jan 2015 17:38:40 -0800 Russ Allbery <rra@debian.org> wrote:
> Niels Thykier <niels@thykier.net> writes:
>
> > The html_reports process itself consumes up to 2GB while processing
> > templates. It is possible that there is nothing we can do about that
> > as there *is* a lot of data in play. But even then, we can free it as
> > soon as possible (so we do not keep it while running gnuplot at the
> > end of the run).
>
> I think the code currently takes a very naive approach and loads the
> entire state of the world into memory, and Perl's memory allocation is
> known to aggressively trade space for speed.
>
> If instead it stored the various things it cared about in a local SQLite
> database, it would be a bit slower, but it would consume much less
> memory. I bet the speed difference wouldn't be too bad. And this would
> have the possibly useful side effect of creating a SQLite database full of
> interesting statistics that one could run rich queries against.
>
> --
> Russ Allbery (rra@debian.org) <http://www.eyrie.org/~eagle/>
>
>
Hi Russ (and others),
I have been considering the expand the scope of the reporting framework
to include testing and to de-tangle suite-related data in the reports
(i.e. get a separate report for each suite).
If I am to add that, I think a database might be the only realistic way
forward for that with all the bells and whistles we currently have.
However, I do not really have a lot of experience with Perl database
frameworks, so I could use some help here.
I suspect it would make sense to obsolete the harness state cache as
well (the YAML file), so most tools only need to deal with the database.
FTR, I would probably ask for a postgres database on lindsay.d.o. That
said, SQLite support would be great for local testing.
Thanks,
~Niels
Changed Bug title to 'litian.d.o: Use database to reduce memory footprint' from 'lintian: Memory consumption of harness and html_reports'.
Request was from Felix Lechner <felix.lechner@lease-up.com>
to control@bugs.debian.org.
(Mon, 30 Mar 2020 14:42:02 GMT) (full text, mbox, link).
Changed Bug title to 'lintian.d.o: Use database to reduce memory footprint' from 'litian.d.o: Use database to reduce memory footprint'.
Request was from Felix Lechner <felix.lechner@lease-up.com>
to control@bugs.debian.org.
(Mon, 30 Mar 2020 14:51:02 GMT) (full text, mbox, link).
Reply sent
to Felix Lechner <felix.lechner@lease-up.com>:
You have taken responsibility.
(Mon, 20 Apr 2020 00:42:04 GMT) (full text, mbox, link).
Notification sent
to Niels Thykier <niels@thykier.net>:
Bug acknowledged by developer.
(Mon, 20 Apr 2020 00:42:04 GMT) (full text, mbox, link).
Message #49 received at 776658-done@bugs.debian.org (full text, mbox, reply):
Hi,
On 2017-07-16, Niels Thykier wrote:
>
> I think a database might be the only realistic way forward
The reporting framework now uses an Sqlite3 database. The framework is
separate from Lintian. There is a tag sieve that scans the archive
[1], and a public website to inspect the results. [2] Both are
connected via an Sqlite3 database.
[1] https://salsa.debian.org/lechner/taxiv
[2] https://salsa.debian.org/lechner/detagtive
I will move the two repos to our team area when they are ready.
The database import is helped greatly by a new, experimental JSON
output mode in Lintian.
The Sqlite3 database appears sufficient for the time being. Its size
is approximately 230 MB with indices, and 150 MB without. Compressed,
the database is about 15 MB, and therefore just a bit larger than the
traditional lintian.log.gz. If historical information is worth
keeping, we may ask for a Postgres instance.
Also, we have a new lintian.d.o in beta. Please let us know what you think.
It is not clear that the database will reduce the memory footprint,
but I am closing this bug.
Kind regards
Felix Lechner
Bug archived.
Request was from Debbugs Internal Request <owner@bugs.debian.org>
to internal_control@bugs.debian.org.
(Mon, 18 May 2020 07:29:50 GMT) (full text, mbox, link).
Send a report that this bug log contains spam.
Debian bug tracking system administrator <owner@bugs.debian.org>.
Last modified:
Sun Nov 19 12:46:38 2023;
Machine Name:
bembo
Debian Bug tracking system
Debbugs is free software and licensed under the terms of the GNU
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson,
2005-2017 Don Armstrong, and many other contributors.