Debian Bug report logs - #636292
dak/apt: deficiencies at handling out-of-sync metadata

version graph

Package: apt; Maintainer for apt is APT Development Team <deity@lists.debian.org>; Source for apt is src:apt.

Reported by: jidanni@jidanni.org

Date: Tue, 2 Aug 2011 01:15:02 UTC

Severity: normal

Merged with 582352

Found in version apt/0.8.15.5

Summary: Well, that has two problems we have observed in practice:

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, spaillard@debian.org, debian-mirrors@lists.debian.org, hmh@debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Tue, 02 Aug 2011 01:15:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
New Bug report received and forwarded. Copy sent to spaillard@debian.org, debian-mirrors@lists.debian.org, hmh@debian.org, APT Development Team <deity@lists.debian.org>. (Tue, 02 Aug 2011 01:15:05 GMT) Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: submit@bugs.debian.org
Subject: MD5Sum mismatch is due to multiple DNS queries!
Date: Tue, 02 Aug 2011 09:10:40 +0800
X-Debbugs-Cc: spaillard@debian.org, debian-mirrors@lists.debian.org, hmh@debian.org
Package: apt
Version: 0.8.15.5

I think I have a very good idea of what is causing all those MD5Sum
mismatch errors during apt-get update.
( http://article.gmane.org/gmane.linux.debian.user.mirrors/1368 )

You see during a single apt-get update, there will be TWO (2) queries
made to the DNS server for each ONE (1) line in a sources.list file.

I believe one query gets the thing. The other gets the checksum of the
thing.

Now you can guess what will happen when that one line is a round robin
site name.

Yup, if the _two different machines_ now being called are slightly out of
sync, naturally the checksums will not match!

The cure is to fix apt so that it only makes one query!

Making a second query not only does not even out the total load on the
servers any more, it also means there are several windows of time each
day when you are comparing apples from machine 1 to oranges from machine
2! Keep it all on one machine and you will be safe.

You can test it yourself. Turn on verbose debugging in your DNS server,
and do apt-get update, and check the log. Voila, two queries for each one line
in sources.list!

Now try a
$ ping example.com

Check your DNS logs. Only one DNS query is made, despite many repeated
connections. Ping has got it right. Apt has got it wrong.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Wed, 03 Aug 2011 21:27:11 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Wed, 03 Aug 2011 21:27:13 GMT) Full text and rfc822 format available.

Message #10 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: debian-devel@lists.debian.org
Cc: 636292@bugs.debian.org, spaillard@debian.org, hmh@debian.org
Subject: apt MD5Sum mismatch is due to multiple DNS queries!
Date: Thu, 04 Aug 2011 05:24:59 +0800
Gentlemen, junior programmer me has finally found the reason
behind apt's MD5Sum mismatchs: multiple DNS queries!
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=636292




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Thu, 04 Aug 2011 08:15:11 GMT) Full text and rfc822 format available.

Acknowledgement sent to Samuel Thibault <sthibault@debian.org>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Thu, 04 Aug 2011 08:15:13 GMT) Full text and rfc822 format available.

Message #15 received at 636292@bugs.debian.org (full text, mbox):

From: Samuel Thibault <sthibault@debian.org>
To: jidanni@jidanni.org, 636292@bugs.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Thu, 4 Aug 2011 10:12:41 +0200
retitle 636292 MD5Sum mismatch error
thanks

jidanni@jidanni.org, le Tue 02 Aug 2011 09:10:40 +0800, a écrit :
> I think I have a very good idea of what is causing all those MD5Sum
> mismatch errors during apt-get update.
> ( http://article.gmane.org/gmane.linux.debian.user.mirrors/1368 )
> 
> You see during a single apt-get update, there will be TWO (2) queries
> made to the DNS server for each ONE (1) line in a sources.list file.
> 
> I believe one query gets the thing. The other gets the checksum of the
> thing.
> 
> Now you can guess what will happen when that one line is a round robin
> site name.

I'm getting the error on all ftp.{uk,ch,fr}.debian.org sites, which do
not use round robin at all.

Samuel




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Thu, 04 Aug 2011 08:27:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Thu, 04 Aug 2011 08:27:03 GMT) Full text and rfc822 format available.

Message #20 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: sthibault@debian.org
Cc: 636292@bugs.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Thu, 04 Aug 2011 16:24:18 +0800
>>>>> "ST" == Samuel Thibault <sthibault@debian.org> writes:
ST> I'm getting the error on all ftp.{uk,ch,fr}.debian.org sites, which do
ST> not use round robin at all.
All I know is rocky-mountain.csail.mit.edu is rock solid. Try that.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 05 Aug 2011 01:57:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonathan Nieder <jrnieder@gmail.com>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 05 Aug 2011 01:57:03 GMT) Full text and rfc822 format available.

Message #25 received at 636292@bugs.debian.org (full text, mbox):

From: Jonathan Nieder <jrnieder@gmail.com>
To: jidanni@jidanni.org
Cc: 636292@bugs.debian.org
Subject: Re: MD5Sum mismatch is due to multiple DNS queries!
Date: Fri, 5 Aug 2011 03:52:24 +0200
Hi,

積丹尼 wrote:

> You can test it yourself. Turn on verbose debugging in your DNS server,
> and do apt-get update, and check the log. Voila, two queries for each one line
> in sources.list!

That particular consequence of mirrors' use of round-robin DNS is
tracked as Bug#582352.  As far as I can tell, it violates the HTTP
spec and can confuse proxies even if the clients are fixed.  I would
be willing to carry out a protocol change to make this work (doing one
DNS query and using the IP as hostname from then on), but it's not
clear anyone involved is interested, so for now I just avoid
round-robin DNS in sources.list on machines I manage.

Thanks for the reproduction recipe.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 05 Aug 2011 02:27:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 05 Aug 2011 02:27:03 GMT) Full text and rfc822 format available.

Message #30 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: jrnieder@gmail.com
Cc: 636292@bugs.debian.org, control@bugs.debian.org
Subject: Re: MD5Sum mismatch is due to multiple DNS queries!
Date: Fri, 05 Aug 2011 10:24:40 +0800
forcemerge 636292 582352
thanks
>>>>> "JN" == Jonathan Nieder <jrnieder@gmail.com> writes:
JN> 積丹尼 wrote:

>> You can test it yourself. Turn on verbose debugging in your DNS server,
>> and do apt-get update, and check the log. Voila, two queries for each one line
>> in sources.list!

JN> That particular consequence of mirrors' use of round-robin DNS is
JN> tracked as Bug#582352.  As far as I can tell, it violates the HTTP
JN> spec and can confuse proxies even if the clients are fixed.  I would
JN> be willing to carry out a protocol change to make this work (doing one
JN> DNS query and using the IP as hostname from then on), but it's not
JN> clear anyone involved is interested, so for now I just avoid
JN> round-robin DNS in sources.list on machines I manage.

JN> Thanks for the reproduction recipe.
I'll forcemerge the bugs. That will swing them into action.




Forcibly Merged 582352 636292. Request was from jidanni@jidanni.org to control@bugs.debian.org. (Fri, 05 Aug 2011 03:24:05 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 05 Aug 2011 17:42:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kurt Roeckx <kurt@roeckx.be>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 05 Aug 2011 17:42:05 GMT) Full text and rfc822 format available.

Message #37 received at 636292@bugs.debian.org (full text, mbox):

From: Kurt Roeckx <kurt@roeckx.be>
To: jidanni@jidanni.org, 636292@bugs.debian.org
Cc: debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Fri, 5 Aug 2011 19:40:09 +0200
On Tue, Aug 02, 2011 at 09:10:40AM +0800, jidanni@jidanni.org wrote:
> 
> I think I have a very good idea of what is causing all those MD5Sum
> mismatch errors during apt-get update.
> ( http://article.gmane.org/gmane.linux.debian.user.mirrors/1368 )
> 
> You see during a single apt-get update, there will be TWO (2) queries
> made to the DNS server for each ONE (1) line in a sources.list file.

I'm not sure what you mean.  I do see 2 queries, but's it's for
the A (ipv4) and AAAA (ipv6) record:
19:03:20.575070 IP localhost.35750 > localhost.domain: 41865+ A?  ftp.be.debian.org. (35)
19:03:20.575688 IP localhost.domain > localhost.35750: 41865 1/4/7 A 77.243.184.65 (281)
19:03:20.575885 IP localhost.35750 > localhost.domain: 48866+ AAAA? ftp.be.debian.org. (35)
19:03:20.576190 IP localhost.domain > localhost.35750: 48866 1/4/7 AAAA 2a01:300:11:4:2e0:81ff:fe63:cdb2 (293)

There are no other queries, and this is perfectly normal.  There
is nothing wrong with this.

Even with multiple lines in the sources.list file I only see those
2 requests.

(tested with apt 0.8.15.4, I doubt 0.8.15.5 behaves differently.)

As far as I know the issues with hash sum mismatches is either one
of:
- They use an old version of the mirror script that didn't exclude
  InRelease in the first stage.  As a result the InRelease file
  was already updated while the Packages/Sources file isn't for
  a long time.  This has been a problem since ftp-master started
  generating those InRelease file, which was just after the
  squeeze release.
- There is always a delay between updating the Release file and
  the Packages and Sources file, and the error should go away
  after a short time.
- ftp-master generated broken files for some reason.  It sometimes
  happen but not that often.

So I suggest you make sure that all the mirrors that you see
an issue with have updated their mirror script, since I think
that's the biggest issue at the moment.

This was fixed with this commit in archvsync:
commit 77223bb1af262e139a898020a05680e932d51888
Author: Joerg Jaspert <joerg@debian.org>
Date:   Tue Feb 22 22:32:13 2011 +0100

    ftpsync

    update rsync_options1 to also exclude the newish InRelease files in the first run

    Signed-off-by: Joerg Jaspert <joerg@debian.org>

This is part of the 80387 version that you can find in
project/ftpsync/ on the Debian mirrors.  80387 was released
the next day.

If they are using this script to update the mirror, you should
be able to see the version in project/trace/

If there is no version in that file (only a date) they're probably
using an even older script that's also broken.

If they're not using that script or the latest version of it, you
will very likely see the hash sum issues during the mirror sync.


An other issue might be that you're behind some broken transparent
proxy and your connection gets directed to a different servers for
each file you get.  As far as I know apt will only open 1
connection to the server and requests all files over that single
connection, so this really shouldn't happen.


Kurt





Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 05 Aug 2011 20:45:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Henrique de Moraes Holschuh <hmh@debian.org>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 05 Aug 2011 20:45:05 GMT) Full text and rfc822 format available.

Message #42 received at 636292@bugs.debian.org (full text, mbox):

From: Henrique de Moraes Holschuh <hmh@debian.org>
To: Kurt Roeckx <kurt@roeckx.be>
Cc: jidanni@jidanni.org, 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Fri, 5 Aug 2011 17:43:28 -0300
(cc's kept since I am not really sure everyone involved is in subscribed
to debian-mirrors.  If you want me to start trimming them down, please
say so).

On Fri, 05 Aug 2011, Kurt Roeckx wrote:
> Even with multiple lines in the sources.list file I only see those
> 2 requests.

Hmm, a normal request like this is supposed to return a number of A or
AAAA records for, e.g. ftp.us.debian.org, and not just one.

Just so that we can close that door completely, does apt do the right
thing and use always the same A record or AAAA record from the returned
set, switching to the next one only if there are problems?  I believe it
does it right, but it would be nice to have a definitive answer on it
(and I don't really grok apt to take a quick look at the source to check
it myself).

> (tested with apt 0.8.15.4, I doubt 0.8.15.5 behaves differently.)
> 
> As far as I know the issues with hash sum mismatches is either one
> of:
> - They use an old version of the mirror script that didn't exclude
>   InRelease in the first stage.  As a result the InRelease file
>   was already updated while the Packages/Sources file isn't for
>   a long time.  This has been a problem since ftp-master started
>   generating those InRelease file, which was just after the
>   squeeze release.
> - There is always a delay between updating the Release file and
>   the Packages and Sources file, and the error should go away
>   after a short time.
> - ftp-master generated broken files for some reason.  It sometimes
>   happen but not that often.
> 
> So I suggest you make sure that all the mirrors that you see
> an issue with have updated their mirror script, since I think
> that's the biggest issue at the moment.

That is actually quite possible.  However, it is also something we can
assert for sure:

> This was fixed with this commit in archvsync:
> commit 77223bb1af262e139a898020a05680e932d51888
> Author: Joerg Jaspert <joerg@debian.org>
> Date:   Tue Feb 22 22:32:13 2011 +0100
> 
>     ftpsync
> 
>     update rsync_options1 to also exclude the newish InRelease files in the first run
> 
>     Signed-off-by: Joerg Jaspert <joerg@debian.org>
> 
> This is part of the 80387 version that you can find in
> project/ftpsync/ on the Debian mirrors.  80387 was released
> the next day.
> 
> If they are using this script to update the mirror, you should
> be able to see the version in project/trace/
>
> If there is no version in that file (only a date) they're probably
> using an even older script that's also broken.

So, it is time to inspect the project/trace/* files in every mirror on
the multi-mirror aliases that users have complained about.

> An other issue might be that you're behind some broken transparent
> proxy and your connection gets directed to a different servers for
> each file you get.  As far as I know apt will only open 1
> connection to the server and requests all files over that single
> connection, so this really shouldn't happen.

That might not be true if it is a http/1.0 proxy, or if persistent
connections get disabled for whatever reason.  In that case, apt would
have to make multiple connections, and therefore any proxy, transparent
or not, would likely round-robin over the multiple A and AAAA records.

The answer for that would be to update our repository format to have
something seqlock-like to allow apt to detect metadata generation
mismatch, and thus be able to automatically refetch things until it gets
all metadata with the same generation number:
http://en.wikipedia.org/wiki/Seqlock

Maybe using rsync or ftp can help, if it enforces the "get everything
using the same connection" that http might or might not allow apt to do.
But that does NOT scale well at the mirror server side, at all.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 05 Aug 2011 21:30:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 05 Aug 2011 21:30:06 GMT) Full text and rfc822 format available.

Message #47 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: hmh@debian.org
Cc: kurt@roeckx.be, 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Sat, 06 Aug 2011 05:26:31 +0800
Oh my god even my "rock solid" rocky-mountain server is crumbling today:

W: Failed to fetch http://rocky-mountain.csail.mit.edu/debian/dists/experimental/main/binary-i386/PackagesIndex  MD5Sum mismatch

W: Failed to fetch http://rocky-mountain.csail.mit.edu/debian/dists/unstable/main/binary-i386/PackagesIndex  MD5Sum mismatch

E: Some index files failed to download. They have been ignored, or old ones used instead.

My theories are up in the air. My reputation is ruined.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 05 Aug 2011 22:09:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 05 Aug 2011 22:09:05 GMT) Full text and rfc822 format available.

Message #52 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: hmh@debian.org
Cc: kurt@roeckx.be, 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Sat, 06 Aug 2011 06:07:19 +0800
Ha ha ha, it really does split a single apt-get update into two
different places completely across the Internet.

Any maybe even for singular servers like rocky-mountain... maybe
upstream from it is the same splitting problem somewhere.

Anyway here we go:
# cat /etc/apt/sources.list.d/*
deb http://ftp.us.debian.org/debian unstable contrib
# tcpflow -i ppp0 &
# apt-get update
# ls -og /tmp/m
-rw-r--r-- 1 146150 Aug  6 05:54 064.050.233.100.00080-218.163.001.135.45826
-rw-r--r-- 1  68985 Aug  6 05:54 199.006.012.070.00080-218.163.001.135.56243
-rw-r--r-- 1    185 Aug  6 05:54 218.163.001.135.45826-064.050.233.100.00080
-rw-r--r-- 1   1432 Aug  6 05:54 218.163.001.135.56243-199.006.012.070.00080
$ host ftp.us.debian.org
ftp.us.debian.org has address 128.30.2.36
ftp.us.debian.org has address 199.6.12.70
ftp.us.debian.org has address 35.9.37.225
ftp.us.debian.org has address 64.50.233.100
ftp.us.debian.org has address 64.50.236.52
ftp.us.debian.org has IPv6 address 2001:500:61:28::70




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Sat, 06 Aug 2011 01:33:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Sat, 06 Aug 2011 01:33:06 GMT) Full text and rfc822 format available.

Message #57 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: kurt@roeckx.be
Cc: 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Sat, 06 Aug 2011 09:30:44 +0800
>>>>> "KR" == Kurt Roeckx <kurt@roeckx.be> writes:
KR> - There is always a delay between updating the Release file and
KR>   the Packages and Sources file, and the error should go away
KR>   after a short time.

NOT acceptable.
I hope on the mirrors they are not doing something like
$ cd staging_area && wget a b
when they should be doing
$ wget a b && mv a b staging_area




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Sat, 06 Aug 2011 01:39:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Sat, 06 Aug 2011 01:39:03 GMT) Full text and rfc822 format available.

Message #62 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: kurt@roeckx.be
Cc: 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Sat, 06 Aug 2011 09:37:50 +0800
> $ wget a b && mv a b staging_area
With a and b and staging_area all being on the same disk partition, for
almost an atomic operation...
OK this is probably not the culprit today, but it is just good practice.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Sat, 06 Aug 2011 02:03:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Sat, 06 Aug 2011 02:03:05 GMT) Full text and rfc822 format available.

Message #67 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: hmh@debian.org
Cc: kurt@roeckx.be, 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Sat, 06 Aug 2011 09:59:51 +0800
>>>>> "H" == Henrique de Moraes Holschuh <hmh@debian.org> writes:
H> Maybe using rsync or ftp can help, if it enforces the "get everything
H> using the same connection" that http might or might not allow apt to do.
H> But that does NOT scale well at the mirror server side, at all.

Well whatever you do, remember a+b+c+a+b+c=a+a+b+b+c+c, so please be
sure no round robin switching is occurring when it shouldn't. No matter
during user operations or mirror operations. In the big picture the load all
evens out anyway, so no savings are had, and instead errors are introduced.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Sat, 06 Aug 2011 09:54:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kurt Roeckx <kurt@roeckx.be>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Sat, 06 Aug 2011 09:54:07 GMT) Full text and rfc822 format available.

Message #72 received at 636292@bugs.debian.org (full text, mbox):

From: Kurt Roeckx <kurt@roeckx.be>
To: jidanni@jidanni.org
Cc: 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Sat, 6 Aug 2011 11:50:51 +0200
On Sat, Aug 06, 2011 at 09:30:44AM +0800, jidanni@jidanni.org wrote:
> >>>>> "KR" == Kurt Roeckx <kurt@roeckx.be> writes:
> KR> - There is always a delay between updating the Release file and
> KR>   the Packages and Sources file, and the error should go away
> KR>   after a short time.
> 
> NOT acceptable.
> I hope on the mirrors they are not doing something like
> $ cd staging_area && wget a b
> when they should be doing
> $ wget a b && mv a b staging_area

Except that it's about 1000 files. This is basicly what rsync
--delay-updates does, and what is being used.  And on a very busy
mirror this can actually take some time to do.


Kurt





Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Sat, 06 Aug 2011 21:30:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Sat, 06 Aug 2011 21:30:04 GMT) Full text and rfc822 format available.

Message #77 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: kurt@roeckx.be
Cc: 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Sun, 07 Aug 2011 05:26:40 +0800
>>>>> "KR" == Kurt Roeckx <kurt@roeckx.be> writes:
KR> Except that it's about 1000 files. This is basicly what rsync
KR> --delay-updates does, and what is being used.  And on a very busy
KR> mirror this can actually take some time to do.
Well all I know is the 998 .debs should be done first.
Then the 1 index file and 1 checksum file second.
And that second step being as atomic as $ ln a b staging_area
You get in to trouble when you put the president on the same slow train as
the common person, even if he is supposed to arrive after the other
participants are seated.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Sat, 06 Aug 2011 23:57:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kurt Roeckx <kurt@roeckx.be>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Sat, 06 Aug 2011 23:57:07 GMT) Full text and rfc822 format available.

Message #82 received at 636292@bugs.debian.org (full text, mbox):

From: Kurt Roeckx <kurt@roeckx.be>
To: jidanni@jidanni.org
Cc: 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Sun, 7 Aug 2011 01:56:03 +0200
On Sun, Aug 07, 2011 at 05:26:40AM +0800, jidanni@jidanni.org wrote:
> >>>>> "KR" == Kurt Roeckx <kurt@roeckx.be> writes:
> KR> Except that it's about 1000 files. This is basicly what rsync
> KR> --delay-updates does, and what is being used.  And on a very busy
> KR> mirror this can actually take some time to do.
> Well all I know is the 998 .debs should be done first.
> Then the 1 index file and 1 checksum file second.

No, this is 1000 index files.  Please note that we have more than 1
suite and more than 1 arch, and each of those have several files.
Just take a look at the Release file itself to know how many files
need to be updated at the same time.

The new .debs are done first, so that if you get a Packages or
Sources file, you can actually download the files mentioned in
those files.  They are directly copied to the correct place since
they are new files and not updated files.

Then the Packages, Sources, Release and other files are first
all transfered, then moved to the correct place.

After that old files are removed.  And ftp-master only removes
them after a few days that no Pacakges or Sources files mentions
them.

The critical part is moving all the Packages/Sources/Release files
to the new place.  You want to do that in as short a time as
possible.

The problem you're most likely seeing is that the InRelease file
is done together with copying the .deb files, while it should be
part of the Packages/Sources/Release files part.  And I already
explained that part.

> And that second step being as atomic as $ ln a b staging_area

But also note that an atomic update on the server side doesn't
help.  If I start downloading the Release file, and while I'm
downloading the Release files the Release/Packages files are
updated on the server, and then download a Packages file, the
Packages and Release file still won't be from the same time.


Kurt





Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Sun, 07 Aug 2011 13:36:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Henrique de Moraes Holschuh <hmh@debian.org>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Sun, 07 Aug 2011 13:36:06 GMT) Full text and rfc822 format available.

Message #87 received at 636292@bugs.debian.org (full text, mbox):

From: Henrique de Moraes Holschuh <hmh@debian.org>
To: Kurt Roeckx <kurt@roeckx.be>
Cc: jidanni@jidanni.org, 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Sun, 7 Aug 2011 10:33:59 -0300
On Sun, 07 Aug 2011, Kurt Roeckx wrote:
> The new .debs are done first, so that if you get a Packages or
> Sources file, you can actually download the files mentioned in
> those files.  They are directly copied to the correct place since
> they are new files and not updated files.
> 
> Then the Packages, Sources, Release and other files are first
> all transfered, then moved to the correct place.

Well, that has two problems we have observed in practice:

1. Not all mirrors have up-to-date mirror scripts, and that
   _does_ include mirrors selected for the multi-mirror aliases;

2. Mirrors in the same multi-mirror alias are not updated at the
   same time, and it is very possible (especially in http
   scenarios) to get metadata skew problems across mirrors even
   when they are perfectly fine and internally consistent.

That doesn't even need a third issue (multiple DNS queries) to cause
problems, way too many users are behind http proxies and caches that
break things regardless.

Maybe we should start designing sequence tagging/generation tagging for
the metadata?  If nobody has time to implement it right now, it would be
a damn fine GSOC project for 2013...

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Mon, 08 Aug 2011 05:09:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Mon, 08 Aug 2011 05:09:06 GMT) Full text and rfc822 format available.

Message #92 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: hmh@debian.org
Cc: kurt@roeckx.be, 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Mon, 08 Aug 2011 13:05:13 +0800
>>>>> "H" == Henrique de Moraes Holschuh <hmh@debian.org> writes:
H> That doesn't even need a third issue (multiple DNS queries)
OK. But at least that part could be fixed now.
No denying it is happening, as I showed with tcpflow!




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Thu, 11 Aug 2011 21:57:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Thu, 11 Aug 2011 21:57:03 GMT) Full text and rfc822 format available.

Message #97 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: spaillard@debian.org, 636292@bugs.debian.org
Cc: carlos@fisica.ufpr.br, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Fri, 12 Aug 2011 05:52:36 +0800
>>>>> "SP" == Simon Paillard <spaillard@debian.org> writes:
SP> Could you please try to use ftp.us.d.o and confirm up to date ftpsync on all
SP> backends solved your problem ?
I would be extremely ecstatically happy to.
However,
as I _proved_ in 636292 using tcpflow(1),
a simple "apt-get update",
will make TWO calls to the DNS.
The checksum will come from a _different_ round robin machine, four out
of five times. It's Russian Roulette. I can't bear to pull the trigger.
A user would have to be crazy to use a round robin mirror until the apt
team finally gets around to fixing this probably one line bug.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Thu, 11 Aug 2011 23:12:42 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kurt Roeckx <kurt@roeckx.be>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Thu, 11 Aug 2011 23:12:42 GMT) Full text and rfc822 format available.

Message #102 received at 636292@bugs.debian.org (full text, mbox):

From: Kurt Roeckx <kurt@roeckx.be>
To: jidanni@jidanni.org
Cc: hmh@debian.org, 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Fri, 12 Aug 2011 01:03:20 +0200
On Sat, Aug 06, 2011 at 06:07:19AM +0800, jidanni@jidanni.org wrote:
> Ha ha ha, it really does split a single apt-get update into two
> different places completely across the Internet.
> 
> Any maybe even for singular servers like rocky-mountain... maybe
> upstream from it is the same splitting problem somewhere.
> 
> Anyway here we go:
> # cat /etc/apt/sources.list.d/*
> deb http://ftp.us.debian.org/debian unstable contrib
> # tcpflow -i ppp0 &
> # apt-get update
> # ls -og /tmp/m
> -rw-r--r-- 1 146150 Aug  6 05:54 064.050.233.100.00080-218.163.001.135.45826
> -rw-r--r-- 1  68985 Aug  6 05:54 199.006.012.070.00080-218.163.001.135.56243
> -rw-r--r-- 1    185 Aug  6 05:54 218.163.001.135.45826-064.050.233.100.00080
> -rw-r--r-- 1   1432 Aug  6 05:54 218.163.001.135.56243-199.006.012.070.00080
> $ host ftp.us.debian.org
> ftp.us.debian.org has address 128.30.2.36
> ftp.us.debian.org has address 199.6.12.70
> ftp.us.debian.org has address 35.9.37.225
> ftp.us.debian.org has address 64.50.233.100
> ftp.us.debian.org has address 64.50.236.52
> ftp.us.debian.org has IPv6 address 2001:500:61:28::70

So let's take a real look at what it does:
First it does some UDP thing to all IP addresses:
[pid 25433] socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
[pid 25433] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("128.30.2.36")}, 16) = 0
[pid 25433] getsockname(3, {sa_family=AF_INET, sin_port=htons(49660), sin_addr=inet_addr("10.0.200.1")}, [16]) = 0
[pid 25433] connect(3, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
[pid 25433] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("199.6.12.70")}, 16) = 0
[pid 25433] getsockname(3, {sa_family=AF_INET, sin_port=htons(35821), sin_addr=inet_addr("10.0.200.1")}, [16]) = 0
[pid 25433] connect(3, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
[pid 25433] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("35.9.37.225")}, 16) = 0
[pid 25433] getsockname(3, {sa_family=AF_INET, sin_port=htons(52379), sin_addr=inet_addr("10.0.200.1")}, [16]) = 0
[pid 25433] connect(3, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
[pid 25433] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("64.50.233.100")}, 16) = 0
[pid 25433] getsockname(3, {sa_family=AF_INET, sin_port=htons(39421), sin_addr=inet_addr("10.0.200.1")}, [16]) = 0
[pid 25433] connect(3, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
[pid 25433] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("64.50.236.52")}, 16) = 0
[pid 25433] getsockname(3, {sa_family=AF_INET, sin_port=htons(37020), sin_addr=inet_addr("10.0.200.1")}, [16]) = 0
[pid 25433] close(3)                    = 0
[pid 25433] socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3
[pid 25433] connect(3, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, "2001:500:61:28::70", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
[pid 25433] getsockname(3, {sa_family=AF_INET6, sin6_port=htons(50188), inet_pton(AF_INET6, "2001:0:53aa:64c:2ca7:460f:aeac:9430", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
[pid 25433] close(3)                    = 0

No idea what it's really trying to do, but I guess it's trying to see which if they're routable.
The AF_UNSPEC part probably doesn't make much sense.

Then it goes on with:
[pid 25433] socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
[pid 25433] fcntl(3, F_GETFL)           = 0x2 (flags O_RDWR)
[pid 25433] fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 25433] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("128.30.2.36")}, 16) = -1 EINPROGRESS (Operation now in progress)
[...]
[pid 25433] write(3, "GET /debian/dists/sid/InRelease HTTP/1.1\r\nHost: ftp.us.debian.org\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nIf-Modified-Since: Thu, 11 Aug 2011 20:22:47 GMT\r\nUser-Agent: Debian APT-HTTP/1.3 (0.8.15.5)\r\n\r\n", 213) = 213
[...]
[pid 25433] read(3, "HTTP/1.1 304 Not Modified\r\nDate: Thu, 11 Aug 2011 22:32:27 GMT\r\nServer: Apache/2.2.9 (Debian)\r\nConnection: Keep-Alive\r\nKeep-Alive: timeout=15, max=100\r\nETag: \"1d0a203-239d0-4aa408f6173c0\"\r\n\r\n", 65536) = 191
[...]
[pid 25433] write(3, "GET /debian/dists/sid/main/binary-amd64/Packages.diff/Index HTTP/1.1\r\nHost: ftp.us.debian.org\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nIf-Modified-Since: Thu, 11 Aug 2011 20:16:48 GMT\r\nUser-Agent: Debian APT-HTTP/1.3 (0.8.15.5)\r\n\r\n", 241) = 241
[...]
[pid 25433] read(3, "HTTP/1.1 304 Not Modified\r\nDate: Thu, 11 Aug 2011 22:32:28 GMT\r\nServer: Apache/2.2.9 (Debian)\r\nConnection: Keep-Alive\r\nKeep-Alive: timeout=15, max=99\r\nETag: \"1d0a308-7f6-4aa4079fb8c00\"\r\n\r\n", 65345) = 188

So it looked for the InRelease and Packages file over the same connection.

And than for some unclear reason to me it closes and opens the connection again to get the i18n files:

[pid 25433] close(3)                    = 0
[pid 25433] read(0, 0x7fff66c68790, 64000) = -1 EAGAIN (Resource temporarily unavailable)
[pid 25433] close(4294967295)           = -1 EBADF (Bad file descriptor)
[pid 25433] write(1, "102 Status\nURI: http://ftp.us.debian.org/debian/dists/sid/main/i18n/Index\nMessage: Connecting to ftp.us.debian.org (199.6.12.70)\n\n", 130) = 130
[pid 25433] socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
[pid 25433] fcntl(3, F_GETFL)           = 0x2 (flags O_RDWR)
[pid 25433] fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 25433] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("199.6.12.70")}, 16) = -1 EINPROGRESS (Operation now in progress)
[...]
[pid 25433] write(3, "GET /debian/dists/sid/main/i18n/Index HTTP/1.1\r\nHost: ftp.us.debian.org\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nIf-Modified-Since: Thu, 11 Aug 2011 19:55:34 GMT\r\nUser-Agent: Debian APT-HTTP/1.3 (0.8.15.5)\r\n\r\n", 219 <unfinished ...>
[...]
[pid 25433] read(3, "HTTP/1.1 304 Not Modified\r\nServer: nginx/0.8.54\r\nDate: Thu, 11 Aug 2011 22:32:46 GMT\r\nLast-Modified:
Thu, 11 Aug 2011 19:55:34 GMT\r\nConnection: keep-alive\r\n\r\n", 65536) = 158
[...]
[pid 25433] exit_group(100)             = ?

(It stops the program without closing the socket.)

This i18n/Index file is also covered by the InRelease, so this clearly is a problem.


Kurt





Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 12 Aug 2011 01:45:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Henrique de Moraes Holschuh <hmh@debian.org>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 12 Aug 2011 01:45:03 GMT) Full text and rfc822 format available.

Message #107 received at 636292@bugs.debian.org (full text, mbox):

From: Henrique de Moraes Holschuh <hmh@debian.org>
To: jidanni@jidanni.org
Cc: spaillard@debian.org, 636292@bugs.debian.org, carlos@fisica.ufpr.br, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Thu, 11 Aug 2011 22:39:54 -0300
On Fri, 12 Aug 2011, jidanni@jidanni.org wrote:
> The checksum will come from a _different_ round robin machine, four out
> of five times. It's Russian Roulette. I can't bear to pull the trigger.
> A user would have to be crazy to use a round robin mirror until the apt
> team finally gets around to fixing this probably one line bug.

Now, don't be absurd.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 12 Aug 2011 08:48:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Dominik Bay <eimann@etherkiller.de>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 12 Aug 2011 08:48:03 GMT) Full text and rfc822 format available.

Message #112 received at 636292@bugs.debian.org (full text, mbox):

From: Dominik Bay <eimann@etherkiller.de>
To: Henrique de Moraes Holschuh <hmh@debian.org>
Cc: jidanni@jidanni.org, 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Fri, 12 Aug 2011 10:45:28 +0200
On Fri, Aug 12, 2011 at 03:39, Henrique de Moraes Holschuh
<hmh@debian.org> wrote:
> On Fri, 12 Aug 2011, jidanni@jidanni.org wrote:
>> The checksum will come from a _different_ round robin machine, four out
>> of five times. It's Russian Roulette. I can't bear to pull the trigger.
>> A user would have to be crazy to use a round robin mirror until the apt
>> team finally gets around to fixing this probably one line bug.
>
> Now, don't be absurd.

Yeah, it's getting hilarious since a while ...
Now as ftpsync is fixed on the US mirrors all checksum problems should
be solved.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 12 Aug 2011 15:33:09 GMT) Full text and rfc822 format available.

Acknowledgement sent to Henrique de Moraes Holschuh <hmh@debian.org>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 12 Aug 2011 15:33:10 GMT) Full text and rfc822 format available.

Message #117 received at 636292@bugs.debian.org (full text, mbox):

From: Henrique de Moraes Holschuh <hmh@debian.org>
To: Dominik Bay <eimann@etherkiller.de>
Cc: jidanni@jidanni.org, 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Fri, 12 Aug 2011 12:31:15 -0300
On Fri, 12 Aug 2011, Dominik Bay wrote:
> On Fri, Aug 12, 2011 at 03:39, Henrique de Moraes Holschuh
> <hmh@debian.org> wrote:
> > On Fri, 12 Aug 2011, jidanni@jidanni.org wrote:
> >> The checksum will come from a _different_ round robin machine, four out
> >> of five times. It's Russian Roulette. I can't bear to pull the trigger.
> >> A user would have to be crazy to use a round robin mirror until the apt
> >> team finally gets around to fixing this probably one line bug.
> >
> > Now, don't be absurd.
> 
> Yeah, it's getting hilarious since a while ...
> Now as ftpsync is fixed on the US mirrors all checksum problems should
> be solved.

Hmm, no.  There is a real design bug in play: we cannot trust metadata
to be in sync *across* mirrors, and we cannot trust the network backends
to always connect to the same mirror.

The multiple DNS lookups bug just breaks a workaround for that design
bug that works well in a particular case (fortunately, a common one):
persistent connections.

What I consider absurd is jidanni's "probably one line bug" comment.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Sat, 13 Aug 2011 11:09:44 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Sat, 13 Aug 2011 11:09:53 GMT) Full text and rfc822 format available.

Message #122 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: hmh@debian.org
Cc: eimann@etherkiller.de, 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Sat, 13 Aug 2011 19:04:22 +0800
>>>>> "H" == Henrique de Moraes Holschuh <hmh@debian.org> writes:
H> What I consider absurd is jidanni's "probably one line bug" comment.
Naw... it's probably just a case of
for(thing,checksum_of_thing){
	do_dns_query(); #move this line before the loop
	get_it();
}




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Sun, 21 Aug 2011 22:15:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Sun, 21 Aug 2011 22:15:03 GMT) Full text and rfc822 format available.

Message #127 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: spaillard@debian.org
Cc: carlos@fisica.ufpr.br, debian-mirrors@lists.debian.org, 636292@bugs.debian.org, hmh@debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Mon, 22 Aug 2011 06:12:43 +0800
>>>>> "SP" == Simon Paillard <spaillard@debian.org> writes:
SP> It's no longer the case, all ftp.us have no 80387.

SP> jidanni, do you still observe issues ?

Yes, as a matter of fact I do.
I even recorded the exact time window for you. In UTC as a special bonus.
starting Sun Aug 21 21:30:51 UTC 2011
W: Failed to fetch http://ftp.us.debian.org/debian/dists/experimental/main/binary-i386/PackagesIndex  MD5Sum mismatch
W: Failed to fetch http://ftp.us.debian.org/debian/dists/unstable/main/binary-i386/PackagesIndex  MD5Sum mismatch
E: Some index files failed to download. They have been ignored, or old ones used instead.
ending Sun Aug 21 21:36:55 UTC 2011

I have a recommendation:
that you fellows fix the this bug.
As I have noted, it is certainly a one-liner.
I mean aren't we running out of other things to blame for the problem? Thanks.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Sun, 21 Aug 2011 22:36:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kurt Roeckx <kurt@roeckx.be>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Sun, 21 Aug 2011 22:36:03 GMT) Full text and rfc822 format available.

Message #132 received at 636292@bugs.debian.org (full text, mbox):

From: Kurt Roeckx <kurt@roeckx.be>
To: jidanni@jidanni.org
Cc: spaillard@debian.org, carlos@fisica.ufpr.br, debian-mirrors@lists.debian.org, 636292@bugs.debian.org, hmh@debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Mon, 22 Aug 2011 00:33:04 +0200
On Mon, Aug 22, 2011 at 06:12:43AM +0800, jidanni@jidanni.org wrote:
> starting Sun Aug 21 21:30:51 UTC 2011
[...]
> ending Sun Aug 21 21:36:55 UTC 2011
> 
> I have a recommendation:
> that you fellows fix the this bug.
> As I have noted, it is certainly a one-liner.

As we already pointed out, it is not a one-liner.  If you're so
sure it's a one-liner, I suggest you submit a patch.

> I mean aren't we running out of other things to blame for the problem? Thanks.

Even if we fix the problem with connecting to multiple servers, there
are various other reasons why it can fail, and they have all been
explained already.

I'm not even sure that if you fix the multiple server connections
that would get better or worse results.  But I would still suggest
that we do try and connect to only 1 server.


Kurt





Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Mon, 22 Aug 2011 01:12:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Mon, 22 Aug 2011 01:12:03 GMT) Full text and rfc822 format available.

Message #137 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: kurt@roeckx.be
Cc: spaillard@debian.org, carlos@fisica.ufpr.br, debian-mirrors@lists.debian.org, 636292@bugs.debian.org, hmh@debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Mon, 22 Aug 2011 09:08:52 +0800
>> starting Sun Aug 21 21:30:51 UTC 2011
Actually the first five minutes were spent in my 'sleep 5m' so it really is
<< starting Sun Aug 21 21:35:51 UTC 2011
>> ending   Sun Aug 21 21:36:55 UTC 2011
KR> I'm not even sure that if you fix the multiple server connections
KR> that would get better or worse results.  But I would still suggest
KR> that we do try and connect to only 1 server.
I've now also added a tcpflow(1) wrapper enabling me to send you all
byte-by-byte evidence the next time it happens... but why allow me that
wicked pleasure?




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Mon, 22 Aug 2011 06:51:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kurt Roeckx <kurt@roeckx.be>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Mon, 22 Aug 2011 06:51:05 GMT) Full text and rfc822 format available.

Message #142 received at 636292@bugs.debian.org (full text, mbox):

From: Kurt Roeckx <kurt@roeckx.be>
To: jidanni@jidanni.org
Cc: spaillard@debian.org, carlos@fisica.ufpr.br, debian-mirrors@lists.debian.org, 636292@bugs.debian.org, hmh@debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Mon, 22 Aug 2011 08:49:27 +0200
On Mon, Aug 22, 2011 at 09:08:52AM +0800, jidanni@jidanni.org wrote:
> >> starting Sun Aug 21 21:30:51 UTC 2011
> Actually the first five minutes were spent in my 'sleep 5m' so it really is
> << starting Sun Aug 21 21:35:51 UTC 2011
> >> ending   Sun Aug 21 21:36:55 UTC 2011
> KR> I'm not even sure that if you fix the multiple server connections
> KR> that would get better or worse results.  But I would still suggest
> KR> that we do try and connect to only 1 server.
> I've now also added a tcpflow(1) wrapper enabling me to send you all
> byte-by-byte evidence the next time it happens... but why allow me that
> wicked pleasure?

We know what the problem is, that's not needed.


Kurt





Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Wed, 14 Sep 2011 01:33:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Wed, 14 Sep 2011 01:33:03 GMT) Full text and rfc822 format available.

Message #147 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: 636292@bugs.debian.org
Cc: debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Wed, 14 Sep 2011 09:28:24 +0800
[Message part 1 (text/plain, inline)]
K> We know what the problem is, that's not needed.
Are you sure?
[tcpflow.bz2 (application/octet-stream, attachment)]
[ppp-apt-get.log.2383 (text/plain, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Wed, 14 Sep 2011 01:39:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Wed, 14 Sep 2011 01:39:03 GMT) Full text and rfc822 format available.

Message #152 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: 636292@bugs.debian.org
Cc: debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Wed, 14 Sep 2011 09:35:05 +0800
Actually all that is going to happen is one day I will accidentally send
the tcpflow logs containing unrelated personal traffic too as the
filtering is too complex, so I would appreciate it if someone looked
into this bug.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Wed, 14 Sep 2011 14:12:16 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Wed, 14 Sep 2011 14:12:17 GMT) Full text and rfc822 format available.

Message #157 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: spaillard@debian.org
Cc: carlos@fisica.ufpr.br, debian-mirrors@lists.debian.org, 636292@bugs.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Wed, 14 Sep 2011 21:58:38 +0800
[Message part 1 (text/plain, inline)]
Three different mirrors in a single _botched_ apt-get update.
[ppp-apt-get.log.1965 (text/plain, attachment)]
[flow.bz2 (application/octet-stream, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 16 Sep 2011 00:36:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni@jidanni.org:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 16 Sep 2011 00:36:03 GMT) Full text and rfc822 format available.

Message #162 received at 636292@bugs.debian.org (full text, mbox):

From: jidanni@jidanni.org
To: ricardo.yanez@calel.org, 636292@bugs.debian.org
Cc: carlos@fisica.ufpr.br, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Fri, 16 Sep 2011 08:33:15 +0800
> Given the difficulties in getting mirrors to use correct scripts
I recall that was taken care of.
> and the apt not-safe-enough behavior of using different hosts
Why doesn't someone take care of that.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 16 Sep 2011 08:28:17 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Kalnischkies <kalnischkies+debian@gmail.com>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 16 Sep 2011 08:28:17 GMT) Full text and rfc822 format available.

Message #167 received at 636292@bugs.debian.org (full text, mbox):

From: David Kalnischkies <kalnischkies+debian@gmail.com>
To: jidanni@jidanni.org, 636292@bugs.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Fri, 16 Sep 2011 10:26:02 +0200
tags 636292 will-get-fixed-by-donkult-then-hell-freezes-over
kthxbye

On Fri, Sep 16, 2011 at 02:33,  <jidanni@jidanni.org> wrote:
>> Given the difficulties in getting mirrors to use correct scripts
> I recall that was taken care of.
>> and the apt not-safe-enough behavior of using different hosts
> Why doesn't someone take care of that.

I have good news for you:
This is open source software: YOU are part of the awesome team!
So feel free to blame yourself that you haven't taken care of it.
In fact, as we are all volunteers you can only blame yourself…


> A user would have to be crazy to use a round robin mirror until the apt
> team finally gets around to fixing this probably one line bug.

Before you are getting even more crazy feel free to post
your one line patch to this bugreport.
We need NOTHING else from you. I repeat: NOTHING ELSE!
No goddamn tcpflow logs nor any other data, just provide
your simple patch and everybody will be happy.
Thanks.


I can only speak for myself, but I haven't even tried to look at
this issue because of this sentence (and all the howling before
and after that). And I am pretty sure it will need a loooooooong time
until I feel motivated to do so thanks to your behavior in the buglog,
so if I were you I would submit a patch or keep silent until I can
provide something useful to fix the bug I respond to.
Bonus points if you can do both.


If you want to blame anyone in the meantime, blame yourself for considerable
lower the chances to get this or any related bug fixed by working hard on
demotivating at least one of the few people who regularly contribute to APT…

Thats a great achievement, given that even the worst kids in my young groups
can't make that happen, so:
Congratulations!

David Kalnischkies

P.S.: Don't bother to answer, the buglog includes enough messages already and
I will not read it anyway. Everything we need is your patch now, so hurry up.




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 16 Sep 2011 09:33:54 GMT) Full text and rfc822 format available.

Acknowledgement sent to Martin Bagge / brother <brother@bsnet.se>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 16 Sep 2011 09:34:02 GMT) Full text and rfc822 format available.

Message #172 received at 636292@bugs.debian.org (full text, mbox):

From: Martin Bagge / brother <brother@bsnet.se>
To: jidanni@jidanni.org
Cc: ricardo.yanez@calel.org, 636292@bugs.debian.org, carlos@fisica.ufpr.br, debian-mirrors@lists.debian.org
Subject: Re: Bug#636292: MD5Sum mismatch is due to multiple DNS queries!
Date: Fri, 16 Sep 2011 11:23:31 +0200
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 2011-09-16 02:33, jidanni@jidanni.org wrote:
>> Given the difficulties in getting mirrors to use correct scripts
> I recall that was taken care of.
>> and the apt not-safe-enough behavior of using different hosts
> Why doesn't someone take care of that.

Might be that the trade off between "time to spend" and "what I like to
do first" might not suite your wishes.

As always "Show the code" still applies.

- -- 
brother
http://sis.bthstudent.se
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBCAAGBQJOcxWTAAoJEJbdSEaj0jV7xeoH/2QUcPE0ZWMFvtUbW4hx1Vln
1YiirakR2ZbLKMUIPw9S/cHgcTWv5pNX13GGjnQIF/nnehh6wVqLXfqZ7X2Kh+s3
nQFy9jJPddLFGpTs0GbZ/XqHgdb5ETfiZkqu29edqIBJhQX2M2MmbZ0UbCLMC4Rg
C1v9HCH0q+UMwm3U9pTGt1nGsvN8jwO1McPgUaWCa9XhqezmdxQl9RQ1FDSmM31i
YHwAJE4hVdsTJ5bbuQKjORU0XQQU1FbFO414HSzdvr/3AavcAZCukyHLL5D2XmO3
aBMyNWzNqQ8BHO00m+Zrm197aAd+PuqRqyXvHen3m+hVHC3iBcLkUnHOJ687Lc4=
=IOih
-----END PGP SIGNATURE-----




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Fri, 16 Sep 2011 20:09:33 GMT) Full text and rfc822 format available.

Acknowledgement sent to Filipus Klutiero <chealer@gmail.com>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Fri, 16 Sep 2011 20:09:33 GMT) Full text and rfc822 format available.

Message #177 received at 636292@bugs.debian.org (full text, mbox):

From: Filipus Klutiero <chealer@gmail.com>
To: 636292@bugs.debian.org
Subject: Similar bug
Date: Fri, 16 Sep 2011 16:06:29 -0400
A bug involving InRelease files which has similar symptoms was reported 
on http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=641769




Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Sat, 17 Sep 2011 15:37:23 GMT) Full text and rfc822 format available.

Acknowledgement sent to Henrique de Moraes Holschuh <hmh@debian.org>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Sat, 17 Sep 2011 15:37:23 GMT) Full text and rfc822 format available.

Message #182 received at 636292@bugs.debian.org (full text, mbox):

From: Henrique de Moraes Holschuh <hmh@debian.org>
To: Filipus Klutiero <chealer@gmail.com>
Cc: 636292@bugs.debian.org, 641769@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#641769: [apt] fetches InRelease file, problematic on several mirrors (aka "Packages Hash Sum mismatch")
Date: Sat, 17 Sep 2011 12:34:02 -0300
retitle 636292 dak/apt: deficiencies at handling out-of-sync metadata
summary 636292 87
thanks

If anyone disagrees with the above triage, please change the summary
and/or title.  Thank you.

On Sat, 17 Sep 2011, Filipus Klutiero wrote:
> #636292 is labelled as being about round-robin mirrors and problems
> APT has with those, even though that is probably not the actual
> cause of the reporter's problem.

You're expected to read the entire thing when refered to a bug report in
a thread you're replying to.  Anyway, triaged.  I didn't want to do it
because it is not my bug, it is not a package I work on, and I have no
idea wether the apt developers agree with my anaylsis of #636292 or not.
Please feel free to improve the title or chose a new summary.

> But indeed, there would be room for a general bug "unreliable
> package indices fetching protocol" :-S

That misses the point, IMO.  To me, it looks like what's "broken" is
that the repository format _and_ the front-ends have deficiencies at
handling metadata which is unsyncronized either in-mirror or across
mirrors.  And these deficiencies are a lot more important nowadays than
they once were, as we have now many dinstall runs per day, lots of users
tracking testing and unstable, a larger set of metadata files, a larger
and more diverse set of mirrors... I.e: a lot more chances to hit
unsyncronized metadata windows.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh




Changed Bug title to 'dak/apt: deficiencies at handling out-of-sync metadata' from 'MD5Sum mismatch is due to multiple DNS queries!' Request was from Henrique de Moraes Holschuh <hmh@debian.org> to control@bugs.debian.org. (Sat, 17 Sep 2011 15:37:26 GMT) Full text and rfc822 format available.

Summary recorded from message bug 636292 message 87 Request was from Henrique de Moraes Holschuh <hmh@debian.org> to control@bugs.debian.org. (Sat, 17 Sep 2011 15:37:28 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, APT Development Team <deity@lists.debian.org>:
Bug#636292; Package apt. (Tue, 20 Sep 2011 02:57:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Filipus Klutiero <chealer@gmail.com>:
Extra info received and forwarded to list. Copy sent to APT Development Team <deity@lists.debian.org>. (Tue, 20 Sep 2011 02:57:05 GMT) Full text and rfc822 format available.

Message #191 received at 636292@bugs.debian.org (full text, mbox):

From: Filipus Klutiero <chealer@gmail.com>
To: 641769@bugs.debian.org
Cc: 636292@bugs.debian.org, debian-mirrors@lists.debian.org
Subject: Re: Bug#641769: [apt] fetches InRelease file, problematic on several mirrors (aka "Packages Hash Sum mismatch")
Date: Mon, 19 Sep 2011 22:52:53 -0400
Le 2011-09-17 11:34, Henrique de Moraes Holschuh a écrit :
> retitle 636292 dak/apt: deficiencies at handling out-of-sync metadata
> summary 636292 87
> thanks
>
> If anyone disagrees with the above triage, please change the summary
> and/or title.  Thank you.
>
> On Sat, 17 Sep 2011, Filipus Klutiero wrote:
>> #636292 is labelled as being about round-robin mirrors and problems
>> APT has with those, even though that is probably not the actual
>> cause of the reporter's problem.
> You're expected to read the entire thing when refered to a bug report in
> a thread you're replying to.
I was unaware of that.

[...]
>> But indeed, there would be room for a general bug "unreliable
>> package indices fetching protocol" :-S
> That misses the point, IMO.  To me, it looks like what's "broken" is
> that the repository format _and_ the front-ends have deficiencies at
> handling metadata which is unsyncronized either in-mirror or across
> mirrors.  And these deficiencies are a lot more important nowadays than
> they once were, as we have now many dinstall runs per day, lots of users
> tracking testing and unstable, a larger set of metadata files, a larger
> and more diverse set of mirrors... I.e: a lot more chances to hit
> unsyncronized metadata windows.

I don't think increasing dinstall frequency worsens these issues 
significantly if dinstalls get shorter (unless previous dinstalls ran 
during the night). I also think archive size growth should have been 
compensated by performance increases. I think the time spent 
synchronizing a mirror must not have increased a lot. What did change 
here (dramatically) is the proportion of that time where APT indices 
updates fail. Round-robin mirrors might also have worsened.

Anyway, the repository format is not a problem per se, it's the 
combination of what's on a mirror and how APT fetches it that's a 
problem. If you assume the communication protocol is HTTP-like, then 
indeed there should be mechanisms to cope with race conditions - i.e. 
file versioning and/or having APT retry or report desynchronizations.




Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Mon Apr 21 16:18:13 2014; Machine Name: beach.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.