Debian Bug report logs -
#595790
The value from gethostid() should be more unique and not change when the host IP changes
Reply or subscribe to this bug.
Toggle useless messages
Report forwarded
to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#595790; Package coreutils.
(Mon, 06 Sep 2010 17:39:08 GMT) (full text, mbox, link).
Acknowledgement sent
to martin f krafft <madduck@debian.org>:
New Bug report received and forwarded. Copy sent to Michael Stone <mstone@debian.org>.
(Mon, 06 Sep 2010 17:39:08 GMT) (full text, mbox, link).
Message #5 received at submit@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Package: coreutils
Version: 8.5-1
Severity: normal
File: /usr/bin/hostid
Tags: upstream
I have never come across a (Debian) system where /usr/bin/hostid
didn't print 007f0101. That is because Debian uses /etc/hosts to map
127.0.1.1 to the hostname(s).
Arguably, having a host UUID would be quite nice, but as there is no
"one" IPv4 of a host, it's kinda useless to try to go that road.
Unless hostid [well, gethostid()] can be replaced with something
sensible, I suggest that it be removed from coreutils, or disabled,
vandalised, or otherwise physically harmed.
Btw, the info page says:
the 32-bit quantity happens to be closely related to the system's
Internet address, but that isn't always the case.
and that's clearly wrong. Again, the days when a system had "an
Internet address" are long gone, and apparently, it isn't even
always the case.
Feel free to reassign to glibc, where gethostid() comes from}.
-- System Information:
Debian Release: squeeze/sid
APT prefers unstable
APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.35-trunk-amd64 (SMP w/1 CPU core)
Locale: LANG=en_GB, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages coreutils depends on:
ii libacl1 2.2.49-3 Access control list shared library
ii libattr1 1:2.4.44-2 Extended attribute shared library
ii libc6 2.11.2-5 Embedded GNU C Library: Shared lib
ii libselinux1 2.0.96-1 SELinux runtime shared libraries
coreutils recommends no packages.
coreutils suggests no packages.
-- no debconf information
--
.''`. martin f. krafft <madduck@d.o> Related projects:
: :' : proud Debian developer http://debiansystem.info
`. `'` http://people.debian.org/~madduck http://vcs-pkg.org
`- Debian - when you have better things to do than fixing systems
[digital_signature_gpg.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#595790; Package coreutils.
(Wed, 20 Feb 2013 02:03:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Carlos Alberto Lopez Perez <clopez@igalia.com>:
Extra info received and forwarded to list. Copy sent to Michael Stone <mstone@debian.org>.
(Wed, 20 Feb 2013 02:03:04 GMT) (full text, mbox, link).
Message #10 received at 595790@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On 06/09/10 19:35, martin f krafft wrote:
> Package: coreutils
> Version: 8.5-1
> Severity: normal
> File: /usr/bin/hostid
> Tags: upstream
>
> I have never come across a (Debian) system where /usr/bin/hostid
> didn't print 007f0101. That is because Debian uses /etc/hosts to map
> 127.0.1.1 to the hostname(s).
>
> Arguably, having a host UUID would be quite nice, but as there is no
> "one" IPv4 of a host, it's kinda useless to try to go that road.
> Unless hostid [well, gethostid()] can be replaced with something
> sensible, I suggest that it be removed from coreutils, or disabled,
> vandalised, or otherwise physically harmed.
>
> Btw, the info page says:
>
> the 32-bit quantity happens to be closely related to the system's
> Internet address, but that isn't always the case.
>
> and that's clearly wrong. Again, the days when a system had "an
> Internet address" are long gone, and apparently, it isn't even
> always the case.
>
> Feel free to reassign to glibc, where gethostid() comes from}.
>
> -- System Information:
> Debian Release: squeeze/sid
> APT prefers unstable
> APT policy: (500, 'unstable'), (1, 'experimental')
> Architecture: amd64 (x86_64)
>
> Kernel: Linux 2.6.35-trunk-amd64 (SMP w/1 CPU core)
> Locale: LANG=en_GB, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
> Shell: /bin/sh linked to /bin/dash
>
> Versions of packages coreutils depends on:
> ii libacl1 2.2.49-3 Access control list shared library
> ii libattr1 1:2.4.44-2 Extended attribute shared library
> ii libc6 2.11.2-5 Embedded GNU C Library: Shared lib
> ii libselinux1 2.0.96-1 SELinux runtime shared libraries
>
> coreutils recommends no packages.
>
> coreutils suggests no packages.
>
> -- no debconf information
>
>
I have been digging on how hostid works on Linux versus other UNIXes. Check:
http://lists.alioth.debian.org/pipermail/pkg-zfsonlinux-devel/2013-February/000005.html
Perhaps a quick and easy solution for this issue will be to check if
/etc/hostid is already configured on the system, and if not, just set it
to a random value on the postinst of coreutils. Something like:
if [ ! -f /etc/hostid ]; then
dd if=/dev/urandom bs=1 count=4 of=/etc/hostid 2>/dev/null
fi
What do you think?
[signature.asc (application/pgp-signature, attachment)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#595790; Package coreutils.
(Wed, 20 Feb 2013 20:06:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Carlos Alberto Lopez Perez <clopez@igalia.com>:
Extra info received and forwarded to list. Copy sent to Michael Stone <mstone@debian.org>.
(Wed, 20 Feb 2013 20:06:03 GMT) (full text, mbox, link).
Message #15 received at 595790@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On 20/02/13 03:01, Carlos Alberto Lopez Perez wrote:
> On 06/09/10 19:35, martin f krafft wrote:
>> Package: coreutils
>> Version: 8.5-1
>> Severity: normal
>> File: /usr/bin/hostid
>> Tags: upstream
>>
>> I have never come across a (Debian) system where /usr/bin/hostid
>> didn't print 007f0101. That is because Debian uses /etc/hosts to map
>> 127.0.1.1 to the hostname(s).
>>
>> Arguably, having a host UUID would be quite nice, but as there is no
>> "one" IPv4 of a host, it's kinda useless to try to go that road.
>> Unless hostid [well, gethostid()] can be replaced with something
>> sensible, I suggest that it be removed from coreutils, or disabled,
>> vandalised, or otherwise physically harmed.
>>
>> Btw, the info page says:
>>
>> the 32-bit quantity happens to be closely related to the system's
>> Internet address, but that isn't always the case.
>>
>> and that's clearly wrong. Again, the days when a system had "an
>> Internet address" are long gone, and apparently, it isn't even
>> always the case.
>>
>> Feel free to reassign to glibc, where gethostid() comes from}.
>>
>> -- System Information:
>> Debian Release: squeeze/sid
>> APT prefers unstable
>> APT policy: (500, 'unstable'), (1, 'experimental')
>> Architecture: amd64 (x86_64)
>>
>> Kernel: Linux 2.6.35-trunk-amd64 (SMP w/1 CPU core)
>> Locale: LANG=en_GB, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
>> Shell: /bin/sh linked to /bin/dash
>>
>> Versions of packages coreutils depends on:
>> ii libacl1 2.2.49-3 Access control list shared library
>> ii libattr1 1:2.4.44-2 Extended attribute shared library
>> ii libc6 2.11.2-5 Embedded GNU C Library: Shared lib
>> ii libselinux1 2.0.96-1 SELinux runtime shared libraries
>>
>> coreutils recommends no packages.
>>
>> coreutils suggests no packages.
>>
>> -- no debconf information
>>
>>
>
> I have been digging on how hostid works on Linux versus other UNIXes. Check:
>
> http://lists.alioth.debian.org/pipermail/pkg-zfsonlinux-devel/2013-February/000005.html
>
>
> Perhaps a quick and easy solution for this issue will be to check if
> /etc/hostid is already configured on the system, and if not, just set it
> to a random value on the postinst of coreutils. Something like:
>
>
>
> if [ ! -f /etc/hostid ]; then
> dd if=/dev/urandom bs=1 count=4 of=/etc/hostid 2>/dev/null
> fi
>
>
>
> What do you think?
>
Here is a great summary from Lennart Poettering about the sources of
unique IDs on a Linux systems:
http://0pointer.de/blog/projects/ids.html
He agrees that the current status of hostid is useless on most distros,
and he suggests to symlink /etc/hostid to /var/lib/dbus/machine-id (!!).
IMHO A more reasonable approach that don't forces a dependency on dbus,
is just to randomize /etc/hostid on the postinst of coreutils as I
suggested previously.
Regards!
[signature.asc (application/pgp-signature, attachment)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#595790; Package coreutils.
(Wed, 20 Feb 2013 21:42:03 GMT) (full text, mbox, link).
Message #18 received at 595790@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Carlos Alberto Lopez Perez wrote:
> On 20/02/13 03:01, Carlos Alberto Lopez Perez wrote:
> > On 06/09/10 19:35, martin f krafft wrote:
> >> I have never come across a (Debian) system where /usr/bin/hostid
> >> didn't print 007f0101. That is because Debian uses /etc/hosts to map
> >> 127.0.1.1 to the hostname(s).
> >>
> >> Arguably, having a host UUID would be quite nice, but as there is no
> >> "one" IPv4 of a host, it's kinda useless to try to go that road.
Agreed. It is the wrong path.
Hosts have network devices. In the plural. Network devices have IP
addresses. Also plural. Where plural could be any of 0, 1, or many.
There isn't any one true IP address. That hostid tries to use one in
the way that it does is based upon flawed fundamental assumptions.
> >> Unless hostid [well, gethostid()] can be replaced with something
> >> sensible, I suggest that it be removed from coreutils, or disabled,
> >> vandalised, or otherwise physically harmed.
(chuckle) Because otherwise people will try to use it and simply
propagate the problem. But I would hate have Debian bear the force
the storm for those odd but misguided applications that try to use it.
I would vote to go down the path of creating a very likely to be
unique identifier for it.
> >> Feel free to reassign to glibc, where gethostid() comes from}.
Obviously the coreutils hostid is simply an interface to the
gethostid() call. But would this be more or less likely to be
addressed there?
> Here is a great summary from Lennart Poettering about the sources of
> unique IDs on a Linux systems:
>
> http://0pointer.de/blog/projects/ids.html
>
> He agrees that the current status of hostid is useless on most distros,
> and he suggests to symlink /etc/hostid to /var/lib/dbus/machine-id (!!).
This sentence is troubling. "On Linux, it is universally available,
given that almost all non-embedded and even a fair share of the
embedded machines ship D-Bus now." In the same sentence it is
declared universially available and then examples are given where it
is NOT available!
Also, AFAICS gethostid() only reads the first four bytes from the
file. Therefore using a symlink to /var/lib/dbus/machine-id would not
yield the desired result. It would reduce the entropy there down to
the first four characters from it. Mapping from binary data to the
character representation of it would be incorrect.
> IMHO A more reasonable approach that don't forces a dependency on dbus,
> is just to randomize /etc/hostid on the postinst of coreutils as I
> suggested previously.
Agreed. Let's not use this as an excuse to increase dependencies upon
dbus. It would be gratuitous.
> > Perhaps a quick and easy solution for this issue will be to check if
> > /etc/hostid is already configured on the system, and if not, just set it
> > to a random value on the postinst of coreutils. Something like:
> >
> > if [ ! -f /etc/hostid ]; then
> > dd if=/dev/urandom bs=1 count=4 of=/etc/hostid 2>/dev/null
> > fi
> >
> > What do you think?
In concept I am in agreement with you that Debian should automatically
create a randomized id value so that gethostid() and hostid then
return something that is not simply the first IP address it associates
with the host. In implementation I worry that writing to /etc is not
the correct place to put this information. It might be read-only for
example. Of course at initial system installation time there is no
problem pushing values into /etc and that is the only time it would
happen upon new system installations.
Perfect is the enemy of good. Therefore I vote to fix /etc/hostid if
it does not already exist.
Bob
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org:
Bug#595790; Package coreutils.
(Wed, 20 Feb 2013 23:18:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Michael Stone <mstone@debian.org>:
Extra info received and forwarded to list.
(Wed, 20 Feb 2013 23:18:03 GMT) (full text, mbox, link).
Message #23 received at 595790@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Short version:
My inclination is to simply better document that hostid is an interface
without clear semantics which exists for compatability with legacy
systems and should not be used in new applications.
Longer version:
What is the reason for wanting to use hostid? Historically this sort of
interface was most often used for software licensing and other such
applications which wanted to tie something to a particular piece of
hardware. On the proprietary hardware, the vendors would encode a serial
number which was nicely suited to that purpose. On linux, we don't have
such a hardware serial because we don't control the hardware, and we
don't have a real strong desire to facilitate software licensing
schemes. (If someone wants to cobble one together, fine, but we're not
going to do the work for it.) Intel tried to implement cpuids for this
purpose, and it flopped; we're unlikely to get further than they did.
Any scheme that relies on a cookie isn't going to provide that "tied to
the hardware" guarantee (a guarantee which is increasingly meaningless
in a VM world, anyway). And, dbus already went and reinvented that
wheel--does it make any sense to try to re-reinvent that wheel? You
still won't be able to rely on hostid having a useful value, because it
will be installation-dependent (unlike dbus, which got to define the
semantics from the ground up). You'll basically have an interface which
on some systems has a great semantic, and on others does not, so the
documentation will have to say exactly what it should say now: "use
something else".
So should we just get rid of it? Doing so would probably break some
ancient stupid script somewhere, to save 40k. And it's within the realm
of possibility that someone, within a particular environment, is
actually managing the hostids to do something useful, so we shouldn't
break that.
Mike Stone
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#595790; Package coreutils.
(Thu, 21 Feb 2013 00:12:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Carlos Alberto Lopez Perez <clopez@igalia.com>:
Extra info received and forwarded to list. Copy sent to Michael Stone <mstone@debian.org>.
(Thu, 21 Feb 2013 00:12:03 GMT) (full text, mbox, link).
Message #28 received at 595790@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On 21/02/13 00:14, Michael Stone wrote:
> Short version:
>
> My inclination is to simply better document that hostid is an interface
> without clear semantics which exists for compatability with legacy
> systems and should not be used in new applications.
>
> Longer version:
>
> What is the reason for wanting to use hostid? Historically this sort of
> interface was most often used for software licensing and other such
> applications which wanted to tie something to a particular piece of
> hardware. On the proprietary hardware, the vendors would encode a serial
> number which was nicely suited to that purpose. On linux, we don't have
> such a hardware serial because we don't control the hardware, and we
> don't have a real strong desire to facilitate software licensing
> schemes. (If someone wants to cobble one together, fine, but we're not
> going to do the work for it.) Intel tried to implement cpuids for this
> purpose, and it flopped; we're unlikely to get further than they did.
> Any scheme that relies on a cookie isn't going to provide that "tied to
> the hardware" guarantee (a guarantee which is increasingly meaningless
> in a VM world, anyway). And, dbus already went and reinvented that
> wheel--does it make any sense to try to re-reinvent that wheel? You
> still won't be able to rely on hostid having a useful value, because it
> will be installation-dependent (unlike dbus, which got to define the
> semantics from the ground up). You'll basically have an interface which
> on some systems has a great semantic, and on others does not, so the
> documentation will have to say exactly what it should say now: "use
> something else".
>
> So should we just get rid of it? Doing so would probably break some
> ancient stupid script somewhere, to save 40k. And it's within the realm
> of possibility that someone, within a particular environment, is
> actually managing the hostids to do something useful, so we shouldn't
> break that.
> Mike Stone
>
When you create a ZFS pool the value of the hostid is stored inside it.
When the system is going to import a pool it checks if the hostid stored
inside the pool matches the current host hostid. If it don't matches it
refuses to import the pool.
In this case you have to manually force an import, which in turns
overwrites the old value for the hostid stored on the pool.
This was meant (I'm guessing) when you have a big SAN/NAS cluster with
many ZFS pools and many hosts accessing them at the same time. With this
feature each one of the hosts can automatically import it's own pools
just by looking at the hostid value without interfering on the others.
http://distfiles.scode.org/tmp/zfs-handbook/zfs-hostid.html
We are working on packaging ZoL (ZFS on Linux native with DKMS) [1] for
Debian and we are checking how we can solve this issue.
The solution that I have in mind is just create /etc/hostid on the
postinst of the ZoL package [2]. But it would be much better if this
could be done by coreutils (ideally at install time)
Writing 4 random bytes on /etc/hostid is not perfect, but is much better
than what we currently have now.
I was looking closer to what FreeBSD does [3]:
First they check if the file /etc/hostid exists. If it exists they set
the hostid to that value. They have a sysctl variable in their kernel
for setting the hostid.
If this file is not available then they check if the system has
smbios.system.uuid defined (in Linux this is
/sys/class/dmi/id/product_uuid). If the system has this value then they
just assign the hostid to that value (via sysctl).
If this value is not available, then they just generate a random value
and write it on /etc/hostid so it is preserved across reboots, and
finally they set the value.
Regards!
--------
[1] http://bugs.debian.org/686447
[2]
http://lists.alioth.debian.org/pipermail/pkg-zfsonlinux-devel/2013-February/000005.html
[3] https://gitorious.org/freebsd/freebsd/blobs/HEAD/etc/rc.d/hostid
[signature.asc (application/pgp-signature, attachment)]
Information forwarded
to debian-bugs-dist@lists.debian.org:
Bug#595790; Package coreutils.
(Thu, 21 Feb 2013 00:54:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Michael Stone <mstone@debian.org>:
Extra info received and forwarded to list.
(Thu, 21 Feb 2013 00:54:03 GMT) (full text, mbox, link).
Message #33 received at 595790@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
On Thu, Feb 21, 2013 at 01:08:00AM +0100, Carlos Alberto Lopez Perez wrote:
>When you create a ZFS pool the value of the hostid is stored inside it.
>When the system is going to import a pool it checks if the hostid stored
>inside the pool matches the current host hostid. If it don't matches it
>refuses to import the pool.
Not suprising: sunos is one of those legacy systems where old scripts
tend to assume hostid is a system serial number. :)
>We are working on packaging ZoL (ZFS on Linux native with DKMS) [1] for
>Debian and we are checking how we can solve this issue.
I'd suggest an application-specific UUID incorporating the hostname.
Mike Stone
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#595790; Package coreutils.
(Thu, 21 Feb 2013 02:18:03 GMT) (full text, mbox, link).
Acknowledgement sent
to martin f krafft <madduck@debian.org>:
Extra info received and forwarded to list. Copy sent to Michael Stone <mstone@debian.org>.
(Thu, 21 Feb 2013 02:18:03 GMT) (full text, mbox, link).
Message #38 received at 595790@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
The only use I've ever had for hostid was to splay cronjobs of
a large number of hosts in the same network, to prevent them
hammering the same server all at once, i.e. make them sleep for
86400 seconds modular the hostid every day…
--
.''`. martin f. krafft <madduck@d.o> Related projects:
: :' : proud Debian developer http://debiansystem.info
`. `'` http://people.debian.org/~madduck http://vcs-pkg.org
`- Debian - when you have better things to do than fixing systems
"in all unimportant matters, style, not sincerity, is the essential.
in all important matters, style, not sincerity, is the essential."
-- oscar wilde
[digital_signature_gpg.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#595790; Package coreutils.
(Thu, 21 Feb 2013 02:18:05 GMT) (full text, mbox, link).
Acknowledgement sent
to martin f krafft <madduck@debian.org>:
Extra info received and forwarded to list. Copy sent to Michael Stone <mstone@debian.org>.
(Thu, 21 Feb 2013 02:18:05 GMT) (full text, mbox, link).
Message #43 received at 595790@bugs.debian.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
also sprach Bob Proulx <bob@proulx.com> [2013.02.21.1038 +1300]:
> Also, AFAICS gethostid() only reads the first four bytes from the
> file. Therefore using a symlink to /var/lib/dbus/machine-id would not
> yield the desired result. It would reduce the entropy there down to
> the first four characters from it. Mapping from binary data to the
> character representation of it would be incorrect.
I suggest initialising the host ID from the UUID of the root
filesystem…
--
.''`. martin f. krafft <madduck@d.o> Related projects:
: :' : proud Debian developer http://debiansystem.info
`. `'` http://people.debian.org/~madduck http://vcs-pkg.org
`- Debian - when you have better things to do than fixing systems
"if they can get you asking the wrong questions,
they don't have to worry about answers."
-- thomas pynchon
[digital_signature_gpg.asc (application/pgp-signature, inline)]
Information forwarded
to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#595790; Package coreutils.
(Thu, 21 Feb 2013 02:51:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Darik Horn <dajhorn@vanadac.com>:
Extra info received and forwarded to list. Copy sent to Michael Stone <mstone@debian.org>.
(Thu, 21 Feb 2013 02:51:03 GMT) (full text, mbox, link).
Message #48 received at 595790@bugs.debian.org (full text, mbox, reply):
> I'd suggest an application-specific UUID incorporating the hostname.
I expect that ZoL will remove or disable the hostid check when Illumos
implements the Multi Reader Protection feature, or something like it
for HA configurations. The behavior in question is a nonessential
holdover from Solaris that fits poorly into Linux systems.
When I wrote commit 0d54dcb5 to improve /etc/hostid handling on
Debian, which was quoted earlier, we briefly discussed using the
/var/lib/dbus/machine-id file instead, but decided that it wasn't
worthwhile because it wouldn't solve the underlying problem.
That said, I still don't understand the issue that is motivating this
discussion. Changing the return of hostid() after initial system
configuration will break a non-zero number of systems running legacy
software, and status quo has been okay with existing users since
forever.
--
Darik Horn <dajhorn@vanadac.com>
Information forwarded
to debian-bugs-dist@lists.debian.org, Michael Stone <mstone@debian.org>:
Bug#595790; Package coreutils.
(Wed, 28 Sep 2016 07:36:06 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to Michael Stone <mstone@debian.org>.
(Wed, 28 Sep 2016 07:36:06 GMT) (full text, mbox, link).
Message #53 received at 595790@bugs.debian.org (full text, mbox, reply):
Control: reassign -1 libc6
Control: found -1 2.19-18
Control: The value from gethostid() should be more unique and not change when the host IP changes
Reassigning to glibc as that is the source of gethostid() where the
problem with the missing unique identifier originates. Using the
version number in stable, but the issue have been around before that.
In my work as a system administrator for tens of thousand of machines, I
have often had the need to get some semi-unique identifier out of the
operating system. On all other Unix like operating systems, hostid and
gethostid() will provide this, but not on Linux. I find this rather sad,
and have had to spend time generating our own solution to the problem
because gethostid() is useless on Linux.
Because of this, and to spare future system administrators to share that
pain, I fully support the request from Martin Kraft to extend Debian to
make sure the gethostid() value return something sensible.
The described approach from FreeBSD, using /etc/hostid,
/sys/class/dmi/id/product_uuid or a random value (in that order) seem
like a sensible one. It might make sense to use other sources too, but
the goal should be to pick a value that will stay the same until the hardware
is replaced, and pick a value that will stay the same as long as the operating
system isn't reinstalled if such hardware dependent value do not exist.
To avoid changing the ID on running systems I believe it should only be done
when libc6 is installed for the first time. Those willing to change their
hostid at runtime should be provided a simple script to do so instead of doing
it automatically. It will fix the issue for future installations. I am not
sure how to sensibly fix it for existing installations without ending up with
a lot of machines with the same hostid as 7f0100 is a very common hostid on
Linux already, and everyone with a private IP address like those on 192.168.*
will have collisions. But then again a 32 bit number can only provide
4.294.967.296 unique IDs and with the amount of Linux machines in the world
there are going to be collisions anyway. We just should reduce the chance to
a more sensible number.
Something like this should work, I guess:
if [ ! -f /etc/hostid ]; then
if [ -e /sys/class/dmi/id/product_uuid ]; then
sethostidfromuuid $(cat /sys/class/dmi/id/product_uuid)
else
dd if=/dev/urandom bs=1 count=4 of=/etc/hostid 2>/dev/null
fi
fi
We need to figure out how to transform the UUID to a 32 bit integer, of course.
--
Happy hacking
Petter Reinholdtsen
Bug reassigned from package 'coreutils' to 'libc6'.
Request was from Petter Reinholdtsen <pere@hungry.com>
to 595790-submit@bugs.debian.org.
(Wed, 28 Sep 2016 07:36:06 GMT) (full text, mbox, link).
No longer marked as found in versions coreutils/8.5-1.
Request was from Petter Reinholdtsen <pere@hungry.com>
to 595790-submit@bugs.debian.org.
(Wed, 28 Sep 2016 07:36:06 GMT) (full text, mbox, link).
Marked as found in versions glibc/2.19-18.
Request was from Petter Reinholdtsen <pere@hungry.com>
to 595790-submit@bugs.debian.org.
(Wed, 28 Sep 2016 07:36:07 GMT) (full text, mbox, link).
Changed Bug title to 'The value from gethostid() should be more unique and not change when the host IP changes' from 'hostid: useless unless fixed'.
Request was from Petter Reinholdtsen <pere@hungry.com>
to control@bugs.debian.org.
(Wed, 28 Sep 2016 07:57:02 GMT) (full text, mbox, link).
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Wed, 28 Sep 2016 10:42:04 GMT) (full text, mbox, link).
Acknowledgement sent
to Florian Weimer <fw@deneb.enyo.de>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Wed, 28 Sep 2016 10:42:04 GMT) (full text, mbox, link).
Message #66 received at 595790@bugs.debian.org (full text, mbox, reply):
* Petter Reinholdtsen:
> Something like this should work, I guess:
>
> if [ ! -f /etc/hostid ]; then
> if [ -e /sys/class/dmi/id/product_uuid ]; then
> sethostidfromuuid $(cat /sys/class/dmi/id/product_uuid)
> else
> dd if=/dev/urandom bs=1 count=4 of=/etc/hostid 2>/dev/null
> fi
> fi
That's not very different from /etc/machine-id, isn't it?
> We need to figure out how to transform the UUID to a 32 bit integer,
> of course.
And I think this is the crux of the problem. Whatever we do, with
today's cluster sizes it's just not reliably unique.
You could use /etc/machine-id instead. Some effort goes into that to
make it actually unique.
DMI data seems risky because it depends on firmware, and there are so
many firmware bugs out there. It would also not address the matter of
changing host IDs as the result of host migrations.
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Wed, 28 Sep 2016 11:45:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Wed, 28 Sep 2016 11:45:03 GMT) (full text, mbox, link).
Message #71 received at 595790@bugs.debian.org (full text, mbox, reply):
[Florian Weimer]
> That's not very different from /etc/machine-id, isn't it?
Ah, thank you very much for bringing this systemd setting to my
attention. I was not aware of it.
I agree that it seem very similar in purpose and implementation. Will
it be available on non-linux Debian architectures too?
>> We need to figure out how to transform the UUID to a 32 bit integer,
>> of course.
>
> And I think this is the crux of the problem. Whatever we do, with
> today's cluster sizes it's just not reliably unique.
Well, for the set of machines we have available at work (ca. 3000) it
would be sufficiently unique. For most sites it would make the return
value from gethostid() unique. In most use cases it do not need need to
globally unique. Like the ZFS use case, it just need to be unique among
the hosts sharing the storage system.
In another use case at work, it should be unique across the entire stock
of linux machines.
> You could use /etc/machine-id instead. Some effort goes into that to
> make it actually unique.
I will definitely put this systemd value in my tool box. Again, thank
you very much for mentioning it. :)
> DMI data seems risky because it depends on firmware, and there are so
> many firmware bugs out there.
I did not quite understand what you mean here. Do you mean the DMI
value in your experience isn't unique?
> It would also not address the matter of changing host IDs as the
> result of host migrations.
As far as I can tell, host migration could be solved by storing the
wanted hostid in /etc/hostid when migrating.
On an related note, I had a look at the POSIX definition for
gethostuid()[1], and its "Upon successful completion, gethostid() shall
return an identifier for the current host" is definitely very vague. So
glibc is sure not violating POSIX by changing the value when the host
changes IP address or commonly returning identical IDs on different
machines, but real world applications on the other hand expect the
hostid value to be reasonably unique and fixed across IP changes and
reboots.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/gethostid.html
--
Happy hacking
Petter Reinholdtsen
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Wed, 28 Sep 2016 12:15:05 GMT) (full text, mbox, link).
Acknowledgement sent
to Michael Stone <mstone@debian.org>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Wed, 28 Sep 2016 12:15:05 GMT) (full text, mbox, link).
Message #76 received at 595790@bugs.debian.org (full text, mbox, reply):
On Wed, Sep 28, 2016 at 12:32:04PM +0200, Florian Weimer wrote:
>* Petter Reinholdtsen:
>> Something like this should work, I guess:
>>
>> if [ ! -f /etc/hostid ]; then
>> if [ -e /sys/class/dmi/id/product_uuid ]; then
>> sethostidfromuuid $(cat /sys/class/dmi/id/product_uuid)
>> else
>> dd if=/dev/urandom bs=1 count=4 of=/etc/hostid 2>/dev/null
>> fi
>> fi
>
>That's not very different from /etc/machine-id, isn't it?
>
>> We need to figure out how to transform the UUID to a 32 bit integer,
>> of course.
>
>And I think this is the crux of the problem. Whatever we do, with
>today's cluster sizes it's just not reliably unique.
>
>You could use /etc/machine-id instead. Some effort goes into that to
>make it actually unique.
>
>DMI data seems risky because it depends on firmware, and there are so
>many firmware bugs out there. It would also not address the matter of
>changing host IDs as the result of host migrations.
Yes, this seems a quixotic quest. In historic terms, this was mostly
used on systems that actually had some kind of serial number burned onto
the mainboard; it's fairly useless in the absence of that kind of
controlled environment. Many systems these days actually do have that
sort of ID, e.g., via dmi/smbios, but 1) it's not guaranteed to be there
2) it's unlikely to fit in a 32 bit int.
Other platforms have deprecated gethostid, that's the best way forward for
linux, IMO. This proposal doesn't fix the problem generally and actually
changes the semantics of the call. (It was originally expected that the
value would remain constant independent of a particular OS installation,
which is not a property of a value stored on disk.) The main users of
hostid (that I'm aware of) tended to be commercial software vendors
locking licenses to systems--and they typically didn't use gethostid on
linux because it was useless for the purpose. So I'm not aware of a
userbase for this call on linux, and nobody should be using it for new
development. If you need a stable unique id then you should be using
something like the dmi uuid *and you need to have hardware from a vendor
that sets such a property*.
If you want something tied to the OS instance rather than the machine,
then use /etc/machine-id (and gnash your teeth at the misnomer) rather
than reinventing it.
Mike Stone
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Wed, 28 Sep 2016 12:15:07 GMT) (full text, mbox, link).
Acknowledgement sent
to Florian Weimer <fw@deneb.enyo.de>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Wed, 28 Sep 2016 12:15:07 GMT) (full text, mbox, link).
Message #81 received at 595790@bugs.debian.org (full text, mbox, reply):
* Petter Reinholdtsen:
> [Florian Weimer]
>> That's not very different from /etc/machine-id, isn't it?
>
> Ah, thank you very much for bringing this systemd setting to my
> attention. I was not aware of it.
>
> I agree that it seem very similar in purpose and implementation. Will
> it be available on non-linux Debian architectures too?
It might be possible to port over this part, yes.
>>> We need to figure out how to transform the UUID to a 32 bit integer,
>>> of course.
>>
>> And I think this is the crux of the problem. Whatever we do, with
>> today's cluster sizes it's just not reliably unique.
>
> Well, for the set of machines we have available at work (ca. 3000) it
> would be sufficiently unique.
I simulated 100,000 random assigns of 32-bit host IDs to 3,000 hosts,
and got collisions in 104 cases.
For 5,000 hosts, I got 286, and for 10,000, 1,112 (again in 100,000
runs). I was lazy, it shouldn't be too hard to calculate expected
values accurately.
So a 32-bit value without central coordination is pretty much a time
bomb.
> For most sites it would make the return value from gethostid()
> unique.
The IP address of a host could be better than that. I doubt it is
possible to imrpove upon the glibc implementation.
>> DMI data seems risky because it depends on firmware, and there are so
>> many firmware bugs out there.
>
> I did not quite understand what you mean here. Do you mean the DMI
> value in your experience isn't unique?
I wouldn't count on them being unique. Most such ID fields are
definitely not, and there are groups out there who strongly oppose
device IDs.
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Wed, 28 Sep 2016 21:15:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Wed, 28 Sep 2016 21:15:03 GMT) (full text, mbox, link).
Message #86 received at 595790@bugs.debian.org (full text, mbox, reply):
[Michael Stone]
> Other platforms have deprecated gethostid, that's the best way forward for
> linux, IMO.
Which platforms is this? I find FreeBSD recommend to use sysctl and
KERN_HOSTID to get the hostid integer directly from the kernel instead
of using gethostid(), which isn't really depricating the feature, only
the way to get access to it. A quick search did not show any other
platforms depricating the function and feature, so I am curious to learn
what those are.
> This proposal doesn't fix the problem generally and actually changes
> the semantics of the call. (It was originally expected that the value
> would remain constant independent of a particular OS installation,
> which is not a property of a value stored on disk.)
My proposal is to use the DMI info which should stay the same
independent of OS installation.
> The main users of hostid (that I'm aware of) tended to be commercial
> software vendors locking licenses to systems--and they typically
> didn't use gethostid on linux because it was useless for the
> purpose. So I'm not aware of a userbase for this call on linux, and
> nobody should be using it for new development.
The users I am aware of is zfs-linux and the tools we wrote at work to
detect when a Linux machine was reinstalled or had its hardware changed.
The use case of zfs-linux require the ID to be unique among the machines
sharing a storage solution, and not globally unique.
A search in the source of all Debian packages[1] show this list of 148
packages mentioning the string 'gethostid': actiona alpine amanda
apcupsd aplus-fsf arpwatch ats-lang-anairiats audit bacula bareos
bluefish bsdgames burp busybox casacore cde cdrdao cdrkit
chromium-browser cl-irc clisp cmucl condor coreutils ctwm cython dc3dd
dcmtk deheader deja-dup dicom3tools dietlibc dist dmtcp dx e17
eclipse-titan edk2 emscripten erlang facter fpc frama-c freebsd-utils
freetds fs-uae ga gcc-h8300-hms gdb ghc glibc gnucash gnulib
gnustep-netclasses golang golang-1.6 golang-1.7 golang-golang-x-sys
hercules highlight hugs98 hurd iputils isdnutils ivtools kfreebsd-10
krb5 ksh latrace ldc libcanberra libconvert-binary-c-perl
libdata-uuid-perl libexplain libpam-tacplus libpcap libposix-2008-perl
linux linux-grsec ltrace lua-posix manpages manpages-de manpages-es
manpages-fr manpages-ja manpages-pl metview minc-tools mingw-w64 mono
mono-reference-assemblies musl nam ncbi-tools6 netatalk newlib nim nmap
nordugrid-arc ns2 ntirpc nwchem open-iscsi open-vm-tools openafs openmpi
otp pidgin pidgin-nateon pimd polygraph praat prayer pulseaudio
python-ptrace qemu radare2 rat roaraudio samhain sbcl silo-llnl sipxtapi
slirp smlnj sniffit spl-linux splint strace talksoup.app tau tcpdump
tcpslice tkrat topal trinity tripwire uclibc uclmmbase uhd uw-imap vde2
xfsdump yap zephyr zfs-fuse zfsutils.
I do not know what they use gethostid() for. :)
[1] curl -s https://codesearch.debian.net/results/2308ff3051ed55cc/packages.json | jq -r '.Packages[]'
--
Happy hacking
Petter Reinholdtsen
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Wed, 28 Sep 2016 22:18:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Michael Stone <mstone@debian.org>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Wed, 28 Sep 2016 22:18:03 GMT) (full text, mbox, link).
Message #91 received at 595790@bugs.debian.org (full text, mbox, reply):
On Wed, Sep 28, 2016 at 11:11:21PM +0200, Petter Reinholdtsen wrote:
>[Michael Stone]
>> Other platforms have deprecated gethostid, that's the best way forward for
>> linux, IMO.
>
>Which platforms is this? I find FreeBSD recommend to use sysctl and
>KERN_HOSTID to get the hostid integer directly from the kernel instead
>of using gethostid(), which isn't really depricating the feature, only
>the way to get access to it. A quick search did not show any other
>platforms depricating the function and feature, so I am curious to learn
>what those are.
openbsd deprecates it, netbsd doesn't have it at all. neither of those
platforms is likely to have a useful value unless you set it yourself.
I'd wonder where this *is* expected to be useful value more than I'd
wonder where it isn't.
>My proposal is to use the DMI info which should stay the same
>independent of OS installation.
Which doesn't exist on many, many platforms. If you need an ID tied to
the hardware that's the one to use, but you have to know that the
hardware you're deploying to actually supports that feature.
>The users I am aware of is zfs-linux and the tools we wrote at work to
>detect when a Linux machine was reinstalled or had its hardware changed.
For the latter case, just use the smbios values directly (assuming
you're buying enterprise style hardware, it should support machine
uuids.) That way you know that you're getting something tied to the
hardware, instead of hoping.
>The use case of zfs-linux require the ID to be unique among the machines
>sharing a storage solution, and not globally unique.
I can't understand why, for that use case, zfs-linux wouldn't simply
create a uuid itself. I see no obvious advantage in the program trying
to fix the semantics of a fundamentally broken function that was
introduced in BSD in the 80s and was removed from BSD itself back in the
90s.
>A search in the source of all Debian packages[1] show this list of 148
>packages mentioning the string 'gethostid': actiona alpine amanda
>apcupsd aplus-fsf arpwatch ats-lang-anairiats audit bacula bareos
>bluefish bsdgames burp busybox casacore cde cdrdao cdrkit
>chromium-browser cl-irc clisp cmucl condor coreutils ctwm cython dc3dd
>dcmtk deheader deja-dup dicom3tools dietlibc dist dmtcp dx e17
>eclipse-titan edk2 emscripten erlang facter fpc frama-c freebsd-utils
>freetds fs-uae ga gcc-h8300-hms gdb ghc glibc gnucash gnulib
>gnustep-netclasses golang golang-1.6 golang-1.7 golang-golang-x-sys
>hercules highlight hugs98 hurd iputils isdnutils ivtools kfreebsd-10
>krb5 ksh latrace ldc libcanberra libconvert-binary-c-perl
>libdata-uuid-perl libexplain libpam-tacplus libpcap libposix-2008-perl
>linux linux-grsec ltrace lua-posix manpages manpages-de manpages-es
>manpages-fr manpages-ja manpages-pl metview minc-tools mingw-w64 mono
>mono-reference-assemblies musl nam ncbi-tools6 netatalk newlib nim nmap
>nordugrid-arc ns2 ntirpc nwchem open-iscsi open-vm-tools openafs openmpi
>otp pidgin pidgin-nateon pimd polygraph praat prayer pulseaudio
>python-ptrace qemu radare2 rat roaraudio samhain sbcl silo-llnl sipxtapi
>slirp smlnj sniffit spl-linux splint strace talksoup.app tau tcpdump
>tcpslice tkrat topal trinity tripwire uclibc uclmmbase uhd uw-imap vde2
>xfsdump yap zephyr zfs-fuse zfsutils.
>
>I do not know what they use gethostid() for. :)
Pulling a couple at random:
libpcap -- the only occurance is in lbl/os-sunos4.h
which is basically a list of function prototypes from a long obsolete OS
for historic curiosity.
xfsdump -- honestly seems like a bug or at least a misunderstanding:
ghdrp->gh_ipaddr = ( uint64_t )( unsigned long )gethostid( )
cdrdao -- questionable assumption in scsi-sun.c:
cpu_type = gethostid() >> 24
burp -- contains a couple of prototypes for the function, checks for it
in configure, doesn't seem to actually use it
This really is a function with no current value that should just be
forgotten. And certainly don't make random assumptions about the value
it returns.
Mike Stone
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Wed, 28 Sep 2016 22:33:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Aurelien Jarno <aurelien@aurel32.net>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Wed, 28 Sep 2016 22:33:02 GMT) (full text, mbox, link).
Message #96 received at 595790@bugs.debian.org (full text, mbox, reply):
On 2016-09-28 09:33, Petter Reinholdtsen wrote:
> Control: reassign -1 libc6
> Control: found -1 2.19-18
> Control: The value from gethostid() should be more unique and not change when the host IP changes
>
> Reassigning to glibc as that is the source of gethostid() where the
> problem with the missing unique identifier originates. Using the
> version number in stable, but the issue have been around before that.
>
> In my work as a system administrator for tens of thousand of machines, I
> have often had the need to get some semi-unique identifier out of the
> operating system. On all other Unix like operating systems, hostid and
> gethostid() will provide this, but not on Linux. I find this rather sad,
> and have had to spend time generating our own solution to the problem
> because gethostid() is useless on Linux.
>
> Because of this, and to spare future system administrators to share that
> pain, I fully support the request from Martin Kraft to extend Debian to
> make sure the gethostid() value return something sensible.
>
> The described approach from FreeBSD, using /etc/hostid,
> /sys/class/dmi/id/product_uuid or a random value (in that order) seem
> like a sensible one. It might make sense to use other sources too, but
> the goal should be to pick a value that will stay the same until the hardware
> is replaced, and pick a value that will stay the same as long as the operating
> system isn't reinstalled if such hardware dependent value do not exist.
>
> To avoid changing the ID on running systems I believe it should only be done
> when libc6 is installed for the first time. Those willing to change their
> hostid at runtime should be provided a simple script to do so instead of doing
> it automatically. It will fix the issue for future installations. I am not
> sure how to sensibly fix it for existing installations without ending up with
> a lot of machines with the same hostid as 7f0100 is a very common hostid on
> Linux already, and everyone with a private IP address like those on 192.168.*
> will have collisions. But then again a 32 bit number can only provide
> 4.294.967.296 unique IDs and with the amount of Linux machines in the world
> there are going to be collisions anyway. We just should reduce the chance to
> a more sensible number.
>
> Something like this should work, I guess:
>
> if [ ! -f /etc/hostid ]; then
> if [ -e /sys/class/dmi/id/product_uuid ]; then
> sethostidfromuuid $(cat /sys/class/dmi/id/product_uuid)
> else
> dd if=/dev/urandom bs=1 count=4 of=/etc/hostid 2>/dev/null
> fi
> fi
>
> We need to figure out how to transform the UUID to a 32 bit integer, of course.
Hmm DMI is something quite x86/aarch64 specific, so it means we will
always use the /dev/urandom fallback on other architectures.
Another question is about chroots. The above methods means we might
end-up with the same machine-id in chroots id the DMI UUID is available.
Is it something really wanted?
In any case it looks to me we should not reinvent the wheel. We already
ended-up with two implementations of a unique machine ID, one in dbus
and one for systemd (which fortunately now try to just copy the other
one if it already exists), I am not sure we want a third one. Could we
just copy (part) of this ID if it exists, otherwise generate a random
number? Or even point the current gethostid() to /etc/machine-id if it
exists?
I am not even sure it's a good idea to fix this, it might be better to
just mark this function as deprecated, and encourage existing users of
this function (including hostid) to use something much longer than
32-bit to avoid collisions.
One thing is sure however, if we change the current behaviour, it will
change the hostid on many systems, including ones which do not return
007f0101.
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Wed, 28 Sep 2016 22:45:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Michael Stone <mstone@debian.org>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Wed, 28 Sep 2016 22:45:03 GMT) (full text, mbox, link).
Message #101 received at 595790@bugs.debian.org (full text, mbox, reply):
On Thu, Sep 29, 2016 at 12:30:39AM +0200, Aurelien Jarno wrote:
>Another question is about chroots. The above methods means we might
>end-up with the same machine-id in chroots id the DMI UUID is available.
>Is it something really wanted?
One of the many ambiguities of gethostid. :) Is it a unique ID (no) or
is it something that reflects the hardware (no)? Picking one will annoy
the people who think it's the other, even though both are currently wrong.
>I am not even sure it's a good idea to fix this, it might be better to
>just mark this function as deprecated, and encourage existing users of
>this function (including hostid) to use something much longer than
>32-bit to avoid collisions.
That's my vote, except that hostid(1) probably shouldn't change except
to say that nobody should use it.
>One thing is sure however, if we change the current behaviour, it will
>change the hostid on many systems, including ones which do not return
>007f0101.
Yes, changing it will likely be bad in the off chance that someone is
actually using the value. If you want to "fix" it (that is, define
semantics) it would be better to create a new system call than to change
the return value of a system call whose only semantic is that it returns
a stable value in some (not fully defined) case. Or just explain to
people how to use the options that already exist.
Mike Stone
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Thu, 29 Sep 2016 03:09:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Florian Weimer <fw@deneb.enyo.de>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Thu, 29 Sep 2016 03:09:03 GMT) (full text, mbox, link).
Message #106 received at 595790@bugs.debian.org (full text, mbox, reply):
* Michael Stone:
> Other platforms have deprecated gethostid, that's the best way forward
> for linux, IMO.
I agree. It's the most likely outcome if this issue was reported to
glibc upstream.
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Thu, 29 Sep 2016 04:15:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Richard Laager <rlaager@wiktel.com>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Thu, 29 Sep 2016 04:15:02 GMT) (full text, mbox, link).
Message #111 received at 595790@bugs.debian.org (full text, mbox, reply):
On 09/28/2016 04:41 AM, Petter Reinholdtsen wrote:
> I did not quite understand what you mean here. Do you mean the DMI
> value in your experience isn't unique?
Absolutely, yes. I found this out because, for some reason that I don't
know, libvirtd wants a unique identifier. It defaults to looking at the
UUID from the BIOS. Unfortunately, SuperMicro boards have non-unique UUIDs.
Getting back to ZFS and /etc/hostid... I would think that a
randomly-generated /etc/hostid is probably sufficient. Whether that's
done in the libc, spl, or zfs package makes no difference to me.
--
Richard
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Thu, 29 Sep 2016 04:57:05 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Thu, 29 Sep 2016 04:57:05 GMT) (full text, mbox, link).
Message #116 received at 595790@bugs.debian.org (full text, mbox, reply):
[Aurelien Jarno]
> In any case it looks to me we should not reinvent the wheel. We
> already ended-up with two implementations of a unique machine ID, one
> in dbus and one for systemd (which fortunately now try to just copy
> the other one if it already exists), I am not sure we want a third
> one. Could we just copy (part) of this ID if it exists, otherwise
> generate a random number? Or even point the current gethostid() to
> /etc/machine-id if it exists?
Peeking at the dbus and systemd UUID (and perhaps preferring them over
the DMI UUID) seem like a good idea, as long as /etc/hostid is updated
once during installation. Perhaps glibc is the wrong place to do this.
Perhaps a debian-installer udeb is a better place? It will of course
miss out chroots, which is unfortunate.
We have /etc/machine-id from systemd, /var/lib/dbus/machine-id from dbus
and /sys/class/dmi/id/product_uuid from DMI which all contain 128 bits
coded as hexadecimal numbers. I guess using the lower 32 bits for
gethostid() is as good as any of the other options.
> I am not even sure it's a good idea to fix this, it might be better to
> just mark this function as deprecated, and encourage existing users of
> this function (including hostid) to use something much longer than
> 32-bit to avoid collisions.
Mentioning alternatives with more bits in the gethostid() manual page
definitely sound like a good idea.
> One thing is sure however, if we change the current behaviour, it will
> change the hostid on many systems, including ones which do not return
> 007f0101.
I agree what it should not be done automatically on existing
installation. This is why I propose to set a value in /etc/hostid only
on first time installation of libc6, and document in the manual page how
to set it for those that want to modify an existing installation.
--
Happy hacking
Petter Reinholdtsen
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Thu, 29 Sep 2016 07:09:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Florian Weimer <fw@deneb.enyo.de>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Thu, 29 Sep 2016 07:09:03 GMT) (full text, mbox, link).
Message #121 received at 595790@bugs.debian.org (full text, mbox, reply):
* Richard Laager:
> Getting back to ZFS and /etc/hostid... I would think that a
> randomly-generated /etc/hostid is probably sufficient. Whether that's
> done in the libc, spl, or zfs package makes no difference to me.
As I tried to explain, the risks of collisions without central
coordination looks rather high. glibc's current approach, using the
IP address associated with the host name, provides a certain level of
coordination, avoiding duplicates.
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Thu, 29 Sep 2016 10:21:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Michael Stone <mstone@debian.org>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Thu, 29 Sep 2016 10:21:02 GMT) (full text, mbox, link).
Message #126 received at 595790@bugs.debian.org (full text, mbox, reply):
On Wed, Sep 28, 2016 at 09:03:38PM -0700, Richard Laager wrote:
>Getting back to ZFS and /etc/hostid... I would think that a
>randomly-generated /etc/hostid is probably sufficient. Whether that's
>done in the libc, spl, or zfs package makes no difference to me.
You still haven't explained why zfs doesn't just generate a uuid itself.
There's a large body of work ensuring reasonable uniqueness for uuids,
and there isn't a clear benefit to clinging to getuid. Even on solaris
there's a big honkin' warning on the man page that it isn't guaranteed
to be unique (IIRC, getuid on containers reflects the hardware the
container is running on).
Mike Stone
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Thu, 29 Sep 2016 20:03:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Richard Laager <rlaager@wiktel.com>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Thu, 29 Sep 2016 20:03:02 GMT) (full text, mbox, link).
Message #131 received at 595790@bugs.debian.org (full text, mbox, reply):
On 09/29/2016 05:19 AM, Michael Stone wrote:
> On Wed, Sep 28, 2016 at 09:03:38PM -0700, Richard Laager wrote:
>> Getting back to ZFS and /etc/hostid... I would think that a
>> randomly-generated /etc/hostid is probably sufficient. Whether that's
>> done in the libc, spl, or zfs package makes no difference to me.
>
> You still haven't explained why zfs doesn't just generate a uuid itself.
>
> There's a large body of work ensuring reasonable uniqueness for uuids,
> and there isn't a clear benefit to clinging to getuid.
It can't be a full UUID. The on-disk format of ZFS uses a 32-bit
integer. It doesn't really matter what we use to derive it, but a 32-bit
integer is the constraint.
For example, if you want to use the low 32-bits of /etc/machine-id, that
would work too. It'd mean carrying a patch on Debian, but if the pain of
a patch and different behavior is less than the benefits of the change,
go for it.
> Even on solaris
> there's a big honkin' warning on the man page that it isn't guaranteed
> to be unique (IIRC, getuid on containers reflects the hardware the
> container is running on).
On Solaris the zone (container) wouldn't import the pool. Pools are
imported in the "global zone". So this isn't a problem.
--
Richard
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Thu, 29 Sep 2016 20:39:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Petter Reinholdtsen <pere@hungry.com>:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Thu, 29 Sep 2016 20:39:03 GMT) (full text, mbox, link).
Message #136 received at 595790@bugs.debian.org (full text, mbox, reply):
[Richard Laager]
> For example, if you want to use the low 32-bits of /etc/machine-id,
> that would work too. It'd mean carrying a patch on Debian, but if the
> pain of a patch and different behavior is less than the benefits of
> the change, go for it.
I guess we would have to verify that /etc/machine-id is available in the
initrd for this to work with / in zfs. But I guess that is a problem
with /etc/hostid too for gethostid(). :)
While researching this topic I came across
<URL: http://stackoverflow.com/questions/9258228/how-to-prevent-gethostid-from-doing-dns-lookups-on-linux >
which report that gethostid() might lock up a program if the DNS server
become unavailable. A scary scenario just to get the machine ID.
I also came across <URL: http://0pointer.de/blog/projects/ids.html >,
which provide a very useful list of possible IDs to use in addition to
the gethostid() value. It agrees that gethostid() have unclear
sematics. :)
--
Happy hacking
Petter Reinholdtsen
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Thu, 15 Feb 2018 21:21:03 GMT) (full text, mbox, link).
Acknowledgement sent
to vadyba@klientai.eu:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Thu, 15 Feb 2018 21:21:03 GMT) (full text, mbox, link).
Information forwarded
to debian-bugs-dist@lists.debian.org, GNU Libc Maintainers <debian-glibc@lists.debian.org>:
Bug#595790; Package libc6.
(Sun, 18 Feb 2018 04:36:03 GMT) (full text, mbox, link).
Acknowledgement sent
to vadyba@klientai.eu:
Extra info received and forwarded to list. Copy sent to GNU Libc Maintainers <debian-glibc@lists.debian.org>.
(Sun, 18 Feb 2018 04:36:03 GMT) (full text, mbox, link).
Send a report that this bug log contains spam.
Debian bug tracking system administrator <owner@bugs.debian.org>.
Last modified:
Sat Jul 1 20:41:49 2023;
Machine Name:
bembo
Debian Bug tracking system
Debbugs is free software and licensed under the terms of the GNU
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson,
2005-2017 Don Armstrong, and many other contributors.