Debian Bug report logs - #541198
python-mysqldb: utf8_bin collation will not convert to Unicode strings

version graph

Package: python-mysqldb; Maintainer for python-mysqldb is Debian Python Modules Team <python-modules-team@lists.alioth.debian.org>; Source for python-mysqldb is src:python-mysqldb.

Reported by: Christoph Burgmer <chrislb@gmx.de>

Date: Wed, 12 Aug 2009 12:03:01 UTC

Severity: normal

Found in version python-mysqldb/1.2.2-8

Forwarded to http://sourceforge.net/tracker/?func=detail&aid=2837134&group_id=22307&atid=374932

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, chrislb@gmx.de, Debian Python Modules Team <python-modules-team@lists.alioth.debian.org>:
Bug#541198; Package python-mysqldb. (Wed, 12 Aug 2009 12:03:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Christoph Burgmer <chrislb@gmx.de>:
New Bug report received and forwarded. Copy sent to chrislb@gmx.de, Debian Python Modules Team <python-modules-team@lists.alioth.debian.org>. (Wed, 12 Aug 2009 12:03:04 GMT) Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: Christoph Burgmer <chrislb@gmx.de>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: python-mysqldb: utf8_bin collation will not convert to Unicode strings
Date: Wed, 12 Aug 2009 14:01:35 +0200
Package: python-mysqldb
Version: 1.2.2-8
Severity: normal

A string type column with a utf8_bin collation will not be converted to a
Python Unicode string, but instead will be returned as a utf8 (byte) string.

The MySQL documentation though clearly states: "A nonbinary string has a
character set and is converted to another character set in many cases, even
when the string has a _bin collation"[1].

I understand that a string with utf8_bin collation is still a string and
thus should not be dealt with differently. The utf8_bin collation is
essential when working with Unicode without wanting the Unicode collation
algorithm to kick in.

How to reproduce:

CREATE TABLE t1 (
    a CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin,
);

INSERT INTO t1 VALUES ('ΓΌ');

In Python:
>>> import MySQLdb
>>> db = MySQLdb.connect(db='pymysqltest', charset='utf8', use_unicode=True)
>>> cur = db.cursor()
>>> cur.execute("SELECT a FROM t1;")
1L
>>> cur.fetchall()
(('\xc3\xbc',),)

Chosing utf8_general_ci instead of utf8_bin will properly yield Unicode
objects:

>>> cur.execute("SELECT a COLLATE utf8_general_ci FROM t1;")
1L
>>> cur.fetchall()
((u'\xfc',),)

[1] http://dev.mysql.com/doc/refman/5.1/en/charset-binary-collations.html

-- System Information:
Debian Release: squeeze/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (i686)

Kernel: Linux 2.6.30-1-686 (SMP w/1 CPU core)
Locale: LANG=de_DE@euro, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages python-mysqldb depends on:
ii  libc6                         2.9-23     GNU C Library: Shared libraries
ii  libmysqlclient16              5.1.37-1   MySQL database client library
ii  python                        2.5.4-2    An interactive high-level object-o
ii  python-support                1.0.3      automated rebuilding support for P

python-mysqldb recommends no packages.

Versions of packages python-mysqldb suggests:
ii  mysql-server                  5.1.37-1   MySQL database server (metapackage
ii  mysql-server-5.1 [mysql-serve 5.1.37-1   MySQL database server binaries
ii  python-egenix-mxdatetime      3.1.2-1    date and time handling routines fo
pn  python-mysqldb-dbg            <none>     (no description available)

-- no debconf information




Information forwarded to debian-bugs-dist@lists.debian.org, Debian Python Modules Team <python-modules-team@lists.alioth.debian.org>:
Bug#541198; Package python-mysqldb. (Thu, 13 Aug 2009 21:03:11 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonas Meurer <jonas@freesources.org>:
Extra info received and forwarded to list. Copy sent to Debian Python Modules Team <python-modules-team@lists.alioth.debian.org>. (Thu, 13 Aug 2009 21:03:11 GMT) Full text and rfc822 format available.

Message #10 received at 541198@bugs.debian.org (full text, mbox):

From: Jonas Meurer <jonas@freesources.org>
To: Christoph Burgmer <chrislb@gmx.de>, 541198@bugs.debian.org
Subject: Re: [Python-modules-team] Bug#541198: python-mysqldb: utf8_bin collation will not convert to Unicode strings
Date: Thu, 13 Aug 2009 23:01:47 +0200
[Message part 1 (text/plain, inline)]
hey,

On 12/08/2009 Christoph Burgmer wrote:
> A string type column with a utf8_bin collation will not be converted to a
> Python Unicode string, but instead will be returned as a utf8 (byte) string.
> 
> The MySQL documentation though clearly states: "A nonbinary string has a
> character set and is converted to another character set in many cases, even
> when the string has a _bin collation"[1].
> 
> I understand that a string with utf8_bin collation is still a string and
> thus should not be dealt with differently. The utf8_bin collation is
> essential when working with Unicode without wanting the Unicode collation
> algorithm to kick in.

thanks for the bugreport. i forwarded it to the upstream bug tracking
system at sourceforge:
http://sourceforge.net/tracker/?func=detail&aid=2837134&group_id=22307&atid=374932

greetings,
 jonas
[signature.asc (application/pgp-signature, inline)]

Set Bug forwarded-to-address to 'http://sourceforge.net/tracker/?func=detail&aid=2837134&group_id=22307&atid=374932'. Request was from Jonas Meurer <mejo@debian.org> to control@bugs.debian.org. (Thu, 13 Aug 2009 21:03:14 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Debian Python Modules Team <python-modules-team@lists.alioth.debian.org>:
Bug#541198; Package python-mysqldb. (Sat, 10 Dec 2011 13:51:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Philipp Spitzer <philipp+debian@spitzer.priv.at>:
Extra info received and forwarded to list. Copy sent to Debian Python Modules Team <python-modules-team@lists.alioth.debian.org>. (Sat, 10 Dec 2011 13:51:07 GMT) Full text and rfc822 format available.

Message #17 received at 541198@bugs.debian.org (full text, mbox):

From: Philipp Spitzer <philipp+debian@spitzer.priv.at>
To: 541198@bugs.debian.org
Subject: Thanks for reporting this bug
Date: Sat, 10 Dec 2011 14:50:27 +0100
It is still present in upstream python-mysqldb 1.2.3.




Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Sat Apr 19 18:11:06 2014; Machine Name: beach.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.