Debian Bug report logs - #181378
grep is extremely slow

version graph

Package: grep; Maintainer for grep is Anibal Monsalve Salazar <anibal@debian.org>; Source for grep is src:grep (PTS, buildd, popcon).

Reported by: Max Zou <zoum@mzou.net>

Date: Mon, 17 Feb 2003 15:18:03 UTC

Severity: important

Tags: patch

Merged with 206470, 224993, 442882

Found in versions 2.5.1-1, 2.5.1-5, 2.5.1.ds1-2, grep/2.5.1.ds1-5, grep/2.5.1.ds2-1, grep/2.5.3~dfsg-1, grep/2.5.3~dfsg-2

Fixed in versions 2.5.1.ds2-6, grep/2.5.3~dfsg-3

Done: Anibal Monsalve Salazar <anibal@debian.org>

Bug is archived. No further changes may be made.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Clint Adams <schizo@debian.org>, grep@packages.qa.debian.org:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Max Zou <zoum@mzou.net>:
New Bug report received and forwarded. Copy sent to Clint Adams <schizo@debian.org>, grep@packages.qa.debian.org. (full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: Max Zou <zoum@mzou.net>
To: submit@bugs.debian.org
Subject: grep is extremely slow
Date: Mon, 17 Feb 2003 23:10:28 +0800
[Message part 1 (text/plain, inline)]
Package: grep
Version: 2.5.1-1

When I try to use the latest "grep" to search a pattern in a 100-KB file, 
it is considerably slower than previous version of grep.

Here is a time comparison with grep v2.4.2-3 on the same
machine.

# using grep v2.4.2-3
$ time ./grep-old test file |wc
2232    6696   44261

real    0m0.058s
user    0m0.060s
sys     0m0.000s

# using grep v2.5.1-1
$ time grep test srcfp/a |wc
2232    6696   44261

real    0m10.497s
user    0m10.430s
sys     0m0.010s

Is there any problem with the algorithm used in the latest grep?

I am using Debian sid, kernel-2.4.18 and libc6 2.3.1-11 on
a PIII 700MHz machine with 384MB RAM.

Thanks!

--
regards
ZM
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, rmgolbeck@debian.org (Ryan M. Golbeck):
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Michel Daenzer <daenzer@debian.org>:
Extra info received and forwarded to list. Copy sent to rmgolbeck@debian.org (Ryan M. Golbeck). (full text, mbox, link).


Message #10 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Michel Daenzer <daenzer@debian.org>
To: Debian Bug Tracking System <181378@bugs.debian.org>
Subject: Depends on locale
Date: Mon, 11 Aug 2003 16:52:10 +0200
Package: grep
Version: 2.5.1-5
Followup-For: Bug #181378

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


I experienced the same problem, but I just noticed that grep is fast
when the LC_ALL environment variable is set to C. Here's the output of
locale on my system:

LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=C
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES=en_US.UTF-8
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Seems like grep handles character encoding conversions inefficiently or
something?


- -- System Information:
Debian Release: testing/unstable
Architecture: powerpc
Kernel: Linux thor 2.4.20-ben8-xfs-lolat #18 Wed Aug 6 10:56:56 CEST 2003 ppc
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8

Versions of packages grep depends on:
ii  libc6                         2.3.1-16   GNU C Library: Shared libraries an

- -- no debconf information

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/N62aWoGvjmrbsgARArZVAJ4iVqUDptDeldcvNgA2DlWmoOuXnwCffIQ5
n9TZeES03gReAsL5IkS9ock=
=KnnW
-----END PGP SIGNATURE-----



Information forwarded to debian-bugs-dist@lists.debian.org, Max Zou <zoum@mzou.net>, Dan Jacobson <jidanni@jidanni.org>, rmgolbeck@debian.org (Ryan M. Golbeck):
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to "H. S. Teoh" <hsteoh@quickfur.ath.cx>:
Extra info received and forwarded to list. Copy sent to Max Zou <zoum@mzou.net>, Dan Jacobson <jidanni@jidanni.org>, rmgolbeck@debian.org (Ryan M. Golbeck). (full text, mbox, link).


Message #15 received at 181378@bugs.debian.org (full text, mbox, reply):

From: "H. S. Teoh" <hsteoh@quickfur.ath.cx>
To: control@bugs.debian.org, 181378@bugs.debian.org, 206470@bugs.debian.org
Subject: Identical bugs
Date: Thu, 4 Sep 2003 15:02:38 -0400
merge 181378 206470
thanks

These bugs appear to be the same (see latest messages in #181378).

As for the bugs themselves, could it be that the problem is caused by grep
localizing every input character, as opposed to localizing the regex and
then matching the resulting bytes? I haven't looked at the code to be
sure, but this is what immediately came to mind when I read about the
LC_CTYPE=C speed difference.

Translating every input character would, indeed, slow things down a lot. A
better alternative would be to localize the regex, match on a byte-by-byte
basis, and then localize the output only if it matches. However, this may
have pathological problems if multiple representations of the same
character are possible (e.g. Unicode combining diacritics vs. precomposed
characters). I'm not sure what the solution would be in this case.


T

-- 
If you look at a thing nine hundred and ninety-nine times, you are perfectly
safe; if you look at it the thousandth time, you are in frightful danger of
seeing it for the first time. -- G. K. Chesterton



Merged 181378 206470. Request was from "H. S. Teoh" <hsteoh@quickfur.ath.cx> to control@bugs.debian.org. (full text, mbox, link).


Severity set to `important'. Request was from Dan Jacobson <jidanni@jidanni.org> to control@bugs.debian.org. (full text, mbox, link).


Message sent on to Max Zou <zoum@mzou.net>:
Bug#181378. (full text, mbox, link).


Message #22 received at 181378-submitter@bugs.debian.org (full text, mbox, reply):

From: Michel Dänzer <daenzer@debian.org>
To: 181378-submitter@bugs.debian.org
Subject: patch
Date: Sat, 13 Dec 2003 16:19:43 +0100
[Message part 1 (text/plain, inline)]
This patch seems to help; I extracted it from the src rpm at

http://download.fedora.redhat.com/pub/fedora/linux/core/updates/1/SRPMS/

and tweaked one hunk for it to apply.


-- 
Earthling Michel Dänzer      |     Debian (powerpc), X and DRI developer
Software libre enthusiast    |   http://svcs.affero.net/rm.php?r=daenzer
[56-grep-2.5.1-gofast.patch (text/x-patch, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, rmgolbeck@debian.org (Ryan M. Golbeck):
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Roland Illig <roland.illig@gmx.de>:
Extra info received and forwarded to list. Copy sent to rmgolbeck@debian.org (Ryan M. Golbeck). (full text, mbox, link).


Message #27 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Roland Illig <roland.illig@gmx.de>
To: Debian Bug Tracking System <181378@bugs.debian.org>
Subject: grep: ... and Perl is a thousand times faster ...
Date: Wed, 18 Feb 2004 04:44:26 +0100
Package: grep
Version: 2.5.1.ds1-2
Severity: normal
Followup-For: Bug #181378

I found the magic frontier of grep: DFAs with 1024 states.

Please make grep a little quicker or replace it completely with pcre or
another fast implementation, as far as POSIX allows it.


$ time egrep .\{1024,\} debug | wc
    109   17987  169422

real    1m47.522s
user    1m28.960s
sys     0m3.680s

$ time egrep .\{1023,\} debug | wc
    109   17987  169422

real    0m1.074s
user    0m0.940s
sys     0m0.100s

$ time perl -ne '/.{1024,}/ and print' debug | wc 
    109   17987  169422

real    0m0.077s
user    0m0.070s
sys     0m0.000s


-- System Information:
Debian Release: testing/unstable
Architecture: i386
Kernel: Linux wwid 2.4.22-1-k7 #5 Sat Oct 4 14:11:12 EST 2003 i686
Locale: LANG=de_DE.ISO-8859-15@euro, LC_CTYPE=C (ignored: LC_ALL set to de_DE@euro)

Versions of packages grep depends on:
ii  libc6                       2.3.2.ds1-11 GNU C Library: Shared libraries an

-- no debconf information




Information forwarded to debian-bugs-dist@lists.debian.org, rmgolbeck@debian.org (Ryan M. Golbeck):
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Peter Moulder <Peter.Moulder@infotech.monash.edu.au>:
Extra info received and forwarded to list. Copy sent to rmgolbeck@debian.org (Ryan M. Golbeck). (full text, mbox, link).


Message #32 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Peter Moulder <Peter.Moulder@infotech.monash.edu.au>
To: 181378@bugs.debian.org
Subject: perl not fair comparison: perl gets "wrong" answer for utf-8 text
Date: Tue, 29 Jun 2004 23:15:22 +1000
Suppose UTF-8 LC_CTYPE.

  $ (echo rôle; echo role) | grep 'r.le'
  rôle
  role
  $ (echo rôle; echo role) | perl -ne '/r.le/ and print'
  role
  $ (echo rôle; echo role) | grep 'r..le'
  $ (echo rôle; echo role) | perl -ne '/r..le/ and print'
  rôle

(This is with perl_5.8.3-3, grep_2.5.1.ds1-2.)

Perl is using octet/byte regexps, whereas grep is using character
regexps.  Although arguable, I believe users would prefer grep's
behaviour (other than its speed).


I believe a better solution would be for grep to convert the character
regexp to an octet regexp.  E.g. the character regexp "." (which I'll assume
for simplicity matches any character) might be translated to
(?:[\x00-\x7f]|[\xc0-\xf7][\x80-\xbf]*).

That translation assumes that an accented character formed by
composition is to be considered distinct from a single unicode character
(H. S. Teoh's example above).  I'm not familiar with the unicode spec.
Maybe it's reasonable to consider them different.  Otherwise, I believe
the translate-the-regexp approach is still applicable but requires
longer translations.


However, I wonder if the problem is just that the conversion of the
input stream to wchars is inefficient.  Off hand, I don't see why it
should make things so much slower.


pjrm.




Information forwarded to debian-bugs-dist@lists.debian.org, rmgolbeck@debian.org (Ryan M. Golbeck):
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Peter Moulder <Peter.Moulder@infotech.monash.edu.au>:
Extra info received and forwarded to list. Copy sent to rmgolbeck@debian.org (Ryan M. Golbeck). (full text, mbox, link).


Message #37 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Peter Moulder <Peter.Moulder@infotech.monash.edu.au>
To: 181378@bugs.debian.org
Subject: gprof; combining diacritical marks; octet regexp conversion
Date: Wed, 30 Jun 2004 16:46:42 +1000
According to gprof on grep compiled with -pg (without installing libc6-prof),
~all the time is spent in check_multibyte_string.

In the case of utf-8, we don't need the return value of
check_multibyte_string: bytes other than the initial byte of utf-8
characters (ignoring the composition case) have (c & 0xc0) == 0x80 (i.e. bit7=1, bit6=0).
To handle the composition case, we add a test that the wide character
is a combining diacritical: "|| (c == 0xcc) || ((c == 0xcd) && (nextchar
<= 0xaf))".


Combining diacritical marks are a hassle for grep.  According to
http://en.wikipedia.org/wiki/Combining_diacritical_mark, a character can
be followed by more than one combining diacritical mark character.
Presumably, order doesn't matter, so grep 'a<string of n combining
diacritical mark characters>' can match n factorial different strings in
the haystack text, without counting use of precomposed characters.
The simplest way of handling this would be to convert to a canonical
form, say decomposed form with combining diacritical marks in sorted
order.

Note that I haven't checked the unicode standards on this point:
possibly order is to be considered significant, in which case the only
possible matches are decomposed vs use of precomposed character.  This
would make the convert-to-octet-regexp approach practical (see below).

Another issue with decomposable characters is that we must use negative
lookahead tests: if searching for `o' then we must check that the
matched 'o' isn't followed by a combining diacritical mark character.
The alternative of canonicalizing to precomposed form instead of
decomposed form has its own expense: if there are 112 possible
diacritical mark characters, and characters can be followed by an
arbitrary selection of those, then we need an extra 112 bits per
canonical character to represent those.  And even that only presence or
absence (rather than number or order) is significant for combining
diacritical marks, i.e. it assumes that e<macron><acute><acute> is to be
considered equivalent to e<acute><macron>.  If number and order are
significant then no finite number of bits suffices.

Note that grep doesn't currently handle combining diacritical marks:

  $ printf 'e\xcc\x80\n'
  è
  $ printf 'e\xcc\x80\n' | grep 'è'
  <nothing>
  $ printf 'e\xcc\x80\n' | grep '^.$'
  <nothing>


More remarks on the idea of converting character regexps to byte regexps
(see previous message).

First, note that it works for UTF-8, but not e.g. GBK, precisely because
in UTF-8 one can't mistake the middle of a character for the beginning
of one.  E.g. in GBK, the string for 我我 contains the string for 椅;
there is no way to tell where the character beginnings are short of
something like what grep is already doing.

Is it worth adding special code for UTF-8 (probably sharable with other
UTF encodings) if we still need something like the current code to
handle GBK and other multi-byte encodings?  Well, UTF-8 is likely to
become the primary encoding on Debian systems.

The example translation given for "." may discourage from its
complexity.  However, we should note that many, perhaps most, common
regexps don't need any translation at all (ignoring character
composition issues).  E.g. '[abc]a.*b($|\>)' doesn't need any change.
'rô*l' needs a minor change of wrapping the multi-byte ô: 'r(?:ô)*l'.
Similarly for character ranges that include multi-byte characters:
'r[ioôé]le' becomes 'r(?:[io]|ô|é)le'.  The fragment '.+' (extended
regexp) may or may not need change, depending on whether preceding &
following fragments only match whole characters.  E.g. 'a.+b' doesn't
need change, but 'a.+.+b' does need changing.  Whether '[a-z]' needs
changing depends on LC_COLLATE, i.e. it depends whether that range
includes ô and so on.  However, in most places where a regexp uses
[a-z], one would want to exclude ô and so on, so it would be a bug not
to specify LC_COLLATE=C.

I've now checked Single Unix Spec for the regexp '.': it matches all
characters other than NUL (and in particular it matches \n, except that
grep should already have split its input into lines excluding \n).


pjrm.




Severity set to `grave'. Request was from Dan Jacobson <jidanni@jidanni.org> to control@bugs.debian.org. (full text, mbox, link).


Merged 181378 206470 224993. Request was from Dan Jacobson <jidanni@jidanni.org> to control@bugs.debian.org. (full text, mbox, link).


Severity set to `important'. Request was from Colin Watson <cjwatson@debian.org> to control@bugs.debian.org. (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, rmgolbeck@debian.org (Ryan M. Golbeck):
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Justin Pryzby <justinpryzby@users.sourceforge.net>:
Extra info received and forwarded to list. Copy sent to rmgolbeck@debian.org (Ryan M. Golbeck). (full text, mbox, link).


Message #48 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Justin Pryzby <justinpryzby@users.sourceforge.net>
To: 181378@bugs.debian.org
Subject: profile
Date: Mon, 6 Dec 2004 10:20:37 -0500
I don't know that I can add useful info here, but this just bit me
too.

  $ wc -l /tmp/setuid;
  50 /tmp/setuid

  $ time ltrace grep -v /dev/ /tmp/setuid 2>&1 |LANG=C grep '^mbrtowc(' |wc -l
  77989

  real    0m14.802s
  user    0m6.118s
  sys     0m8.071s

  $ calc 50/14.802
          ~3.37792190244561545737

Justin



Information forwarded to debian-bugs-dist@lists.debian.org, rmgolbeck@debian.org (Ryan M. Golbeck):
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Simon Law <sfllaw@debian.org>:
Extra info received and forwarded to list. Copy sent to rmgolbeck@debian.org (Ryan M. Golbeck). (full text, mbox, link).


Message #53 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Simon Law <sfllaw@debian.org>
To: 181378@bugs.debian.org
Subject: Red Hat's UTF-8 speedup patch
Date: Wed, 8 Dec 2004 16:29:56 -0500
[Message part 1 (text/plain, inline)]
tags 181378 +patch
thanks

Here is Fedora Core 3's patch to grep that makes it work quickly in
UTF-8 environments.

I haven't tested if it applies cleanly to Debian's grep.  But if you
make me a co-maintainer, I'll happily spend a couple of hours merging
this and other useful Red Hat patches into our grep.

Simon
[grep-2.5.1-gofast.patch (text/plain, attachment)]

Tags added: patch Request was from Simon Law <sfllaw@debian.org> to control@bugs.debian.org. (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Nicolas François <nicolas.francois@centraliens.net>:
Extra info received and forwarded to list. Copy sent to Anibal Monsalve Salazar <anibal@debian.org>. (full text, mbox, link).


Message #60 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Nicolas François <nicolas.francois@centraliens.net>
To: Debian Bug Tracking System <181378@bugs.debian.org>
Subject: grep is extremely slow with UTF-8
Date: Wed, 7 Sep 2005 11:52:34 +0200
[Message part 1 (text/plain, inline)]
Package: grep
Version: 2.5.1.ds1-5
Followup-For: Bug #181378

Hello,

I tried the gofast patche, and did not find a real improvement.

However, Fedora is now using a different patch, which improve dramaticaly
grep performances on an UTF-8 environment.

Please find attached the following patches:
  * I put the original Fedora patches in the orig directory. The other
    patches are updated for the Debian package.
  * 64-egf-speedup.patch
    It does most of the work. Here is the explanation, according to:
    http://savannah.gnu.org/patch/?func=detailitem&item_id=3803
>    The full story behind this patch is that grep-2.5.1a does not handle
>    UTF-8 gracefully at all. The basic plan with handling UTF-8 in 2.5.1a
>    is:
>    * whenever a buffer is parsed, go through the entire buffer deciding
>      how many bytes make up each character
>    * use this information when necessary
>
>    This patch changes that to:
>    * when information about how many bytes make up a character is needed,
>      work it out on demand
>
>    On the face of it, this is a small obvious improvement. In fact it is
>    much better than that, because the original scheme would calculate
>    character lengths several times for each buffer: in fact, one full
>    pass for every single potential match!

  * 65-dfa-optional.patch
    I'm not sure this one is really needed.
    I've read the DFA algorithme is slow for UTF-8 and this patch disable
    it in that case (and it can be forced enabled by setting an evirronment
    variable)
  * grep-2.5.1-tests.patch
    Fedora also added a test for UTF-8.
  * 66-match_icase.patch
  * 67-w.patch
    After testing the new UTF-8 tests, these too seems to be needed.
    (It is not really related to the grep's speed, but these patches may
    be interresting)

I tried a grep packages with all these patches, and for the following
command:
    grep '^' /var/lib/dpkg/available> /dev/null
grep is more than 1500 faster on an UTF-8 environment.
(on my machine, it take less than 3/4s instead of more than 10 minutes!)

Also, I did not notice any regression, and grep is not dramatically
slower on the C locale.

These patches may be important for Etch since the transition to UTF-8 is
mentionned on the (unofficial) Etch TODO list:
http://wiki.debian.net/?EtchTODOList

(And the French team is considering using UTF-8 for the default French
locale)

Thanks in advance,
-- 
Nekral
[patches.tar.bz2 (application/octet-stream, attachment)]

Reply sent to Santiago Ruano Rincon <santiago@unicauca.edu.co>:
You have taken responsibility. (full text, mbox, link).


Notification sent to Max Zou <zoum@mzou.net>:
Bug acknowledged by developer. (full text, mbox, link).


Message #65 received at 181378-close@bugs.debian.org (full text, mbox, reply):

From: Santiago Ruano Rincon <santiago@unicauca.edu.co>
To: 181378-close@bugs.debian.org
Subject: Bug#181378: fixed in grep 2.5.1.ds1-6
Date: Sat, 10 Sep 2005 22:47:05 -0700
Source: grep
Source-Version: 2.5.1.ds1-6

We believe that the bug you reported is fixed in the latest version of
grep, which is due to be installed in the Debian FTP archive:

grep_2.5.1.ds1-6.diff.gz
  to pool/main/g/grep/grep_2.5.1.ds1-6.diff.gz
grep_2.5.1.ds1-6.dsc
  to pool/main/g/grep/grep_2.5.1.ds1-6.dsc
grep_2.5.1.ds1-6_i386.deb
  to pool/main/g/grep/grep_2.5.1.ds1-6_i386.deb



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 181378@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Santiago Ruano Rincon <santiago@unicauca.edu.co> (supplier of updated grep package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Sat, 10 Sep 2005 01:52:04 -0500
Source: grep
Binary: grep
Architecture: source i386
Version: 2.5.1.ds1-6
Distribution: unstable
Urgency: low
Maintainer: Anibal Monsalve Salazar <anibal@debian.org>
Changed-By: Santiago Ruano Rincon <santiago@unicauca.edu.co>
Description: 
 grep       - GNU grep, egrep and fgrep
Closes: 181378 206470 224993
Changes: 
 grep (2.5.1.ds1-6) unstable; urgency=low
 .
   * 64-egf-speedup.patch, 65-dfa-optional.patch, 66-match_icase.patch,
     67-w.patch speed up grep. Thanks to Nicolas François
     <nicolas.francois@centraliens.net> (Closes: #181378, #206470, #224993)
   * Deleted the CVS directories
Files: 
 7797de5e94d5c6b930a29e0a7fc5e205 669 base required grep_2.5.1.ds1-6.dsc
 caea29b0505d0401fb03d0f3c5b0de75 29266 base required grep_2.5.1.ds1-6.diff.gz
 38ead74511b3423ee277778ae85c2077 172330 base required grep_2.5.1.ds1-6_i386.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDI8JagY5NIXPNpFURAq2QAKC4AK77tJr5vlyg5sSVasgEMr49RQCgrZYI
wsu93RoCSrY292GZAvXAoTM=
=XhKe
-----END PGP SIGNATURE-----




Reply sent to Santiago Ruano Rincon <santiago@unicauca.edu.co>:
You have taken responsibility. (full text, mbox, link).


Notification sent to Max Zou <zoum@mzou.net>:
Bug acknowledged by developer. (full text, mbox, link).


Message #70 received at 206470-close@bugs.debian.org (full text, mbox, reply):

From: Santiago Ruano Rincon <santiago@unicauca.edu.co>
To: 206470-close@bugs.debian.org
Subject: Bug#206470: fixed in grep 2.5.1.ds1-6
Date: Sat, 10 Sep 2005 22:47:05 -0700
Source: grep
Source-Version: 2.5.1.ds1-6

We believe that the bug you reported is fixed in the latest version of
grep, which is due to be installed in the Debian FTP archive:

grep_2.5.1.ds1-6.diff.gz
  to pool/main/g/grep/grep_2.5.1.ds1-6.diff.gz
grep_2.5.1.ds1-6.dsc
  to pool/main/g/grep/grep_2.5.1.ds1-6.dsc
grep_2.5.1.ds1-6_i386.deb
  to pool/main/g/grep/grep_2.5.1.ds1-6_i386.deb



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 206470@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Santiago Ruano Rincon <santiago@unicauca.edu.co> (supplier of updated grep package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Sat, 10 Sep 2005 01:52:04 -0500
Source: grep
Binary: grep
Architecture: source i386
Version: 2.5.1.ds1-6
Distribution: unstable
Urgency: low
Maintainer: Anibal Monsalve Salazar <anibal@debian.org>
Changed-By: Santiago Ruano Rincon <santiago@unicauca.edu.co>
Description: 
 grep       - GNU grep, egrep and fgrep
Closes: 181378 206470 224993
Changes: 
 grep (2.5.1.ds1-6) unstable; urgency=low
 .
   * 64-egf-speedup.patch, 65-dfa-optional.patch, 66-match_icase.patch,
     67-w.patch speed up grep. Thanks to Nicolas François
     <nicolas.francois@centraliens.net> (Closes: #181378, #206470, #224993)
   * Deleted the CVS directories
Files: 
 7797de5e94d5c6b930a29e0a7fc5e205 669 base required grep_2.5.1.ds1-6.dsc
 caea29b0505d0401fb03d0f3c5b0de75 29266 base required grep_2.5.1.ds1-6.diff.gz
 38ead74511b3423ee277778ae85c2077 172330 base required grep_2.5.1.ds1-6_i386.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDI8JagY5NIXPNpFURAq2QAKC4AK77tJr5vlyg5sSVasgEMr49RQCgrZYI
wsu93RoCSrY292GZAvXAoTM=
=XhKe
-----END PGP SIGNATURE-----




Reply sent to Santiago Ruano Rincon <santiago@unicauca.edu.co>:
You have taken responsibility. (full text, mbox, link).


Notification sent to Max Zou <zoum@mzou.net>:
Bug acknowledged by developer. (full text, mbox, link).


Message #75 received at 224993-close@bugs.debian.org (full text, mbox, reply):

From: Santiago Ruano Rincon <santiago@unicauca.edu.co>
To: 224993-close@bugs.debian.org
Subject: Bug#224993: fixed in grep 2.5.1.ds1-6
Date: Sat, 10 Sep 2005 22:47:05 -0700
Source: grep
Source-Version: 2.5.1.ds1-6

We believe that the bug you reported is fixed in the latest version of
grep, which is due to be installed in the Debian FTP archive:

grep_2.5.1.ds1-6.diff.gz
  to pool/main/g/grep/grep_2.5.1.ds1-6.diff.gz
grep_2.5.1.ds1-6.dsc
  to pool/main/g/grep/grep_2.5.1.ds1-6.dsc
grep_2.5.1.ds1-6_i386.deb
  to pool/main/g/grep/grep_2.5.1.ds1-6_i386.deb



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 224993@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Santiago Ruano Rincon <santiago@unicauca.edu.co> (supplier of updated grep package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Sat, 10 Sep 2005 01:52:04 -0500
Source: grep
Binary: grep
Architecture: source i386
Version: 2.5.1.ds1-6
Distribution: unstable
Urgency: low
Maintainer: Anibal Monsalve Salazar <anibal@debian.org>
Changed-By: Santiago Ruano Rincon <santiago@unicauca.edu.co>
Description: 
 grep       - GNU grep, egrep and fgrep
Closes: 181378 206470 224993
Changes: 
 grep (2.5.1.ds1-6) unstable; urgency=low
 .
   * 64-egf-speedup.patch, 65-dfa-optional.patch, 66-match_icase.patch,
     67-w.patch speed up grep. Thanks to Nicolas François
     <nicolas.francois@centraliens.net> (Closes: #181378, #206470, #224993)
   * Deleted the CVS directories
Files: 
 7797de5e94d5c6b930a29e0a7fc5e205 669 base required grep_2.5.1.ds1-6.dsc
 caea29b0505d0401fb03d0f3c5b0de75 29266 base required grep_2.5.1.ds1-6.diff.gz
 38ead74511b3423ee277778ae85c2077 172330 base required grep_2.5.1.ds1-6_i386.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDI8JagY5NIXPNpFURAq2QAKC4AK77tJr5vlyg5sSVasgEMr49RQCgrZYI
wsu93RoCSrY292GZAvXAoTM=
=XhKe
-----END PGP SIGNATURE-----




Information forwarded to debian-bugs-dist@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Adeodato Simó <asp16@alu.ua.es>:
Extra info received and forwarded to list. Copy sent to Anibal Monsalve Salazar <anibal@debian.org>. (full text, mbox, link).


Message #80 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Adeodato Simó <asp16@alu.ua.es>
To: debian-devel@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>
Cc: control@bugs.debian.org, 181378@bugs.debian.org, 281647@bugs.debian.org
Subject: Re: Accepted grep 2.5.1.ds2-1 (source i386 sparc)
Date: Mon, 26 Sep 2005 20:04:24 +0200
found 181378 2.5.1.ds2-1
thanks

* Anibal Monsalve Salazar [Mon, 26 Sep 2005 05:47:06 -0700]:

>    * Removed 64-egf-speedup.patch, 65-dfa-optional.patch,
>      66-match_icase.patch and 67-w.patch from debian/patches,
>      closes: #329876.

  Those patches fixed a bug (and two merged) that had been opened for 2
  and a half years. I think it'd be useful if you tried to contact the
  authors of the patches, and try to fix them instead of removing them?

>    * Removed grep.texi from upstream tarball, 50-rgrep-info.patch and
>      51-dircategory-info.patch from debian/patches, the GNU Free
>      Documentation License from debian/copyright and debian/fdl.txt,
>      closes: #281647.

  Still, grep.1 remains, which (a) contains verbatim paragraphs from
  grep.texi yet (b) comes in the upstream tarball with a license notice.
  Does this mean that grep.1 is?:

    - under the GFDL, so should be removed
    - under the GPL (the general license of the tarball), despite
      sharing contents with grep.texi
    - undistributable, because it has no license attached

  Cheers,

-- 
Adeodato Simó
    EM: asp16 [ykwim] alu.ua.es | PK: DA6AE621
 
Man is certainly stark mad; he cannot make a flea, yet he makes gods by the
dozens.
                -- Michel de Montaigne




Bug marked as found in version 2.5.1.ds2-1. Request was from Adeodato Simó <asp16@alu.ua.es> to control@bugs.debian.org. (full text, mbox, link).


Bug reopened, originator not changed. Request was from Aníbal Monsalve Salazar <anibal@debian.org> to control@bugs.debian.org. (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Aníbal Monsalve Salazar <anibal@debian.org>:
Extra info received and forwarded to list. Copy sent to Anibal Monsalve Salazar <anibal@debian.org>. (full text, mbox, link).


Message #89 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Aníbal Monsalve Salazar <anibal@debian.org>
To: debian-devel@lists.debian.org, 281647@bugs.debian.org
Subject: Re: Accepted grep 2.5.1.ds2-1 (source i386 sparc)
Date: Tue, 27 Sep 2005 11:12:07 +1000
[Message part 1 (text/plain, inline)]
On Mon, Sep 26, 2005 at 08:04:24PM +0200, Adeodato Simó wrote:
>found 181378 2.5.1.ds2-1
>thanks
>
>* Anibal Monsalve Salazar [Mon, 26 Sep 2005 05:47:06 -0700]:
>
>>   * Removed 64-egf-speedup.patch, 65-dfa-optional.patch,
>>     66-match_icase.patch and 67-w.patch from debian/patches,
>>     closes: #329876.
>
>  Those patches fixed a bug (and two merged) that had been opened for 2
>  and a half years. I think it'd be useful if you tried to contact the
>  authors of the patches, and try to fix them instead of removing them?

Sure, the grep maintainers decided to pull out them and will go
trough the patches again.

I have bcc-ed #181378.

>>   * Removed grep.texi from upstream tarball, 50-rgrep-info.patch and
>>     51-dircategory-info.patch from debian/patches, the GNU Free
>>     Documentation License from debian/copyright and debian/fdl.txt,
>>     closes: #281647.
>
>  Still, grep.1 remains, which (a) contains verbatim paragraphs from
>  grep.texi yet (b) comes in the upstream tarball with a license notice.
>  Does this mean that grep.1 is?:
>
>    - under the GFDL, so should be removed

grep.texi is the only documentation file under the GFDL whereas
grep.1 is not.

>    - under the GPL (the general license of the tarball), despite
>      sharing contents with grep.texi

grep.1 is covered by the license of the tarball which is the GPL.

>    - undistributable, because it has no license attached

I don't think so. If grep.1 is undistributable so many others files
are.

grep.1 is not the only only file without an explicit license. Other
files without an explicit license are:

lib/alloca.c
lib/closeout.h
lib/hard-locale.h
lib/regex.h
lib/savedir.h
lib/xstrtol.h
po/cat-id-tbl.c
src/dosbuf.c
src/getpagesize.h
src/grepmat.c
src/vms_fab.c
src/vms_fab.h
vms/config_vms.h
config.h

>  Cheers,
>
>-- 
>Adeodato Simó
>    EM: asp16 [ykwim] alu.ua.es | PK: DA6AE621
> 
>Man is certainly stark mad; he cannot make a flea, yet he makes gods by the
>dozens.
>                -- Michel de Montaigne

Aníbal Monsalve Salazar
--
 .''`. Debian GNU/Linux
: :' : Free Operating System
`. `'  http://debian.org/
  `-   http://v7w.com/anibal
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Nicolas François <nicolas.francois@centraliens.net>:
Extra info received and forwarded to list. Copy sent to Anibal Monsalve Salazar <anibal@debian.org>. (full text, mbox, link).


Message #94 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Nicolas François <nicolas.francois@centraliens.net>
To: Aníbal Monsalve Salazar <anibal@debian.org>
Cc: debian-devel@lists.debian.org, 181378@bugs.debian.org
Subject: Re: Accepted grep 2.5.1.ds2-1 (source i386 sparc)
Date: Tue, 27 Sep 2005 23:53:41 +0200
Hello,

On Tue, Sep 27, 2005 at 11:12:07AM +1000, Aníbal Monsalve Salazar wrote:
> On Mon, Sep 26, 2005 at 08:04:24PM +0200, Adeodato Simó wrote:
> >found 181378 2.5.1.ds2-1
> >thanks
> >
> >* Anibal Monsalve Salazar [Mon, 26 Sep 2005 05:47:06 -0700]:
> >
> >>   * Removed 64-egf-speedup.patch, 65-dfa-optional.patch,
> >>     66-match_icase.patch and 67-w.patch from debian/patches,
> >>     closes: #329876.
> >
> >  Those patches fixed a bug (and two merged) that had been opened for 2
> >  and a half years. I think it'd be useful if you tried to contact the
> >  authors of the patches, and try to fix them instead of removing them?
> 
> Sure, the grep maintainers decided to pull out them and will go
> trough the patches again.


I wondered if I introduced this issue while porting the Fedora patches to
Debian, so I tried Fedora's grep...which has the same issue.

You can reproduce it with this simple command:
echo foobar | grep -Fw ""

This was introduced by the patch I named '64-egf-speedup.patch'

You can fix it by changing the 'while (1)' by 'while (len)' (or by
embedding this while loop in a 'if (len){...}', I don't know if there is a
real difference, and what is the best way).
Tim Waugh, who wrote the original patches, may have a better understanding
of the grep's code.

The testsuite still pass with this patch.


BTW, I don't know if you received a mail I sent to grep@packages.debian.org,
which indicated that the additional patches (which I submitted because
they helped passing the testsuite) were fixing: #209194 #218873 #226397
#238167

If you plan to re-introduce these patches, please tell me. While checking
for this issue (#329876), I've seen that there was one issue fixed in a
Fedora update, related to this patch:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=161700
I can update 64-egf-speedup.patch if you want.

Kind Regards,
-- 
Nekral



Information forwarded to debian-bugs-dist@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Aníbal Monsalve Salazar <anibal@debian.org>:
Extra info received and forwarded to list. Copy sent to Anibal Monsalve Salazar <anibal@debian.org>. (full text, mbox, link).


Message #99 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Aníbal Monsalve Salazar <anibal@debian.org>
To: Nicolas François <nicolas.francois@centraliens.net>
Cc: Santiago Ruano Rincon <santiago@unicauca.edu.co>, debian-devel@lists.debian.org, 181378@bugs.debian.org
Subject: Re: Accepted grep 2.5.1.ds2-1 (source i386 sparc)
Date: Wed, 28 Sep 2005 09:04:03 +1000
[Message part 1 (text/plain, inline)]
On Tue, Sep 27, 2005 at 11:53:41PM +0200, Nicolas François wrote:
>On Tue, Sep 27, 2005 at 11:12:07AM +1000, Aníbal Monsalve Salazar wrote:
>>On Mon, Sep 26, 2005 at 08:04:24PM +0200, Adeodato Simó wrote:
>>>* Anibal Monsalve Salazar [Mon, 26 Sep 2005 05:47:06 -0700]:
>>>
>>>>   * Removed 64-egf-speedup.patch, 65-dfa-optional.patch,
>>>>     66-match_icase.patch and 67-w.patch from debian/patches,
>>>>     closes: #329876.
>>>
>>>  Those patches fixed a bug (and two merged) that had been opened for 2
>>>  and a half years. I think it'd be useful if you tried to contact the
>>>  authors of the patches, and try to fix them instead of removing them?
>>
>>Sure, the grep maintainers decided to pull them out and will go
>>trough the patches again.
>
>I wondered if I introduced this issue while porting the Fedora patches to
>Debian, so I tried Fedora's grep...which has the same issue.
>
>You can reproduce it with this simple command:
>echo foobar | grep -Fw ""
>
>This was introduced by the patch I named '64-egf-speedup.patch'
>
>You can fix it by changing the 'while (1)' by 'while (len)' (or by
>embedding this while loop in a 'if (len){...}', I don't know if there is a
>real difference, and what is the best way).
>Tim Waugh, who wrote the original patches, may have a better understanding
>of the grep's code.
>
>The testsuite still pass with this patch.
>
>BTW, I don't know if you received a mail I sent to grep@packages.debian.org,
>which indicated that the additional patches (which I submitted because
>they helped passing the testsuite) were fixing: #209194 #218873 #226397
>#238167

I received it, thanks. I'll close the bugs.

>If you plan to re-introduce these patches, please tell me. While checking
>for this issue (#329876), I've seen that there was one issue fixed in a
>Fedora update, related to this patch:
>https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=161700
>I can update 64-egf-speedup.patch if you want.

Yes, please. I would like to reapply 64-egf-speedup.patch
(and 6[567]-*.patch) and an updated version will be very much
appreciated.

>Kind Regards,
>-- 
>Nekral

Regards,

Aníbal Monsalve Salazar
--
 .''`. Debian GNU/Linux
: :' : Free Operating System
`. `'  http://debian.org/
  `-   http://v7w.com/anibal
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Nicolas François <nicolas.francois@centraliens.net>:
Extra info received and forwarded to list. Copy sent to Anibal Monsalve Salazar <anibal@debian.org>. (full text, mbox, link).


Message #104 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Nicolas François <nicolas.francois@centraliens.net>
To: 181378@bugs.debian.org
Cc: Santiago Ruano Rincon <santiago@unicauca.edu.co>
Subject: update for 64-egf-speedup.patch
Date: Wed, 28 Sep 2005 12:58:50 +0200
[Message part 1 (text/plain, inline)]
Hello,

Please find attached an update for the 64-egf-speedup.patch patch.
The other patches did not need to be updated and can be found in the
#181378 log.

This update intend to fix:
echo foobar | grep -Fw ""
(which was hanging with the previous version)

echo test | LC_ALL=C grep -Fw test
echo x test x | LC_ALL=C grep -Fw test

which were not working and were fixed by Tim Waugh (original author of the
patches).

I intend to mail Tim Waugh about the first issue, to check if my fix is
correct/optimal. grep being slow on UTF-8 is not that critical, it may be
better to wait his answer before releasing it. I will CC the BTS.

Kind Regards,
-- 
Nekral
[64-egf-speedup.patch (text/plain, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Nicolas François <nicolas.francois@centraliens.net>:
Extra info received and forwarded to list. Copy sent to Anibal Monsalve Salazar <anibal@debian.org>. (full text, mbox, link).


Message #109 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Nicolas François <nicolas.francois@centraliens.net>
To: twaugh@redhat.com
Cc: 181378@bugs.debian.org
Subject: grep hanging with -Fw and an empty pattern
Date: Wed, 28 Sep 2005 13:26:27 +0200
Hello,

Sorry for contacting you directly.

I'm trying to port you patch (grep-2.5.1-egf-speedup.patch) to Debian.
This patch triggered an issue when an empty pattern is used with the -Fw
options.
(see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=329876)

I tried the Fedora grep-2.5.1-48.2 binary, which suffers from the same
issue (on a Debian system, with a Debian libc and libpcre):
   echo foobar | grep -Fw ""
hangs (this could appear with the -Fwf options when the patterns file
contains an empty line).

Changing the 'while (1)' loop to a 'while (len)' loop in search.c fix this
issue.  However, I don't know if this is correct or optimal (I don't know
what should happen if we enter the loop with len>0 and len is then
decreased to 0; Maybe this should also be catched earlier).

Does it seems correct to you ?

Sorry I could not check if a Redhat system suffers from this (that's the
reson why I do not use the BTS) and thanks a lot for the impressive
speed-up of grep on an UTF-8 environment,
-- 
Nekral



Information forwarded to debian-bugs-dist@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Tim Waugh <twaugh@redhat.com>:
Extra info received and forwarded to list. Copy sent to Anibal Monsalve Salazar <anibal@debian.org>. (full text, mbox, link).


Message #114 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Tim Waugh <twaugh@redhat.com>
To: Nicolas François <nicolas.francois@centraliens.net>
Cc: 181378@bugs.debian.org
Subject: Re: grep hanging with -Fw and an empty pattern
Date: Thu, 29 Sep 2005 13:22:39 +0100
[Message part 1 (text/plain, inline)]
On Wed, Sep 28, 2005 at 01:26:27PM +0200, Nicolas François wrote:

> Changing the 'while (1)' loop to a 'while (len)' loop in search.c fix this
> issue.  However, I don't know if this is correct or optimal (I don't know
> what should happen if we enter the loop with len>0 and len is then
> decreased to 0; Maybe this should also be catched earlier).
> 
> Does it seems correct to you ?

Yes, looks correct to me.  Thanks.

Tim.
*/
[Message part 2 (application/pgp-signature, inline)]

Reply sent to Anibal Monsalve Salazar <anibal@debian.org>:
You have taken responsibility. (full text, mbox, link).


Notification sent to Max Zou <zoum@mzou.net>:
Bug acknowledged by developer. (full text, mbox, link).


Message #119 received at 181378-close@bugs.debian.org (full text, mbox, reply):

From: Anibal Monsalve Salazar <anibal@debian.org>
To: 181378-close@bugs.debian.org
Subject: Bug#181378: fixed in grep 2.5.1.ds2-2
Date: Wed, 26 Oct 2005 03:02:09 -0700
Source: grep
Source-Version: 2.5.1.ds2-2

We believe that the bug you reported is fixed in the latest version of
grep, which is due to be installed in the Debian FTP archive:

grep_2.5.1.ds2-2.diff.gz
  to pool/main/g/grep/grep_2.5.1.ds2-2.diff.gz
grep_2.5.1.ds2-2.dsc
  to pool/main/g/grep/grep_2.5.1.ds2-2.dsc
grep_2.5.1.ds2-2_alpha.deb
  to pool/main/g/grep/grep_2.5.1.ds2-2_alpha.deb
grep_2.5.1.ds2-2_i386.deb
  to pool/main/g/grep/grep_2.5.1.ds2-2_i386.deb
grep_2.5.1.ds2-2_sparc.deb
  to pool/main/g/grep/grep_2.5.1.ds2-2_sparc.deb



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 181378@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Anibal Monsalve Salazar <anibal@debian.org> (supplier of updated grep package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Wed, 26 Oct 2005 19:14:35 +1000
Source: grep
Binary: grep
Architecture: source i386 alpha sparc
Version: 2.5.1.ds2-2
Distribution: unstable
Urgency: low
Maintainer: Anibal Monsalve Salazar <anibal@debian.org>
Changed-By: Anibal Monsalve Salazar <anibal@debian.org>
Description: 
 grep       - GNU grep, egrep and fgrep
Closes: 181378 206470 224993 240239 257900 267718 284676
Changes: 
 grep (2.5.1.ds2-2) unstable; urgency=low
 .
   * Patched 64-egf-speedup.patch with patch from Nicolas François
     <nicolas.francois@centraliens.net>. Put 64-egf-speedup.patch,
     65-dfa-optional.patch, 66-match_icase.patch and 67-w.patch back
     in, closes: #181378, #206470, #224993.
   * Fixed "minor documentation syntax error", closes: #240239,
     #257900. Patches by Allard Hoeve <allard@byte.nl> and Derrick
     'dman' Hudson <dman@dman13.dyndns.org>.
   * Fixed "info page not in main info menu", closes: #284676,
     #267718. Patches by Rui Tiago Cação Matos
     <a28525@alunos.det.ua.pt> and Paul Brook <paul@nowt.org>.
Files: 
 88b2af4b3578729420158583be03731f 660 utils required grep_2.5.1.ds2-2.dsc
 14e96467e8623210c797ec104ed9e3b2 21354 utils required grep_2.5.1.ds2-2.diff.gz
 e69a3fbbab86633594273203f7f2207e 139112 utils required grep_2.5.1.ds2-2_i386.deb
 76128b684a7deac71454c5f6b5697345 140514 utils required grep_2.5.1.ds2-2_sparc.deb
 01da865bef322c130f6f46abad86d1f9 147868 utils required grep_2.5.1.ds2-2_alpha.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDX1MXipBneRiAKDwRAkE4AKCuQ7V6POyqk3uqYL4c5ifTHLtu6ACdHk7e
Kowqh+yG6VdaC2w+ve8bhyc=
=sBND
-----END PGP SIGNATURE-----




Reply sent to Anibal Monsalve Salazar <anibal@debian.org>:
You have taken responsibility. (full text, mbox, link).


Notification sent to Max Zou <zoum@mzou.net>:
Bug acknowledged by developer. (full text, mbox, link).


Message #124 received at 206470-close@bugs.debian.org (full text, mbox, reply):

From: Anibal Monsalve Salazar <anibal@debian.org>
To: 206470-close@bugs.debian.org
Subject: Bug#206470: fixed in grep 2.5.1.ds2-2
Date: Wed, 26 Oct 2005 03:02:09 -0700
Source: grep
Source-Version: 2.5.1.ds2-2

We believe that the bug you reported is fixed in the latest version of
grep, which is due to be installed in the Debian FTP archive:

grep_2.5.1.ds2-2.diff.gz
  to pool/main/g/grep/grep_2.5.1.ds2-2.diff.gz
grep_2.5.1.ds2-2.dsc
  to pool/main/g/grep/grep_2.5.1.ds2-2.dsc
grep_2.5.1.ds2-2_alpha.deb
  to pool/main/g/grep/grep_2.5.1.ds2-2_alpha.deb
grep_2.5.1.ds2-2_i386.deb
  to pool/main/g/grep/grep_2.5.1.ds2-2_i386.deb
grep_2.5.1.ds2-2_sparc.deb
  to pool/main/g/grep/grep_2.5.1.ds2-2_sparc.deb



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 206470@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Anibal Monsalve Salazar <anibal@debian.org> (supplier of updated grep package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Wed, 26 Oct 2005 19:14:35 +1000
Source: grep
Binary: grep
Architecture: source i386 alpha sparc
Version: 2.5.1.ds2-2
Distribution: unstable
Urgency: low
Maintainer: Anibal Monsalve Salazar <anibal@debian.org>
Changed-By: Anibal Monsalve Salazar <anibal@debian.org>
Description: 
 grep       - GNU grep, egrep and fgrep
Closes: 181378 206470 224993 240239 257900 267718 284676
Changes: 
 grep (2.5.1.ds2-2) unstable; urgency=low
 .
   * Patched 64-egf-speedup.patch with patch from Nicolas François
     <nicolas.francois@centraliens.net>. Put 64-egf-speedup.patch,
     65-dfa-optional.patch, 66-match_icase.patch and 67-w.patch back
     in, closes: #181378, #206470, #224993.
   * Fixed "minor documentation syntax error", closes: #240239,
     #257900. Patches by Allard Hoeve <allard@byte.nl> and Derrick
     'dman' Hudson <dman@dman13.dyndns.org>.
   * Fixed "info page not in main info menu", closes: #284676,
     #267718. Patches by Rui Tiago Cação Matos
     <a28525@alunos.det.ua.pt> and Paul Brook <paul@nowt.org>.
Files: 
 88b2af4b3578729420158583be03731f 660 utils required grep_2.5.1.ds2-2.dsc
 14e96467e8623210c797ec104ed9e3b2 21354 utils required grep_2.5.1.ds2-2.diff.gz
 e69a3fbbab86633594273203f7f2207e 139112 utils required grep_2.5.1.ds2-2_i386.deb
 76128b684a7deac71454c5f6b5697345 140514 utils required grep_2.5.1.ds2-2_sparc.deb
 01da865bef322c130f6f46abad86d1f9 147868 utils required grep_2.5.1.ds2-2_alpha.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDX1MXipBneRiAKDwRAkE4AKCuQ7V6POyqk3uqYL4c5ifTHLtu6ACdHk7e
Kowqh+yG6VdaC2w+ve8bhyc=
=sBND
-----END PGP SIGNATURE-----




Reply sent to Anibal Monsalve Salazar <anibal@debian.org>:
You have taken responsibility. (full text, mbox, link).


Notification sent to Max Zou <zoum@mzou.net>:
Bug acknowledged by developer. (full text, mbox, link).


Message #129 received at 224993-close@bugs.debian.org (full text, mbox, reply):

From: Anibal Monsalve Salazar <anibal@debian.org>
To: 224993-close@bugs.debian.org
Subject: Bug#224993: fixed in grep 2.5.1.ds2-2
Date: Wed, 26 Oct 2005 03:02:09 -0700
Source: grep
Source-Version: 2.5.1.ds2-2

We believe that the bug you reported is fixed in the latest version of
grep, which is due to be installed in the Debian FTP archive:

grep_2.5.1.ds2-2.diff.gz
  to pool/main/g/grep/grep_2.5.1.ds2-2.diff.gz
grep_2.5.1.ds2-2.dsc
  to pool/main/g/grep/grep_2.5.1.ds2-2.dsc
grep_2.5.1.ds2-2_alpha.deb
  to pool/main/g/grep/grep_2.5.1.ds2-2_alpha.deb
grep_2.5.1.ds2-2_i386.deb
  to pool/main/g/grep/grep_2.5.1.ds2-2_i386.deb
grep_2.5.1.ds2-2_sparc.deb
  to pool/main/g/grep/grep_2.5.1.ds2-2_sparc.deb



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 224993@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Anibal Monsalve Salazar <anibal@debian.org> (supplier of updated grep package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Wed, 26 Oct 2005 19:14:35 +1000
Source: grep
Binary: grep
Architecture: source i386 alpha sparc
Version: 2.5.1.ds2-2
Distribution: unstable
Urgency: low
Maintainer: Anibal Monsalve Salazar <anibal@debian.org>
Changed-By: Anibal Monsalve Salazar <anibal@debian.org>
Description: 
 grep       - GNU grep, egrep and fgrep
Closes: 181378 206470 224993 240239 257900 267718 284676
Changes: 
 grep (2.5.1.ds2-2) unstable; urgency=low
 .
   * Patched 64-egf-speedup.patch with patch from Nicolas François
     <nicolas.francois@centraliens.net>. Put 64-egf-speedup.patch,
     65-dfa-optional.patch, 66-match_icase.patch and 67-w.patch back
     in, closes: #181378, #206470, #224993.
   * Fixed "minor documentation syntax error", closes: #240239,
     #257900. Patches by Allard Hoeve <allard@byte.nl> and Derrick
     'dman' Hudson <dman@dman13.dyndns.org>.
   * Fixed "info page not in main info menu", closes: #284676,
     #267718. Patches by Rui Tiago Cação Matos
     <a28525@alunos.det.ua.pt> and Paul Brook <paul@nowt.org>.
Files: 
 88b2af4b3578729420158583be03731f 660 utils required grep_2.5.1.ds2-2.dsc
 14e96467e8623210c797ec104ed9e3b2 21354 utils required grep_2.5.1.ds2-2.diff.gz
 e69a3fbbab86633594273203f7f2207e 139112 utils required grep_2.5.1.ds2-2_i386.deb
 76128b684a7deac71454c5f6b5697345 140514 utils required grep_2.5.1.ds2-2_sparc.deb
 01da865bef322c130f6f46abad86d1f9 147868 utils required grep_2.5.1.ds2-2_alpha.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDX1MXipBneRiAKDwRAkE4AKCuQ7V6POyqk3uqYL4c5ifTHLtu6ACdHk7e
Kowqh+yG6VdaC2w+ve8bhyc=
=sBND
-----END PGP SIGNATURE-----




Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Tue, 26 Jun 2007 10:04:53 GMT) (full text, mbox, link).


Bug unarchived. Request was from Aníbal Monsalve Salazar <anibal@debian.org> to control@bugs.debian.org. (Thu, 06 Sep 2007 22:51:01 GMT) (full text, mbox, link).


Bug reopened, originator not changed. Request was from Aníbal Monsalve Salazar <anibal@debian.org> to control@bugs.debian.org. (Thu, 06 Sep 2007 22:51:02 GMT) (full text, mbox, link).


Bug marked as found in version 2.5.3~dfsg-2. Request was from Aníbal Monsalve Salazar <anibal@debian.org> to control@bugs.debian.org. (Thu, 06 Sep 2007 22:51:03 GMT) (full text, mbox, link).


Tags removed: patch Request was from Touko Korpela <tkorpela@phnet.fi> to control@bugs.debian.org. (Mon, 17 Sep 2007 20:24:02 GMT) (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Touko Korpela <tkorpela@phnet.fi>:
Extra info received and forwarded to list. Copy sent to Anibal Monsalve Salazar <anibal@debian.org>. (full text, mbox, link).


Message #144 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Touko Korpela <tkorpela@phnet.fi>
To: 181378@bugs.debian.org
Subject: sid+lenny slow, etch is OK
Date: Mon, 17 Sep 2007 23:47:07 +0300
It seems that etch version 2.5.1.ds2-6 is not slow, but 2.5.3~dfsg-2 is 
very slow with UTF-8 (I'm on i386)




Bug marked as fixed in version 2.5.1.ds2-6. Request was from Touko Korpela <tkorpela@phnet.fi> to control@bugs.debian.org. (Mon, 17 Sep 2007 20:57:04 GMT) (full text, mbox, link).


Bug marked as found in version 2.5.3~dfsg-1. Request was from Touko Korpela <tkorpela@phnet.fi> to control@bugs.debian.org. (Mon, 17 Sep 2007 21:06:02 GMT) (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Nicolas François <nicolas.francois@centraliens.net>:
Extra info received and forwarded to list. Copy sent to Anibal Monsalve Salazar <anibal@debian.org>. (full text, mbox, link).


Message #153 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Nicolas François <nicolas.francois@centraliens.net>
To: 181378@bugs.debian.org
Subject: Re: Bug#181378: update for 64-egf-speedup.patch
Date: Mon, 1 Oct 2007 16:20:36 +0200
[Message part 1 (text/plain, inline)]
tags 181378 patch
forcemerge 181378 442882
thanks

Hello,

Please find attached updated patches for grep-2.5.3~dfsg:
 * 64-egf-speedup.patch
   This provides the speedup when the DFA algorithm is not used.
   But the DFA algorithm is used for most grep execution.
   (So there are no speed improvements if 65-dfa-optional.patch is not
   applied)
 * 65-dfa-optional.patch
   This disables the DFA algorithm, which can be very slow in UTF-8
   environments. The DFA algorithm can be enabled with an environment
   variable.
   (This patch is not valid if 64-egf-speedup.patch is not applied)

These two patches are tightly coupled and must be applied together.

There used to be also two other patches in the 2.5.1, which improve the
results of the grep testsuite:
 * 66-match_icase.patch
   This patch fixes some some usage of the -i option.
   It could probably be applied without the previous patches.

 * 67-w.patch
   This patch fixes the -w option.
   This probably fixes issues introduced by the first two patches.

I tried to add a few comments in the header of the patches.

With the 4 patches applied, 3 tests fail in the grep testsuite, but the
results are better than an unpatched upstream.

It could be nice to have a patch to split the testsuite in two categories:
known working test case) and known broken test cases (i.e. in the spencer1
testsuite, I don't expect the handling of case insensitive matches for non
latin characters to be fixed in a near future).
This would allow to run the testsuite at build time and detect regressions
in later uploads.
There are currently too many test cases/sub cases that fail to consider
the testsuite as useful at build time.

I'm also concerned about the maintainability of these patches.
I will try reduce their size and comment them, but do not wait for this
for an upload (I won't have time in the next two weeks).

With these 4 patches applied, there are probably a few bugs in the BTS
which can be closed (obviously the "grep too slow" bugs, but you should
also check if the locale dependent bugs (or the bugs which involve the -i
or -w options) are still reproducible)

I will subscribe to the PTS for grep, but do not hesitate to ping me if
these patches broke grep.

Kind Regards,
-- 
Nekral
[64-egf-speedup.patch (text/x-diff, attachment)]
[65-dfa-optional.patch (text/x-diff, attachment)]
[66-match_icase.patch (text/x-diff, attachment)]
[67-w.patch (text/x-diff, attachment)]

Tags added: patch Request was from Nicolas François <nicolas.francois@centraliens.net> to control@bugs.debian.org. (Mon, 01 Oct 2007 14:24:05 GMT) (full text, mbox, link).


Forcibly Merged 181378 206470 224993 442882. Request was from Nicolas François <nicolas.francois@centraliens.net> to control@bugs.debian.org. (Mon, 01 Oct 2007 14:24:06 GMT) (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Anibal Monsalve Salazar <anibal@debian.org>:
Bug#181378; Package grep. (full text, mbox, link).


Acknowledgement sent to Thomas Viehmann <tv@beamnet.de>:
Extra info received and forwarded to list. Copy sent to Anibal Monsalve Salazar <anibal@debian.org>. (full text, mbox, link).


Message #162 received at 181378@bugs.debian.org (full text, mbox, reply):

From: Thomas Viehmann <tv@beamnet.de>
To: 181378@bugs.debian.org, 442882@bugs.debian.org
Subject: grep: diff for NMU version 2.5.3~dfsg-2.1
Date: Wed, 03 Oct 2007 21:59:42 +0200
tags 181378 + pending
tags 442882 + pending
thanks

Hi,

The following is the diff for the grep 2.5.3~dfsg-2.1 NMU with
the patches by Nicolas François. As per Santiago's mail to d-devel[1]
and the appearantly only inadvertently lowered severity, I'm NMUing this
as an RC bug. According to my tests, the new version resolve the
regressions reported against grep 2.5.3~dfsg-1.

Kind regards

T.

1. http://lists.debian.org/debian-devel/2007/09/msg00946.html

--
Thomas Viehmann, tv@beamnet.de

diff -u grep-2.5.3~dfsg/debian/changelog grep-2.5.3~dfsg/debian/changelog
--- grep-2.5.3~dfsg/debian/changelog
+++ grep-2.5.3~dfsg/debian/changelog
@@ -1,3 +1,11 @@
+grep (2.5.3~dfsg-2.1) unstable; urgency=high
+
+  * Non-maintainer upload.
+  * Reinstate patches by Nicolas François <nicolas.francois@centraliens.net>
+    Closes: #181378, #442882
+
+ -- Thomas Viehmann <tv@beamnet.de>  Tue, 02 Oct 2007 23:02:35 +0200
+
 grep (2.5.3~dfsg-2) unstable; urgency=low
 
   * Removed 65-dfa-optional.patch. (Closes: #439827, #440195, #440342)
only in patch2:
unchanged:
--- grep-2.5.3~dfsg.orig/debian/patches/64-egf-speedup.patch
+++ grep-2.5.3~dfsg/debian/patches/64-egf-speedup.patch
@@ -0,0 +1,792 @@
+--- src/search.c.orig
++++ src/search.c
+@@ -18,10 +18,15 @@
+ 
+ /* Written August 1992 by Mike Haertel. */
+ 
++#ifndef _GNU_SOURCE
++# define _GNU_SOURCE 1
++#endif
+ #ifdef HAVE_CONFIG_H
+ # include <config.h>
+ #endif
+ 
++#include <assert.h>
++
+ #include <sys/types.h>
+ 
+ #include "mbsupport.h"
+@@ -43,6 +48,9 @@
+ #ifdef HAVE_LIBPCRE
+ # include <pcre.h>
+ #endif
++#ifdef HAVE_LANGINFO_CODESET
++# include <langinfo.h>
++#endif
+ 
+ #define NCHAR (UCHAR_MAX + 1)
+ 
+@@ -68,6 +76,19 @@
+     error (2, 0, _("memory exhausted"));
+ }
+ 
++/* UTF-8 encoding allows some optimizations that we can't otherwise
++   assume in a multibyte encoding. */
++static int using_utf8;
++
++void
++check_utf8 (void)
++{
++#ifdef HAVE_LANGINFO_CODESET
++  if (strcmp (nl_langinfo (CODESET), "UTF-8") == 0)
++    using_utf8 = 1;
++#endif
++}
++
+ #ifndef FGREP_PROGRAM
+ /* DFA compiled regexp. */
+ static struct dfa dfa;
+@@ -134,49 +155,6 @@
+ }
+ #endif /* !FGREP_PROGRAM */
+ 
+-#ifdef MBS_SUPPORT
+-/* This function allocate the array which correspond to "buf".
+-   Then this check multibyte string and mark on the positions which
+-   are not single byte character nor the first byte of a multibyte
+-   character.  Caller must free the array.  */
+-static char*
+-check_multibyte_string(char const *buf, size_t size)
+-{
+-  char *mb_properties = xmalloc(size);
+-  mbstate_t cur_state;
+-  wchar_t wc;
+-  int i;
+-
+-  memset(&cur_state, 0, sizeof(mbstate_t));
+-  memset(mb_properties, 0, sizeof(char)*size);
+-
+-  for (i = 0; i < size ;)
+-    {
+-      size_t mbclen;
+-      mbclen = mbrtowc(&wc, buf + i, size - i, &cur_state);
+-
+-      if (mbclen == (size_t) -1 || mbclen == (size_t) -2 || mbclen == 0)
+-	{
+-	  /* An invalid sequence, or a truncated multibyte character.
+-	     We treat it as a single byte character.  */
+-	  mbclen = 1;
+-	}
+-      else if (match_icase)
+-	{
+-	  if (iswupper((wint_t)wc))
+-	    {
+-	      wc = towlower((wint_t)wc);
+-	      wcrtomb(buf + i, wc, &cur_state);
+-	    }
+-	}
+-      mb_properties[i] = mbclen;
+-      i += mbclen;
+-    }
+-
+-  return mb_properties;
+-}
+-#endif /* MBS_SUPPORT */
+-
+ #if defined(GREP_PROGRAM) || defined(EGREP_PROGRAM)
+ #ifdef EGREP_PROGRAM
+ COMPILE_FCT(Ecompile)
+@@ -193,6 +171,7 @@
+   size_t total = size;
+   char const *motif = pattern;
+ 
++  check_utf8 ();
+ #if 0
+   if (match_icase)
+     syntax_bits |= RE_ICASE;
+#@@ -303,47 +282,78 @@ hunk6
+@@ -303,20 +282,9 @@ hunk6
+   struct kwsmatch kwsm;
+   size_t i, ret_val;
+ #ifdef MBS_SUPPORT
+-  char *mb_properties = NULL;
+-  if (MB_CUR_MAX > 1)
+-    {
+-      if (match_icase)
+-        {
+-          char *case_buf = xmalloc(size);
+-          memcpy(case_buf, buf, size);
+-	  if (start_ptr)
+-	    start_ptr = case_buf + (start_ptr - buf);
+-          buf = case_buf;
+-        }
+-      if (kwset)
+-        mb_properties = check_multibyte_string(buf, size);
+-    }
++  int mb_cur_max = MB_CUR_MAX;
++  mbstate_t mbs;
++  memset (&mbs, '\0', sizeof (mbstate_t));
+ #endif /* MBS_SUPPORT */
+ 
+   buflim = buf + size;
+@@ -329,21 +282,63 @@ hunk6
+ 	  if (kwset)
+ 	    {
+ 	      /* Find a possible match using the KWset matcher. */
+-	      size_t offset = kwsexec (kwset, beg, buflim - beg, &kwsm);
++#ifdef MBS_SUPPORT
++	      size_t bytes_left = 0;
++#endif /* MBS_SUPPORT */
++	      size_t offset;
++#ifdef MBS_SUPPORT
++	      /* kwsexec doesn't work with match_icase and multibyte input. */
++	      if (match_icase && mb_cur_max > 1)
++		/* Avoid kwset */
++		offset = 0;
++	      else
++#endif /* MBS_SUPPORT */
++	      offset = kwsexec (kwset, beg, buflim - beg, &kwsm);
+ 	      if (offset == (size_t) -1)
+-		goto failure;
++		return (size_t)-1;
++#ifdef MBS_SUPPORT
++	      if (mb_cur_max > 1 && !using_utf8)
++		{
++		  bytes_left = offset;
++		  while (bytes_left)
++		    {
++		      size_t mlen = mbrlen (beg, bytes_left, &mbs);
++		      if (mlen == (size_t) -1 || mlen == 0)
++			{
++			  /* Incomplete character: treat as single-byte. */
++			  memset (&mbs, '\0', sizeof (mbstate_t));
++			  beg++;
++			  bytes_left--;
++			  continue;
++			}
++
++		      if (mlen == (size_t) -2)
++			/* Offset points inside multibyte character:
++			 * no good. */
++			break;
++
++		      beg += mlen;
++		      bytes_left -= mlen;
++		    }
++		}
++	      else
++#endif /* MBS_SUPPORT */
+ 	      beg += offset;
+ 	      /* Narrow down to the line containing the candidate, and
+ 		 run it through DFA. */
+ 	      end = memchr(beg, eol, buflim - beg);
+ 	      end++;
+ #ifdef MBS_SUPPORT
+-	      if (MB_CUR_MAX > 1 && mb_properties[beg - buf] == 0)
++	      if (mb_cur_max > 1 && bytes_left)
+ 		continue;
+ #endif
+ 	      while (beg > buf && beg[-1] != eol)
+ 		--beg;
+-	      if (kwsm.index < kwset_exact_matches)
++	      if (
++#ifdef MBS_SUPPORT
++		  !(match_icase && mb_cur_max > 1) &&
++#endif /* MBS_SUPPORT */
++		  (kwsm.index < kwset_exact_matches))
+ 		goto success;
+ 	      if (dfaexec (&dfa, beg, end - beg, &backref) == (size_t) -1)
+ 		continue;
+@@ -351,13 +363,47 @@
+ 	  else
+ 	    {
+ 	      /* No good fixed strings; start with DFA. */
++#ifdef MBS_SUPPORT
++	      size_t bytes_left = 0;
++#endif /* MBS_SUPPORT */
+ 	      size_t offset = dfaexec (&dfa, beg, buflim - beg, &backref);
+ 	      if (offset == (size_t) -1)
+ 		break;
+ 	      /* Narrow down to the line we've found. */
++#ifdef MBS_SUPPORT
++	      if (mb_cur_max > 1 && !using_utf8)
++		{
++		  bytes_left = offset;
++		  while (bytes_left)
++		    {
++		      size_t mlen = mbrlen (beg, bytes_left, &mbs);
++		      if (mlen == (size_t) -1 || mlen == 0)
++			{
++			  /* Incomplete character: treat as single-byte. */
++			  memset (&mbs, '\0', sizeof (mbstate_t));
++			  beg++;
++			  bytes_left--;
++			  continue;
++			}
++
++		      if (mlen == (size_t) -2)
++			/* Offset points inside multibyte character:
++			 * no good. */
++			break;
++
++		      beg += mlen;
++		      bytes_left -= mlen;
++		    }
++		}
++	      else
++#endif /* MBS_SUPPORT */
+ 	      beg += offset;
+ 	      end = memchr (beg, eol, buflim - beg);
+ 	      end++;
++#ifdef MBS_SUPPORT
++	      if (mb_cur_max > 1 && bytes_left)
++		continue;
++#endif /* MBS_SUPPORT */
+ 	      while (beg > buf && beg[-1] != eol)
+ 		--beg;
+ 	    }
+@@ -475,24 +521,144 @@
+   *match_size = len;
+   ret_val = beg - buf;
+  out:
+-#ifdef MBS_SUPPORT
+-  if (MB_CUR_MAX > 1)
+-    {
+-      if (match_icase)
+-        free((char*)buf);
+-      if (mb_properties)
+-        free(mb_properties);
+-    }
+-#endif /* MBS_SUPPORT */
+   return ret_val;
+ }
+ #endif /* defined(GREP_PROGRAM) || defined(EGREP_PROGRAM) */
+ 
++#ifdef MBS_SUPPORT
++static int f_i_multibyte; /* whether we're using the new -Fi MB method */
++static struct
++{
++  wchar_t **patterns;
++  size_t count, maxlen;
++  unsigned char *match;
++} Fimb;
++#endif
++
+ #if defined(GREP_PROGRAM) || defined(FGREP_PROGRAM)
+ COMPILE_FCT(Fcompile)
+ {
++  int mb_cur_max = MB_CUR_MAX;
+   char const *beg, *lim, *err;
+ 
++  check_utf8 ();
++#ifdef MBS_SUPPORT
++  /* Support -F -i for UTF-8 input. */
++  if (match_icase && mb_cur_max > 1)
++    {
++      mbstate_t mbs;
++      wchar_t *wcpattern = xmalloc ((size + 1) * sizeof (wchar_t));
++      const char *patternend = pattern;
++      size_t wcsize;
++      kwset_t fimb_kwset = NULL;
++      char *starts = NULL;
++      wchar_t *wcbeg, *wclim;
++      size_t allocated = 0;
++
++      memset (&mbs, '\0', sizeof (mbs));
++# ifdef __GNU_LIBRARY__
++      wcsize = mbsnrtowcs (wcpattern, &patternend, size, size, &mbs);
++      if (patternend != pattern + size)
++	wcsize = (size_t) -1;
++# else
++      {
++	char *patterncopy = xmalloc (size + 1);
++
++	memcpy (patterncopy, pattern, size);
++	patterncopy[size] = '\0';
++	patternend = patterncopy;
++	wcsize = mbsrtowcs (wcpattern, &patternend, size, &mbs);
++	if (patternend != patterncopy + size)
++	  wcsize = (size_t) -1;
++	free (patterncopy);
++      }
++# endif
++      if (wcsize + 2 <= 2)
++	{
++fimb_fail:
++	  free (wcpattern);
++	  free (starts);
++	  if (fimb_kwset)
++	    kwsfree (fimb_kwset);
++	  free (Fimb.patterns);
++	  Fimb.patterns = NULL;
++	}
++      else
++	{
++	  if (!(fimb_kwset = kwsalloc (NULL)))
++	    error (2, 0, _("memory exhausted"));
++
++	  starts = xmalloc (mb_cur_max * 3);
++	  wcbeg = wcpattern;
++	  do
++	    {
++	      int i;
++	      size_t wclen;
++
++	      if (Fimb.count >= allocated)
++		{
++		  if (allocated == 0)
++		    allocated = 128;
++		  else
++		    allocated *= 2;
++		  Fimb.patterns = xrealloc (Fimb.patterns,
++					    sizeof (wchar_t *) * allocated);
++		}
++	      Fimb.patterns[Fimb.count++] = wcbeg;
++	      for (wclim = wcbeg;
++		   wclim < wcpattern + wcsize && *wclim != L'\n'; ++wclim)
++		*wclim = towlower (*wclim);
++	      *wclim = L'\0';
++	      wclen = wclim - wcbeg;
++	      if (wclen > Fimb.maxlen)
++		Fimb.maxlen = wclen;
++	      if (wclen > 3)
++		wclen = 3;
++	      if (wclen == 0)
++		{
++		  if ((err = kwsincr (fimb_kwset, "", 0)) != 0)
++		    error (2, 0, err);
++		}
++	      else
++		for (i = 0; i < (1 << wclen); i++)
++		  {
++		    char *p = starts;
++		    int j, k;
++
++		    for (j = 0; j < wclen; ++j)
++		      {
++			wchar_t wc = wcbeg[j];
++			if (i & (1 << j))
++			  {
++			    wc = towupper (wc);
++			    if (wc == wcbeg[j])
++			      continue;
++			  }
++			k = wctomb (p, wc);
++			if (k <= 0)
++			  goto fimb_fail;
++			p += k;
++		      }
++		    if ((err = kwsincr (fimb_kwset, starts, p - starts)) != 0)
++		      error (2, 0, err);
++		  }
++	      if (wclim < wcpattern + wcsize)
++		++wclim;
++	      wcbeg = wclim;
++	    }
++	  while (wcbeg < wcpattern + wcsize);
++	  f_i_multibyte = 1;
++	  kwset = fimb_kwset;
++	  free (starts);
++	  Fimb.match = xmalloc (Fimb.count);
++	  if ((err = kwsprep (kwset)) != 0)
++	    error (2, 0, err);
++	  return;
++	}
++    }
++#endif /* MBS_SUPPORT */
++
++
+   kwsinit ();
+   beg = pattern;
+   do
+@@ -511,6 +677,76 @@
+     error (2, 0, err);
+ }
+ 
++#ifdef MBS_SUPPORT
++static int
++Fimbexec (const char *buf, size_t size, size_t *plen, int exact)
++{
++  size_t len, letter, i;
++  int ret = -1;
++  mbstate_t mbs;
++  wchar_t wc;
++  int patterns_left;
++
++  assert (match_icase && f_i_multibyte == 1);
++  assert (MB_CUR_MAX > 1);
++
++  memset (&mbs, '\0', sizeof (mbs));
++  memset (Fimb.match, '\1', Fimb.count);
++  letter = len = 0;
++  patterns_left = 1;
++  while (patterns_left && len <= size)
++    {
++      size_t c;
++
++      patterns_left = 0;
++      if (len < size)
++	{
++	  c = mbrtowc (&wc, buf + len, size - len, &mbs);
++	  if (c + 2 <= 2)
++	    return ret;
++
++	  wc = towlower (wc);
++	}
++      else
++	{
++	  c = 1;
++	  wc = L'\0';
++	}
++
++      for (i = 0; i < Fimb.count; i++)
++	{
++	  if (Fimb.match[i])
++	    {
++	      if (Fimb.patterns[i][letter] == L'\0')
++		{
++		  /* Found a match. */
++		  *plen = len;
++		  if (!exact && !match_words)
++		    return 0;
++		  else
++		    {
++		      /* For -w or exact look for longest match.  */
++		      ret = 0;
++		      Fimb.match[i] = '\0';
++		      continue;
++		    }
++		}
++
++	      if (Fimb.patterns[i][letter] == wc)
++		patterns_left = 1;
++	      else
++		Fimb.match[i] = '\0';
++	    }
++	}
++
++      len += c;
++      letter++;
++    }
++
++  return ret;
++}
++#endif /* MBS_SUPPORT */
++
+ EXECUTE_FCT(Fexecute)
+ {
+   register char const *beg, *try, *end;
+@@ -519,69 +755,256 @@
+   struct kwsmatch kwsmatch;
+   size_t ret_val;
+ #ifdef MBS_SUPPORT
+-  char *mb_properties = NULL;
+-  if (MB_CUR_MAX > 1)
+-    {
+-      if (match_icase)
+-        {
+-          char *case_buf = xmalloc(size);
+-          memcpy(case_buf, buf, size);
+-	  if (start_ptr)
+-	    start_ptr = case_buf + (start_ptr - buf);
+-          buf = case_buf;
+-        }
+-      mb_properties = check_multibyte_string(buf, size);
+-    }
++  int mb_cur_max = MB_CUR_MAX;
++  mbstate_t mbs;
++  memset (&mbs, '\0', sizeof (mbstate_t));
++  const char *last_char = NULL;
+ #endif /* MBS_SUPPORT */
+ 
+   for (beg = start_ptr ? start_ptr : buf; beg <= buf + size; beg++)
+     {
+       size_t offset = kwsexec (kwset, beg, buf + size - beg, &kwsmatch);
+       if (offset == (size_t) -1)
+-	goto failure;
++	return offset;
+ #ifdef MBS_SUPPORT
+-      if (MB_CUR_MAX > 1 && mb_properties[offset+beg-buf] == 0)
+-	continue; /* It is a part of multibyte character.  */
++      if (mb_cur_max > 1 && !using_utf8)
++	{
++	  size_t bytes_left = offset;
++	  while (bytes_left)
++	    {
++	      size_t mlen = mbrlen (beg, bytes_left, &mbs);
++
++	      last_char = beg;
++	      if (mlen == (size_t) -1 || mlen == 0)
++		{
++		  /* Incomplete character: treat as single-byte. */
++		  memset (&mbs, '\0', sizeof (mbstate_t));
++		  beg++;
++		  bytes_left--;
++		  continue;
++		}
++
++	      if (mlen == (size_t) -2)
++		/* Offset points inside multibyte character: no good. */
++		break;
++
++	      beg += mlen;
++	      bytes_left -= mlen;
++	    }
++
++	  if (bytes_left)
++	    continue;
++	}
++      else
+ #endif /* MBS_SUPPORT */
+       beg += offset;
++#ifdef MBS_SUPPORT
++      /* For f_i_multibyte, the string at beg now matches first 3 chars of
++	 one of the search strings (less if there are shorter search strings).
++	 See if this is a real match.  */
++      if (f_i_multibyte
++	  && Fimbexec (beg, buf + size - beg, &kwsmatch.size[0], start_ptr == NULL))
++	goto next_char;
++#endif /* MBS_SUPPORT */
+       len = kwsmatch.size[0];
+       if (start_ptr && !match_words)
+ 	goto success_in_beg_and_len;
+       if (match_lines)
+ 	{
+ 	  if (beg > buf && beg[-1] != eol)
+-	    continue;
++	    goto next_char;
+ 	  if (beg + len < buf + size && beg[len] != eol)
+-	    continue;
++	    goto next_char;
+ 	  goto success;
+ 	}
+       else if (match_words)
+-	for (try = beg; len; )
+-	  {
+-	    if (try > buf && WCHAR((unsigned char) try[-1]))
+-	      break;
+-	    if (try + len < buf + size && WCHAR((unsigned char) try[len]))
+-	      {
+-		offset = kwsexec (kwset, beg, --len, &kwsmatch);
+-		if (offset == (size_t) -1)
+-		  break;
+-		try = beg + offset;
+-		len = kwsmatch.size[0];
+-	      }
+-	    else if (!start_ptr)
+-	      goto success;
+-	    else
+-	      goto success_in_beg_and_len;
+-	  } /* for (try) */
+-      else
+-	goto success;
+-    } /* for (beg in buf) */
++	{
++	  while (len)
++	    {
++	      int word_match = 0;
++	      if (beg > buf)
++		{
++#ifdef MBS_SUPPORT
++		  if (mb_cur_max > 1)
++		    {
++		      const char *s;
++		      int mr;
++		      wchar_t pwc;
++
++		      if (using_utf8)
++			{
++			  s = beg - 1;
++			  while (s > buf
++				 && (unsigned char) *s >= 0x80
++				 && (unsigned char) *s <= 0xbf)
++			    --s;
++			}
++		      else
++			s = last_char;
++		      mr = mbtowc (&pwc, s, beg - s);
++		      if (mr <= 0)
++			memset (&mbs, '\0', sizeof (mbstate_t));
++		      else if ((iswalnum (pwc) || pwc == L'_')
++			       && mr == (int) (beg - s))
++			goto next_char;
++		    }
++		  else
++#endif /* MBS_SUPPORT */
++		  if (WCHAR ((unsigned char) beg[-1]))
++		    goto next_char;
++		}
++#ifdef MBS_SUPPORT
++	      if (mb_cur_max > 1)
++		{
++		  wchar_t nwc;
++		  int mr;
+ 
+- failure:
+-  ret_val = -1;
+-  goto out;
++		  mr = mbtowc (&nwc, beg + len, buf + size - beg - len);
++		  if (mr <= 0)
++		    {
++		      memset (&mbs, '\0', sizeof (mbstate_t));
++		      word_match = 1;
++		    }
++		  else if (!iswalnum (nwc) && nwc != L'_')
++		    word_match = 1;
++		}
++	      else
++#endif /* MBS_SUPPORT */
++		if (beg + len >= buf + size || !WCHAR ((unsigned char) beg[len]))
++		  word_match = 1;
++	      if (word_match)
++		{
++		  if (start_ptr == NULL)
++		    /* Returns the whole line now we know there's a word match. */
++		    goto success;
++		  else {
++		    /* Returns just this word match. */
++		    *match_size = len;
++		    return beg - buf;
++		  }
++		}
++	      if (len > 0)
++		{
++		  /* Try a shorter length anchored at the same place. */
++		  --len;
++		  offset = kwsexec (kwset, beg, len, &kwsmatch);
++
++		  if (offset == -1)
++		    goto next_char; /* Try a different anchor. */
++#ifdef MBS_SUPPORT
++
++		  if (mb_cur_max > 1 && !using_utf8)
++		    {
++		      size_t bytes_left = offset;
++		      while (bytes_left)
++			{
++			  size_t mlen = mbrlen (beg, bytes_left, &mbs);
++
++			  last_char = beg;
++			  if (mlen == (size_t) -1 || mlen == 0)
++			    {
++			      /* Incomplete character: treat as single-byte. */
++			      memset (&mbs, '\0', sizeof (mbstate_t));
++			      beg++;
++			      bytes_left--;
++			      continue;
++			    }
++
++			  if (mlen == (size_t) -2)
++			    {
++			      /* Offset points inside multibyte character:
++			       * no good. */
++			      break;
++			    }
++
++			  beg += mlen;
++			  bytes_left -= mlen;
++			}
++
++		      if (bytes_left)
++			{
++			  memset (&mbs, '\0', sizeof (mbstate_t));
++			  goto next_char; /* Try a different anchor. */
++			}
++		    }
++		  else
++#endif /* MBS_SUPPORT */
++		  beg += offset;
++#ifdef MBS_SUPPORT
++		  /* The string at beg now matches first 3 chars of one of
++		     the search strings (less if there are shorter search
++		     strings).  See if this is a real match.  */
++		  if (f_i_multibyte
++		      && Fimbexec (beg, len - offset, &kwsmatch.size[0],
++				   start_ptr == NULL))
++		    goto next_char;
++#endif /* MBS_SUPPORT */
++		  len = kwsmatch.size[0];
++		}
++	    }
++	}
++       else
++	goto success;
++next_char:;
++#ifdef MBS_SUPPORT
++      /* Advance to next character.  For MB_CUR_MAX == 1 case this is handled
++	 by ++beg above.  */
++      if (mb_cur_max > 1)
++	{
++	  if (using_utf8)
++	    {
++	      unsigned char c = *beg;
++	      if (c >= 0xc2)
++		{
++		  if (c < 0xe0)
++		    ++beg;
++		  else if (c < 0xf0)
++		    beg += 2;
++		  else if (c < 0xf8)
++		    beg += 3;
++		  else if (c < 0xfc)
++		    beg += 4;
++		  else if (c < 0xfe)
++		    beg += 5;
++		}
++	    }
++	  else
++	    {
++	      size_t l = mbrlen (beg, buf + size - beg, &mbs);
++
++	      last_char = beg;
++	      if (l + 2 >= 2)
++		beg += l - 1;
++	      else
++		memset (&mbs, '\0', sizeof (mbstate_t));
++	    }
++	}
++#endif /* MBS_SUPPORT */
++    }
++
++  return -1;
+ 
+  success:
++#ifdef MBS_SUPPORT
++  if (mb_cur_max > 1 && !using_utf8)
++    {
++      end = beg + len;
++      while (end < buf + size)
++	{
++	  size_t mlen = mbrlen (end, buf + size - end, &mbs);
++	  if (mlen == (size_t) -1 || mlen == (size_t) -2 || mlen == 0)
++	    {
++	      memset (&mbs, '\0', sizeof (mbstate_t));
++	      mlen = 1;
++	    }
++	  if (mlen == 1 && *end == eol)
++	    break;
++
++	  end += mlen;
++	}
++     }
++  else
++ #endif /* MBS_SUPPORT */
+   end = memchr (beg + len, eol, (buf + size) - (beg + len));
+   end++;
+   while (buf < beg && beg[-1] != eol)
+@@ -591,15 +1016,6 @@
+   *match_size = len;
+   ret_val = beg - buf;
+  out:
+-#ifdef MBS_SUPPORT
+-  if (MB_CUR_MAX > 1)
+-    {
+-      if (match_icase)
+-        free((char*)buf);
+-      if (mb_properties)
+-        free(mb_properties);
+-    }
+-#endif /* MBS_SUPPORT */
+   return ret_val;
+ }
+ #endif /* defined(GREP_PROGRAM) || defined(FGREP_PROGRAM) */
only in patch2:
unchanged:
--- grep-2.5.3~dfsg.orig/debian/patches/67-w.patch
+++ grep-2.5.3~dfsg/debian/patches/67-w.patch
@@ -0,0 +1,118 @@
+reverted:
+--- src/search.c	2007-10-01 14:47:55.000000000 +0200
++++ src/search.c	2007-09-30 23:38:45.000000000 +0200
+@@ -282,6 +284,7 @@
+   static int use_dfa_checked = 0;
+   size_t i, ret_val;
+ #ifdef MBS_SUPPORT
++  const char *last_char = NULL;
+   int mb_cur_max = MB_CUR_MAX;
+   mbstate_t mbs;
+   memset (&mbs, '\0', sizeof (mbstate_t));
+@@ -338,6 +341,8 @@
+ 		  while (bytes_left)
+ 		    {
+ 		      size_t mlen = mbrlen (beg, bytes_left, &mbs);
++
++		      last_char = beg;
+ 		      if (mlen == (size_t) -1 || mlen == 0)
+ 			{
+ 			  /* Incomplete character: treat as single-byte. */
+@@ -398,6 +403,8 @@
+ 		  while (bytes_left)
+ 		    {
+ 		      size_t mlen = mbrlen (beg, bytes_left, &mbs);
++
++		      last_char = beg;
+ 		      if (mlen == (size_t) -1 || mlen == 0)
+ 			{
+ 			  /* Incomplete character: treat as single-byte. */
+@@ -475,10 +483,84 @@
+ 	      if (match_words)
+ 		while (match <= best_match)
+ 		  {
++		    int lword_match = 0;
++		    if (match == buf)
++		      lword_match = 1;
++		    else
++		      {
++			assert (start > 0);
++#ifdef MBS_SUPPORT
++			if (mb_cur_max > 1)
++			  {
++			    const char *s;
++			    int mr;
++			    wchar_t pwc;
++			    if (using_utf8)
++			      {
++				s = match - 1;
++				while (s > buf
++				       && (unsigned char) *s >= 0x80
++				       && (unsigned char) *s <= 0xbf)
++				  --s;
++			      }
++			    else
++			      s = last_char;
++			    mr = mbtowc (&pwc, s, match - s);
++			    if (mr <= 0)
++			      {
++				memset (&mbs, '\0', sizeof (mbstate_t));
++				lword_match = 1;
++			      }
++			    else if (!(iswalnum (pwc) || pwc == L'_')
++				     && mr == (int) (match - s))
++			      lword_match = 1;
++			  }
++			else
++#endif /* MBS_SUPPORT */
++			if (!WCHAR ((unsigned char) match[-1]))
++			  lword_match = 1;
++		      }
++
++		    if (lword_match)
++		      {
++			int rword_match = 0;
++			if (start + len == end - beg - 1)
++			  rword_match = 1;
++			else
++			  {
++#ifdef MBS_SUPPORT
++			    if (mb_cur_max > 1)
++			      {
++				wchar_t nwc;
++				int mr;
++
++				mr = mbtowc (&nwc, buf + start + len,
++					     end - buf - start - len - 1);
++				if (mr <= 0)
++				  {
++				    memset (&mbs, '\0', sizeof (mbstate_t));
++				    rword_match = 1;
++				  }
++				else if (!iswalnum (nwc) && nwc != L'_')
++				  rword_match = 1;
++			      }
++			    else
++#endif /* MBS_SUPPORT */
++			    if (!WCHAR ((unsigned char) match[len]))
++			      rword_match = 1;
++			  }
++
++			if (rword_match)
++			  {
++			    if (!start_ptr)
++			      /* Returns the whole line. */
++			      goto success;
++			    else
++			      {
++				goto assess_pattern_match;
++			      }
++			  }
++		      }
+-		    if ((match == buf || !WCHAR ((unsigned char) match[-1]))
+-			&& (len == end - beg - 1
+-			    || !WCHAR ((unsigned char) match[len])))
+-		      goto assess_pattern_match;
+ 		    if (len > 0)
+ 		      {
+ 			/* Try a shorter length anchored at the same place. */
only in patch2:
unchanged:
--- grep-2.5.3~dfsg.orig/debian/patches/65-dfa-optional.patch
+++ grep-2.5.3~dfsg/debian/patches/65-dfa-optional.patch
@@ -0,0 +1,73 @@
+The DFA algorithm is slow with mutlibytes characters.
+This patch disables the DFA algorithm, but it can be re-enabled by setting
+the GREP_USE_DFA environment variable.
+
+This patch requires 64-egf-speedup.patch
+--- src/search.c.orig	2005-09-06 22:22:17.000000000 +0200
++++ src/search.c	2005-09-06 22:25:41.000000000 +0200
+@@ -326,6 +326,8 @@
+   char eol = eolbyte;
+   int backref, start, len;
+   struct kwsmatch kwsm;
++  static int use_dfa;
++  static int use_dfa_checked = 0;
+   size_t i, ret_val;
+ #ifdef MBS_SUPPORT
+   int mb_cur_max = MB_CUR_MAX;
+@@ -333,6 +335,26 @@
+   memset (&mbs, '\0', sizeof (mbstate_t));
+ #endif /* MBS_SUPPORT */
+ 
++  if (!use_dfa_checked)
++    {
++      char *grep_use_dfa = getenv ("GREP_USE_DFA");
++      if (!grep_use_dfa)
++	{
++#ifdef MBS_SUPPORT
++	  /* Turn off DFA when processing multibyte input. */
++	  use_dfa = (MB_CUR_MAX == 1);
++#else
++	  use_dfa = 1;
++#endif /* MBS_SUPPORT */
++	}
++      else
++	{
++	  use_dfa = atoi (grep_use_dfa);
++	}
++
++      use_dfa_checked = 1;
++    }
++
+   buflim = buf + size;
+ 
+   for (beg = end = buf; end < buflim; beg = end)
+@@ -400,7 +422,8 @@
+ #endif /* MBS_SUPPORT */
+ 		  (kwsm.index < kwset_exact_matches))
+ 		goto success;
+-	      if (dfaexec (&dfa, beg, end - beg, &backref) == (size_t) -1)
++	      if (use_dfa &&
++		  dfaexec (&dfa, beg, end - beg, &backref) == (size_t) -1)
+ 		continue;
+ 	    }
+ 	  else
+@@ -409,7 +432,9 @@
+ #ifdef MBS_SUPPORT
+ 	      size_t bytes_left = 0;
+ #endif /* MBS_SUPPORT */
+-	      size_t offset = dfaexec (&dfa, beg, buflim - beg, &backref);
++	      size_t offset = 0;
++	      if (use_dfa)
++		offset = dfaexec (&dfa, beg, buflim - beg, &backref);
+ 	      if (offset == (size_t) -1)
+ 		break;
+ 	      /* Narrow down to the line we've found. */
+@@ -451,7 +476,7 @@
+ 		--beg;
+ 	    }
+ 	  /* Successful, no backreferences encountered! */
+-	  if (!backref)
++	  if (use_dfa && !backref)
+ 	    goto success;
+ 	}
+       else
only in patch2:
unchanged:
--- grep-2.5.3~dfsg.orig/debian/patches/66-match_icase.patch
+++ grep-2.5.3~dfsg/debian/patches/66-match_icase.patch
@@ -0,0 +1,39 @@
+This fixes
+    echo Y | LC_ALL=en_US.UTF-8 grep -i '[y]'
+The expected output is:
+    Y
+
+Without this patch, it works on non UTF-8 environment, but fails on UTF-8
+environment.
+
+The definition of RE_ICASE comes from the glibc (/usr/include/regex.h)
+
+Maybe lib/posix/regex.h should be removed to enforce the usage of the
+glibc's regex.h
+
+--- lib/posix/regex.h.orig	2004-01-05 12:09:12.984391131 +0000
++++ lib/posix/regex.h	2004-01-05 12:09:24.717990622 +0000
+@@ -109,6 +109,10 @@
+    treated as 'a\{1'.  */
+ #define RE_INVALID_INTERVAL_ORD (RE_DEBUG << 1)
+
++/* If this bit is set, then ignore case when matching.
++   If not set, then case is significant.  */
++#define RE_ICASE (RE_INVALID_INTERVAL_ORD << 1)
++
+ /* This global variable defines the particular regexp syntax to use (for
+    some interfaces).  When a regexp is compiled, the syntax used is
+    stored in the pattern buffer, so changing this does not affect
+--- src/search.c.orig	2005-09-06 23:50:40.000000000 +0200
++++ src/search.c	2005-09-06 23:59:33.000000000 +0200
+@@ -172,10 +167,8 @@
+   char const *motif = pattern;
+ 
+   check_utf8 ();
+-#if 0
+   if (match_icase)
+     syntax_bits |= RE_ICASE;
+-#endif
+   re_set_syntax (syntax_bits);
+   dfasyntax (syntax_bits, match_icase, eolbyte);
+ 




Tags added: pending Request was from Thomas Viehmann <tv@beamnet.de> to control@bugs.debian.org. (Wed, 03 Oct 2007 20:15:04 GMT) (full text, mbox, link).


Tags added: pending Request was from Thomas Viehmann <tv@beamnet.de> to control@bugs.debian.org. (Wed, 03 Oct 2007 20:15:06 GMT) (full text, mbox, link).


Reply sent to Anibal Monsalve Salazar <anibal@debian.org>:
You have taken responsibility. (full text, mbox, link).


Notification sent to Max Zou <zoum@mzou.net>:
Bug acknowledged by developer. (full text, mbox, link).


Message #171 received at 181378-close@bugs.debian.org (full text, mbox, reply):

From: Anibal Monsalve Salazar <anibal@debian.org>
To: 181378-close@bugs.debian.org
Subject: Bug#181378: fixed in grep 2.5.3~dfsg-3
Date: Thu, 04 Oct 2007 12:47:03 +0000
Source: grep
Source-Version: 2.5.3~dfsg-3

We believe that the bug you reported is fixed in the latest version of
grep, which is due to be installed in the Debian FTP archive:

grep_2.5.3~dfsg-3.diff.gz
  to pool/main/g/grep/grep_2.5.3~dfsg-3.diff.gz
grep_2.5.3~dfsg-3.dsc
  to pool/main/g/grep/grep_2.5.3~dfsg-3.dsc
grep_2.5.3~dfsg-3_i386.deb
  to pool/main/g/grep/grep_2.5.3~dfsg-3_i386.deb



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 181378@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Anibal Monsalve Salazar <anibal@debian.org> (supplier of updated grep package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Thu, 04 Oct 2007 21:01:16 +1000
Source: grep
Binary: grep
Architecture: source i386
Version: 2.5.3~dfsg-3
Distribution: unstable
Urgency: high
Maintainer: Anibal Monsalve Salazar <anibal@debian.org>
Changed-By: Anibal Monsalve Salazar <anibal@debian.org>
Description: 
 grep       - GNU grep, egrep and fgrep
Closes: 181378 350206 368575 429435 432636 439931 441006 442882 444164
Changes: 
 grep (2.5.3~dfsg-3) unstable; urgency=high
 .
   * Acknowledge NMU. Closes: #181378, #442882, #439931, #368575,
     #350206, #432636, #441006, #444164, #429435
   * debian/control: added homepage
Files: 
 9634be6dc678db211b80a505b5fd2fdc 703 utils required grep_2.5.3~dfsg-3.dsc
 f38cd1e84e70ea244f6ef0ef901e2edc 16438 utils required grep_2.5.3~dfsg-3.diff.gz
 55072d5f3ab1ad8d59a5355de11b53f4 277064 utils required grep_2.5.3~dfsg-3_i386.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHBN0BgY5NIXPNpFURAjnRAKCNKtYgLLVuFbU59OhGzfSEoiYBrQCZAXF+
6joAY6olWsHhkpkDJm4TkQY=
=5qXC
-----END PGP SIGNATURE-----





Reply sent to Anibal Monsalve Salazar <anibal@debian.org>:
You have taken responsibility. (full text, mbox, link).


Notification sent to Dan Jacobson <jidanni@jidanni.org>:
Bug acknowledged by developer. (full text, mbox, link).


Reply sent to Anibal Monsalve Salazar <anibal@debian.org>:
You have taken responsibility. (full text, mbox, link).


Notification sent to Dan Jacobson <jidanni@jidanni.org>:
Bug acknowledged by developer. (full text, mbox, link).


Reply sent to Anibal Monsalve Salazar <anibal@debian.org>:
You have taken responsibility. (full text, mbox, link).


Notification sent to Pierre Habouzit <madcoder@debian.org>:
Bug acknowledged by developer. (full text, mbox, link).


Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Mon, 05 Nov 2007 07:27:21 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Thu Jan 11 19:13:44 2018; Machine Name: beach

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.