Debian Bug report logs - #355966
mawk: segfault setting RS to invalid regexp

version graph

Package: mawk; Maintainer for mawk is Steve Langasek <vorlon@debian.org>; Source for mawk is src:mawk.

Reported by: Devin Bayer <devin@freeshell.org>

Date: Thu, 9 Mar 2006 00:33:07 UTC

Severity: normal

Tags: fixed-upstream, patch

Found in version mawk/1.3.3-11

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, James Troup <james@nocrew.org>:
Bug#355966; Package mawk. Full text and rfc822 format available.

Acknowledgement sent to Devin Bayer <devin@freeshell.org>:
New Bug report received and forwarded. Copy sent to James Troup <james@nocrew.org>. Full text and rfc822 format available.

Message #5 received at submit@bugs.debian.org (full text, mbox):

From: Devin Bayer <devin@freeshell.org>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: mawk segfaults on invalid regexp
Date: Wed, 08 Mar 2006 16:16:24 -0800
Package: mawk
Version: 1.3.3-11
Severity: normal

$ mawk -v RS='\n|'
mawk: line 0: regular expression compile failed (missing operand)

|
Segmentation fault (core dumped)

backtrace:
#0  0x08057f46 in matherr ()
#1  0x080537f0 in ?? ()
#2  0x00000000 in ?? ()

-- System Information:
Debian Release: testing/unstable
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable'), (1, 'experimental')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.11-1-k7-smp
Locale: LANG=en_US, LC_CTYPE=en_US (charmap=UTF-8) (ignored: LC_ALL set to en_US.UTF-8)

Versions of packages mawk depends on:
ii  libc6                         2.3.5-12.1 GNU C Library: Shared libraries an

mawk recommends no packages.

-- no debconf information



Information forwarded to debian-bugs-dist@lists.debian.org, Steve Langasek <vorlon@debian.org>:
Bug#355966; Package mawk. (Tue, 06 Jan 2009 19:24:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jörgen Tegnér <jorgen.tegner@telia.com>:
Extra info received and forwarded to list. Copy sent to Steve Langasek <vorlon@debian.org>. (Tue, 06 Jan 2009 19:24:02 GMT) Full text and rfc822 format available.

Message #10 received at 355966@bugs.debian.org (full text, mbox):

From: Jörgen Tegnér <jorgen.tegner@telia.com>
To: 355966@bugs.debian.org
Cc: 355966-submitter@bugs.debian.org
Subject: Patch
Date: Tue, 06 Jan 2009 20:23:01 +0100
teg@burken:~/test/mawk/mawk-1.3.3/rexp$ diff -u rexp2.c.orig rexp2.c
--- rexp2.c.orig	2009-01-06 20:05:46.000000000 +0100
+++ rexp2.c	2009-01-06 20:06:33.000000000 +0100
@@ -323,6 +323,8 @@
    register STATE *p ;
    unsigned *lenp ;
 {
+  if (!p) return (char *) 0;
+  
    if (p[0].type == M_STR && p[1].type == M_ACCEPT)
    {
       *lenp = p->len ;

The output becomes
$./mawk -v RS='\n|'
mawk: line 0: regular expression compile failed (missing operand)

|
$

/Jörgen





Message sent on to Devin Bayer <devin@freeshell.org>:
Bug#355966. (Tue, 06 Jan 2009 19:24:04 GMT) Full text and rfc822 format available.

Message sent on to Devin Bayer <devin@freeshell.org>:
Bug#355966. (Mon, 13 Jul 2009 00:51:04 GMT) Full text and rfc822 format available.

Message #16 received at 355966-submitter@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@his.com>
To: 355966-submitter@bugs.debian.org
Subject: re: #355966 mawk segfaults on invalid regexp
Date: Sun, 12 Jul 2009 20:46:46 -0400
[Message part 1 (text/plain, inline)]
This fix is in the current version at

	ftp://invisible-island.net/mawk/

-- 
Thomas E. Dickey <dickey@invisible-island.net>
http://invisible-island.net
ftp://invisible-island.net
[signature.asc (application/pgp-signature, inline)]

Added tag(s) fixed-upstream. Request was from Thomas Dickey <dickey@his.com> to control@bugs.debian.org. (Tue, 28 Jul 2009 08:51:20 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Steve Langasek <vorlon@debian.org>:
Bug#355966; Package mawk. (Mon, 01 Mar 2010 19:57:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonathan Nieder <jrnieder@gmail.com>:
Extra info received and forwarded to list. Copy sent to Steve Langasek <vorlon@debian.org>. (Mon, 01 Mar 2010 19:57:03 GMT) Full text and rfc822 format available.

Message #23 received at 355966@bugs.debian.org (full text, mbox):

From: Jonathan Nieder <jrnieder@gmail.com>
To: 355966@bugs.debian.org, control@bugs.debian.org
Cc: Jörgen Tegnér <jorgen.tegner@telia.com>, Devin Bayer <devin@freeshell.org>
Subject: Re: mawk segfaults on invalid regexp
Date: Mon, 1 Mar 2010 13:56:05 -0600
retitle 355966 mawk: segfault setting RS to invalid regexp
tags 355966 + patch
thanks

Hi,

Devin Bayer wrote:

> $ mawk -v RS='\n|'
> mawk: line 0: regular expression compile failed (missing operand)
> 
> |
> Segmentation fault (core dumped)

Jörgen Tegnér wrote:

--- rexp2.c.orig	2009-01-06 20:05:46.000000000 +0100
+++ rexp2.c	2009-01-06 20:06:33.000000000 +0100
@@ -323,6 +323,8 @@
    register STATE *p ;
    unsigned *lenp ;
 {
+  if (!p) return (char *) 0;
+  
    if (p[0].type == M_STR && p[1].type == M_ACCEPT)
    {
       *lenp = p->len ;

Thanks, Jörgen, for the patch!

A similar fix was applied in mawk 1.3.3-20090711.  Because of Aleksey
Cheusov’s regex patch, the code had moved from rexp2.c to
rexp/rexp4.c.

For context, it is probably worth mentioning that is_string_split()
already returns NULL when its argument is a regexp that does not
represent a literal string.  In that case, a regexp pointing to null
is stored in RS, which is analagous to what normally happens when
an invalid regexp is used.  The compile_error_count was already
incremented, another compile-time error might be reported, and when
it is time to start execution mawk checks compile_error_count and
exits before there is a chance to try to use an invalid value.

Regards,
Jonathan




Changed Bug title to 'mawk: segfault setting RS to invalid regexp' from 'mawk segfaults on invalid regexp' Request was from Jonathan Nieder <jrnieder@gmail.com> to control@bugs.debian.org. (Mon, 01 Mar 2010 19:57:10 GMT) Full text and rfc822 format available.

Added tag(s) patch. Request was from Jonathan Nieder <jrnieder@gmail.com> to control@bugs.debian.org. (Mon, 01 Mar 2010 19:57:10 GMT) Full text and rfc822 format available.

Information forwarded to debian-bugs-dist@lists.debian.org, Steve Langasek <vorlon@debian.org>:
Bug#355966; Package mawk. (Mon, 01 Mar 2010 20:00:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonathan Nieder <jrnieder@gmail.com>:
Extra info received and forwarded to list. Copy sent to Steve Langasek <vorlon@debian.org>. (Mon, 01 Mar 2010 20:00:03 GMT) Full text and rfc822 format available.

Message #32 received at 355966@bugs.debian.org (full text, mbox):

From: Jonathan Nieder <jrnieder@gmail.com>
To: 355966@bugs.debian.org
Cc: Jörgen Tegnér <jorgen.tegner@telia.com>, Devin Bayer <devin@freeshell.org>
Subject: Re: mawk segfaults on invalid regexp
Date: Mon, 1 Mar 2010 13:58:18 -0600
Jonathan Nieder wrote:

> For context, it is probably worth mentioning that is_string_split()
> already returns NULL when its argument is a regexp that does not
> represent a literal string.  In that case,

Here by “In that case” I mean not the case of a regexp that does not
represent a literal string but the case of an invalid regexp.  Sorry
for the confusion.

Jonathan




Information forwarded to debian-bugs-dist@lists.debian.org, Steve Langasek <vorlon@debian.org>:
Bug#355966; Package mawk. (Tue, 10 Aug 2010 23:33:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Carlos Carvalho <carlos@fisica.ufpr.br>:
Extra info received and forwarded to list. Copy sent to Steve Langasek <vorlon@debian.org>. (Tue, 10 Aug 2010 23:33:07 GMT) Full text and rfc822 format available.

Message #37 received at 355966@bugs.debian.org (full text, mbox):

From: Carlos Carvalho <carlos@fisica.ufpr.br>
To: 355966@bugs.debian.org
Subject: mawk still fails with RS="\0"
Date: Tue, 10 Aug 2010 20:20:23 -0300
mawk cannot handle records separated by nulls, such as for example the
ones generated by find. Only the first record is handled:

% mkdir mawk-fails
% cd mawk-fails
%mawk-fails touch a b c
%mawk-fails find -printf "%p\0"|mawk 'BEGIN {RS="\0"} {print}' 
.

gawk works:

%mawk-fails find -printf "%p\0"|gawk 'BEGIN {RS="\0"} {print}'                                                   
.
./a
./b
./c




Information forwarded to debian-bugs-dist@lists.debian.org, Steve Langasek <vorlon@debian.org>:
Bug#355966; Package mawk. (Wed, 11 Aug 2010 00:21:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Thomas Dickey <dickey@his.com>:
Extra info received and forwarded to list. Copy sent to Steve Langasek <vorlon@debian.org>. (Wed, 11 Aug 2010 00:21:02 GMT) Full text and rfc822 format available.

Message #42 received at 355966@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@his.com>
To: Carlos Carvalho <carlos@fisica.ufpr.br>, 355966@bugs.debian.org
Cc: Steve Langasek <vorlon@debian.org>
Subject: Re: Bug#355966: mawk still fails with RS="\0"
Date: Tue, 10 Aug 2010 20:17:07 -0400 (EDT)
On Tue, 10 Aug 2010, Carlos Carvalho wrote:

> mawk cannot handle records separated by nulls, such as for example the
> ones generated by find. Only the first record is handled:
>
> % mkdir mawk-fails
> % cd mawk-fails
> %mawk-fails touch a b c
> %mawk-fails find -printf "%p\0"|mawk 'BEGIN {RS="\0"} {print}'
> ..

It works with upstream mawk (just tested mawk-1.3.4-20100625).

This is a duplicate of #135614

-- 
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net




Information forwarded to debian-bugs-dist@lists.debian.org, Steve Langasek <vorlon@debian.org>:
Bug#355966; Package mawk. (Wed, 11 Aug 2010 00:24:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Thomas Dickey <dickey@his.com>:
Extra info received and forwarded to list. Copy sent to Steve Langasek <vorlon@debian.org>. (Wed, 11 Aug 2010 00:24:06 GMT) Full text and rfc822 format available.

Message #47 received at 355966@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@his.com>
To: Carlos Carvalho <carlos@fisica.ufpr.br>, 355966@bugs.debian.org
Cc: Steve Langasek <vorlon@debian.org>
Subject: Re: Bug#355966: mawk still fails with RS="\0"
Date: Tue, 10 Aug 2010 20:20:20 -0400 (EDT)
On Tue, 10 Aug 2010, Thomas Dickey wrote:

> On Tue, 10 Aug 2010, Carlos Carvalho wrote:
>
>> mawk cannot handle records separated by nulls, such as for example the
>> ones generated by find. Only the first record is handled:
>> 
>> % mkdir mawk-fails
>> % cd mawk-fails
>> %mawk-fails touch a b c
>> %mawk-fails find -printf "%p\0"|mawk 'BEGIN {RS="\0"} {print}'
>> ..
>
> It works with upstream mawk (just tested mawk-1.3.4-20100625).
>
> This is a duplicate of #135614

...however, #355966 is unrelated to this additional report about embedded
nulls.

-- 
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net




Information forwarded to debian-bugs-dist@lists.debian.org, Steve Langasek <vorlon@debian.org>:
Bug#355966; Package mawk. (Wed, 11 Aug 2010 01:57:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Carlos Carvalho <carlos@fisica.ufpr.br>:
Extra info received and forwarded to list. Copy sent to Steve Langasek <vorlon@debian.org>. (Wed, 11 Aug 2010 01:57:03 GMT) Full text and rfc822 format available.

Message #52 received at 355966@bugs.debian.org (full text, mbox):

From: Carlos Carvalho <carlos@fisica.ufpr.br>
To: 355966@bugs.debian.org
Subject: Re: Bug#355966: mawk still fails with RS="\0"
Date: Tue, 10 Aug 2010 22:53:48 -0300
Thomas Dickey (dickey@his.com) wrote on 10 August 2010 20:20:
 >On Tue, 10 Aug 2010, Thomas Dickey wrote:
 >
 >> On Tue, 10 Aug 2010, Carlos Carvalho wrote:
 >>
 >>> mawk cannot handle records separated by nulls, such as for example the
 >>> ones generated by find. Only the first record is handled:
 >>> 
 >>> % mkdir mawk-fails
 >>> % cd mawk-fails
 >>> %mawk-fails touch a b c
 >>> %mawk-fails find -printf "%p\0"|mawk 'BEGIN {RS="\0"} {print}'
 >>> ..
 >>
 >> It works with upstream mawk (just tested mawk-1.3.4-20100625).
 >>
 >> This is a duplicate of #135614

There your say that processing nulls is a gawk extension. However I
don't see it in the list of extensions of gawk manpage. It mentions
only the ability to split a string with the null string. Further, the
null string is not "\0".

 >...however, #355966 is unrelated to this additional report about embedded
 >nulls.

It's related to RS, which is where I stumbled on the nulls.

Anyway, it'd be quite useful to have this in Debian. mawk is a lot
faster than gawk...




Information forwarded to debian-bugs-dist@lists.debian.org, Steve Langasek <vorlon@debian.org>:
Bug#355966; Package mawk. (Wed, 11 Aug 2010 08:09:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Thomas Dickey <dickey@his.com>:
Extra info received and forwarded to list. Copy sent to Steve Langasek <vorlon@debian.org>. (Wed, 11 Aug 2010 08:09:04 GMT) Full text and rfc822 format available.

Message #57 received at 355966@bugs.debian.org (full text, mbox):

From: Thomas Dickey <dickey@his.com>
To: Carlos Carvalho <carlos@fisica.ufpr.br>, 355966@bugs.debian.org
Cc: Steve Langasek <vorlon@debian.org>
Subject: Re: Bug#355966: mawk still fails with RS="\0"
Date: Wed, 11 Aug 2010 04:04:26 -0400 (EDT)
On Tue, 10 Aug 2010, Carlos Carvalho wrote:

> Thomas Dickey (dickey@his.com) wrote on 10 August 2010 20:20:
> >On Tue, 10 Aug 2010, Thomas Dickey wrote:
> >
> >> On Tue, 10 Aug 2010, Carlos Carvalho wrote:
> >>
> >>> mawk cannot handle records separated by nulls, such as for example the
> >>> ones generated by find. Only the first record is handled:
> >>>
> >>> % mkdir mawk-fails
> >>> % cd mawk-fails
> >>> %mawk-fails touch a b c
> >>> %mawk-fails find -printf "%p\0"|mawk 'BEGIN {RS="\0"} {print}'
> >>> ..
> >>
> >> It works with upstream mawk (just tested mawk-1.3.4-20100625).
> >>
> >> This is a duplicate of #135614
>
> There your say that processing nulls is a gawk extension. However I
> don't see it in the list of extensions of gawk manpage. It mentions
> only the ability to split a string with the null string. Further, the
> null string is not "\0".

My understanding is that almost all of the embedded-nulls are the same -
an extension over POSIX.  Once you introduce it in one area, it gets 
implied into a lot of other areas.

> >...however, #355966 is unrelated to this additional report about embedded
> >nulls.
>
> It's related to RS, which is where I stumbled on the nulls.
>
> Anyway, it'd be quite useful to have this in Debian. mawk is a lot
> faster than gawk...

I agree.

-- 
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net




Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Thu Apr 17 04:13:24 2014; Machine Name: buxtehude.debian.org

Debian Bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.