Debian Bug report logs - #421845
confused by nested divs with class attributes

version graph

Package: markdown; Maintainer for markdown is Matt Kraai <kraai@debian.org>; Source for markdown is src:markdown (PTS, buildd, popcon).

Reported by: Joey Hess <joeyh@debian.org>

Date: Tue, 1 May 2007 23:48:01 UTC

Severity: important

Found in versions markdown/1.0.1-6, markdown/1.0.2~b7-2, markdown/1.0.1-3

Fixed in version markdown/1.0.2~b8-1

Done: Matt Kraai <kraai@debian.org>

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, Matt Kraai <kraai@debian.org>:
Bug#421845; Package markdown. (full text, mbox, link).


Acknowledgement sent to Joey Hess <joeyh@debian.org>:
New Bug report received and forwarded. Copy sent to Matt Kraai <kraai@debian.org>. (full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: Joey Hess <joeyh@debian.org>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: confused by nested divs with class attributes
Date: Tue, 1 May 2007 19:45:33 -0400
[Message part 1 (text/plain, inline)]
Package: markdown
Version: 1.0.2~b7-2
Severity: important

This is a test case that I derived from a much larger page in ikiwiki
that was failing to validate when built with this version of markdown.
The old version of markdown generates valid html for the same input.

joey@kodama:~>cat foo
<div class="inlinepage">
<div class="toggleableend">
foo
</div>
</div>
joey@kodama:~>markdown foo
<div class="inlinepage">
<div class="toggleableend">
foo
</div>

<p></div></p>

Note that the bug only happens if two divs have an attribute. If one
is literally just "<div>", the bug does not appear.. 

Given how important nested divs can be to css-friendly layout, this is a
pretty bad bug. :-/

-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: i386 (i686)

Kernel: Linux 2.6.20-1-686 (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages markdown depends on:
ii  perl                          5.8.8-7    Larry Wall's Practical Extraction 

markdown recommends no packages.

-- no debconf information

-- 
see shy jo
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Matt Kraai <kraai@debian.org>:
Bug#421845; Package markdown. (full text, mbox, link).


Acknowledgement sent to Joey Hess <joeyh@debian.org>:
Extra info received and forwarded to list. Copy sent to Matt Kraai <kraai@debian.org>. (full text, mbox, link).


Message #10 received at 421845@bugs.debian.org (full text, mbox, reply):

From: Joey Hess <joeyh@debian.org>
To: 421845@bugs.debian.org
Subject: partial analysis
Date: Tue, 1 May 2007 20:38:15 -0400
[Message part 1 (text/plain, inline)]
Seems that the use of gen_extract_tagged call in markdown extracts the
following block from my test case:

<div class="baz">
<div class="bar">
foo
</div>

Which leaves the final closing div dangling. 

It's having trouble finding the right closing tag. I checked, and
Text::Balanced is generating a closetag of "</div>". Problem is, this
matches the first closing div, rather than the second one. It doesn't
seem to notice that there is a nested div.

I am not yet sure if the bug is in how Text::Balanced is used, or if
Text::Balanced really cannot handle this. It seems to have code to check
for a nested tag, but that check is not firing.

-- 
see shy jo
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Matt Kraai <kraai@debian.org>:
Bug#421845; Package markdown. (full text, mbox, link).


Acknowledgement sent to Joey Hess <joeyh@debian.org>:
Extra info received and forwarded to list. Copy sent to Matt Kraai <kraai@debian.org>. (full text, mbox, link).


Message #15 received at 421845@bugs.debian.org (full text, mbox, reply):

From: Joey Hess <joeyh@debian.org>
To: 421845@bugs.debian.org
Subject: hmm
Date: Tue, 1 May 2007 21:41:51 -0400
[Message part 1 (text/plain, inline)]
Actually, markdown 1.0.1-6 does also fail to handle these nested divs.

-- 
see shy jo
[signature.asc (application/pgp-signature, inline)]

Bug marked as found in version 1.0.1-6. Request was from Joey Hess <joeyh@debian.org> to control@bugs.debian.org. (Wed, 02 May 2007 01:45:03 GMT) (full text, mbox, link).


Information forwarded to debian-bugs-dist@lists.debian.org, Matt Kraai <kraai@debian.org>:
Bug#421845; Package markdown. (full text, mbox, link).


Acknowledgement sent to Matt Kraai <kraai@ftbfs.org>:
Extra info received and forwarded to list. Copy sent to Matt Kraai <kraai@debian.org>. (full text, mbox, link).


Message #22 received at 421845@bugs.debian.org (full text, mbox, reply):

From: Matt Kraai <kraai@ftbfs.org>
To: Joey Hess <joeyh@debian.org>, 421845@bugs.debian.org
Subject: Re: Bug#421845: partial analysis
Date: Tue, 1 May 2007 21:27:49 -0700
On Tue, May 01, 2007 at 08:38:15PM -0400, Joey Hess wrote:
> Seems that the use of gen_extract_tagged call in markdown extracts the
> following block from my test case:
> 
> <div class="baz">
> <div class="bar">
> foo
> </div>
> 
> Which leaves the final closing div dangling. 
> 
> It's having trouble finding the right closing tag. I checked, and
> Text::Balanced is generating a closetag of "</div>". Problem is, this
> matches the first closing div, rather than the second one. It doesn't
> seem to notice that there is a nested div.
> 
> I am not yet sure if the bug is in how Text::Balanced is used, or if
> Text::Balanced really cannot handle this. It seems to have code to check
> for a nested tag, but that check is not firing.

The regular expression that Markdown passes to Text::Balanced to match
opening tags contains a backreference to the attribute value quote
character.  Text::Balanced wraps this expression in another set of
parentheses when matching for a nested tag, which breaks this
backreference.

I don't know whether this is a bug in Markdown or Text::Balanced, nor
do I see a good way to fix it.  Help!

-- 
Matt



Information forwarded to debian-bugs-dist@lists.debian.org, Matt Kraai <kraai@debian.org>:
Bug#421845; Package markdown. (full text, mbox, link).


Acknowledgement sent to Joey Hess <joeyh@debian.org>:
Extra info received and forwarded to list. Copy sent to Matt Kraai <kraai@debian.org>. (full text, mbox, link).


Message #27 received at 421845@bugs.debian.org (full text, mbox, reply):

From: Joey Hess <joeyh@debian.org>
To: Matt Kraai <kraai@ftbfs.org>, 421845@bugs.debian.org
Subject: Re: Bug#421845: partial analysis
Date: Wed, 2 May 2007 01:26:21 -0400
[Message part 1 (text/plain, inline)]
Matt Kraai wrote:
> The regular expression that Markdown passes to Text::Balanced to match
> opening tags contains a backreference to the attribute value quote
> character.  Text::Balanced wraps this expression in another set of
> parentheses when matching for a nested tag, which breaks this
> backreference.

Aha. That explains it[1].

> I don't know whether this is a bug in Markdown or Text::Balanced, nor
> do I see a good way to fix it.  Help!

Well, one way is to go to a more complex regexp, with two halves, each
half being hardcoded for one type of quoting.

         (?:                             # Match one attr name/value pair
                 \s+                             # There needs to be at least some whitespace
                                                 # before each attribute name.
                 [\w.:_-]+               # Attribute name
                 \s*=\s*
                 (?:
                   ".+?"                         # "Attribute value"
                 |
                   '.+?'                         # 'Attribute value'
                 )
         )*                              # Zero or more

-- 
see shy jo

[1] Why you're maintaining markdown, and not me, that is. ;-)
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Matt Kraai <kraai@debian.org>:
Bug#421845; Package markdown. (full text, mbox, link).


Acknowledgement sent to Matt Kraai <kraai@ftbfs.org>:
Extra info received and forwarded to list. Copy sent to Matt Kraai <kraai@debian.org>. (full text, mbox, link).


Message #32 received at 421845@bugs.debian.org (full text, mbox, reply):

From: Matt Kraai <kraai@ftbfs.org>
To: Joey Hess <joeyh@debian.org>
Cc: 421845@bugs.debian.org
Subject: Re: Bug#421845: partial analysis
Date: Wed, 2 May 2007 00:05:02 -0700
On Wed, May 02, 2007 at 01:26:21AM -0400, Joey Hess wrote:
> Well, one way is to go to a more complex regexp, with two halves, each
> half being hardcoded for one type of quoting.
> 
>          (?:                             # Match one attr name/value pair
>                  \s+                             # There needs to be at least some whitespace
>                                                  # before each attribute name.
>                  [\w.:_-]+               # Attribute name
>                  \s*=\s*
>                  (?:
>                    ".+?"                         # "Attribute value"
>                  |
>                    '.+?'                         # 'Attribute value'
>                  )
>          )*                              # Zero or more

[1]  I'll upload a fixed version as soon as 421371 is fixed.

-- 
Matt

1. And this shows why we should both be maintaining it.  :)



Reply sent to Matt Kraai <kraai@debian.org>:
You have taken responsibility. (full text, mbox, link).


Notification sent to Joey Hess <joeyh@debian.org>:
Bug acknowledged by developer. (full text, mbox, link).


Message #37 received at 421845-close@bugs.debian.org (full text, mbox, reply):

From: Matt Kraai <kraai@debian.org>
To: 421845-close@bugs.debian.org
Subject: Bug#421845: fixed in markdown 1.0.2~b8-1
Date: Thu, 10 May 2007 03:02:02 +0000
Source: markdown
Source-Version: 1.0.2~b8-1

We believe that the bug you reported is fixed in the latest version of
markdown, which is due to be installed in the Debian FTP archive:

markdown_1.0.2~b8-1.diff.gz
  to pool/main/m/markdown/markdown_1.0.2~b8-1.diff.gz
markdown_1.0.2~b8-1.dsc
  to pool/main/m/markdown/markdown_1.0.2~b8-1.dsc
markdown_1.0.2~b8-1_all.deb
  to pool/main/m/markdown/markdown_1.0.2~b8-1_all.deb
markdown_1.0.2~b8.orig.tar.gz
  to pool/main/m/markdown/markdown_1.0.2~b8.orig.tar.gz



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 421845@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Matt Kraai <kraai@debian.org> (supplier of updated markdown package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Wed, 09 May 2007 19:50:43 -0700
Source: markdown
Binary: markdown
Architecture: source all
Version: 1.0.2~b8-1
Distribution: experimental
Urgency: low
Maintainer: Matt Kraai <kraai@debian.org>
Changed-By: Matt Kraai <kraai@debian.org>
Description: 
 markdown   - Text-to-HTML conversion tool
Closes: 421845 421849
Changes: 
 markdown (1.0.2~b8-1) experimental; urgency=low
 .
   * New beta release, closes: #421845.
   * Remove trailing spaces in debian/changelog and debian/rules.
   * Remove debian/changelog~ and debian/control~.
   * Add syntax documentation written by Joey Hess.
   * Install the upstream changelog and README, closes: #421849.
Files: 
 60688c5fab56f82ef62f76fdc4c18c5d 535 web optional markdown_1.0.2~b8-1.dsc
 a887162bdb39e960fd87733cec231079 18822 web optional markdown_1.0.2~b8.orig.tar.gz
 9c7b3e4716baa27d6babb4b19174a4c1 5061 web optional markdown_1.0.2~b8-1.diff.gz
 9a900d62384db62cc01f1e9096c0228d 26188 web optional markdown_1.0.2~b8-1_all.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGQolEfNdgYxVXvBARAt+bAJ4/5T3J0jbBCbj03EL50mtnoZ2xBwCeOpXq
XctVs8oZcd56jk5o+P4Oykg=
=51rq
-----END PGP SIGNATURE-----




Marked as found in versions markdown/1.0.1-3. Request was from Andreas Beckmann <anbe@debian.org> to control@bugs.debian.org. (Sat, 02 Nov 2013 15:57:25 GMT) (full text, mbox, link).


Bug archived. Request was from Debbugs Internal Request <owner@bugs.debian.org> to internal_control@bugs.debian.org. (Mon, 05 Dec 2016 09:21:24 GMT) (full text, mbox, link).


Bug unarchived. Request was from Don Armstrong <don@debian.org> to control@bugs.debian.org. (Wed, 07 Dec 2016 01:33:36 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Wed Oct 11 23:39:03 2017; Machine Name: buxtehude

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.