Bugs item #1331849, was opened at 2005-10-19 09:06
Message generated for change (Comment added) made by cmwoods
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=390963&aid=1331849&group_id=27659
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: HTML/XML/XHTML Parser
Group: Current - all platforms
Status: Open
Resolution: None
Priority: 5
Submitted By: Christopher M. Woods (cmwoods)
Assigned to: Nobody/Anonymous (nobody)
Summary: Closing Tag ignored when Start is implied/inferred
Initial Comment:
It appears that Tidy is ignoring the closing tag when
the opening/start tag is implied when using the HASH
lookup functionality (ELEMENT_HASH_LOOKUP). A trace
into the code shows that:
- when creating a new tag from a token in the input
stream, Tidy calls lookup() [in tags.c] which will
check the Hash Table and return the item.
- when creating an implied/inferred tag, Tidy calls
LookupTagDef() [in tags.c] which does NOT check the
Hash Table (nor the custom tags table) - only the
original def table.
When Tidy encounters the closing tag for the element
(e.g. for table close in ParseTable() [in parser.c]),
it does a check to see if the tag pointers are the same
(not the contents of the pointers). This comparison
fails because one of the pointers is from the original
def table and the other is from the hash table.
I see two options to address this:
1) Change all the pointer comparisons to key/unique
fields in the tag def structure.
2) Add the hash lookup functionality to LookupTagDef():
#ifdef ELEMENT_HASH_LOOKUP
/* this breaks if declared elements get changed
between two */
/* parser runs since Tidy would use the cached
version rather */
/* than the new one
*/
for (np = tags->hashtab[hash(s)]; np != NULL; np =
np->next)
if (tmbstrcmp(s, np->name) == 0)
return np;
for (np = tag_defs + 1; np < tag_defs +
N_TIDY_TAGS; ++np)
if (tmbstrcmp(s, np->name) == 0)
return install(tags, np);
#else
for (np = tag_defs + 1; np < tag_defs +
N_TIDY_TAGS; ++np )
if (np->id == tid)
return np;
#endif /* ELEMENT_HASH_LOOKUP */
(Note: I did not include the declared tags list in this
functional change but it probably should be included
for consistency.)
----------------------------------------------------------------------
>
Comment By: Christopher M. Woods (cmwoods)
Date: 2005-10-19 13:59
Message:
Logged In: YES
user_id=576763
I must appologize here - I just C/C/P and grabbed the wrong
snippet of code when I pasted my suggest code change. What
I pasted does not compile but it is conceptually what needs
to change.
If the Hash functionality is being used then it must be used
in this function as well so that the pointer comparisons
used throughout the code function as originally intended.
I will try to update with a corrected code change suggestion
once I look into the code some more and determine how to get
a clean reference to the hash table.
----------------------------------------------------------------------
Comment By: Christopher M. Woods (cmwoods)
Date: 2005-10-19 13:25
Message:
Logged In: YES
user_id=576763
I should add to my last comment that this issue is not
experienced with the new hash lookup functionality disabled
- as both functions then return pointers from the tag def table.
It manifests specifically because pointer comparisons are
occurring and the hash table functionality creates a copy of
the tag def - the pointers are different but the content is
the same. This could be addressed by changing the hash
logic to add an extra layer of indirection (a pointer
instead of a copy) or by something similar to the change I
specified in the original issue report.
----------------------------------------------------------------------
Comment By: Christopher M. Woods (cmwoods)
Date: 2005-10-19 13:19
Message:
Logged In: YES
user_id=576763
"When Tidy encounters the closing tag for the element
(e.g. for table close in ParseTable() [in parser.c]),
it does a check to see if the tag pointers are the same
(not the contents of the pointers). This comparison
fails because one of the pointers is from the original
def table and the other is from the hash table."
This means that if an inferred table is started, it does not
exit when </table> is encountered.
Using the sample file Tidy generates:
line 6 column 1 - Warning: <tr> isn't allowed in <body> elements
line 9 column 1 - Warning: inserting implicit <table>
line 25 column 1 - Warning: discarding unexpected </table>
line 9 column 1 - Warning: plain text isn't allowed in
<table> elements
line 29 column 1 - Warning: missing <tr>
line 29 column 1 - Warning: discarding unexpected <table>
line 29 column 1 - Warning: <tr> isn't allowed in <tr> elements
line 37 column 1 - Warning: discarding unexpected </tr>
line 29 column 1 - Warning: <tr> isn't allowed in <tr> elements
line 45 column 1 - Warning: discarding unexpected </tr>
line 46 column 1 - Warning: discarding unexpected </table>
line 9 column 1 - Warning: plain text isn't allowed in
<table> elements
line 9 column 1 - Warning: missing </table> before </body>
line 9 column 1 - Warning: <table> lacks "summary" attribute
The output is incorrectly ordered and the two tables are
merged into one (see input vs. output in browser).
----------------------------------------------------------------------
Comment By: Björn Höhrmann (hoehrmann)
Date: 2005-10-19 09:18
Message:
Logged In: YES
user_id=188003
Which kind of problem does this cause?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=390963&aid=1331849&group_id=27659
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more.
http://solutions.newsforge.com/ibmarch.tmpl