logo       

[ tidy-Bugs-1264455 ] Multiple URLs in profile incorrectly modified: msg#00079

web.html-tidy.tracker

Subject: [ tidy-Bugs-1264455 ] Multiple URLs in profile incorrectly modified

Bugs item #1264455, was opened at 2005-08-20 01:08
Message generated for change (Comment added) made by hoehrmann
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=390963&aid=1264455&group_id=27659

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: HTML/XML/XHTML Parser
Group: Current - all platforms
Status: Open
Resolution: Wont Fix
Priority: 5
Submitted By: Klaus Johannes Rusch (krusch)
Assigned to: Björn Höhrmann (hoehrmann)
Summary: Multiple URLs in profile incorrectly modified

Initial Comment:
Multiple URLs in the profile attribute of a head
element are incorrectly converted into a single URL
with escaped whitespace, for example

<head profile="http://gmpg.org/xfn/11
http://dublincore.org/documents/dcq-html/";>

becomes

<head
profile="http://gmpg.org/xfn/11%20http://dublincore.org/documents/dcq-html/";>

and a warning is issued:
line 2 column 1 - Warning: <head> escaping malformed
URI reference

HTML 4.01 reference:
http://www.w3.org/TR/REC-html40/struct/global.html#h-7.4.1

profile = uri [CT]
This attribute specifies the location of one or
more meta data profiles, separated by white space. For
future extensions, user agents should consider the
value to be a list even though this specification only
considers the first URI to be significant.



----------------------------------------------------------------------

>Comment By: Björn Höhrmann (hoehrmann)
Date: 2005-08-21 01:43

Message:
Logged In: YES
user_id=188003

Well, I don't think changing it to profile="a" helps here,
we'd still get bug reports for considering profile="a b"
beeing considered an error, and if we change Tidy to allow
it we would get bug reports for not considering an error.

----------------------------------------------------------------------

Comment By: Klaus Johannes Rusch (krusch)
Date: 2005-08-21 01:17

Message:
Logged In: YES
user_id=365576


Input (valid according to the wording of the HTML 4.01
specification but not matching the profile=uri pattern)
<head profile="A B">

Intended interpretation according to the wording of the HTML
4.01 specification (only the first value is honored):
<head profile="A">

Yes you can find arguments for both <head profile="A"> and
<head profile="A B"> in the HTML specification, which
obviously is not consistent here and it is unlikely that an
agreement will be reached.
With the interpretation instructions that only the first
value is currently honored and additional values are
reserved for future use, dropping the extra URIs would be a
reasonable though and not change the behaviour of a
conforming user agent.

<head profile="A%20B"> neither matches the intent of the
document author, nor the interpretation instructions in the
specification.


----------------------------------------------------------------------

Comment By: Björn Höhrmann (hoehrmann)
Date: 2005-08-20 20:06

Message:
Logged In: YES
user_id=188003

Well, this specific issue came up many many times in the
past, my conclusion is that we don't do anything about it
until the HTML WGs gets around to clarify their specs. You
can easily avoid Tidy's behavior by using the --fix-uri
option.

----------------------------------------------------------------------

Comment By: Klaus Johannes Rusch (krusch)
Date: 2005-08-20 16:06

Message:
Logged In: YES
user_id=365576


The description does indicate though that while only one URI
is currently considered significant more than one can be
provided.

Stripping additional URLs might be reasonable since that is
what browsers are supposed to do as well, and what the DTD
and later specifications support as well, converting "url1
url2" to "url1%20url2" is not since the result is a URL that
is formally correct but no longer valid.

With the restriction to one and the lack of a namespace-like
mapping of eta elements to profile definition the profile
attribute looks pretty useless anyway but at least tidy
should not break the profile definition.


----------------------------------------------------------------------

Comment By: Björn Höhrmann (hoehrmann)
Date: 2005-08-20 01:15

Message:
Logged In: YES
user_id=188003

It says "profile = uri" and that's what's been implemented
in Tidy ever since. This is a known problem with the HTML
and XHTML specifications. I'm happy to change Tidy's
behavior once the W3C clarifies this issue. I think it's
pretty clear that in HTML 4.01 and document types that
build on it only a single URI is allowed and the above just
describes error handling behavior. This is quite evident
from even recent specifications like the M12N in XML Schema
where the anyURI type is used here instead of list types.

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=390963&aid=1264455&group_id=27659


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise