logo       

Re: [xmlc] 2.3beta getElementById() behavior: msg#00030

java.enhydra.xmlc

Subject: Re: [xmlc] 2.3beta getElementById() behavior

Quoting Erik Rasmussen <rasmussenerik@xxxxxxxxx>:

> Right. I'm more on the XHTML side of things, but I do some XML
> parsing too (SAX mostly). It seems to me that if you're depending on
> elements having an "id" attribute in XML-land, then you're doing
> something wrong. The "id" attribute is very HTML-specific.

Yes, but XHTML 1.0 is the XML-ized version of HTML 4.01. Since HTML 4.01
defines "id" as of type ID (even though the HTML DTD's are totally invalid), so
does XHTML 1.0. As such, even though XHTML is XML where no attribute is of type
ID unless specified so in a DTD (or via DOM3 setIdAttribute()), and it is, the
"id" attribute is predictably of type ID even if it loses its ID'ness because
of issues like the importNode/adoptNode stuff we've been discussing.

You're right that it doesn't make sense in XML, and that's why the fallback
recursion is limited to specific DOM implementations; those implementing
HTMLDocument, of which the XHTML DOM is one of them.

> The
> interface for an XML document shouldn't even have a getElementById()
> method, in my opinion. The XHTMLDocument interface (or whatever it's
> called) should extend XMLDocument and HTMLDocument, thereby
> acquiring getElementById() from the latter.
>

There is no "XMLDocument" interface, just "Document". But, in fact, the
XHTMLDocument interface *does* extend the "HTMLDocument" interface.... which
extends the "Document" interface. I guess you must be saying that "Document"
shouldn't have getElementById() and only "HTMLDocument" and extensions would
have that.

Two comments about that...

1. Your opinion is shared by Elliotte Rusty Harold. He implemented his XOM
library without getElementById(). And if you ask him if it will be implemented
at some point, I'm quite positive that you'll get an emphatic "NO"! Kindred
spirits, I guess :-)

2. Keep in mind that just because getElementById() ends in "Id", doesn't imply
anything about the attribute name. The fact that the HTML DOM chose a specific
attribute, "id", to represent an attribute of type ID says nothing about any
other markup dialect. In fact, in HTML, it's not even really of type ID,
because there's no DTD defining it as such (again, the HTML DTDs are invalid
and if you try to use DOM3's NormalizeDocument() with validation enabled and
provide the path to one of the HTML DTDs, it will fail miserably saying it
can't parse the DTD). The XHTML 1.0 DTDs are valid (other than Basic, which is
broken) and define the attribute "id" as of type ID. So, they are true Id's in
both the HTML and XML sense.

Personally, I don't see an issue with defining certain attributes as of type ID.
I just wish they stayed consistent. Right now, you need a DTD defining certain
attributes as of type ID or you need to use DOM3's setIdAttribute(). XML
Schema, as I understand it, has no concept of defining attributes as of type
ID. That inconsistency notwithstanding, getElementById() is a perfectly
reasonable method to have on "Document", not just HTMLDocument.

> It seems like it wouldn't be too difficult for
> XHTMLElementImpl.cloneNode() and XHTMLElementImpl.importNode() to
> access its XHTMLDocument's id map. The implementations of these
> methods in XMLElementImpl wouldn't care about maintaining maps of any
> attributes.
>

Well, right now, all the ID magic takes place in the core DOM implementation of
Xerces. It's the fallback recursion that is localized to HTMLDocumentImpl,
LazyHTMLDocumentImpl, and XHTMLBaseDocumentImpl. Localizing the core ID magic
to HTMLDocumentImpl would be totally pointless because HTML documents aren't
validated and, therefore, never know anything about the ID'ness of the "id"
attribute, other than simply assuming it is the defacto attribute treated as an
attribute of type "ID". It would also make it difficult for various DOM
implementations to share logic and DOM3's functionality of defining certain
attributes as of type ID wouldn't exist.

> It all seems pretty straightforward to me, but I'm no expert in XML
> parsing or DTDs, so I'm sure there are more complicated issues that
> I'm missing. I'd be against a standard "xml:id" attribute.
>

I'm not clear why? There are already other magic XML attributes such as
"xml:space". "xml:id" would clear up the whole "what is an Id in XML" debate
and allow for getElementById() to work for well-formed, but not necessarily
validated, XML documents. Thus, no requirement for a DTD to define it and
you'd have attributes of type ID even when using Schema instead of DTD (or no
DTD/Schema at all).

> just my $0.02,

Noted, but I can't say I agree.

Jake

> Erik
>
> On Jan 26, 2007, at 3:08 AM, Jacob Kjome wrote:
> > Keep in mind that the discussion here is XML-centric. Because
> > there is no inherent "id" attribute in XML (people have proposed a
> > standard "xml:id" attribute), there is no fallback like there is in
> > the HTML DOM, where it can simply recurse the DOM for elements with
> > the "id" attribute and see if they match the desired value. I
> > like what Elliotte Rusty Harold had to say about it and agree that
> > the ID'ness of attributes should be carried over. And it sounds
> > like it's simply an issue with the cloneNode(). Which might make
> > sense if the original implementors of cloneNode() decided that if
> > they cloned ID'ness and someone cloned a node with an attribute of
> > type "ID", then there's be duplicate ID's, creating an invalid
> > DOM. However, there's lots of ways for a user to mess up a DOM and
> > make it invalid. What's one more, especially when user's are
> > likely to know what they are doing and avoid doing bad things.
> >
> >
> >> But again, that's just my opinion as an end-user outside the DOM
> >> black box. I'm pretty sure that javascript DOM behaves the way
> >> I've described. But then javascript is out there in Anything Goes
> >> Land without DTDs to respect.
> >
> > The javascipt DOM is HTML-centric and treats "id" attributes as of
> > type "ID". You said it yourself. There's no "DTDs to respect".
> > As such, it must make assumptions about what is an Id. They simply
> > don't have to face the issues that Xerces (or any parser) has with
> > XML. HTML is easy.
>
>





--
You receive this message as a subscriber of the xmlc@xxxxxxxxxxxxx mailing list.
To unsubscribe: mailto:xmlc-unsubscribe@xxxxxxxxxxxxx
For general help: mailto:sympa@xxxxxxxxxxxxx?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise