|
Re: Thought on future of XMLC: msg#00119java.enhydra.xmlc
Hi Arno! Arno Schatz <list@xxxxxxxxxxxxx> writes: > We are setting the bar a bit higher if we require html being corecct to the > spec. Most html developers can live without the spec until they get to know > XMLC and JTidy. On my other project, I am working with an external html > design studio. Even though they are very capable, I again have to explain > what correct HTML is and what not. Is it too much to ask people to know how to use their tools? You are doing the internet a big favor by forcing them to know how to write correct HTML. Sorry you have to deal with this attitude, it sounds frustrating. It was a major mistake for browsers to ever support invalid html. It's why their is so much invalid HTML and why it's so hard to make compatibile pages, develop new web browsers, write tools that analyze and index HTML. If browsers had simply rejected bad html, there wouldn't be any. > Technically XMLC does not need to do that either, if it adopts the concept > of not changing more of the original html as neccessary. Actually, it does. XMLC transforms the document into an object hierarchy; the original document is gone. The DOM does not represent the formatting of text file that was inputed, it represents the data in it. There is no way to produce the original formatting, even if valid, only the original meaning. If invalid, it leads to impossible to handle cases. For instance, I have seen <BR></BR> used. This is not valid html, and doesn't render the same as <BR>. Yet there is no way for XMLC to represent this in the DOM. It would always output <BR>. Modifing the DOM to be able to represent any type of invalid html would be monumental undertaking. > However, it insists on generating the HTML from nodes even if they were not > changed (instead of taking the original HTML from the file), which makes it > slower and IHMO more difficult to use. This seems very hard to implement and would result in something complex and unpredictable.. XMLC has no way to know what is going to be modified. It would have to know how to get back to the original parts of the document for every subtree, however defining the subtrees means parsing the document. The rendering of parts of the document would be different depending on if it was coming from the DOM or the document. If this really needed, I think it indicates that XMLC is fundamentally flawed. > One great thing about xmlc is, that it lets the html developer work the way > they are used to (with static html). This being said, I don't think we > should force them to validate their HTML, because that is not the way they > work. I think the tools should work the way most people are used to work > successfully on projects. (Not adapting the people to a way of working we > think they should work.) Java programmers are expected to develop valid java, why shouldn't HTML developers write valid HTML? Sorry, I just don't buy this. It's why there is a HTML standard. HTML is a file format, not free form text. There are a lot of unemployed web designers, they can be replaced! If the way people want to work is create something that isn't HTML, only looks like it, well, it isn't HTML. I don't thing we (as techincal people) should be encouraging improper use of tools, even if it is they way they `like' to work. We should have the patience to educate people who don't come from a computer background on how computers sortware works and why doing something that is almost right makes it an artificial intelligence problem to deal with. Programmer's have to make compormises to implement things within the constraints of the languages they use, html developers might have to compromise their design a little to be able to do it in html, but that is just part using software. > taking the design goal of xmlc serious, that it only supports valid > documents, then why not breaking the build if xmlc encouters any warning? I actually think it should. At least in the case where it needs to modify the contents to produce a valid document. This seems to be a big part of the problem you are having; JTidy tries to correct the document to valid HTML, but the results are not predictable. The way Tidy was designed to be used was it would correct the document, then one examines it and edits it as needed. However, when using XMLC, one can't edit the results, one has to go back to the orignal document. Their is a big delay between creating and compiling. Since they are warnings, people ignore them (especially when some of the warnings are debatable as being actual warnings). Sorry, I made a big mistake by not figuring out how to turn these in to errors. IMHO, this would be a good thing to add to XMLC, with an option to reenable the document correction. > Why support the Swing parser? I think these two things are already > kompromises. The swing parser is there for historic reasons; it was the original parser, since there was no other freely available parser at the time. It probably should not be used. If people really want to use invalid html, XMLC is the wrong tool. One of the tools based on string substitution is the way to go. However, IMHO, invalid html is the wrong thing to do no matter what tool they are using. Take care Mark
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: Thought on future of XMLC, David Li |
|---|---|
| Next by Date: | Re: Thought on future of XMLC, Douglas Harris |
| Previous by Thread: | Re: Thought on future of XMLC, David Li |
| Next by Thread: | Re: Thought on future of XMLC, Douglas Harris |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |