logo       

Re: How do I use DTD's with Scala 2.3.0's XML object literals?: msg#00521

lang.scala

Subject: Re: How do I use DTD's with Scala 2.3.0's XML object literals?

Raphael,

For the most part i went out of my way to implement the spec as it is... and I don't claim an xml object conforms to dtd.

But for all situations where it matters (when you serialize something out, actually producing an XML document), you can have your DTD (and validate against it, too).

Processing instructions are supported.

Could you be more specific in your proposal in what you would like to see changed? I offered two alternatives in my preceding mail, which I both don't like. I am not sure which one you would prefer?

cheers,
Burak

On 1/30/07, Raphael James Cohn <raphaelcohn@xxxxxxxxx> wrote:
Burak,
 
Thank you for your comprehensive reply. I agree with you; DTDs are a nuisance, and they, along with processing instructions, make life for API developers hard, and, potentially users ("one has to go through the docElem method first, which one may find confusing and inconvenient"). However, if that's the XML spec, then we should stick with it. XML _is _ awkward to work with, primarily because its data structures are completely unlike those in use in most moden programming languages. That's why other pragmatic data forms are popular in the agile community (eg JSON and YAML; others argue that augmented TSV is good for tabular data). The work you've done with Scala makes it a whole lot easier to work with, and I think you should be very proud of that.
 
I understand why you may not want to change this behaviour, but please be cautious. If you don't completely implement the spec, put it up in lights. Why? Because it violates the Principal of Least Surprise. Perhaps we could have at least have a sensible compiler message, eg "Processing instructions and DOCTYPEs are not supported? After all, people (including me) are going to just paste in their XML... It should also be very hard for a user to use a library and produce an invalid document. Any means by which it can be done (and which won't be supported) should be probably be noted. For example, XHTML is not XHTML 'officially' without the DTD. Whilst this might seem moot, there is at least two instances I can think of: IE6 and Excel's interpretation of CSS attributes. In both, the presence or absence of the DTD causes rendering changes!
On another note, I was thinking it would be nice to have a way to tell the compiler to ignore whitespace. This matters in some XML documents, eg Jetty configuration files.
 
I'd also love to be able to extend this syntax to general string and text templating. I think David Pollak referred to this. With multi-line strings, this would be a killer app. It would reduce to about 2 days what has just taken me 2 weeks in Java...
 
Ta for the prompt reply,
 
Raph
 
----- Original Message ----
From: Burak Emir < burak.emir@xxxxxxxxx>
To: Raphael James Cohn <raphaelcohn@xxxxxxxxx>
Cc: scala@xxxxxxxxxxxxxx
Sent: Tuesday, 30 January, 2007 9:46:46 AM
Subject: Re: How do I use DTD's with Scala 2.3.0's XML object literals?

Hi Raphael,

On 1/29/07, Raphael James Cohn < raphaelcohn@xxxxxxxxx> wrote:
Hi,

I'm using scala's wickedly cool XML object literals (great idea - no more need
for Freemarker or Velocity for XML - could we have them for other things, too,
please... like a compile time templating engine!) to generate XHTML reports for
rehersal. To create valid XHTML, I need to have a DTD before the first element.
However, the compiler complains if I have a method like this:-

    private def template(category: TestCategory) =
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
" http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">


This won't work because I consciously left out DTDs from the parser. The biggest trouble is that DTDs would mess up Scala's "data model" (the XML people's slang for XML object mapping). They may only appear at the top :/ So there is a class  scala.xml.Document (http://www.scala-lang.org/docu/files/api/scala/xml/Document.html ) which can contains XML nodes but is not regarded as a node itself.

There are routines for writing out XML with a prefixed DTD declaration though, see Utility.toXML

<head>
<meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8"/>
<title>{category.name} Testing Results</title>
<style type="text/css" media="all">
(SNIPPED lots of content)

Removing the DTD causes it to compile.

Does anybody know why?

Additionally, does the compiler or API understand XML and HTML literals? Is ther
a utility method I could use to convert strings?

The compiler parses your XML and gives you the objects, not more, not less (XHTML is ok, plain HTML is not).
Of course, the API offers methods for parsing XML, the easiest being ( http://www.scala-lang.org/docu/files/api/scala/xml/parsing/ConstructingParser.html)

import scala.io.Source
import scala.xml.parsing.ConstructingParser
val src = "" // there's also fromFile, fromInputStream, fromURL...
val prs = ConstructingParser.fromSource(src , true/*preserve whitespace*/)
val doc = prs.document // parses the document
val root= doc.docElem // the content node

If you know that your source does not have prologue (DTD, processing instructions or comments), you can directly tell the parser to grap the content node like this

val root = prs.element(scala.xml.TopScope)(0) // parse an element, assuming no namespace bindings.

The (0) at the end is necessary to get the "first" element, because the routine is designed to return a node sequence in some cases.


Many thanks

Raph

For information, the compile error (I'm using scalac in ant, with Scala 2.3.0) is:-


I don't think we would add support for parsing the DTD, it is very easy to achieve, but from a design point of view it adds nothing but causes trouble. Certainly we don't want to have scala.xml.Document a node (something that can appear everywhere in an xml document).

The alternative, making the parser clever enough to understand that something with a docype declaration is not a node, but a scala.xml.Document is also annoying, because then in order to do anything with it (traverse,transform,match,xpath...), one has to go through the docElem method first, which one may find confusing and inconvenient.

cheers,
Burak

--
Burak Emir
Research Assistant / PhD Candidate
Programming Methods Group
EPFL, 1015 Lausanne, Switzerland
http://lamp.epfl.ch/~emir



Yahoo! Photos – NEW, now offering a quality print service from just 7p a photo.



--
Burak Emir
Research Assistant / PhD Candidate
Programming Methods Group
EPFL, 1015 Lausanne, Switzerland
http://lamp.epfl.ch/~emir
<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise