Hi Cefn,
thanks for the files, I'll see what I can do to put a working example
together in the next day or so.
As far as parsers are concerned, you generally get two varieties: top-
down (aka recursive descent) and bottom-up.
YAPP XSLT is an example of a pure recursive descent parser, with no
workarounds to fix any of the problems inherent with this type of
parser (eg poor error detection and reporting).
For Java, the main free parser generators are Antlr as you've
mentioned, JavaCC and JCup. I've used all of these in different
projects, and can't say I've found any one particularly better or
worse than any other. It's a case of horses for courses, and what
syntax you feel most comfortable working with. JavaCC and Antlr are
the two biggest contenders, JCup being a much smaller and less
supported project.
JCup is used for the XPath expression parser in the current o:XML
compiler/interpreter implementation. JavaCC is used for the various
parsers in SML. In fact, if you are interested in examples of XML-
oriented JavaCC parsers, check out SML [1] (source in CVS [2]).
If you're interested in parsers, compilers etc then I can recommend
the book Modern Compiler Design [3]. It's a good read and a great
resource, not exactly a beginners volume but it covers the whole
compiler/interpreter realm very thoroughly. Otherwise the so-called
Red Dragon Book [4] is supposed to be the classic in the genre.
Also remember that BNF is not a single, strictly defined grammar
language, there are probably as many variations as there are
implementations. A grammar written in BNF-like syntax for one parser-
generator will not usually work with another.
hth,
/m
[1] http://www.o-xml.org/projects/sml.html
[2] http://cvs.pingdynasty.com/viewcvs/sml/src/javacc/com/pingdynasty/
sml/
[3] http://www.amazon.co.uk/Modern-Compiler-Design-Computer-Science/
dp/0471976970
[4] http://en.wikipedia.org/wiki/Compilers:_Principles%2C_Techniques%
2C_and_Tools
On 25 Apr 2007, at 18:27, Cefn Hoile wrote:
WORKING EXAMPLE?
I've been trying to get a basic example of YAPP compiling a source
file which conforms to a BNF spec.
I've been using the example of arithmetic directly from the YAPP
page, and a text file containing '3 + 4' but I'm filling in a lot
of blanks as there doesn't seem to be a worked example with
verified code - how to run the full-fledged BNF+YAPP process, e.g.
* start with BNF and example source (plus a templates file for
'compilation')
* generate lexer and parser (which itself imports both lexer and
templates file)
* run the source through the parser to complete a compilation
(triggering rules in the templates file)
I attach the files I'm currently using to test run the process,
(which would be ideal supporting files for a worked example, if
they worked :). They are as minimal as possible, really just a
vehicle for the BNF right now.
The XSLT templating is being coordinated through ANT, and the full
output is shown at the end of this mail. The early lines of output
from ANT are just copying original files into a working directory
where they can be safely operated on. ANT terminates with an
unusual-looking error 'Content is not allowed in Prolog'.
Can you point me to an example which should actually work so I can
complete this whole pipeline correctly at least once. I'm using
Xalan 2_7_0.
ALTERNATIVES TO YAPP?
I would very much appreciate your suggestions for alternatives to
YAPP, either in the form of concrete tools, or the abstract
concepts I can use to search out those tools. The worst case
scenario is that I have to build my own tool.
You explained that YAPP is really only a proof of concept, which I
appreciate. I'm also a bit concerned about the comments about the
impossibility of error reporting and matching which you warn about
in the YAPP information page. Strikes me that the first (error
reporting) is one of the main reasons you define a grammar. The
second I haven't fully understood the impact of yet for my case.
You seem to imply that there are other forms of parsing/compilation
which might avoid these limitations. Could you suggest some terms I
could use to search for these alternative approaches, (espacially
tools based on XML).
I'm beginning to speculate whether I should be using RelaxNG
(similarly simple to BNF) and writing my own compiler routines
based on metadata inserted into the RelaxNG grammar file - elements
and attributes from a bespoke namespace. However I've not tackled
this kind of thing before, hence the desire to use a framework
which works out of the box (if I could get YAPP to run a test case).
Cefn
http://cefn.com/blog/
<data.zip>
P.S. YAPP also relies on the xalan:ignore function, not just the
nodeset function. I had a go at getting it to work with saxon (the
XSLT2.0 processor which I'm using for other projects). Not sure if
this is feasible/sensible. Reverted back to Xalan as couldn't trace
the xalan:ignore function.
P.P.S Here is the full output from ant
...
|