I have been developing a deterministic parser backend for spirit.
Is there any interest for this?
Here is what it is:
It is implemented as a separate rule type, deterministic_rule.
A parser is assigned to the rule, e.g.
rule=real_p;
and a function: build_expression is called on the parser. This
function will build a deterministic parse graph. As I understand it,
this approach will fit well with the new spirit2.
A node in this graph has the following data:
1. A range map between e.g. char and a new node. (May point to itself)
2. A node builder to support rules referring to other rules
3. A separate node to support recursion
Each rule holds a root node and a set of end node references.
If the input is exhausted at an end node reference, and we are not
inside a recursion,
the parsing is successful. Otherwise it fails.
The following parsers are supported so far:
alternative, |
difference, -
primitives, char_p,str_p,space_p etc.
sequence, >>
looping, *,+
optional,!
numerics, real_p,int_p,etc
directives,lexeme_d
other rules.
In addition, skip parsers can be used. Currently this must be added to
the rule type.
The following is a partial implementation (from memory) of a c++ literal.
I added two exceptions (keywords) as well, to show that it works as expected.
typedef deterministic_rule<std::string::iterator,int> rule_t;
uint_parser<unsigned, 16,4,4> hex_quad;
rule_t rule;
rule=(
(
alpha_p
| '_'
| ("\\b" >> hex_quad)
) >>
*(
alnum_p
| '_'
| ("\\b" >> hex_quad)
)
) - (str_p("int")|str_p("if"))
;
rule.simplify_nodes();
Simplifying a rule is not necessary. It removes duplicate brances,
increasing the clarity of the parse graph. In this case, it removed 6
nodes.
When combining rules, we need to build a temporary node builder in
place of the actual node unification process (serial join,parallel
join or imprint)
The node builder does not need to be expanded until that path is
traversed during parsing.
I also attach the output of the calculator grammar (with a skip
parser). This is not optimal with respect to the number of nodes
(Extra nodes generated in '(' >> expression >> ')')
Due to the complexity of this tree, I have not been able to fully
verify its validity. Only actual parsing will prove if it works.
What is missing:
Actions. I have some ideas of how to implement at least ast parse tree
node actions, but I need help in a general solution.
Better recursion implementation:
e.g. rule1= rule2 >> *(('+' >> rule2) | ('++' >> rule2)) will
currently not work.
(rule1= rule2 >> *(('+' >> rule2) | ('-' >> rule2)) will work)
Support for unicode range maps.
Testing,testing,testing.
Regards,
Peder Holt
numerics.dot.png
Description: PNG image
literal.dot.png
Description: PNG image
calc.dot.png
Description: PNG image
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642_______________________________________________
Spirit-devel mailing list
Spirit-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/spirit-devel