On 17 Dec 2004, at 12:57, Frank Raiser wrote:
Hello,
Hi Frank, thanks for posting to the list -
As SrcML is more of a framework there are several modules included
and I suggest you check them all out from CVS.
I found I had to download jars antlr.jar and dom4j-full.jar to build.
Also I got compilation errors compiling some of the modules, eg
parser-shared (after compiling config, api and util successfully).
I'll find the top-level build.xml file and try that.
From what we've seen in the programming guide for o:XML it appears as
if o:XML is highly based on XSL with f.ex procedures being used instead
of the XSL templates. Of course this might be a bit blunt, but from
that
introduction we can not yet see the big difference.
There's an introductory article on XML.com which may interest you ->
http://www.xml.com/pub/a/2004/07/21/oxml.html
Though for the purpose of examining MLML, you can disregard o:XML
completely.
SrcML is essentially a framework. It does have an intermediary code
format specified in our DTD. Though it is not really intermediate, but
rather an underlying format. The data is kept as XML and worked with
it,
but it is not visible to the user/developer most of the times. It's
also
not intermediate, because we have tools working on SrcML and producing
SrcML as output again.
Framework for application development, similar to an IDE? Or framework
for code manipulation?
I believe that the data format, whether you wish to call it
intermediary or not, is really important (understatement).
It's the basis that all other tools and utilities have to operate on,
be they visualisers, analysers or processors. That's why it must
contain the right information at the right level of abstraction.
I don't really see a difference between the switch statement in Java
and C. Do you mind explaining this point a little further?
C switch statements allow loop unrolling in rather bizarre ways (as for
example in Duff's device), afaik this is not possible in Java.
(googling - found it:
http://java.sun.com/docs/books/jls/first_edition/html/14.doc.html#35518
)
The difference is more syntactic than semantic, but important
nonetheless.
Anyhow, just an example of language differences, an issue which you've
identified yourselves with SrcML.
As a possible solution that I've recently considered, the specific
constructs of any language could be represented in a generic way, eg:
<mlml:statement name="java:switch">
...
The drawback turned out to be too heavy for our goals. One of the main
goals of the SrcML framework is to allow developers to create tools
which
can work on different languages, without (in the most idealistic case)
even knowing what language it is working with. This however turns out
to
be impossible when every single construct is only available in a
language
dependent namespace.
The 'infoset' of MLML is:
- Types (classes)
- Variables and variable references (fields)
- Functions and function calls (static and dynamic, oo and procedural)
(methods)
- Operators and operations (expressions)
- instructions and statements (as per above - defined language specific
statements)
This allows quite extensive analysis, for example for Aspect-Oriented
or meta-programming, without having to look inside any
language-specific constructs. For example, all possible branches
leading to reassignment of a particular type field can be found with an
XPath expression.
It's not in itself complete, but the goal of MLML is to create a
'language of languages' rather than a language superset.
Of course the required level of abstraction will be different depending
on any particular use, so the code format will necessarily be a
compromise between different needs.
If you're interested in this approach you can take a look at the
languages module of SrcML which provides a means of dynamically gather
information about a language at runtime
I'll try to put some time aside to spend with SrcML. I'm very
interested in your experiences and the work you've done.
We do intend to work with our result documents, but
when transforming a program from functional to procedural by such means
we will end up with an unreadable mostly auto-generated version of the
original code which does not really reflect the intentions of the
original
code. ...
This is a prime
reason why we do not plan on having any such conversion process at all.
Auto-generated or transformed code is not a problem, unless you're
intending to transform the program back to source format and expect it
to still be readable and intelligible. Is that a design goal of SrcML?
/m
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
|