logo       
Google Custom Search
    AddThis Social Bookmark Button
-->

Conceptual Graphs are Step 6: msg#00034

Subject: Conceptual Graphs are Step 6
* Tom Passin wrote May 1 on another thread:

The best approach, IMO, is to first convert the natual language constructs into Conceptual Graphs.

I agree Conceptual Graphs seem useful, but building them is NOT the
first step.  Classical NLP theory says five preprocessing steps would
have to be run first on that language, even to extract its SITUATIONS.
These steps are summarized in this diagram:

[1] http://www.lexikos.com/charts/index_files/image003.jpg

Each prior step is *required* to cull out one well-known kind of
ambiguity from a paragraph - lexical, structural, semantic, referential,
etc.  Unless all 5 are present and accurate, errors accumulate so fast
that related software will just be a digital garbage churn.

Conceptual Graphs are ... easy to comprehend and were specifically designed to translate natural language statements into a formal logic.

I agree, except CGs really do not *translate* NL.  They *express*
assertions in predicate calculus about the role players that fill
verb-specific association templates based on case frames.

There is hope such assertions may prove useful.  A persistent theory
holds that they will let agents in an FOL engine *react to* English
statements with enhanced cleverness.  Time will tell.

Even if that theory is true, in practice it cannot help until agent
software can also - and first - *understand* those English statements,
by removing ambiguities about what their speaker intended to say.

At minimum, this means selecting which (dictionary) sense of every
word the speaker had in mind, and which (contextual) topic best models
the intended subject of every pronoun and definite noun phrase.

Good news:  NLP software is getting better good at resolving such
ambiguities.  With heuristics, public lexical data, and sneaky tricks,
software encapsulating steps 1-5 can now guess about such things and
in some circumstances generate low error rates.

My MODELER design can do this.  In fact, if you configure it properly,
and interact with it about noticed problems (like misspellings), it will
cull ambiguities from your paragraphs with *virtually no* errors, and
dump an XTM chart which says in *your* ontology what they asserted:

[2] http://www.lexikos.com/charts/

MODELER uses tricks, one of which is WORDS scripts.  They resemble CGs,
and let you make similar assertions.  They can also substitute for a
parser's level 3 code by forcing *you*, not grammar rules, to map each
paragraph's English systax into legal WORDS syntax.

Regardless of how they may be built, WORDS scripts can expand like
association templates into TM structures that resemble CGs, but use TM
paradigms and *your* TM ontology to express the intended meaning of the
original paragraph.  This happens in subgraphs of MODELER's internal
topic chart, which for each input it returns (by default) in XTM.


I consider Topic Maps to be essentially a subset of Conceptual Graphs (with a few additional wrinkles). Some CGs can be expressed as TMs and some cannot. The ones that can be so expressed are nearly identical except for some syntax details.

I believe TMs can hold graph structures fully equivalent to those of
any CG, but TMs have no standard inferencing model.  CGs do: some
FOL engine that can infer things by using predicate calculus.

I suspect that any part of CGs which a TM cannot express are related
to their missing FOL engine.  But to me, normal conversion direction
would go from TMs toward logic processing - not the other way around -
so these lacks should present no real problems in any case.  The TM
application software would simply have to take up the slack if a chart
become fodder for somehting besides FOL.


The RDF folks are wrestling right now with how to make statements about subgraphs. In a CG, you can draw a box around a collection of conceptual relations and their topics. The box is an assertion (anything placed on the page in a CG asserted by definition), and it is called a "context box". We need something equivalent for topic maps, and you will want it for the kind of things you seem to be getting into.

I can easily believe RDF folks want to annotate subgraphs, because
to learn what was said in any English paragraph, you need only query
the subgraphs in its chart of topics - what associations were stated
for each topic present.  Adding new statements about similar subgraphs
would be a first natural step to "reacting to them" in the RDF world.

Independent of details on *reacting*, the business impact of the
software that charts topics should be non-trivial, as it will let
people write new kinds of IR software that avoids lexical ambiguity:

1) chart any English paragraph about *your* domain in an XTM file
2) find the speaker's intended meaning in *your* TM-based ontology
3) query the merged version of such charts in something like TMQL

And, if you really want it to, a chart can also serve as the input
to a step 6 process, which reformats all its Topic Map subgraphs
into proper CG notation, so predicate calucus engines can crunch it.

So, Tom, okay - if you really do think CGs in volume are a good idea,
here's a partial proposal for a 2-part project able to produce them:

    1. On this public thread, we debate XTM models that would be ideal
       for a chart, and craft a few prototypes by hand in LTM.  Iff we
get a consensus, we will then have a CG-compatible base ontology
       in TM terms, which MODELER will adapt to use for step 5 output.

    2. To guide such work in part 1, we can also imagine a file-to-file
       converter that takes that XTM chart and produces CGIF.  We should
       limit it to use *only* the XTM input syntax, then notice and feed
       back requirements on what data that part 1 output must contain.

Seriously, I have no concrete plans yet in MODELER for a CGIF output
module, because I do not know CGs or FOL in depth; because I sense they
are mostly a tool for teaching logic than a standard; and because it
is not yet clear to me how many people actually care about CGIF.

I hope this thread might yield interesting comments on such issues.

Regards,
Dan Corwin


<Prev in Thread] Current Thread [Next in Thread>