|
|
Subject: Re: ANC, FROWN, Fuzzy Logic - msg#00077
List: science.linguistics.corpora
Daoud Clarke wrote:
>
> As far as I understand it, fuzzy logic isn't about uncertainty in
> qualities, it is about degrees of qualities, or vagueness.
>
> <snip>
All this bit about fuzzy sets and Bayesian inference was very well put,
and I find nothing to disagree with.
> I think perhaps what the reference to Greg Chaitin's work was getting
> at was perhaps related to the following. In practice we are always
> faced with a finite corpus, whereas the theoretical corpora generated
> by rules are infinite. We can view our finite corpus as a sample from
> some hypothetical infinite corpus. The question is, what theory gives
> us the best estimate of this infinite corpus, given the finite sample?
> Using our finite corpus we can form theories about the infinite corpus,
> which may or may not incorporate our linguistic knowledge of the
> language in question. From an information theoretic perspective, the
> best theory would be the one that enabled us to express the finite
> corpus using the least amount of information -- the one that best
> compressed the information in the corpus.
>
> Of course theories become large and unwieldy, so we may prefer the
> minimum description length principle: the best theory for a sequence of
> data is the one that minimises the size of the theory plus the size of
> the data described using the theory.
>
> Some of this has been put into practice by Bill Teahan, who applies
> text compression techniques to NLP applications. It would be extremely
> interesting however to see whether the use of linguistic theories can
> help provide better text compression. To my awareness this has not been
> looked into.
I'd just want to point out that theory evaluation metrics based on
description length are only useful for some purposes, and that one need
not use them except when one's purposes are appropriate to such
evaluation. (There are no "universal" theory evaluation metrics, because
the space of purposes to which a theory can be put is infinite. I see this
as one of the root Cartesian flaws.)
A model that also predicted neuropsychological phenomena during speech
would be more useful in my book than one that only produced a formal
grammatical abstraction of utterances.
A model that also captured phenomena of language evolution over a social
network would be more useful in my book than one that only feeds a
treebank.
-- Mark
Mark P. Line
Polymathix
San Antonio, TX
Was this page helpful?
Thread at a glance:
Previous Message by Date:
click to view message preview
Re: ANC, FROWN, Fuzzy Logic
Hi,
I'm a DPhil student looking at some related stuff.
On 24 Jul 2006, at 02:24, John F. Sowa wrote:
Linda, Rob, Chris, and Mark,
I agree with Rob on following point:
RF> As far as I know fuzzy logic is just a way of keeping
> track of uncertain qualities, it does not explain the
> underlying uncertainty.
As far as I understand it, fuzzy logic isn't about uncertainty in
qualities, it is about degrees of qualities, or vagueness. Consider the
set of tall people, for example. At what height do we say that someone
belongs to this set? Fuzzy set theory proposes that there should be a
degree of membership to some sets, so if someone is not tall or short,
but somewhere in between, we should assign that person a degree of
membership to the set of tall people, say 0.5. Note that there is no
uncertainty about the person's height: we know exactly how tall they
are, we are just not sure whether to call them tall or not.
At first sight, the issue of representing colours would seem to be
perfect for fuzzy logic: since we can decompose colours (for example)
into the primaries red, green and blue, we can represent each possible
colour as partially belonging to the fuzzy sets of red, green and blue
colours. Then we can define ways to calculate, using fuzzy set
operations, to what degree something that is turquoise can be called
blue, for example. There are numerous problems with this, however. The
most obvious is that the behaviour of the representations change if you
look at colours in a different way. For example, you could equally
classify colours in terms of their degree of membership to the fuzzy
sets of cyan, magenta, yellow and black 'colours'; in this case, fuzzy
intersection would make colours closer to white, whereas in the red
green and blue decomposition, fuzzy intersection makes colours darker.
It may be that you are interested in representing uncertainty. The
standard system for reasoning with uncertainty is Bayesian inference.
The idea is that the mathematics of probability is perfectly suited for
reasoning about uncertainty. For example, not everyone has the same
idea of what turquoise should look like, therefore when someone uses
the term 'turquoise' we are not sure exactly what colour she is
referring to. We could ask people to specify their idea of turquoise in
terms of its red green and blue components, and then use their opinions
to estimate a probability distribution for the term 'turquoise' over
all the possible colours. (This would be a continuous function over the
three dimensional vector space in the cube between the points (0,0,0)
and (1,1,1), with a dimension corresponding to each of red green and
blue, such that integrating the function over this space would give 1).
Repeating this for all the terms for colours in the English language,
we could then use this, for example, to estimate the probability that
given someone had used the term 'blue' they meant the same colour that
another person would refer to as 'turquoise'.
Unfortunately I have no idea how this relates to vantage theory.
I also agree that Greg Chaitin makes many good points, but
the connection between those points and this discussion is
not clear.
RF> the solution is to understand language to be fundamentally
> a corpus and not a logical system of rules and classes over
> that corpus.
The first half of that sentence doesn't say much, since Chomsky
also claimed that language is a corpus, but one that is generated
by rules. Saying that the corpus is not generated by rules might
be a reasonable claim, but then it is necessary to answer Chris's
questions:
CB> how should we, as scientists, proceed in trying to derive
> objective and generalizable knowledge about language from
> corpora?
>
> once we have decided what to try and explain, what kind of
> models we should use?
I think perhaps what the reference to Greg Chaitin's work was getting
at was perhaps related to the following. In practice we are always
faced with a finite corpus, whereas the theoretical corpora generated
by rules are infinite. We can view our finite corpus as a sample from
some hypothetical infinite corpus. The question is, what theory gives
us the best estimate of this infinite corpus, given the finite sample?
Using our finite corpus we can form theories about the infinite corpus,
which may or may not incorporate our linguistic knowledge of the
language in question. From an information theoretic perspective, the
best theory would be the one that enabled us to express the finite
corpus using the least amount of information -- the one that best
compressed the information in the corpus.
Of course theories become large and unwieldy, so we may prefer the
minimum description length principle: the best theory for a sequence of
data is the one that minimises the size of the theory plus the size of
the data described using the theory.
Some of this has been put into practice by Bill Teahan, who applies
text compression techniques to NLP applications. It would be extremely
interesting however to see whether the use of linguistic theories can
help provide better text compression. To my awareness this has not been
looked into.
Daoud Clarke
Next Message by Date:
click to view message preview
RE: ANC, FROWN, Fuzzy Logic
Daoud Clarke wrote:
>It would be extremely interesting however to see whether the use of
>linguistic theories can help provide better text compression. To my
>awareness this has not been looked into.
Several researchers have used improvement in total description length as the
result of morphological analysis to justify the existence of morphology
(including me: see my paper in Computational Linguistics in 2001, and our
website at linguistica.uchicago.edu). At a crude level, it is clear that the
redundancy in lists of words -- for example, treating jumps, jumped,
jumping, laughs, laughed, laughing all as separate and unrelated words in
the lexicon of English -- leads to a longer description of an English corpus
than one in which there is a list of stems and affixes, and some machinery
that explicitly indicates how they may be composed in the language in
question. The devil is in the details, and there has been a lot of work in
this area over the last half dozen years.
John Goldsmith
Previous Message by Thread:
click to view message preview
Re: ANC, FROWN, Fuzzy Logic
Hi,
I'm a DPhil student looking at some related stuff.
On 24 Jul 2006, at 02:24, John F. Sowa wrote:
Linda, Rob, Chris, and Mark,
I agree with Rob on following point:
RF> As far as I know fuzzy logic is just a way of keeping
> track of uncertain qualities, it does not explain the
> underlying uncertainty.
As far as I understand it, fuzzy logic isn't about uncertainty in
qualities, it is about degrees of qualities, or vagueness. Consider the
set of tall people, for example. At what height do we say that someone
belongs to this set? Fuzzy set theory proposes that there should be a
degree of membership to some sets, so if someone is not tall or short,
but somewhere in between, we should assign that person a degree of
membership to the set of tall people, say 0.5. Note that there is no
uncertainty about the person's height: we know exactly how tall they
are, we are just not sure whether to call them tall or not.
At first sight, the issue of representing colours would seem to be
perfect for fuzzy logic: since we can decompose colours (for example)
into the primaries red, green and blue, we can represent each possible
colour as partially belonging to the fuzzy sets of red, green and blue
colours. Then we can define ways to calculate, using fuzzy set
operations, to what degree something that is turquoise can be called
blue, for example. There are numerous problems with this, however. The
most obvious is that the behaviour of the representations change if you
look at colours in a different way. For example, you could equally
classify colours in terms of their degree of membership to the fuzzy
sets of cyan, magenta, yellow and black 'colours'; in this case, fuzzy
intersection would make colours closer to white, whereas in the red
green and blue decomposition, fuzzy intersection makes colours darker.
It may be that you are interested in representing uncertainty. The
standard system for reasoning with uncertainty is Bayesian inference.
The idea is that the mathematics of probability is perfectly suited for
reasoning about uncertainty. For example, not everyone has the same
idea of what turquoise should look like, therefore when someone uses
the term 'turquoise' we are not sure exactly what colour she is
referring to. We could ask people to specify their idea of turquoise in
terms of its red green and blue components, and then use their opinions
to estimate a probability distribution for the term 'turquoise' over
all the possible colours. (This would be a continuous function over the
three dimensional vector space in the cube between the points (0,0,0)
and (1,1,1), with a dimension corresponding to each of red green and
blue, such that integrating the function over this space would give 1).
Repeating this for all the terms for colours in the English language,
we could then use this, for example, to estimate the probability that
given someone had used the term 'blue' they meant the same colour that
another person would refer to as 'turquoise'.
Unfortunately I have no idea how this relates to vantage theory.
I also agree that Greg Chaitin makes many good points, but
the connection between those points and this discussion is
not clear.
RF> the solution is to understand language to be fundamentally
> a corpus and not a logical system of rules and classes over
> that corpus.
The first half of that sentence doesn't say much, since Chomsky
also claimed that language is a corpus, but one that is generated
by rules. Saying that the corpus is not generated by rules might
be a reasonable claim, but then it is necessary to answer Chris's
questions:
CB> how should we, as scientists, proceed in trying to derive
> objective and generalizable knowledge about language from
> corpora?
>
> once we have decided what to try and explain, what kind of
> models we should use?
I think perhaps what the reference to Greg Chaitin's work was getting
at was perhaps related to the following. In practice we are always
faced with a finite corpus, whereas the theoretical corpora generated
by rules are infinite. We can view our finite corpus as a sample from
some hypothetical infinite corpus. The question is, what theory gives
us the best estimate of this infinite corpus, given the finite sample?
Using our finite corpus we can form theories about the infinite corpus,
which may or may not incorporate our linguistic knowledge of the
language in question. From an information theoretic perspective, the
best theory would be the one that enabled us to express the finite
corpus using the least amount of information -- the one that best
compressed the information in the corpus.
Of course theories become large and unwieldy, so we may prefer the
minimum description length principle: the best theory for a sequence of
data is the one that minimises the size of the theory plus the size of
the data described using the theory.
Some of this has been put into practice by Bill Teahan, who applies
text compression techniques to NLP applications. It would be extremely
interesting however to see whether the use of linguistic theories can
help provide better text compression. To my awareness this has not been
looked into.
Daoud Clarke
Next Message by Thread:
click to view message preview
RE: ANC, FROWN, Fuzzy Logic
Daoud Clarke wrote:
>It would be extremely interesting however to see whether the use of
>linguistic theories can help provide better text compression. To my
>awareness this has not been looked into.
Several researchers have used improvement in total description length as the
result of morphological analysis to justify the existence of morphology
(including me: see my paper in Computational Linguistics in 2001, and our
website at linguistica.uchicago.edu). At a crude level, it is clear that the
redundancy in lists of words -- for example, treating jumps, jumped,
jumping, laughs, laughed, laughing all as separate and unrelated words in
the lexicon of English -- leads to a longer description of an English corpus
than one in which there is a list of stems and affixes, and some machinery
that explicitly indicates how they may be composed in the language in
question. The devil is in the details, and there has been a lot of work in
this area over the last half dozen years.
John Goldsmith
|
|