Toward a Semantic General Theory of Everything
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Toward a Semantic General
Theory of Everything
The notion of a universal semantic cognitive map is introduced as a general index-
ing space for semantics, useful to reduce semantic relations to geometric and topo-
logical relations. As a first step in designing the concept, the notion of semantics is
operationalized in terms of human subjective experience and is related to the con-
cept of spatial position. Then synonymy and antonymy are introduced in geometri-
cal terms. Further analysis building on previous studies of the authors indicates
that the universal semantic cognitive map should be locally low-dimensional. This
essay ends with a proposal to develop a metric system for subjective experiences
based on the outlined approach. We conclude that a computationally defined uni-
versal semantic cognitive map is a necessary tool for the emerging new science of
the mind: a scientific paradigm that includes subjective experience as an object of
study. ! 2009 Wiley Periodicals, Inc. Complexity 15: 12–18, 2010
INTRODUCTION
I
magine that in a remote future all meaningful information is laid out in some
ALEXEI V. SAMSONOVICH, REBECCA abstract space based on its semantics. Let us call this space the universal
F. GOLDIN, AND GIORGIO A. ASCOLI semantic space, and the entire distribution of available chunks* of information
in it (which may look like stars in the galaxy on the cover illustration) the univer-
sal semantic cognitive map. The idea may sound intuitively familiar, as spatial
Alexei V. Samsonovich and Giorgio A. analogies play a prominent role in human cognition and memory [1, 2]. However,
Ascoli are at Krasnow Institute for the notion of a universal semantic cognitive map goes beyond mere indexing or
Advanced Study, George Mason analogy. The map in and by itself can be used as a language and a complete repre-
University, 4400 University Drive, sentation of all information that is ‘‘indexed’’ on it. Indeed, if each precise map
Fairfax, Virginia 22030 location uniquely determines the associated meaning then, literally, one point of
(e-mail: ascoli@gmu.edu) this map should be worth thousands of words. This representation of information,
however, would make no practical sense, if map coordinates were assigned to
Rebecca F. Goldin is at Department of chunks at random. Instead, as for a road atlas, a useful map would enable seman-
Mathematics, George Mason Univer- tic inferences by virtue of geometric and topological inferences. Therefore, we
sity, 4400 University Drive, Fairfax, assume that:
Virginia 22030
c semantic relations among chunks are captured by geometric and topological
relations among their images in the map;
c points of the map and displacements on the map are associated with definite
and consistent semantics; and
*By a ‘‘chunk’’ we refer to a coherent representation of meaningful information, in
any format and in any medium. Examples: an utterance, a document, a sculpture,
a movie, a traffic light. A message of any kind is also a chunk.
12 C O M P L E X I T Y Q 2009 Wiley Periodicals, Inc., Vol. 15, No. 4
DOI 10.1002/cplx.20293
Published online 25 November 2009 in Wiley InterScience
(www.interscience.wiley.com)c the semantics of each map location rather than quantities of information. The next question refers to the
can be obtained as a composition of For this reason, e.g., it seems natural to notion of semantic space that provides
semantics of finite displacements expect that semantics should be meas- an infrastructure for the map. In cogni-
along a path leading to that location ured by some sort of a vector rather tive psychology and linguistics, there
from some reference point on the than a scalar. Even the task of finding were many limited attempts (and asso-
map. the number and semantics of the vec- ciated criticisms: [6]) to make the idea
tor components seems too challenging. of semantic space mathematically pre-
If this hypothetical construct is ever We start with an observation that could cise. For example, Russell [7] intro-
to materialize, it will in fact constitute simplify the task substantially: to con- duced a two-dimensional circumplex
a General Theory of Everything, once a struct a meaningful multidimensional model to represent feelings, Osgood
dream of a great-grandfather of Ijon semantic map, we only need a measure et al. [8] introduced a three-dimen-
Tichy (a Stanislaw Lem’s character). In of semantic dissimilarity of any two sional semantic differential, Gärdenfors
comparison, we have a rather modest chunks that can be interpreted as dis- [4] developed the notion of a concep-
ambition to outline a more precise tance with a direction, plus a chunk tual space, etc. The state of the art in
notion of this challenge from episte- with known semantic measure. The lat- linguistics related to the semantic
mological and mathematical perspec- ter is easy to ascribe: it could be the space idea is represented by latent
tives. In this essay, we first address the empty chunk, defined to have zero semantic analysis (LSA; [9]) and its
critical question of how to measure semantic ‘‘position.’’ variations, including the probabilistic
semantics practically. Next, we intro- To operationalize the notion of topic model [10], association spaces
duce key distinctions and definitions semantic dissimilarity, we interpret [11], and other techniques. We eschew
of semantic maps. Then, we illustrate semantics as potential average human any requirement to give a comprehen-
these concepts with a simplified exam- experience. If a representation is sive review here.
ple. Finally, we end with a discussion meaningful to a human subject then The growing success of LSA and
of the potential impact that semantic the subject understands its meaning related examples supports our central
map may have on the evolution of sci- (whatever this means) when he/she assumption that the idea of the univer-
ence and the progress of humankind. becomes aware of it. Building on this sal semantic cognitive map can be
premise, the notion of the meaning of given a precise mathematical sense in
a chunk can be made precise by asso- general, and not only in those limited
COGNITIVE-PSYCHOLOGICAL AND
ciation with the mental state of aware- special cases. Specifically, we assume
COGNITIVE-NEUROSCIENTIFIC
ness of the chunk, as follows. Assume that it is possible to map any given set
UNDERPINNINGS
that, in principle, it is possible to mea- of chunks to points in a universal
In this work, we treat semantics as
sure the dissimilarity of any given pair semantic space endowed with a metric
potential experience of a human sub-
of mental states, using their neural that captures semantic relations
ject. In general, the notion of semantics
(e.g., functional magnetic resonance among chunks.
and its definition have been a subject
imaging) and/or behavioral (e.g., intro-
of perpetual philosophical debate
spective report) correlates. Of course,
([3, 4]; see also the last section) to KINDS AND PROPERTIES OF
there are substantial technical issues
which we are not offering a conclusion. SEMANTIC COGNITIVE MAPS
here: how to filter out the noise and In this text, we use the word ‘‘map’’ to
We will expand on our position in the
last section with explanations and dis- effects of unrelated factors, how to mean both a map of sets from the
cussion of related background issues. overcome cultural differences, etc. We chunks to a metric space, as well as its
For now, we just note that our choice skip their discussion for now and image. Semantic relations among
allows us to measure semantics by assume that there is a protocol accord- chunks captured by the map image
measuring human experience and ing to which measurements can be may include dissimilarity, antonymy,
therefore to define all necessary con- repeated with a sufficient number of and synonymy. For example, the ambi-
cepts operationally. In this case, how subjects yielding consistent results. In ent space in which the map lives may
should we measure the meaning of a this case, we could say that the notion have a metric that captures dissimilar-
chunk? The measurement of the mean- of dissimilarity of two mental states is ity (called dissimilarity metric): the
ing is not the measurement of the operationally defined, and so is the more dissimilar two chunks are the
amount of information in the sense of notion of dissimilarity of two chunks. greater is the distance between them
Shannon [5], because in the current At this point, it is only an assumption on the map. Naturally, zero distance
framework, we need to distinguish that these general operational defini- should imply identity of meaning and
among various qualities or qualia tions exist. vice versa. In this case, we call the
Q 2009 Wiley Periodicals, Inc. C O M P L E X I T Y 13
DOI 10.1002/cplxWe also assume that with a suffi-
ciently large set of chunks, elements of
FIGURE 1
a path can be made small enough so
that semantics of a path element
would approach an ‘‘elementary
semantic difference.’’ In other words,
our hope is that as the difference of
meaning of two chunks becomes
smaller and smaller, it should also
become simpler and easier to grasp.
Indeed, when the necessary level of
precision is specified, the difference
between two similar items is usually
easier to characterize than the differ-
ence between items of different kind.
Consider, for example, the following
pairs of chunks: ‘‘a new car’’ and ‘‘a
used car,’’ ‘‘the number 4’’ and ‘‘the
Kinds of semantic cognitive map: strong (left: modified with permission from http://en. number 5,’’ ‘‘a new car with a scratch’’
wikipedia.org/wiki/File:Calabi-Yau.png) and weak (right). and ‘‘polysemy.’’ Consistent with this
observation, we assume that the
notion of an elementary semantic fla-
vor makes sense, understood as the
meaning of a finite semantic difference
map a strong semantic cognitive map. rical framework outlined above in that cannot be further reduced to a
As an alternative, a map may tend to which only the notion of dissimilarity composition of smaller differences.
pull all synonyms together and all is initially assumed to be represented Based on this definition, elementary
antonyms apart, with a soft constraint by distances? To the best of our knowl- semantic flavors must differ from each
on the spread of the entire distribution edge, this is an open problem [13]. other qualitatively, implying that two
and without representing dissimilarity Another issue concerns the seman- different flavors cannot correspond to
per se [12]. In this case, we call the tics of the coordinates of the ambient the same direction in Rn. On the other
map a weak semantic cognitive map space in which the map image lives. A hand, every direction in Rn associated
(Figure 1). recognized limitation of LSA and with a pair of chunks that are sepa-
A proposal to build a universal related approaches is that semantic rated along this direction should be
semantic cognitive map capable of ac- dimensions cannot be clearly identi- representable by a composition of ele-
commodating all possible human fied: ‘‘The typical feature of LSA is that mentary semantic flavors. Based on
knowledge and experiences would dimensions are latent. That means these considerations, we assume that
imply a huge program for many gener- there are no explicit interpretations for any point on the map has local co-
ations to come. Today, we are far from the dimensions’’ (Ref. 14, p. 414). Is it ordinates associated with elementary
the final goal, yet it could make sense possible to construct a semantic cogni- semantic flavors.
to address one relatively simple ques- tive map, the dimensions of which We shall continue this consideration
tion about the final product. Assuming have clearly identifiable, general in the next section using a simple
that the universal semantic cognitive semantics? example (which is not representative of
map can be constructed, what can we For now, we assume that the ambi- today’s state of the art in computa-
say a priori about its geometric and ent space of the cognitive map is a tional or cognitive linguistics). Specifi-
topological properties? vector space Rn, and therefore chunks cally, we restrict chunks to English
We propose to narrow this question are associated with vectors under any documents, replace elementary seman-
down to two issues that in our view semantic map of chunks to Rn. tic flavors with English words, neglect-
can be addressed today. One has to According to the above assumptions, ing their ambiguity and context-de-
do with geometric representation of the semantics of a given chunk are pendence, and assume that coordinates
antonymy on a strong semantic cogni- given by a composition of semantics of given by these words on the map are
tive map. How can one introduce syn- elements of a path leading to it from 0, global Cartesian coordinates. Although
onymy and antonymy into the geomet- representing the empty chunk. these simplifications may introduce
14 C O M P L E X I T Y Q 2009 Wiley Periodicals, Inc.
DOI 10.1002/cplxinconsistencies, this toy example is c the coordinates of Rn have definite [17]. Our present goal is not to find the
useful to illustrate our ideas. semantics, best representation of documents, but
c A(X) records dissimilarities among rather to introduce a framework in
elements of X by distances in Rn which such maps could be explored.
A TOY EXAMPLE
(with an appropriate choice of met- The first question that we address
Let us start with a finite set X consist-
ric): the more distant two docu- here is how to incorporate the notions
ing of the set of documents currently
ments are, the less similar are their of synonymy and antonymy into this
existing in the literature. We are setting
meanings, and vice versa. geometrical framework. This question
out to present a mathematical frame-
relates to the issue of translating rela-
work in which X can be given geomet-
It may not be possible to satisfy all tionships among words and docu-
rical structure, and that structure can
these properties at once, but we begin ments into relationships among docu-
be studied to understand more about
by noting that a part of this general ments.
the structure of language and human
framework has been used already in We posit the existence of two func-
knowledge. As a first step in this pro-
LSA. Attempts to improve on this anal- tions (defined operationally):
cess, we would like to consider a map
ysis will not only rely on changing pa-
rameters but also on varying A and applicability g: X 3 W ? [0, 1], and
A : X !gA on A(X) 3 W, where A(X) is the dimension of the map by analyzing the physical world ‘‘out there.’’ This belief
image. In particular, we let fA(p, w) 5 system of synonym–antonym relations. allows us to develop phenomenological
f(x, w) and gA(p, w) 5 g(x, w), where p We have previously demonstrated knowledge of the World which, together
5 A(x). On each fiber, Fw, we then extend by example the possibility of simulta- with precise metrics, logic, and experi-
fA(., w) and gA(., w) to smooth functions neous geometric representation of syn- mentation, leads to progress, including
on an open neighborhood of the image onym and antonym relations in W tools to modify the World itself. Para-
A(Fw). Such extensions are clearly not using a weak semantic cognitive map doxically, however, in this circle of em-
unique, but we choose one to be as [12]. Our example constructed by nu- pirical science experiences themselves
simple as possible, i.e. such that it merical optimization of a certain are left out as a hard problem. We
does not vary wildly in regions not energy function was a map from W to recently proposed a scientific approach
containing points in A(X). We abuse a vector space V: to study experiences, because humans
notation and denote the extensions by observe them just as they observe other
r:W!V (3) real phenomena [18]. Because experien-
fA and gA as well. Thus, the domains of
fA(., w) and gA(., w) are subsets of Rn. i.e., words were represented as vectors ces are subjective, their phenomenolog-
Then, for each word w, we may associ- in V, such that almost every antonym ical knowledge must be developed via
ate a vector to each point in its do- pair had an obtuse angle between their introspection rather than with psycho-
main of applicability: the gradient of vectors, whereas almost every syno- physiological measures. To yield con-
fA(., w) at that point. nym pair had an acute angle between sistent scientific results, however, those
The notions of synonymy and their vectors (exceptions from this rule observations must become objective, in
antonymy that conform to a common constituted only 1% of synonym and the sense that they can be reliably
intuitive understanding can be now antonym pairs and could be due to repeated by, and quantitatively commu-
introduced as follows. We compute gra- inconsistencies of the dictionary itself). nicated among, independent research-
dient vectors at p for all words that In contrast to Rn, the vector space V is ers (see the cover illustration of this
apply to documents at p. Then, we low dimensional, with most of the data issue). The key missing elements for
define synonyms and antonyms to be captured in just four dimensions (95% this purpose are precise metrics and
pairs of words such that gradient vec- of the variance of the data is captured measuring tools for human subjective
tors at p are nearly parallel for syno- in three dimensions; the fifth and experiences, such as Cartesian coordi-
nyms and nearly antiparallel for anto- higher dimensions have negligible var- nates of the mental space. Logic and
nyms; the notions ‘‘nearly parallel’’ and iance). Moreover, the first three princi- experimentation, applied to measures
‘‘nearly antiparallel’’ are made precise pal components (PC) of the emergent of subjective experiences, will lead to
with a threshold angle, which is a pa- distribution of words in V had clearly more progress, including tools to engi-
rameter of the theory. Note that if A(X) identifiable semantics: ‘‘good-bad’’ (PC neer artificial minds.
satisfies an appropriate density prop- 1), ‘‘calming-exciting’’ (PC 2), and A long time has passed since scien-
erty near p in Rn then the gradients ‘‘open-closed’’ (PC 3). This result sug- tists were satisfied with the belief of a
at p will be independent of the exten- gests that in the case outlined above of world made of particles and fields.
sions of fA and gA to a neighborhood of a strong semantic cognitive map, it Today physicists are preoccupied with
A(X). will be possible to select a semantically interpretation of far more abstract con-
Assuming that it is possible to gen- meaningful local coordinate system cepts, while even the century-old quan-
eralize this toy example to the univer- using these same general semantics tum mechanics lacks interpretation
sal semantic cognitive map considered (possibly producing similar PC charac- [19, 20]. Brain scientists are similarly
above, we relate the local coordinates teristics for the local density of the puzzled with developing a scientific
on the universal semantic cognitive universal semantic cognitive map). description of the brain at the higher
map to the above notions of synonyms cognitive level. In particular, there is no
and antonyms. A pair of antonyms WHY SEMANTIC COGNITIVE general consensus on how to introduce
should correspond to opposite direc- MAPPING MATTERS TO SCIENCE subjective experience into natural sci-
tions along a coordinate that captures The brain creates the Universe as we ence [21]. The new scientific paradigm
their semantics, two synonyms should know it, and the only Universe that we [18] should be based on the classical
correspond to nearly the same direc- know directly: the mind. It all starts scientific method [22–24], while at the
tion, and semantically different pairs of from humans being aware of their own same time it needs to be (a) semanti-
antonyms should correspond to differ- subjective experiences. Scientists call cally oriented and (b) subject-oriented,
ent local coordinates on the map. This some of these experiences ‘‘observa- i.e. it needs to include subjective expe-
approach allows us to estimate the tions’’ and attribute their origin to the rience as a subject of study. The
16 C O M P L E X I T Y Q 2009 Wiley Periodicals, Inc.
DOI 10.1002/cplxexplicit connection between semantics discussions of semantics in the litera- not add a restriction in this context.
and subjectivity can be ensured by ture from ancient to modern times jus- Accordingly, we expect implications of
selecting an operational definition of tify the freedom of our choice, which is semantic cognitive maps for the entire
semantics based on subjectivity, and to define semantics in terms of subjec- natural science, not just for the new
this is the choice we made in this work. tive experience: in other words, we say science of the mind. In conclusion, to
To better understand what seman- that ‘‘meaningful’’ equals ‘‘meaningful develop a unified semantic theory ap-
tics are, how to measure and to quan- to a subject’’. In this case, the notion of plicable to subjective experience and
tify semantics, and how to build quan- semantics and the notion of potential to objective physical reality, we need
titative theories in semantic terms, it experience become synonymous. to better understand semantics itself,
would be useful if we could represent Although the new scientific para- from a conceptual to a computational
semantics geometrically and apply digm needs to include subjective expe- level. This sort of knowledge can be
powerful tools of geometry to (real rience as a topic of investigation, no extracted from natural language and
rather than latent) semantic analysis. conceptual and mathematical tools are all documents, viewed as a collective
The notion of semantics, or the mean- yet available to represent and study product of all human minds of all gen-
ing of a representation, has many defi- subjective experiences quantitatively erations.
nitions and interpretations. Major phil- with adequate accuracy, completeness,
osophical, linguistic, and mathematical and generality. We need a powerful ap- ACKNOWLEDGMENTS
movements have been founded on try- paratus to deal with semantics of expe- The authors thank Dr. Harold J. Moro-
ing to articulate these notions [25, 26]. rience, and therefore, with semantics witz for his continuous encourage-
The long and numerous controversial in general: the word ‘‘experience’’ does ment.
REFERENCES
1. Lakoff, G.; Johnson, M. Metaphors We Live by, 2nd ed.; University of Chicago Press: Chicago, IL, 1980.
2. Samsonovich, A.V.; Ascoli, G.A. A simple neural network model of the hippocampus suggesting its pathfinding role in episodic
memory retrieval. Learn Memory 2005, 12, 193–208.
3. Baldinger, K. Semantic Theory: Towards a Modern Semantics; Brown W.C., Wright, R., Transl.; St. Martin’s Press: New York, 1980.
4. Gärdenfors, P. Conceptual Spaces: The Geometry of Thought. MIT Press: Cambridge, MA, 2004.
5. Shannon, C.E. A mathematical theory of communication. Bell Syst Technical J, 1948, 27, 379–423, 623–656.
6. Tversky, A.; Gati, I. Similarity, separability, and the triangle inequality. Psychol Rev 1982, 89, 123–154.
7. Russell, J.A. A circumplex model of affect. J Pers Soc Psychol 1980, 39, 1161–1178.
8. Osgood, C.E.; Suci, G.; Tannenbaum, P. The Measurement of Meaning. University of Illinois Press: Urbana, IL, 1957.
9. Landauer, T.K.; Dumais, S.T. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and
representation of knowledge. Psychological Rev 1997, 104, 211–240.
10. Griffiths, T.L.; Steyvers, M. A probabilistic approach to semantic representation. In: Proceedings of the 24th Annual Conference
of the Cognitive Science Society; Gray, W.D.; Schunn, C.D., Eds.; Lawrence Erlbaum Associates: Hillsdale, NJ, 2002; pp 381–386.
11. Steyvers, M.; Shiffrin, R.M.; Nelson, D.L. Word association spaces for predicting semantic similarity effects in episodic memory.
In: Experimental Cognitive Psychology and Its Applications: Festschrift in Honor of Lyle Bourne, Walter Kintsch, and Thomas
Landauer; Healy, A., Ed. American Psychological Association: Washington, DC, 2004; pp 237–249.
12. Samsonovich, A.V.; Ascoli, G.A. Cognitive map dimensions of the human value system extracted from natural language. In:
Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms.Proceedings of the AGI Workshop 2006.
Frontiers in Artificial Intelligence and Applications, vol. 157; Goertzel, B.; Wang, P., Eds. Amsterdam, The Netherlands: IOS Press,
2007; pp 111–124. ISBN 978–1-58603-758-1.
13. Schwab, D.; Lafourcade, M.; Prince, V. Antonymy and Conceptual Vectors. In: Proceedings COLING 2002: The 19th International
Conference on Computational Linguistics. http://www.aclweb.org/anthology-new/C/C02/C02-1061.pdf; 2002.
14. Hu, X.; Cai, Z.; Wiemer-Hastings, P.; Graesser, A.C.; McNamara, D.S. Strengths, limitations, and extensions of LSA. In: Handbook
of Latent Semantic Analysis; Landauer, T.K.; McNamara, D.S.; Dennis, S.; Kintsch, W., Eds.; Lawrence Erlbaum Associates: Mah-
wah, NJ, 2007; pp 401–425.
15. Mandelbrot, B. The Fractal Geometry of Nature. Lecture notes in mathematics 1358. W. H. Freeman: New York, NY, 1982. ISBN
0716711869.
16. Martin, D.I.; Berry, M.W. Mathematical foundations behind latent semantic analysis. In: Handbook of Latent Semantic Analysis;
Landauer, T. K.; McNamara, D.S.; Dennis, S.; Kintsch, W., Eds.; Lawrence Erlbaum Associates: Mahwah, NJ, 2007; pp 35–55.
17. Dennis, S. Introducing word order within the LSA framework. In: Handbook of Latent Semantic Analysis; Landauer, T.K.;
McNamara, D.S.; Dennis, S.; Kintsch, W., Eds.; Lawrence Erlbaum Associates: Mahwah, NJ, 2007. pp 449–464.
18. Ascoli, G.A.; Samsonovich, A.V. Science of the conscious mind. Biol Bull 2008, 215, 204–215.
19. Bell, J.S. On the problem of hidden variables in quantum mechanics. Rev Mod Phys 1966, 38, 447.
20. Feynman, R.P. Simulating physics with computers. Int J Theoretical Phys 1982, 21, 467–488.
Q 2009 Wiley Periodicals, Inc. C O M P L E X I T Y 17
DOI 10.1002/cplx21. Chalmers, D.J. The conscious mind. Search of a Fundamental Theory. Oxford University Press: Oxford, UK, 1996.
22. Popper, K.R. The Logic of Scientific Discovery. Unwin Hyman: London, 1934/1959.
23. Kuhn, T.S. The Structure of Scientific Revolutions. University of Chicago Press: Chicago, IL, 1962.
24. Lakatos, I. Falsification and the methodology of scientific research programs. In: Criticism and the Growth of Knowledge;
Lakatos, I.; Musgrave, A., Eds.; Cambridge University Press: Cambridge, UK, 1970.
25. Katz, J. Semantic Theory. Harper & Row: New York, 1972.
26. Putnam, H. Is semantics possible? In: Concepts: Core Readings; Margolis, E.; Laurence, S.; Eds. The MIT Press: Cambridge, MA,
1999; pp 177–187.
18 C O M P L E X I T Y Q 2009 Wiley Periodicals, Inc.
DOI 10.1002/cplxYou can also read