Research Topic Approval

Lynellen D. S. Perry
Department of Computer Science
Mississippi State University
January 24, 1997

	In my pursuit of a Computer Science Ph.D., I have chosen Artificial Intelligence as my 
area of concentration.  Within Artificial Intelligence, I am interested in the field of Natural 
Language Processing, and corpus-based language processing techniques in particular.  I am also 
interested in fractals, chaos, dynamical systems, and tools which have dynamic characteristics, 
such as neural networks (Kolen and Pollack, 1990).  To blend these two interests together, I plan 
to explore the fractal and chaotic properties of natural language via the use of fractal, chaotic, or 
dynamic tools.  In the analysis of language corpora, these tools would be theoretically capable of 
taking advantage of the self-similarity of written language.
	There are several lines of research in the literature that combine the study of fractals, 
chaos, and natural language.  Pollard-Gott (1986) analyzed the poetry of Wallace Stevens, 
examining its stylistic features of sound, word, and image repetition.  She showed how a fractal 
structure could be discovered in several Stevens poems and suggested that fractal structure may 
be important in helping humans appreciate art forms such as music and writing.  This theory is 
also argued by George David Birkhoff (as reported by Schroeder, 1991) in his ‘Theory of 
aesthetic value’.  He says that for art to be pleasing and interesting it can’t be too predictable or 
too surprising. Phrased in signal vocabulary, art can not be in the ‘brown’ noise range (too 
predictable and boring) nor in the ‘white’ noise range (too random and unpredictable to be 
enjoyable).  Richard Voss determined that most famous classical music is in the ‘pink’ noise range 
(Schroeder, 1991) where, like Bach’s music, “It is great because it is inevitable and yet surprising” 
(Balthazaar van der Pol as quoted in Schroeder, 1991).  
‘Brown’, ‘pink’, and ‘white’ noise are terms for particular power laws.  A power law 
relationship that is relevant to natural language is Zipf’s law.  Zipf observed that “the distribution 
of word frequencies in English, if the words are aligned according to their ranks, is an inverse 
power law with the exponent very close to 1” (Li, 1992).  Zipf’s law is also observed in random 
texts (Li, 1992) and Li argues that this occurs because rank is chosen as the independent variable.  
If word length were chosen instead, one would find an exponential distribution.  Li questions 
whether Zipf’s law in natural language is also due to choosing rank as the independent variable.  
Both Li and Mandelbrot (1983) agree that while Zipf’s law appears to be very interesting at first, 
it is not a deep law in natural language.  However, the fact remains that Zipf’s law is observable 
and that power laws “are, by definition, self-similar” (Schroeder, 1992) where self-similarity is an 
important characteristic of things fractal (West and Shlesinger, 1990).  Li (1992) also says that 
“other scaling phenomena such as the 1/f noise or long-range correlation” are missing from 
natural languages, “as observed by the author that the mutual information function between two 
letters decays faster than power laws of small exponents”.  Lack of long-range correlation is 
similar to chaotic behavior where the system is sensitive to initial conditions such that two points 
which begin in close proximity end up far apart.
Mandelbrot uses lexicographical trees to show “why the generalized Zipf’s law holds” 
(1983).  These trees `scale’, meaning that “each branch taken by itself is in some way a reduced-
scale version of the whole tree” (Mandelbrot, 1983).  This is another way to say the trees are self-
similar and is strictly true for a random language.  However, “actual lexicographical trees are far 
from being strictly scaling” (Mandelbrot, 1983) and “commonly spoken and written languages do 
not grow on self-similar trees – or if we insist on hanging them from such tress . . . most branches 
would be dead” (Schroeder, 1992).  Despite this, Mandelbrot used the lexicographical trees and 
Zipf’s law to propose a measure for natural language that is a counterpart of physical energy, 
physical entropy, and the “cost of coding and Shannon’s information” (Mandelbrot, 1983).  This 
is the “temperature of discourse.  The `hotter’ the discourse, the higher the probability of use of 
rare words” (Mandelbrot, 1983).  
Shanon (1993) also argues for a fractal structure in natural language.  He reports “fractal 
patterns manifested by linguistic expressions as they are employed to describe states of affairs in 
different levels of resolution or detail”.  In particular he points out that temporal expressions, 
many verbs, and many adjectives exhibit a fractal linguistic pattern.
`Predictable’ is a synonym for deterministic and `surprising’ is a synonym for random.  In 
between these mathematical concepts is chaos.  Poston (1987) suggests a dynamical, continuous 
approach to speech understanding and word meaning disambiguation because word “meanings 
can change continuously over time, . . . when they do shift discontinuously the jumps may be 
better modeled by continuous dynamics with bifurcating attractors than by discrete models”.
Nicolis and Katsikas (1993) also report using chaos to work with natural language.  “A 
multifractal strange chaotic attractor . . . is responsible for a highly non-linear linguistic filtering 
process, limiting drastically: 1) at the syntactical level: the grammatically legitimate words; 2) at 
the semantic level: the `interesting’ key features of a pattern” (Nicolis and Katsikas, 1993).
Pollack (1991) proposes a test for the complexity of syntactic structure based on the state-
space limit of a dynamical recognizer and uses “neural networks as non-linear dynamical systems’ 
for this task.  Andreyev et. al (1996) exploit recurrent neural networks with chaos for the purpose 
of storing and retrieving information (images).
	Elman (1995) also combines natural language, chaos, and neural networks, suggesting “an 
alternative view of computation, in which language processing is seen as taking place in a 
dynamical system.  The lexicon is viewed as consisting of regions of state space within that 
system; the grammar consists of the dynamics (attractors and repellers) which constrain 
movement in that space” (Elman, 1995).
Currently, I have begun exploring the use of neural networks that capture recurring but 
highly varied syntactic patterns in part-of-speech sequences of lengths up to 13 words.  The 
neural networks can generalize to patterns that are similar to, but not identical to, patterns that 
they have seen previously.  One part of that exploration has involved explicitly incorporating a 
fractal character to the neural network architecture in an experiment that is currently underway.  I 
would continue to pursue this line of research, in addition to exploring other neural network 
architectures and other forms of dynamical tools.


APPROVED:

Lois Boggess, Ph. D. _________________________________________Date_______________

Gene Boggess, Ph. D._________________________________________Date_______________

Susan Bridges, Ph. D._________________________________________Date_______________

Julia Hodges, Ph. D.__________________________________________Date_______________

Donald Dearholt, Ph. D._______________________________________Date_______________

Brad Carter, Ph. D.___________________________________________Date_______________

Bibliography

Andreyev, Y. V., Y. L. Belsky, A. S. Dmitriev, and D. A. Kuminov. 1996. Information processing 
using dynamical chaos: Neural networks implementation. IEEE Transactions on Neural 
Networks. Vol. 7, No. 2, pp. 290-299.

Elman, Jeffrey L. 1995. Language as a dynamical system. In Robert F. Port and T. van Gelder 
(Eds.) Mind as motion: Explorations in the dynamics of cognition. Cambridge, MA: MIT 
Press, pp. 195-223.

Kolen, John F. and Jordan B. Pollack. 1990. Back propagation is sensitive to initial conditions. 
Complex Systems, Vol. 4, No. 3, pp.269-280.

Li, Wentian. 1992. Random texts exhibit Zipf’s-law-like word frequency distribution. IEEE 
Transactions on Information Theory, Vol. 38, No. 6, pp. 1842-1845.

Mandelbrot, Benoit B. 1983. The fractal geometry of nature. San Francisco: W. H. Freeman and 
Company.

Nicolis, John S. and Anastassis A. Katsikas. 1993. Chaotic dynamics of linguistic-like processes at 
the syntactical and semantic levels: In the pursuit of a multifractal attractor. In Bruce J. 
West (Ed.) Patterns, Information and Chaos in Neuronal Systems. Singapore: World 
Scientific. Studies of Nonlinear Phenomena in Life Science, Vol. 2, pp. 123-231.

Pollack, Jordan B. 1991. The induction of dynamical recognizers. Machine Learning. Boston: 
Kluwer Academic Publishers. Vol. 7, pp. 227-252.

Pollard-Gott, Lucy. 1986. Fractal repetition structure in the poetry of Wallace Stevens. Language 
and Style, Vol. 19, No. 3, pp. 233-249.

Poston, Tim. 1987. Mister! Your back wheel’s going round! In Thomas T. Ballmer and Wolfgang 
Wildgen (Eds.) Process Linguistics. Tubingen: Max Niemeyer Verlag, pp. 11-36.

Schroeder, M. R. 1991. Fractals, chaos, power laws: minutes from an infinite paradise. W. H. 
Freeman and Company.

Shanon, Benny. 1993. Fractal patterns in language. New Ideas in Psychology, Vol. 11, No. 1, pp. 
105-109.

West, Bruce and Michael Shlesinger. 1990. The noise in natural phenomena. American Scientist, 
Vol. 78, Jan./Feb., pp. 40-45.