![]() |
![]() |
![]() |
Research Topic Approval Lynellen D. S. Perry Department of Computer Science Mississippi State University January 24, 1997 In my pursuit of a Computer Science Ph.D., I have chosen Artificial Intelligence as my area of concentration. Within Artificial Intelligence, I am interested in the field of Natural Language Processing, and corpus-based language processing techniques in particular. I am also interested in fractals, chaos, dynamical systems, and tools which have dynamic characteristics, such as neural networks (Kolen and Pollack, 1990). To blend these two interests together, I plan to explore the fractal and chaotic properties of natural language via the use of fractal, chaotic, or dynamic tools. In the analysis of language corpora, these tools would be theoretically capable of taking advantage of the self-similarity of written language. There are several lines of research in the literature that combine the study of fractals, chaos, and natural language. Pollard-Gott (1986) analyzed the poetry of Wallace Stevens, examining its stylistic features of sound, word, and image repetition. She showed how a fractal structure could be discovered in several Stevens poems and suggested that fractal structure may be important in helping humans appreciate art forms such as music and writing. This theory is also argued by George David Birkhoff (as reported by Schroeder, 1991) in his ‘Theory of aesthetic value’. He says that for art to be pleasing and interesting it can’t be too predictable or too surprising. Phrased in signal vocabulary, art can not be in the ‘brown’ noise range (too predictable and boring) nor in the ‘white’ noise range (too random and unpredictable to be enjoyable). Richard Voss determined that most famous classical music is in the ‘pink’ noise range (Schroeder, 1991) where, like Bach’s music, “It is great because it is inevitable and yet surprising” (Balthazaar van der Pol as quoted in Schroeder, 1991). ‘Brown’, ‘pink’, and ‘white’ noise are terms for particular power laws. A power law relationship that is relevant to natural language is Zipf’s law. Zipf observed that “the distribution of word frequencies in English, if the words are aligned according to their ranks, is an inverse power law with the exponent very close to 1” (Li, 1992). Zipf’s law is also observed in random texts (Li, 1992) and Li argues that this occurs because rank is chosen as the independent variable. If word length were chosen instead, one would find an exponential distribution. Li questions whether Zipf’s law in natural language is also due to choosing rank as the independent variable. Both Li and Mandelbrot (1983) agree that while Zipf’s law appears to be very interesting at first, it is not a deep law in natural language. However, the fact remains that Zipf’s law is observable and that power laws “are, by definition, self-similar” (Schroeder, 1992) where self-similarity is an important characteristic of things fractal (West and Shlesinger, 1990). Li (1992) also says that “other scaling phenomena such as the 1/f noise or long-range correlation” are missing from natural languages, “as observed by the author that the mutual information function between two letters decays faster than power laws of small exponents”. Lack of long-range correlation is similar to chaotic behavior where the system is sensitive to initial conditions such that two points which begin in close proximity end up far apart. Mandelbrot uses lexicographical trees to show “why the generalized Zipf’s law holds” (1983). These trees `scale’, meaning that “each branch taken by itself is in some way a reduced- scale version of the whole tree” (Mandelbrot, 1983). This is another way to say the trees are self- similar and is strictly true for a random language. However, “actual lexicographical trees are far from being strictly scaling” (Mandelbrot, 1983) and “commonly spoken and written languages do not grow on self-similar trees – or if we insist on hanging them from such tress . . . most branches would be dead” (Schroeder, 1992). Despite this, Mandelbrot used the lexicographical trees and Zipf’s law to propose a measure for natural language that is a counterpart of physical energy, physical entropy, and the “cost of coding and Shannon’s information” (Mandelbrot, 1983). This is the “temperature of discourse. The `hotter’ the discourse, the higher the probability of use of rare words” (Mandelbrot, 1983). Shanon (1993) also argues for a fractal structure in natural language. He reports “fractal patterns manifested by linguistic expressions as they are employed to describe states of affairs in different levels of resolution or detail”. In particular he points out that temporal expressions, many verbs, and many adjectives exhibit a fractal linguistic pattern. `Predictable’ is a synonym for deterministic and `surprising’ is a synonym for random. In between these mathematical concepts is chaos. Poston (1987) suggests a dynamical, continuous approach to speech understanding and word meaning disambiguation because word “meanings can change continuously over time, . . . when they do shift discontinuously the jumps may be better modeled by continuous dynamics with bifurcating attractors than by discrete models”. Nicolis and Katsikas (1993) also report using chaos to work with natural language. “A multifractal strange chaotic attractor . . . is responsible for a highly non-linear linguistic filtering process, limiting drastically: 1) at the syntactical level: the grammatically legitimate words; 2) at the semantic level: the `interesting’ key features of a pattern” (Nicolis and Katsikas, 1993). Pollack (1991) proposes a test for the complexity of syntactic structure based on the state- space limit of a dynamical recognizer and uses “neural networks as non-linear dynamical systems’ for this task. Andreyev et. al (1996) exploit recurrent neural networks with chaos for the purpose of storing and retrieving information (images). Elman (1995) also combines natural language, chaos, and neural networks, suggesting “an alternative view of computation, in which language processing is seen as taking place in a dynamical system. The lexicon is viewed as consisting of regions of state space within that system; the grammar consists of the dynamics (attractors and repellers) which constrain movement in that space” (Elman, 1995). Currently, I have begun exploring the use of neural networks that capture recurring but highly varied syntactic patterns in part-of-speech sequences of lengths up to 13 words. The neural networks can generalize to patterns that are similar to, but not identical to, patterns that they have seen previously. One part of that exploration has involved explicitly incorporating a fractal character to the neural network architecture in an experiment that is currently underway. I would continue to pursue this line of research, in addition to exploring other neural network architectures and other forms of dynamical tools. APPROVED: Lois Boggess, Ph. D. _________________________________________Date_______________ Gene Boggess, Ph. D._________________________________________Date_______________ Susan Bridges, Ph. D._________________________________________Date_______________ Julia Hodges, Ph. D.__________________________________________Date_______________ Donald Dearholt, Ph. D._______________________________________Date_______________ Brad Carter, Ph. D.___________________________________________Date_______________ Bibliography Andreyev, Y. V., Y. L. Belsky, A. S. Dmitriev, and D. A. Kuminov. 1996. Information processing using dynamical chaos: Neural networks implementation. IEEE Transactions on Neural Networks. Vol. 7, No. 2, pp. 290-299. Elman, Jeffrey L. 1995. Language as a dynamical system. In Robert F. Port and T. van Gelder (Eds.) Mind as motion: Explorations in the dynamics of cognition. Cambridge, MA: MIT Press, pp. 195-223. Kolen, John F. and Jordan B. Pollack. 1990. Back propagation is sensitive to initial conditions. Complex Systems, Vol. 4, No. 3, pp.269-280. Li, Wentian. 1992. Random texts exhibit Zipf’s-law-like word frequency distribution. IEEE Transactions on Information Theory, Vol. 38, No. 6, pp. 1842-1845. Mandelbrot, Benoit B. 1983. The fractal geometry of nature. San Francisco: W. H. Freeman and Company. Nicolis, John S. and Anastassis A. Katsikas. 1993. Chaotic dynamics of linguistic-like processes at the syntactical and semantic levels: In the pursuit of a multifractal attractor. In Bruce J. West (Ed.) Patterns, Information and Chaos in Neuronal Systems. Singapore: World Scientific. Studies of Nonlinear Phenomena in Life Science, Vol. 2, pp. 123-231. Pollack, Jordan B. 1991. The induction of dynamical recognizers. Machine Learning. Boston: Kluwer Academic Publishers. Vol. 7, pp. 227-252. Pollard-Gott, Lucy. 1986. Fractal repetition structure in the poetry of Wallace Stevens. Language and Style, Vol. 19, No. 3, pp. 233-249. Poston, Tim. 1987. Mister! Your back wheel’s going round! In Thomas T. Ballmer and Wolfgang Wildgen (Eds.) Process Linguistics. Tubingen: Max Niemeyer Verlag, pp. 11-36. Schroeder, M. R. 1991. Fractals, chaos, power laws: minutes from an infinite paradise. W. H. Freeman and Company. Shanon, Benny. 1993. Fractal patterns in language. New Ideas in Psychology, Vol. 11, No. 1, pp. 105-109. West, Bruce and Michael Shlesinger. 1990. The noise in natural phenomena. American Scientist, Vol. 78, Jan./Feb., pp. 40-45. |