Part of Advances in Neural Information Processing Systems 12 (NIPS 1999)
Bradley Tonkes, Alan Blair, Janet Wiles
Recent theories suggest that language acquisition is assisted by the evolution of languages towards forms that are easily learnable. In this paper, we evolve combinatorial languages which can be learned by a recurrent neural network quickly and from relatively few ex(cid:173) amples. Additionally, we evolve languages for generalization in different "worlds", and for generalization from specific examples. We find that languages can be evolved to facilitate different forms of impressive generalization for a minimally biased, general pur(cid:173) pose learner. The results provide empirical support for the theory that the language itself, as well as the language environment of a learner, plays a substantial role in learning: that there is far more to language acquisition than the language acquisition device.
1
Introduction: Factors in language learnability
In exploring issues of language learnability, the special abilities of humans to learn complex languages have been much emphasized, with one dominant theory based on innate, domain-specific learning mechanisms specifically tuned to learning hu(cid:173) man languages. It has been argued that without strong constraints on the learning mechanism, the complex syntax of language could .not be learned from the sparse data that a 'child observes [1]. More recent theories challenge this claim and em(cid:173) phasize the interaction between learner and environment [~]. In addition to these two theories is the proposal that rather than "language-savvy infants", languages themselves adapt to human learners, and the ones that survive are "infant-friendly languages" [3-5]. To date, relatively few empirical studies have explored how such adaptation of language facilitates learning. Hare and Elman [6] demonstrated that
Evolving Learnable Lan~ages
67
classes of past tense forms could evolve over simulated generations in response to changes in the frequency of verbs, using neural networks. Kirby [7] showed, using a symbolic system, how compositional languages are more likely to emerge when learning is constrained to a limited set of examples. Batali [8] has evolved recurrent networks that communicate simple structured, concepts. Our argument is not that humans are general purpose learners. Rather, current research questions require exploring the nature and extent of biases that learners bring to language learning, and the ways in which languages exploit those biases [2]. Previous theories suggesting that many aspects of language were unlearnable without strong biases are graduallybrealdng down as new aspects of language are shown to be learnable with much weaker biases. Studies include the investigation of how languages may exploit biases as subtle as attention ~d memory limitations in children [9]. A complementary study has shown that general purpose learners can evolve biases in the form of initial starting weights that facilitate the learning of a family of recursive languages [10]..
In this paper we present an empirical paradigm for continuing the exploration of fac(cid:173) tors that contribute to language learnability. The paradigm we propose necessitates the evolution of languages comprising recursive sentences over symbolic strings (cid:173) languages whose sentences cannot be. conveyed without combinatorial composition of symbols drawn from a finite alphabet. The paradigm is not based on any specific natural language, but rather, it is the simplest task we could find to illustrate the point that languages with compositional structure can be evolved to be learnable from few sentences.. The simplicity of the communication task allows us to analyze the language and its generalizability, and highlight the nature of the generalization properties.
We start with the evolution of a recursive language that can be learned easily from five sentences by a minimally biased learner. We then address issues of robust learning of evolved languages, showing that different languages support generaliza(cid:173) tion in different ways. We also address a factor to which scant regard has been paid, namely that languages may evolve not just to their learners, but also to be easily generalizable from a specific set of concepts. It seems almost axiomatic that learning paradigms should sample randomly from the training domain. It may be that human languages are not learnable from random sentences, but are easily gen(cid:173) eralizable from just those examples that a child is likely to be exposed to in its In the third series of simulations, we test whether a language can environment. adapt to be learnable from a core ·set of concepts.
2 A paradig:m for exploring language learnability
We consider a simple language task in which two recurrent neural networks try to communicate a "concept" represented by a point in the unit interval, [0, 1] over a symbolic· channeL An encoder network sends a sequence of symbols (thresholded outputs) for each concept, which a decoder network receives and processes back into a concept (the framework is described in greater detail in [11]). For communication to be successful, the decoder's output should approximate the encoder's input for all concepts. The architecture for the encoder is a recurrent network with one input unit and five output units, and with recurrent connections from both the output and hidden units back to the hidden units. The encoder produces a sequence of up to five symbols (states of the output units) taken from ~ = {A, ....., J}, followed by the $ symbol, for each concept taken from .[0, 1]. To encode a value x E [0,1], the network
68
B. Tonkes, A. Blair and J. Wiles