An Adaptive Multi-Agent System Based on "Neural Darwinism"

Andrei Popescu-Belis

Laboratoire d'Informatique pour la Mécanique et
les Sciences de l'Ingénieur (LIMSI - CNRS)
B.P. 133
F-91403 Orsay Cedex, France
popescu@limsi.fr

Abstract

The Theory of Neuronal Group Selection provides a base for designing sensorimotor systems which adapt to their environment. A new modeling technique is proposed, using a multi-agent architecture rather than a connectionist one. The resulting system operates perceptual categorization of a simple environment, while its learning is based on reentrant links between agents of the control device. Stability and convenient adaptation are confirmed by experimental evidence.

TNGS and Multi-Agent Architecture

The Theory of Neuronal Group Selection (TNGS) proposed by G. M. Edelman (1987) and commonly referred to as "neural darwinism" introduces some elementary neurobiological principles in order to build a global description of the central nervous system (Edelman 1989). According to the TNGS, the cerebral cortex is structured in repertoires of neuronal groups which act towards perceptive input as selective systems (Edelman 1978, 1987). As repertoires can also categorize other repertoires' activity, the perceptual input is thus categorized in a more and more abstract way. Some repertoires are connected through reentrant pathways, enabling elementary associative learning.

The computer models of the TNGS (Reeke et al. 1990) bring convincing justifications to the theory. However, despite TNGS's ambitions to account for higher cognitive functions as language and even consciousness (Edelman 1989), the models seem difficult to extend, as they rely heavily on fined tuned scalar connections between simulated neurons ("integrative units").

Proposals for Higher-Level Computer Modeling

My first proposal is to use agents instead of "integrative units", symbolic message passing instead of numeric synaptic inputs, acquaintances instead of connections, and discrete activation states instead of scalar ones. This corresponds indeed to the "neuronal group" concept, situated at the very basis of the TNGS. Although genesis of neuronal groups is simulated in (Reeke et al. 1990), pp. 616 - 627, and they show clear biological relevance (Mountcastle 1978), they aren't used in the Darwin series. Secondly, my aim here is to study explicit representation of values or needs of a system designed through the TNGS principles. This is a limitation of Darwin III: its "repulsive reflex" is implicitly defined by some connections, and seems merely an arbitrary link between perception and motion, used to check the proper execution of categorization operations. However, value is a central element of cognitive functions explanation in (Edelman 1989).

Description of the Architecture

The system presented here is a simple "animat", simulating very basic features of a living being: positional perception of the environment, internal needs or values, and motor capabilities. The environment is linear: a segment, an infinite line, or a circle. Two areas of the environment (N and B) provide for two different resources, respectively named N and B, for nutriment and beverage. Presence on these zones is sufficient to refill completely and instantly the agent's respective reservoirs, called also N and B. Outside these regions, reservoirs empty at constant rates.

A Multi-Agent Based Control Device

Mainly inspired by the TNGS notion of neuronal group, agents of this system react to various patterns of a repertoire (set) of agents they are assigned to watch, i.e., to combinations of messages received in their mailboxes. The response message depends also on an internal discrete state (the agent is activated or not). Agents are triggered in a predefined order, globally starting with the input and resource watching repertoires, to end with motor output. This represents an activation cycle.

Visual Categorization. The linear retina ("Visual Cells") provides the control system with a topographic representation of the environment. Agents from three repertoires scrutinize this input area (RecoPosN, RecoPosB, RecoSense). They receive broadcasted messages from the "Visual Cells", and respond specifically to combinations of incoming messages. "Second order recognizers", R-of-Rs, as in (Edelman 1978, 1987), categorize combinations of these outgoing messages. There are 16 R- of-R agents, corresponding to two- or three-message combinations.

Reentrant Protocols. These are communication rules between agents of InteroB and InteroN repertoires (interoception, or resource watching) and agents of R-of-R repertoires, as well as between R-of-R and MotCel (motor cells). While interpreting and simplifying Edelman's "reentrance", these protocols provide the basis for adaptive learning through acquaintance evolution. Simultaneous activation of two agents establishes an acquaintance link between them, which is not immutable, but subject to progressive oblivion, i.e., decrease of an acquaintance degree. On the other hand, activation of an agent induces (under some conditions) activation of its acquaintances.

The two sides of each protocol take place simultaneously; even if one sense looks more like learning (MotCel > R-of-R and R-of-R > Intero) and the other sense looks like execution or application of what was learnt (Intero > R-of-R > MotCel). As in biological systems, there is no separation here between the two phases, learning being possible at any moment. Experimental evidence confirms the stability of these reentrant protocols.

Operating principle. Once initialized, the animat has to acquire motor control, and then to spot the regions where its needs are satisfied. Reduced random motor activity leads to coherent linkage between MotCel agents and R-of-R agents. It is not however sufficient to make the system recognize the utility of the N and B regions; an external "push" is necessary to drive it on N, then B. Indeed, there is no prior knowledge about these regions: they behave initially as perceptual landmarks, and are associated with the corresponding value only after learning.

By arrival on N (or B), active InteroN (or InteroB) agents are de-activated, and connected to active R-of-R. These connections will later serve to convey activation messages from the Intero agents to the proper motor agents (MotCel) being reinforced through further successful application - i.e., continuous comings and goings from N to B. Oscillation between N and B is a stable, convenient, behavior, which may be altered by changes in the environment (e.g., shifting of the N or B areas).

Experimental Results

In order to quantify the system's stability, instruction by external intervention was permitted up to 1000 cycles, then the lifetime (until one reservoir was emptied) was recorded. In one series of experiments, 12 instances have been given the same numeric parameters and similar instruction. The average lifetime was 1,550,000 cycles. Six instances were stopped after at least 1,000,000 cycles - one being left up to 6 million (6.106) cycles.

By comparison, when no external help is provided, lifetime never overpasses 1000 cycles. Less then 20% of the instances are able to spot one resource zone (the others finding none). Both reservoirs can never be recognized without external intervention, as the discovery of one zone makes it too "attractive" to move away from it for long random explorations.

The agent has no reason for leaving the N-B interval for the outer zone of the infinite line, as oscillation between N and B is a satisfactory trajectory. However, after proper training, the agent was moved outside the segment and was always able to return to it, whatever its resources' levels where. When the training is restricted to the segment, behavior outside it relies on generalization in this new situation; current work seeks to improve the number of effective correct generalizations up to its theoretical limits.

References

Edelman G. M. 1987. Neural Darwinism: The Theory of Neuronal Group Selection. New York: Basic Books.

Edelman, G. M. 1989. The Remembered Present: A Biological Theory of Consciousness. New York: Basic Books.

Edelman, G. M., and Mountcastle, V. B. 1978. The Mindful Brain: Cortical Organization and the Group Selective Theory of Higher Brain Function. Cambridge: MIT Press.

Reeke, G. N., Finkel, L. H., Sporns, O., and Edelman, G. M. 1990. Synthetic Neural Modeling: A Multilevel Approach to the Analysis of Brain Complexity. In Edelman, G. M., Gall, W. E., and Cowan, W. M., eds. 1990. Signal and Sense: Local and Global Order in Perceptual Maps, 607-707. New York: Wiley-Liss.