Evolving Behaviour Trees for the Mario Bros Game Using Grammatical Evolution

 
CONTINUE READING
Evolving Behaviour Trees for the Mario Bros
        Game Using Grammatical Evolution

                       Miguel Nicolau1 and Diego Perez2
               1
                 Natural Computing Research & Applications Group
                           University College Dublin
                                Dublin, Ireland
                            2
                              University Carlos III
                                 Madrid, Spain
            Miguel.Nicolau@ucd.ie, diego.perez.liebana@gmail.com

       Abstract.

1     Introduction
2     The 2010 Mario AI Competition
The 2010 Mario AI Competition is a contest organized by Julian Togelius and
Sergey Karakovskiy, and it is the successor to the competition held by the same
organizers in 2009 [JSR10]. The 2010 competition took place in four different
events: EvoStar 2010, World Congress on Computational Intelligence (WCCI)
2010, Conference on Computational Intelligence and Games (CIG) 2010 and
Games Innovation Conference (GIC) 2010.
    The participants of the competition are requested to submit a bot that can
participate in up to three different tracks: gameplay, learning and level gener-
ation. The bot presented by the authors for this competition took part in the
gameplay track of the CIG’10, where the bots are evaluated in levels that have
not been seen previously by the competitors.
    The score of each bot is based on the distance run by Mario (the bot), plus
the sum of some other factors, like collected items, enemies killed and time left.
The evaluation is made over several executions, varying level length, enemy types
and difficulty, so the final score is the sum of all these evaluations. The bot that
gets the highest score becomes the winner of the competition.

2.1   The Mario Bros Benchmark
The Mario Bros Benchmark, used for running the competition, is an open source
software, written in Java, and developed by Julian Togelius, Sergey Karakovsky,
Tom Schaul and Jan Koutnik [MBW].
    This benchmark allows to create an agent that plays the Mario Bros game,
just witting a small Java class that overrides two methods: one to retrieve in-
formation about the level enemies and geometry, and the other to specify the
actions used to move the bot. Both functions are called by the engine, in this
order, every execution cycle.
Environment information All the information that can be used to analyse
the world around Mario is given in two bi-dimensional arrays (21x21). Each one
of them is in charge of providing data about the geometry of the level and the
enemies that populate it. These arrays are centred in Mario, so 10 grid cells in
each direction from the position of Mario can be processed every cycle.
    Additionally, three different levels of detail can be specified to retrieve data
in both arrays, depending on the information we are looking for:
 – Zoom 2: The data is represented in a binary array. For enemies, 0 means
   that there is no enemy on that position, while 1 means there is some enemy.
   Likewise, for the level scene, 1 means that there is an obstacle and 0 that
   Mario can pass through.
 – Zoom 1: This zoom levels represents the data with an integer, gathering
   groups of objects with a common identifier. For the enemy information, 0
   means no enemy at all, 2 represents an enemy that can be killed by Mario
   by jumping on it and 9 are those enemies that can be killed just shooting
   at them. For the level scene, different identifiers represent types of blocks,
   like those that can or can not be broken, contain hidden items or can spawn
   enemies.
 – Zoom 0: This zoom level is a very close view of the internal representation
   of the engine, where every kind of enemy or block in the level has its own
   identifier, different from any other entity in the game.

   Apart from this information, more useful input can be used to represent the
current state of the game:
 – Mario position: A pair of float position values that indicates the coordi-
   nates of Mario in the level.
 – Mario status: It informs about the state of the game: running, win or dead.
 – Mario mode: Mario can be small or big, with or without being able to fire.
 – Mario state indicators: They inform about facts like the ability of Mario
   to shoot and jump, the time left for the level and whether Mario is on the
   ground or not.
 – Mario kills: Statistics about the enemies killed by Mario, indicating how
   they were killed (by stomp, by fire or hitting them with a shell).

Mario effectors The actions that can be performed by Mario are all the dif-
ferent inputs that a human player could use with a control pad. They are rep-
resented as a boolean array, where each control has a concrete index assigned.
The controls to use are the following:
 – Directions: One different for each direction: Left, Right Up and Down.
 – Jumping: To make Mario jump.
 – Speed and Fire: Mario can fire, if he is in the proper mode, by using this
   control. This input can also be used to make Mario go faster, but it only
   works if he is moving right or left. Jumps with this button pressed can make
   Mario reach farther places.
3     Grammatical Evolution
Grammatical Evolution (GE) [OR03] is an evolutionary approach that specifies
the syntax of possible solutions through a context-free grammar, which is then
used to map binary strings onto functional and syntactically correct solutions.
Those binary strings can therefore be evolved by any search algorithm; typically,
a variable-length genetic algorithm is used.
    One of the main advantages of GE is that the syntax of the resulting solutions
is specified through a grammar. This facilitates the application of GE to a variety
of problems with relative ease, and explains its usage for the current application.
    GE basically works as a genotype-to-phenotype mapping process. Variable-
length integers are created by an evolutionary process (typically a genetic algo-
rithm [Hol75,Gol89]), and then used to choose production rules from a grammar,
which creates a functional program, syntactically correct for the problem domain.
Finally, this program is evaluated, and its fitness is returned to the evolutionary
algorithm.

3.1   Example Mapping Process
To illustrate the mapping process, consider the grammar shown in Fig. 1, speci-
fying a generic gameplay behaviour, and the following integer string: (4, 5, 3,
6, 8, 5, 9, 1). The first integer is used to choose one of the two productions
of the start symbol , through the formula 4%2 = 0, i.e. the first production
is chosen, so the mapping string becomes .
    The following integer is then used with the first unmapped symbol in the
mapping string, so through the formula 5%2 = 1 the symbol  is replaced by
, and thus the mapping string becomes .
    The mapping process continues in this fashion, so in the next step the map-
ping string becomes  through the formula 3%2 = 1, and through
6%5 = 2 it becomes moveRight;. After all symbols are mapped, the final
program becomes moveRight; if(enemyAhead) then shoot;, which could be
executed in an endless loop.
    Sometimes the integer string may not have enough values to fully map a
syntactic valid program; several options are available, such as reusing the same
integers (in a process called wrapping[OR03]), assigning the individual the worst
possible fitness, or replacing it with a legal individual. In this study, an unmapped
individual is replaced by its parent.

4     Behaviour Trees
4.1   Introduction
Behaviour Trees (BTs) were introduced a few years ago as a means to encode
formal system specifications [Dro04,Col07]. Recently, they have been shown to
provide a means to encode game AI in a modular, scalable and reusable man-
ner [CDC10]. They have been used in high-revenue commercial games, such as
::=    
                      |   
              ::=   
                      |   
         ::=   if(obstacleAhead) then ;
                      |   if(enemyAhead) then ;
            ::=   moveLeft;
                      |   moveRight;
                      |   jump;
                      |   crouch;
                      |   shoot;

      Fig. 1. Example grammar for simple approach to generic shooting game.

“Façade” [MS04], “Halo 2” [Isl05] and “Halo 3”, “Spore” [Mch07], and many
other unpublished commercial uses [CDC10], which illustrate their flexibility
and growing importance in the commercial game AI world.
    BTs are simply a hierarchical way of organising behaviours in a descending
order of complexity; broad behavioural tasks are at the top of the tree, and
these are broken down into several sub-tasks. For example, a soldier in a first-
person shooter game might have a behaviour AI that breaks down into patrol,
investigate and attack tasks. Each of these can then be further broken down:
attacking for example will no doubt require moving tactics, weapon management,
and aiming algorithms. These can be further broken down, right up to the level
of playing sounds or animation sprites.

4.2   Behaviour Trees for Mario
FIXME DIEGO

4.3   Incorporation into GE
FIXME Grammar encoding. First option giving full syntax to GE was not good,
second option fixing the structure of the grammar to resemble an and-or tree
(ref) much more successful.

Extensions to standard GE. FIXME MIGUEL Another novel approach
was the encoding of crossover points in the grammar. This is a technique pre-
sented recently for GE [ND06], in which a specific symbol is used in the grammar,
to label crossover points; the evolutionary algorithm then only slices an individ-
ual according to these points. This made a lot of sense in the work presented
here: many of the parameters passed to jenn3d specify styling options, which
can therefore be exchanged as a whole between different structures (a 2-point
crossover operator was used). This makes crossover act solely as an exploitation
operator; standard point mutation still ensures the exploration of novel param-
eter values.
5     Experiments
FIXME MIGUEL
     The experimental parameters used are shown in Table 1. Note that to ensure
all individuals in the initial population were valid, a form of population initialisa-
tion known as Sensible Initialisation [RA03] was used. A variation of tournament
selection was used, which ensures that each individual participates at least in
one tournament event. Also, the mutation rate was set such that, on average,
one mutation event occurs per individual (its probability is therefore variable,
and dependent on the length of each individual). Finally, note that there is no
maximum number of generations; evolution will always continue, until the user
decides to terminate the execution.

                            Table 1. Experimental Setup

                    Initial Population Size                     20
                    Evolutionary Population Size                10
                    Derivation-tree Depth (for initialisation) 10
                    Tail Ratio (for initialisation)           20%
                    Selection Tournament Size                    2
                    Elitism (for generational replacement) 20%
                    Crossover Ratio                           90%
                    Average Mutation Events per Individual       1

5.1   Results

6     Conclusions

References
[MBW] Mario AI Benchmark, http://code.google.com/p/marioai/
[MAI10] 2010 Mario AI Championship, http://www.marioai.org
[JSR10] Julian Togelius, Sergey Karakovskiy and Robin Baumgarten: The 2009 Mario
   AI Competition. In: IEEE Congress on Evolutionary Computation, Proceedings.
   pp. FIXME–FIXME IEEE Press (2010)
[CDC10] Champandard, A., Dawe, M., Cerpa, D. H.: Behavior Trees: Three Ways of
   Cultivating Strong AI. In: Game Developers Conference, Audio Lecture. (2010)
[Col07] Colvin, R., Hayes, I. J.: A Semantics for Behavior Trees. ARC Centre for
   Complex Systems, tech. report ACCS-TR-07-01. (2007)
[Dro04] Dromey, R. G.: From Requirements to Design: Formalizing the Key Steps. In:
   International Conference on Software Engineering and Formal Methods, Proceed-
   ings. (2004)
[Gol89] Goldberg, D. E.: Genetic Algorithms in Search, Optimization and Machine
   Learning. Addison Wesley (1989)
[Hol75] Holland, J. H.: Adaptation in Natural and Artificial Systems. University of
    Michigan Press (1975)
[Isl05] Isla, D.: Managing Complexity in the Halo 2 AI System. In: Game Developers
    Conference, Proceedings. (2005)
[Mch07] McHugh, L.: Three Approaches to Behavior Tree AI. In: Game Developers
    Conference, Proceedings. (2007)
[MS04] Mateas, M., Stern, A.: Managing Intermixing Behavior Hierarchies. In: Game
    Developers Conference, Proceedings. (2004)
[ND06] Nicolau, M., Dempsey, I.: Introducing Grammar Based Extensions for Gram-
    matical Evolution. In: IEEE Congress on Evolutionary Computation, Proceedings.
    pp. 2663–2670 IEEE Press (2006)
[OR03] O’Neill, M., Ryan, C.: Grammatical Evolution: Evolutionary Automatic Pro-
    gramming in a Arbitrary Language. Kluwer Academic Publishers (2003)
[RA03] Ryan, C., Azad, R.M.A.: Sensible initialisation in grammatical evolution. In:
    Barry, A.M. (ed.) GECCO 2003: Proceedings of the Bird of a Feather Workshops,
    Genetic and Evolutionary Computation Conference. pp. 142–145. AAAI (July 2003)
You can also read