A Proposal for An Agent that Plays 7 Wonders

Page created by Leslie West
 
CONTINUE READING
A Proposal for An Agent that Plays 7 Wonders
A Proposal for An Agent that Plays 7 Wonders
                                Ben Gardiner
                            Willamette University
                              November 27, 2012

                                      Abstract
          In the ever-growing field of artificial intelligence and machine learn-
      ing, we utilize simplified models for complex sets of decisions as forums
      to measure the efficiency and overall ’intelligence’ of the program.
      One such set of modeling systems is board games, for which there
      are clearly a series of complex decisions that interact with each other
      to form a strategy. Furthermore, the outcome of the game provides
      a decisive metric for determining the performance of the agent play-
      ing it. In this literature review, we examine the research performed
      on strategic games that are inaccessible and non-deterministic, or in-
      volve imperfect information about game state and include elements of
      chance. We propose an agent that plays the game 7 Wonders based
      upon the techniques used here.

  Keywords: Artificial intelligence, board games, Imperfect information,
Game Theory

1     Introduction
Artificial intelligence is a constantly growing area of interest because increases
in computing power lead to more questions that can be answered. We will
explore the applications of artificial intelligence with regard to board games,
specifically the game 7 Wonders. This will further the field of artificial intelli-
gence because 7 Wonders is a complex game with multiple strategies relevant
to any given scenario. Thus, it provides a model for complex decisions that

                                          1
occur in real-world applications of artificial intelligence, such as programming
of robots that need to perform intricate tasks unsafe for humans.
    In this paper, we propose a project to develop an agent to play 7 Wonders
in a way that defeats most human opponents. We will begin by implementing
a heavily abridged version of the rules of the game, then creating an agent
for it. Once the agent has achieved a reasonable level of proficiency with
the abridged rules, we will add elements back into the game, and observe
how the agent adapts, potentially with additional programmatic help. This
is detailed to a greater extent in section 4. In section 2, we define necessary
terms and review existing research done on board games, then, in section 3
we summarize the rules of 7 Wonders. We finish the paper in section 5 with
rough timeline for anticipated progress.

2     Existing Work
In the field of artificial intelligence, we utilize an agent-environment model
to simulate the complex decisions that a program needs to make. One set of
models is board games, because there are many varieties of games that can
create composite decision trees. We will begin by defining the terms needed
to discuss artificial intelligence, then examine existing research that has been
done in board games, starting with games simple enough to be solved, and
finishing with a newly emerging “uro-style” of board games, with many more
rules and components than traditional, more popular varieties.

2.1    Background
Following the example set by [6], we will define an artificial intelligence to
be a rational agent acting in a suitable environment.
    An agent is an entity, such as a board game player, that perceives infor-
mation, and acts based upon that information. A rational agent is one whose
actions logically follow from the information that it has perceived, and acts
in a manner deemed correct [6]. So in the case of our agent being a game
player, a rational agent is one whose actions, or plays, are always performed
to help win the game.
    To measure the rationality of an agent, or the correctness of its actions,
we define a performance measure as a means of determining the efficacy of
a given agent for a specific task [6]. Our performance measure is simple,

                                       2
whether or not the agent wins the game. For games with more than two
players, we can measure relative performance to the other players, so an
agent does adequately if it consistently gets second place out of a total of
four players.
   Then, we must define a series of properties an environment can have, to
specify the problem that the agent is attempting to solve. An environment
can be:
      • Fully or partially observable: Can the state of the environment be com-
        pletely determined by observation, or is there hidden information?
        Games such as checkers or chess are fully observable, because there is
        no information, but games like poker are imperfect information games,
        or partially observable, in that there is information that a player’s op-
        ponents have that the player does not.
      • Single agent or multi-agent: Is the state of the environment determined
        by other agents in addition to the one we are programming? In game
        terms, is it a solitaire or multiplayer game?
      • If an environment is multi-agent, is it Competitive or Cooperative: Al-
        ternately, are the agents working towards a common goal, or do they
        have individual agendas?
      • Deterministic or Stochastic: An environment is deterministic if the eval-
        uation of its next state can be fully ascertained based upon its current
        state and the agent’s action. It is stochastic if otherwise.
[6]
   It is clear that board games provide a very easy model for which to
emulate environments, because there many complex board games fill each of
these categories.
   We see that we can formally classify an artificial intelligence problem
based upon the type environment. Furthermore, it is clear that we can use
games to model these systems. In our next section, we discuss existing work
that has been done in the field of artificial intelligence with board games.

2.2      Literature Review
We begin this section by examining known solved games, such as checkers,
and what they have in common. We will then look at other families of games,

                                         3
such as poker, or the recently emerging euro-games genre.

2.2.1   Solved Games
Some games have been studied to the point that artificial players can con-
sistently beat human opponents, even at the level of grand master or world
champion. While these games are no longer interesting for study, they are
useful as points of reference for the development of the field. We will utilize
the game-theoretic definitions for the term “solved game,” where a game has
been solved if its game-theoretic value can be computed for the initial po-
sition and a strategy can be devised for that game-theoretic value [10]. If
a game is solved, the idea of developing an intelligent player is trivialized,
because there are known rules for devising a strategy.
    According to [10], there is a family of games known as convergent games,
“where the size of the state space decreases as the game progresses.” This
leads to the set of end moves being small enough that it can be categorized
in a database. These games have been solved to the extent that the solu-
tion space is searchable via brute force methods, using a technique known
as retrograde analysis. This method, also known as the mini-max algorithm,
consists of beginning at a desired end state, and working backwards to gener-
ate the sequence of moves, acting under the assumption that opponents will
make the best move for them [6, 11]. This list includes Nine Men’s Morris,
mancala games, checkers, all of which are fully detailed in [10].
    The other family of games examined in [10] is divergent games, ones that
do have a finite set of endgame states, so databases cannot be generated. The
methods for solving these games are either search-based or knowledge-based.
    A common theme among these solved games is that all of them are fully
observable, two-agent, and deterministic, and therefore, according to [7],
are “easy” to program. This gives us several specific fields to expand the re-
search; we can examine non-deterministic, n-agent (n greater than 2), and/or
partially observable games, which is our task in the next section.

2.2.2   Stochastic Games
Fully Observable Games We briefly mention backgammon, which is fully
observable but stochastic because of the nature of the dice rolls that generate
moves. However, in 1979, a backgammon player defeated the then world
champion [2]. The backgammon agent, known as TD-gammon, simply plays

                                      4
against itself and learns from results, and was still considered to be a top tier
player in 1994, after several years where human players improved their play
[?]. It is clear that backgammon and other observable, stochastic games like
it could be programmed simply. It is clear that observable, two-agent games
are no longer interesting for artificial intelligence results.

Partially Observable Games One of the early attempts in partially ob-
servable games was a partner-Dominoes player, first published in 1973 [8].
The authors of the article detail the challenges of imperfect information, re-
sulting from the way that at any time, there is a set of unknown dominoes
in opponents’ and partner’s hands, and they begin defining a strategy based
upon probabilistic analyses. While this work is rather rudimentary, it lays a
foundation for the later works we will discuss.
    Poker is an interesting game to model with artificial intelligence, because
of the stochastic and unobservable nature of the game and deception of oppo-
nents being a necessary component of skilled play. This requires algorithms
to model and understand opponents, as a part of formulating a strategy,
because there are many approaches to playing to win, that depend largely
on the person playing [3]. Billings et. al, examine the opponent modeling,
and selective sampling needed to create a skilled automated poker player. In
[3], once a poker player has been described, the authors proceed to explain
why this approach cannot be easily generalized to other non-deterministic
games. They observe that there are a variety of important factors, so the
initial goals of the intelligent agent and its performance measure could vary,
depending upon what the designer wishes to accomplish [3]. This is largely
what makes most games interesting to human players, because there are
many useful strategies to attempt, so any game is an intellectual exercise.

Euro-Games Finally, we will examine more recent research in a genre of
games known as euro-games, where there is a complex set of rules and playing
components, often including pieces or different types of cards. In 2009, Szita
et al. created an agent for the game Settlers of Catan, using Monte Carlo
tree search [9]. While their agent was not as successful as they hoped, the
authors noted that success was more probable with a well-defined heuristic
function, and concluded that Monte Carlo Tree search is viable for other
multiplayer games of similar complexity [9].
    A similar Monte Carlo method was used in [5], to create an agent just for

                                       5
the initial part of determining territories (drafting) in the game Risk. Gibson
et al. use the UCT Monte Carlo method combined with a “machine-learned
evaluation function” to make quick draft picks, and conclude that the two
combined form an effective computational approach.
    One of the quintessential complex age in the modern board game era is
Dominion. Fynbo et al. provide a first documented and published imple-
mentation of an agent that plays Dominion in [4]. They conclude that the
game is complex enough that it requires several different rational agents for
various aspects of the game including one for measuring the overall progress
of the game. While their agent is largely unsuccessful in the playing of cards,
they observe that their artificial neural network using back-propagation is
very skilled at determining the state of the game, which, for human players,
is a more intangible form of the strategy. Finally they demonstrate that two
techniques previously not attempted for complex games work with a high
success rate [4].

2.3    Conclusion
We note that programming board games is a useful field in artificial intelli-
gence research, because there are games for each possible environment. Then,
we examine games that have been researched exhaustively, all of which are
fully observable. The problem of examining partially observable games has
been investigated in a variety of ways, including cooperative dominoes, poker,
and the recently emerging Euro-games. While there has been a great deal
of research into traditional or more popular games such as chess, checkers,
or even poker, the Euro-games have seen much less formal exposure, largely
because they are a recent development in the gaming community. We have
found that a small number of Euro-games have been formally examined, with
no unifying ideas or approaches with regards to complex, multi-agent, par-
tially observable stochastic games, but much more research remains to be
done. We propose an implementation of an agent that plays a game not yet
documented, with new approaches examined.

3     Game Rules
The game 7 Wonders is structured in the form of 3 drafts. A draft is a
process where a pool of cards is distributed evenly among each player, seven

                                      6
(b) Victory Points

             (a) Sample card from 7 Wonders

cards per player to start, and each player selects a card and then passes the
remainder of their cards to the player on their left. Then, once everybody
has selected a card, they play it in one of three ways, fully detailed in [1].
Each card has a cost, a name, and an ability, see figure 1a.
    In order to play the card, the player must be able to produce the resources
depicted in the cost, that is have enough cards that generate each resource
as their ’ability’ so that each resource in the cost is generated. An alternate
means of obtaining resources is purchasing them, for two coins, from the 2
players on either side of the current player, known as the neighbors. For
example, in figure 1a, a player needs to be able to produce at least one
resource of the “wood” type, shown in the upper left-hand corner of the
card.
    Alternately, if one does not want or is unable to play the card, they may
build a stage of the “wonder,” which is similar to playing a card, in that a cost
must be paid. However, the stages of the wonder are unique to each player,
and also are public information. One builds the next stage of the wonder by
paying the cost shown, and then sliding the card face-down underneath the
corresponding wonder stage.
    The final option for playing a card is to sell it, face down, for three coins,
which can be used to purchase resources from one’s neighbors.
    Thus, when selecting a card, there are factors of how useful it will be
in the future, and what its immediate impact are. When every player has
selected and played a card, they pick up the set of cards passed by the
player to their right, and the process repeats. Therefore, the game follows
a simultaneous play paradigm. When the game finishes, the players count

                                        7
their victory points, denoted with the symbol in 1b.
    The player with the most victory points wins the game. An agent for
the game will need to be able to select cards based upon what it has already
played, and what potential cards are coming in the future. It must be able
to evaluate its current points, its potential points given its progress, and
speculate on expected values of cards at the end of the game.
    There are several types of cards. At an abstract level, these can be de-
scribed as cards that provide resources, cards that produce victory points,
and cards that offer economic benefit in terms of commerce with other play-
ers.

4    Proposal
We propose to develop an agent to play 7 Wonders. We will begin by im-
plementing the rules of the game using an object-oriented language such as
Java. For example, there will be a ’card’ abstract class holding basic infor-
mation, with each type of card extending it. Also, there will have to be a
player class, and classes for several other pieces of information. Then, the
agent will interact with the rest of the game through an extension of the
player class.
    The agent for 7 Wonders will largely build upon skills acquired in the Ma-
chine Learning course. Techniques that we used in this course can be easily
extended to implementing the game rules, and then developing a framework
of interaction between the agent and player. Furthermore, the course pro-
vided an introduction to the types of machine learning algorithms necessary
for developing an intelligent, game-playing, agent.
    Our current plan is to use the Monte-Carlo tree search method, which
constructs a ranked tree of potential future game states based upon the cur-
rent one, and then ranks them according to a judged value. There are several
steps involved, fully detailed in [cite sources here]. However, we need to learn
more details of this approach, and several others, before making a final de-
cision in implementation. This is the current largest roadblock, the bulk
of the technical topics still needed to learn, because the Monte-Carlo tree
search method was selected from brief survey of artificial intelligence tech-
niques. Once we have fully familiarized ourselves with the details necessary,
then we will be able to fully exploit them to the advantage of the project. We
will begin by designing an abridged version of the rules, with a small subset

                                       8
of types of cards. Once we have implemented this condensed version of the
game, we will begin designing and implementing an agent for it. An initial
goal, or milestone, would be a competent agent with the smallest version of
the game. Then, we would select 1 rule to add, to minimalize complexity,
and redesign the artificial intelligence. Depending upon how simplified the
rules are, there are several sets of potential deliverables, each corresponding
to a different iteration of the game rules and the agent that can play with it.
Our final, stretch goal is to have the agent developed with the full rules con-
sistently win eighty percent of its games with human opponents. However,
we are not anticipating this much success. We anticipate sacrifices either on
the win rate, or on how much of the game that we develop.
    In our first version of the rules of the game, we propose to ignore the
stages of the wonder, and the ability to purchase resources from opponents.
Cards can still be sold for coins, simply as an alternative if one is unable to
play the card. In our second stage of the rules, we will add the ability to
purchase resources from the neighbors. In the third version of the rules, we
propose to add the ability to build the wonder, in addition to an ability to
play certain cards for free.
    If the full rules of the game are developed, and we have an agent that
plays to our satisfaction, then we will have gone far above what we expected.
In this case, we will develop a visual and interactive user interface that makes
it very easy to play against.

5    Timetable
 Date      Deliverable
 14-Jan    First Day of Classes
 31-Jan    Initial game rules implemented
 14-Feb    Agent for initial rules completed
 28-Feb    Second iteration of rules implemented
 14-Mar    Second Agent Implemented
 28-Mar    Third Rules Version
 11-Apr    Third Agent Implemented
 17-Apr    SSRD
 5-May     Final Deadline

                                       9
References
 [1] Antoine Bauza. 7 wonders game rules, 2010.
 [2] H.J. Berliner. Backgammon computer program beats world champion.
     Artificial Intelligence, 14(2):205–220, 1980.
 [3] Darse Billings, Aaron Davidson, Jonathan Schaeffer, and Duane Szafron.
     The challenge of poker. Artificial Intelligence, 134(1-2):201–240, Jan-
     uary 2002.
 [4] Rasmus Fynbo and Christian Nelleman. Developing an agent for do-
     minion using modern ai-approaches. Master’s thesis, IT- University of
     Copenhagen, 2010.
 [5] Richard Gibson, Neesha Desai, and Richard Zhao. An automated tech-
     nique for drafting territories in the board game risk, 2010.
 [6] Stuart J. Russell and Peter Norvig. Artificial intelligence: A Modern
     Approach. Prentice Hall, third edition, 2010.
 [7] Jonathan Schaeffer and H. Jaap van den Herik. Games, computers, and
     artificial intelligence. Artif. Intell., 134(1-2):1–7, 2002.
 [8] Michael H. Smith. A learning program which plays partnership domi-
     noes. Commun. ACM, 16(8):462–467, August 1973.
 [9] Istvan Szita, Guillaume Chaslot, and Pieter Spronck. Monte-carlo tree
     search in settlers of catan. In H. Jaap van den Herik and Pieter Spronck,
     editors, ACG, volume 6048 of Lecture Notes in Computer Science, pages
     21–32. Springer, 2009.
[10] H.Jaap van den Herik, Jos W.H.M. Uiterwijk, and Jack van Rijswijck.
     Games solved: Now and in the future. Artificial Intelligence, 134(12):277
     – 311, 2002.
[11] Brandon Wilson. Improving game-tree search by incorporating error
     propagation and social orientations. In The 10th International Con-
     ference on Autonomous Agents and Multiagent Systems, volume 3 of
     AAMAS ’11, pages 1355–1356, Richland, SC, 2011. International Foun-
     dation for Autonomous Agents and Multiagent Systems.

                                     10
You can also read