Integrated information theory of consciousness: an updated account

Page created by Alvin Munoz
 
CONTINUE READING
Integrated information theory of consciousness: an updated account
Archives Italiennes de Biologie, 150: 290-326, 2012.

  Integrated information theory of consciousness:
                an updated account
                                                  G. Tononi
                  Department of Psychiatry, University of Wisconsin, Madison, WI, USA

                                                   A bst r act

This article presents an updated account of integrated information theory of consciousness (IIT) and some of its
implications. IIT stems from thought experiments that lead to phenomenological axioms (existence, compositional-
ity, information, integration, exclusion) and corresponding ontological postulates. The information axiom asserts
that every experience is specific – it is what it is by differing in its particular way from a large repertoire of
alternatives. The integration axiom asserts that each experience is unified – it cannot be reduced to independent
components. The exclusion axiom asserts that every experience is definite – it is limited to particular things and
not others and flows at a particular speed and resolution. IIT formalizes these intuitions with postulates. The infor-
mation postulate states that only “differences that make a difference” from the intrinsic perspective of a system
matter: a mechanism generates cause-effect information if its present state has selective past causes and selective
future effects within a system. The integration postulate states that only information that is irreducible matters:
mechanisms generate integrated information only to the extent that the information they generate cannot be par-
titioned into that generated within independent components. The exclusion postulate states that only maxima of
integrated information matter: a mechanism specifies only one maximally irreducible set of past causes and future
effects – a concept. A complex is a set of elements specifying a maximally irreducible constellation of concepts,
where the maximum is evaluated over elements and at the optimal spatio-temporal scale. Its concepts specify a
maximally integrated conceptual information structure or quale, which is identical with an experience. Finally,
changes in information integration upon exposure to the environment reflect a system’s ability to match the causal
structure of the world. After introducing an updated definition of information integration and related quantities,
the article presents some theoretical considerations about the relationship between information and causation and
about the relational structure of concepts within a quale. It also explores the relationship between the temporal
grain size of information integration and the dynamic of metastable states in the corticothalamic complex. Finally,
it summarizes how IIT accounts for empirical findings about the neural substrate of consciousness, and how vari-
ous aspects of phenomenology may in principle be addressed in terms of the geometry of information integration.

                                                    Key words
                             Brain • Experience • Awareness • Causation • Emergence

Phenomenology: Consciousness as                               thoughts or emotions, about the world or about the
integrated information                                        self.
                                                              It is also common knowledge that our consciousness
Everybody knows what consciousness is: it is what             depends on certain parts of the brain. For example,
vanishes every night when we fall into dreamless              the widespread destruction of the cerebral cortex
sleep and reappears when we wake up or when we                leaves people permanently unconscious (vegeta-
dream. Thus, consciousness is synonymous with                 tive), whereas the complete removal of the cerebel-
experience – any experience – of shapes or sounds,            lum, even richer in neurons, hardly affects con-

 Corresponding Author: Giulio Tononi, 6001 Research Park Boulevard, Madison, WI, 53719, USA - E-mail: gtononi@wisc.edu
Integrated information theory of consciousness: an updated account
294                                                G. Tononi

sciousness. Furthermore, it matters how the cerebral     mechanism, which can distinguish between a low
cortex is functioning. For example, cortical neurons     and a high current, detects a high current and, say,
remain active throughout sleep, although their firing    triggers the output ‘light’ rather than the output
patterns may change. Correspondingly, at certain         ‘dark’. Since the distinction is between two alter-
times during sleep consciousness fades, while at         natives, the photodiode generates 1 bit of informa-
other times we dream. It is also well established        tion. We take that bit of information to specify
that different parts of the cortex influence qualita-    ‘light’ as opposed to ‘dark’, but it is important to
tive aspects of consciousness: damage to certain         realize that, from the photodiode’s perspective, the
parts of the cortex impairs the experience of color,     only specification it can make is whether its input
whereas other lesions impair that of visual shapes.      were in one of two ways and whether therefore its
Neuroscientific findings are making progress in          outputs should be in one of two ways – this way or
identifying the neural correlates of consciousness       not this way. Any further specification is impos-
(Koch, 2004). However, to explain why experience         sible because it does not have mechanisms for it.
is generated in the cortex and not in the cerebellum,    Therefore, when the photodiode detects and reports
why it fades in certain stages of sleep, why some        ‘light’, such light cannot possibly mean what it
cortical areas contribute color and others sound,        means for us – it does not even mean that it is a
and to address difficult issues such as the presence     visual attribute.
and quality of consciousness in newborn babies, in       When a human reports pure light, by contrast,
animals, or in pathological conditions, empirical        mechanisms in his brain distinguish, in a specific
studies are usefully complemented by a theoretical       way, among a much larger number of alternatives,
approach. Integrated information theory (IIT) con-       and are primed accordingly for a large number of
stitutes such an approach. What follows is an outline    different outcomes, thus generating many bits of
of IIT, streamlined and updated with respect to pre-     information. This is because ‘light’ is distinguished
vious expositions (Tononi, 2004, 2008).                  not only from ‘dark’, but from a multitude of other
                                                         possibilities, for example a red screen, a green
Three thought experiments                                screen, this movie frame, that movie frame, a sound,
Three thought experiments lie at the heart of IIT        a different sound, a thought, another thought, and
– the photodiode thought experiment, the camera          so on. In other words, each alternative can be dis-
thought experiment, and the internet thought experi-     tinguished from the others in its own specific way,
ment.                                                    and can lead to different consequences, including
                                                         different verbal reports, actions, thoughts, memories
The photodiode thought experiment                        etc. To us, then, ‘light’ is much more meaningful
Consider a human and a simple photodiode facing a        precisely because we have mechanisms that can
blank screen that is alternately on and off. The pho-    specifically distinguish this particular state of affairs
todiode can tell ‘light’ from ‘dark’ just as well as a   we call ‘light’ against each and every one of a large
human. However, a human also has an experience           number of alternatives, and lead to appropriately
of light or dark, whereas the photodiode presumably      different consequences. Indeed, as a human, no mat-
does not. What is the critical property that humans      ter how hard I try, I cannot empty an experience of
have and photodiodes lack?                               meaning: I cannot reduce the experience of ‘light’ to
According to IIT, the critical property has to do        ‘this and not this’. More generally, if I am not blind
with how much information is generated when the          from birth, I cannot reduce myself to lacking visual
distinction between light and dark is made. From         experiences; if I am not color-blind, I cannot reduce
the intrinsic perspective of a system – photodiode       myself to seeing the world in black-and-white; if I
or human – information can best be defined as a          know English, I cannot see the word “English” and
“difference that makes a difference”1: the more          not understand it; if I am an experienced musician, I
alternatives (differences) can be distinguished, to      cannot reduce myself to listening to a sonata as if I
the extent they lead to distinguishable consequenc-      were a novice, and so on.
es (make a difference), the greater the information.     This central point may be appreciated either by addi-
When the blank screen turns on, the photodiode’s         tion or by subtraction. By addition, I realize that I
Integrated information theory of consciousness: an updated account
Integrated information theory of consciousness: an updated account                          295

can only see ‘light’ the way I see it, as progressively      By contrast, a human distinguishes among a vast
more and more meaning is added by mechanisms                 repertoire of alternatives as a single, integrated sys-
that specify how ‘light’ differs from each of count-         tem, one that cannot be broken down into indepen-
less alternatives: from various colors, shapes, and          dent components each with their own separate rep-
countless other visual and non-visual experiences.           ertoire. Phenomenologically, every experience is an
By subtraction, I can realize that, if I were to lose        integrated whole, one that means what it means by
one neural mechanism after the other, my being               virtue of being one, and which is experienced from
conscious of ‘light’ would degrade – it would lose           a single point of view. For example, no matter how
its non-coloredness, its non-shapedness, it would            hard I try, experiencing the full visual field cannot
even lose its visualness – while its meaning is pro-         be reduced into experiencing separately the left half
gressively stripped down to just ‘one of two ways’,          and the right half. No matter how hard I try, I cannot
as with the photodiode. Either way, the theory says          reduce the experience of a red apple into the sepa-
that, the more my mechanisms specify how ‘light’             rate experience of its color and its shape. Indeed,
differs from its many alternatives, and thereby lead         the only way to split an experience into independent
to different consequences – the more they specify            experiences seems to be splitting the brain in two, as
what light means – the more I am conscious of it.            in patients who underwent the section of the corpus
                                                             callosum to treat severe epilepsy (Gazzaniga, 2005).
The camera thought experiment                                Such patients do indeed experience the left half
Information – the ability to discriminate among a            of the visual field independently of the right side,
large number of alternatives – is thus an essential          but then the surgery has created two separate con-
ingredient for consciousness. However, another               sciousnesses instead of one. Therefore, underlying
thought experiment, this time involving a digital            the unity of experience must be causal interactions
camera, shows the need for a second ingredient.              among certain elements within the brain. This means
Assume the sensor chip of the camera is a collection         that these elements work together as an integrated
of a million binary photodiodes. Taken together,             system, which is why, unlike the camera, their per-
then, the camera’s photodiodes can distinguish               formance breaks down if they are disconnected.
among 21,000,000 alternative states, an immense
number, corresponding to 1 million bits of informa-          The internet thought experiment
tion. Indeed, the camera would respond differently           Unlike the camera chip, the internet is obviously
to every possible image. Yet few would argue that            integrated – in fact, its main purpose is to permit
the camera is conscious. What is the critical differ-        exchanges of messages between any point of the
ence between a human being and a camera?                     net and any other point. It can also be used to dis-
According to IIT, the difference has to do with              seminate or ‘broadcast’ messages from any one
information integration. From the point of view of           node to many others. The integration is achieved by
an external observer, the camera may be considered           routers that act as dynamic switches connecting any
as a single system with a repertoire of 21,000,000 states.   address in the network with any other address. And
However, the chip is not an integrated entity: since         yet it seems unlikely that, at least in its current form,
its 1 million photodiodes have no way to interact,           the internet is giving rise to some kind of globally
each photodiode performs its own local discrimina-           integrated consciousness. What could be the critical
tion between a low and a high current, completely            difference between the network of neurons inside
independent of what every other photodiode might             the brain that gives rise to human consciousness, and
be doing. In reality, the chip is just a collection of 1     the network of internet routers connecting devices
million independent photodiodes, each with a reper-          throughout the world?
toire of 2 inputs and outputs – there is no intrinsic        According to IIT, the difference has to do with the
point of view associated with the camera chip as a           fact that the neural substrate of consciousness is
whole. This is easy to see: if the sensor chip were          wired to achieve maxima of integrated information,
cut into 1 million pieces each holding its individual        whereas the internet is not. Consider the internet first.
photodiode, the performance of the camera would              The internet is not designed to achieve a maximum
not change at all.                                           of integrated information, but to ensure point to point
Integrated information theory of consciousness: an updated account
296                                                  G. Tononi

communication. Indeed, interactions within the inter-      help maintain my posture. And I certainly do not
net can typically be reduced to independent compo-         have access to whatever is going on in peripheral
nents, and they better be independent, otherwise there     organs in my body, such as the liver, the kidneys
would be a chaotic cross-talk and point-to-point com-      and so on. Furthermore, while I can interact with
munication would not be possible. In other words, the      other people, I have no access to their internal work-
ability to obtain independent, point-to-point signaling    ings. Exclusion applies also within consciousness:
excludes the ability to perform global computations,       at any given time, there is only one consciousness
and vice versa. Thus, the internet, while integrated       – one maximally integrated subject – me – hav-
enough to permit point-to-point signaling, is certainly    ing one full experience, not a multitude of partial
not maximally integrated – not from the intrinsic          consciousnesses, each experiencing a subset of the
perspective of the internet itself. On the other hand,     contents of my experience. Instead, each experience
from the perspective of an external user, this has         is compositional, i.e. structured – it is constituted
great advantages. For example, from a particular           of different aspects in various combinations: I see
node, say the terminal of an information technologist,     the shape of the apple, I see its red color, I see a
one can access without any cross-talk a connected          position in space, and I also see that the apple is red
hand-held device to diagnose exactly what the speech       and occupies that position. Exclusion also occurs in
recognition module is doing or why it may be mal-          spatio-temporal terms: what I experience, I experi-
functioning; or how the power regulating circuits are      ence at a particular spatial and temporal resolution: I
performing; or one can access a connected peripheral,      have no way to experience directly processes within
say a printer, to diagnose if it is running properly; or   my brain – even within the parts that are involved in
access anybody else’s computer and check any aspect        generating experience – that happen at a much finer
of its functioning; and so on for any other connected      spatial grain, such as the workings of molecules and
device. Moreover, one can check the computations of        atoms within neural cells, or at a much finer tem-
any connected node at a range of spatial and temporal      poral grain, such as the millisecond-by-millisecond
scales, from the operations performed by individual        traffic of spikes among neurons. Similarly, I cannot
transistors at microsecond resolution to daily aver-       experience events at a coarser spatial or temporal
ages of traffic over a hub. However, the price of such     scale: for example, no matter how hard I try, I can-
complete access is that the internet is not well suited,   not lump together into a single experience an entire
at least in its current form, to achieve what one may      movie, a waking day, or a lifetime: there is a “right”
call ‘global’, autonomous computations.                    time scale at which consciousness flows – at other
By contrast, within consciousness information is           time scales, consciousness simply does not exist.
maximally integrated: every experience is whole,
and the entire set of concepts that make up any par-
ticular experience – what makes the experience what        Phenomenological axioms,
it is and what it is not – are maximally interrelated.     ontological postulates, and identities
This integration is excellent for a context-dependent
understanding of a particular state of affairs, but        Based on the intuitions provided by these thought
the flip side of maximal information integration is        experiments, the main tenets of IIT can be presented
exclusion. No matter how hard I try, I cannot become       as a set of phenomenological axioms, ontological
conscious of what is going on within the modules in        postulates, and identities. The central axioms, which
my brain that perform language parsing: I hear and         are taken to be immediately evident, are as follows:
understand an English sentence, but I have no con-
scious access to how the relevant part of my brain         An initial axiom is simply that consciousness exists.
are achieving this computation, although of course         Paraphrasing Descartes, “I experience therefore I
they must be connected to those other parts that give      am”2.
rise to my present consciousness. Similarly, I have
no conscious access to those other parts of my brain       Another axiom concerns compositionality: experi-
that are in charge of blood pressure regulation; or        ence is structured, consisting of multiple aspects in
to the complex computations in the cerebellum that         various combinations. Thus, even an experience of
Integrated information theory of consciousness: an updated account
Integrated information theory of consciousness: an updated account                       297

pure darkness and silence contains visual and audi-        intrinsic, causal notion of information can be assessed
tory aspects, spatial aspects such as left center and      by examining the cause-effect repertoire (CER) speci-
right, and so on.                                          fied by a mechanism in a state – the set of past system
                                                           states that could have been the causes of its present
A central axiom concerns information: experience           state and the set of future system states that could
is informative or specific – in that it differs in its     have been its effects. If a mechanism in a state does
particular way from other possible experiences.            not specify either selective causes or selective effects
Thus, an experience of pure darkness and silence           (for example by lacking inputs or outputs), then the
is what it is by differing, in its particular way, from    mechanism does not generate any cause-effect infor-
an immense number of other possible experiences –          mation (CEI) within the system. Ontologically, the
including the experiences triggered by any frame of        information postulate claims that, from the intrinsic
any possible movie.                                        perspective of a system, only differences that make a
                                                           difference within the system exist.
Another axiom concerns integration: experience is
integrated – in that it cannot be reduced to inde-         Another postulate concerns integration: a mecha-
pendent components. Thus, experiencing the word            nism in a state generates integrated information
“SONO” written in the middle of a blank page can-          only if it cannot be partitioned into independent
not be reduced to an experience of the word “SO” at        submechanisms. That is, the information generated
the right border of a half-page, plus an experience        within a system should be irreducible to the infor-
of he word “NO” on the left border of another half-        mation generated within independent sub-systems
page – the experience is whole.                            or independent interactions. Integrated information
                                                           (ϕ) can be captured by measuring to what extent the
Yet another axiom is exclusion: experience is exclu-       information generated by the whole differs from the
sive – in that it has definite borders, temporal, and      information generated by its components (minimum
spatial grain. Thus, an experience encompasses             information partition MIP). Ontologically, the inte-
what it does, and nothing more; at any given time          gration postulate claims that only irreducible inter-
there is only one of its having its full content, it       actions exist intrinsically, i.e. in and of themselves.
flows at a particular speed, and it has a certain reso-
lution such that certain distinctions are possible and     Yet another postulate concerns exclusion: a mecha-
finer or coarser distinctions are not.                     nism in a state generates integrated information
                                                           about only one set of causes and effects – the one
To parallel the phenomenological axioms, IIT posits        that is maximally irreducible. That is, the mecha-
some ontological postulates:                               nism can specify only one pair of causes and effects.
                                                           By a principle of causal parsimony, this is the pair
An initial postulate is simply that mechanisms in a        of causes and effects whose partition would produce
state exist. That is, there are operators that, given an   the greatest loss of information. This maximally
input, produce an output, and at a given time such         irreducible set of causes and effects is called a con-
operators are in a particular state.                       cept. Exclusion can be captured by measuring the
                                                           maximum of integrated information maxϕMIP over all
Another postulate concerns compositionality: mech-         possible cause-effect repertoires of the mechanism
anisms can be structured, forming higher order             over the system. Ontologically, the exclusion postu-
mechanisms in various combinations.                        late claims that only maximally irreducible entities
                                                           exist intrinsically3.
A central postulate concerns information: from the
intrinsic perspective of a system, a mechanism in a        As will be discussed below, the postulates can be
state generates information only if it has both selec-     applied to subsets of elements within a system (mech-
tive causes and selective effects within the system        anisms) as well as to systems (sets of concepts). A
– that is, the mechanism must constitute “a differ-        system of elements that generates cause-effect infor-
ence that makes a difference within the system”. This      mation (it has concepts), is irreducible (it cannot be
Integrated information theory of consciousness: an updated account
298                                                   G. Tononi

split into mutually independent subsystems), and is         mechanism and state (the cause repertoire CR),
a local maximum of irreducibility (in terms of the          and the maximum uncertainty (entropy) distribu-
concepts it generates) over a set of elements and over      tion PHmax, in which all P outputs are equally likely
an optimal spatio-temporal grain of interactions, con-      a priori7. Thus, EI(P|s) represents the differences
stitutes a complex – a maximally irreducible entity.        in the past states of P that that can be detected by
In this view, only complexes are entities that exist        mechanism S in its present state s. Similarly, D
intrinsically, i.e. in and of themselves.                   between the distribution of F states that would be
                                                            the effect of ‘fixing’ mechanism S in its present state
Finally, IIT posits identities between phenomeno-           s (the effect repertoire ER) and the distribution of
logical aspects and informational/causal aspects of         states of F in which all F inputs are equally likely
systems. The central identity is the following: an          (FHmax), is the effective information s generates about
experience is a maximally integrated conceptual             future states of F:
information structure. Said otherwise, an experience
is a “shape” or maximally irreducible constellation                      EI (F | s) = D [(F | s), FHmax]
of concepts in qualia space (a quale), where qualia
space is a space spanned by all possible past and           Thus, EI(F|s) represents the differences to the future
future states of a complex. In this space, concepts         states of F made by mechanism S being in its present
are points in the space whose coordinates are the           state s. Clearly, EI(P|s) > 0 only if past states of P
probabilities of past and future states corresponding       make a difference to s, and EI(F|s) 0 only if s makes
to maximally irreducible cause-effect repertoires           a difference to F.
specified by various subsets of elements.                   Based on the information postulate, a mechanism in
                                                            a state (s) generates information from the intrinsic
In what follows, the postulates of IIT are briefly          perspective of a system only if it both detects differ-
illustrated by considering a set of mechanisms (a           ences in the past states of the system and it makes
candidate system of elements). Within the system,           a difference to its future states. That is, s generates
the postulates are the first applied to mechanisms in a     information only if it has both selective causes
state, alone or in combination (all subsets), to identify   (EI(P|s) > 0) and selective effects (EI(F|s) > 0). The
concepts; then the postulates are applied to different      minimum of the two, which represents the ‘bottle-
systems of elements and the collection of concepts          neck’ in the channel between past causes over P and
they generate, in order to identify complexes4.             future effects over F as mediated by the mechanism
                                                            S in its present state s, is called cause-effect informa-
Information                                                 tion (CEI):
The information postulate says that information is a
difference that makes a difference from the intrinsic             CEI(P, F | s) = min [ EI (P | s), EI (F | s) ]
perspective of a system. This intrinsic, causal5 notion
of information is assessed by considering if the            Clearly, CEI > 0 only if the system’s states make
present state of a mechanism can specify both past          a difference to the mechanism, and the state of the
causes and future effects within the system.                mechanism makes a difference to the system. Thus
                                                            an element that monitors the state of the system (say
Within a system X, consider a subset of elements S          a parity detector), but has no effects on the system,
in its present state s6. The information s generates        may be relevant from the extrinsic perspective of an
about some subset of elements of X in the past (P) is       observer, but is irrelevant from the intrinsic perspec-
the effective information (EI) between P and s:             tive of the system, as it makes no difference to it. If
                                                            CEI > 0, the cause and effect repertoires together can
             EI (P | s) = D [(P | s), PHmax]                be said to specify a cause-effect repertoire (CER).
                                                            As an example, consider a mechanism A within an
where D indicates the difference between two dis-           isolated system ABC (Fig. 1). The wiring diagram
tributions, in this case between the distribution of        is unfolded into a directed acyclic graph over past,
P states that could have caused s given its present         present, and future. A’s mechanism is a logical AND
Integrated information theory of consciousness: an updated account
Integrated information theory of consciousness: an updated account                          299

Fig. 1. - A cause-effect repertoire (CER) and the cause-effect information it generates (“differences that make a
difference”). See text for explanation.

gate of elements B and C, turning ON if both B and        causes and effects. This integrated (irreducible)
C are ON; moreover, if A is ON, it turns OFF B.           information is quantified by ϕ (small phi), a measure
Thus, A specifies that, starting from the eight possi-    of the difference D between the repertoire specified
ble past states of elements ABC (maximum entropy          by a whole and the product of the repertoires speci-
distribution), only two past outputs of ABC can lead      fied by its partition into causally independent com-
to A’s present state (ON) – those in which B and C        ponents. The difference is taken over the partition
are both ON (cause repertoire CR), thereby ‘detect-       that yields the least difference from the whole (the
ing differences’ and generating EI. Moreover, A           minimum information partition (MIP)), i.e. ϕMIP8.
specifies that, starting from maximum entropy over
the inputs to ABC, A’s present state (ON) can only        Consider a partition / that splits the interactions
lead to four future states of ABC – those in which        between P and S into independent interactions
B is OFF (effect repertoire ER), thereby ‘making          between parts of P and parts of S9, which can be
a difference’. Together, CR and ER specify the            done by ‘injecting’ noise (Hmax) in the connections
cause-effect repertoire CER = (ABC)pa | Apr, (ABC)fu      among them. One can then measure the difference D
| Apr where the subscripts refer to present, past, and    between the unpartitioned cause repertoire CR and
future. The cause-effect information (CEI) gener-         the partitioned CR. For the partition that minimizes
ated by a mechanism over its cause-effect repertoire      D, known as minimum information partition (MIP),
(CER) is the minimum between EI [(ABC)pa | Apr ]          the difference D is called ϕ (small phi). The same
and EI [(ABC)fu | Apr ].                                  holds for the difference D between the unpartitioned
                                                          and partitioned effect repertoire ER:
Integration
The integration postulate says that information is              ϕMIP (P | s) = D [(P | s), ∏ (P | s / MIP) ];
integrated if it cannot be partitioned into indepen-            ϕMIP (F | s) = D [(F | s), ∏ (F | s / MIP) ]
dent components. That is, a mechanism in state
generates integrated information only if it cannot be     Thus, ϕMIP(P|s) is the ‘past’ integrated (irreducible)
partitioned into submechanisms with independent           information, and ϕMIP(F|s) is the ‘future’ integrated
300                                                      G. Tononi

(irreducible) information. Clearly, ϕMIP(P|s) > 0 only         on the other side: ϕMIP (P | s) = (ABCD)pa | (ABCD)
if the past states of P make a difference to s that can-       pr
                                                                   || (AB)pa | (AB)pr x (CD)pa | (CD)pr = 0. Similarly
not be reduced to differences made by parts of P on            for the effect repertoire, ϕMIP (F | s) = (ABCD)
parts of s, and likewise for ϕMIP(F|s) > 0.                    fu
                                                                  | (ABCD)pr || (AB)fu | (AB)pr x (CD)fu | (CD)pr =
Based again on the information postulate, a mecha-             0. Thus, as expected, for this partition ϕMIP = min
nism in a state (s) generates integrated information           [ϕMIP (P | s), ϕMIP (P | s)] = 0. That is, considering the
from the intrinsic perspective of a system only if this        ‘whole’ CER specified by (ABCD)pa | (ABCD)pr and
information is irreducible both in the past and in the         (ABCD)fu | (ABCD)pr adds nothing compared to con-
future. That is, s generates integrated information            sidering the independent ‘partial’ CER specified by
only if it has both irreducible causes (ϕMIP(P|s) > 0)         (AB)pa | (AB)pr, (AB)fu | (AB)pr and by (CD)pa | (CD)pr,
and irreducible effects (ϕMIP(F|s)>0). The minimum             (CD)fu | (CD)pr. In other words, there is no reason to
of the two, which represents the ‘bottleneck’ in the           maintain that the ‘whole’ CER ABCD exists in and
channel between the past P and the future F as medi-           of itself, as it makes no difference above and beyond
ated by the mechanism S in its present state s, is             the two partial CER AB and CD. Thus, searching for
called ‘cause-effect’ integrated information:                  partitions among sets of elements yielding ϕMIP = 0
                                                               enforces a principle of causal parsimony.
    ϕMIP (P, F | s) = min [ϕMIP (P | s), ϕMIP (F | s)]         As another example, consider a partition between
                                                               interactions. The system depicted in Fig. 2b is such
As an example, Fig. 2a shows a set of 4 elements               that A copies B and B copies A. For the cause-rep-
ABCD, where A is reciprocally connected to B and               ertoire CR of AB and its partition into independent
C is reciprocally connected to D. The wiring dia-              interactions of A with B and B with A one has that
gram is again unfolded into a directed acyclic graph           ϕMIP (P | s) = (AB)pa | (AB)pr || (B)pa | (A)pr x (A)pa |
over past, present, and future. Consider now the               (B)pr = 0, and similarly for the effect repertoire ER.
cause repertoire (ABCD)pa | (ABCD)pr and a partition           That is, the CER of AB over AB (written AB/AB)
between subsets of elements AB on one side and CD              reduces without loss to the independent CER of A/B

Fig. 2. - Integrated information generated by an irreducible CER, as established by performing partitions. See text
for explanation.
Integrated information theory of consciousness: an updated account                         301

and B/A both in the past and in the future. Thus,        maximally integrated information only if it has
there is no reason to maintain that the CER AB/          both maximally irreducible causes (maxϕMIP(P|s) >
AB exists in and of itself, as it makes no difference    0) and maximally irreducible effects (maxϕMIP(F|s) >
above and beyond the independent CER of A/B and          0). The minimum of the two, which represents the
B/A. Again, searching for partitions among interac-      ‘bottleneck’ in the channel between the past P and
tions yielding ϕMIP = 0 enforces a principle of causal   the future F as mediated by the mechanism S in its
parsimony.                                               present state s, is called ‘cause-effect’ maximally
By contrast, consider a system in which A is a lin-      integrated information:
ear threshold unit that receives strong inputs from
B and C, which if both ON are sufficient to turn A         ϕMIP (P, F | s) = min [maxϕMIP (P | s), maxϕMIP (F | s) ]
                                                         max
ON, and a weak input from D; and in which A has
strong outputs to B and C (it turns both ON), and a      The cause-effect repertoire of s that has maxϕMIP
weak output to D (Fig. 2c). Considering the CR of        (P,F|s) within a system X is called a concept. Thus,
A/BCD, one has that its partition A/BC x D/[] ([]        from the intrinsic perspective of a system, a concept
indicates the empty set) yields ϕMIP > 0, and the same   is a maximally irreducible set of causes and effects
holds for the ER. Thus, this CER is irreducible,         (MICE) specified by a mechanism in a state.
since there is no way to partition it without losing     For example, in Fig. 3 the powerset of CER (or ‘pur-
some information – in this case some information         views’) of subset A within system ABCD includes,
about element D.                                         for the cause repertoires, A/A; A/B; A/C; A/D; A/
                                                         AB; A/AC; A/AD; A/BC; A/BD; A/CD; A/ABC;
Exclusion                                                A/ABD; A/ACD; A/BCD; A/ABCD. Of these, the
The exclusion postulate says that integrated infor-      partition A/BC || A/B x []/C = maxϕMIP turns out to
mation is about one set of causes and effects only –     be maximal (Fig. 3b), higher for example than the
those that are maximally irreducible – other causes      partition in Fig. 3a (A/BCD || A/BC x []/D). This is
and effects are excluded. That is, a mechanism in a      because partitioning away element B (or A) loses
state can specify only one pair of causes and effects,   much more integrated information than any other
which, by a principle of causal parsimony, is the        partition. A similar result is obtained for the pow-
one whose partition would produce the greatest loss      erset of partitions of A/ABCD for the effect rep-
of information. This maximally irreducible set of        ertoires. By the exclusion postulate, only one CER
causes and effects (MICE) is called a concept or, for    exists – the one made of the maximally irreducible
emphasis, a “core concept”.                              CR and ER – excluding any other CER11.
                                                         The reason to consider exclusively the CER with
For a given subset of elements S in a present state      max
                                                             ϕMIP is as before a principle of causal parsi-
s, there are potentially many cause repertoires CR       mony – more precisely, a principle of least reduc-
depending on the particular subset P one considers       ible reason. Consider A being ON in the previous
(within system X). Exclusion states that, at a given     example: it specifies a cause repertoire, but cannot
time, s can have only one CR – which is the one          distinguish which particular cause was actually
having the maximum value of ϕMIP (maxϕMIP), where        responsible for its being ON; and with respect to its
the maximum is taken over all possible subsets P         effects, it makes no difference which cause turned
within the system10. The corresponding CR is called      A ON. Since the particular cause does not matter,
the core cause of s within X. Similarly, the effect      the exclusion postulate enforces causal parsimony,
repertoire ER having maxϕMIP over all possible sub-      defaulting to the maximally irreducible set of causes
sets F within the system is called the core effect of    for A being ON. These least ‘dispensable’ and thus
s within X.                                              most likely ‘responsible’ causes can be called the
Based again on the information postulate, a mecha-       ‘core’ causes for A being ON, in the sense that
nism in a state (s) generates integrated information     their elimination would have made the most differ-
from the intrinsic perspective of a system only          ence12 13. In turn, the fact that A is ON also specifies
if this information is maximally irreducible both        a forward repertoire of possible effects, but once
in the past and in the future. That is, s generates      again A should be held most responsible only for its
302                                                 G. Tononi

Fig. 3. - Maximally integrated information generated by a maximally irreducible CER over all possible CER specified
by a subset of elements within a system. See text for explanation.

maximally irreducible or ‘core’ effects: the effects        As an example, consider the system in Fig. 4, whose
for which A being ON is least dispensable, meaning          wiring diagram is on the left. The middle panel
that eliminating A’s output would have made the             shows the four concepts generated by the system,
most difference14.                                          with their maximally irreducible cause-effect reper-
                                                            toires and the corresponding maxϕMIP. For the concept
Concepts                                                    generated by all three elements (ABC, top row) the
A concept or ‘core’ concept thus specifies a max-           figure also shows the product repertoires generated
                                                            by the minimum information partitions of its maxi-
imally irreducible cause-effect repertoire (CER)
                                                            mal cause and effect repertoires.
implemented by a mechanism in a state. Within a
                                                            For a given set of elements, it is useful to consider
concept, one can distinguish a core cause – the set of
                                                            concepts as points within a space (concept space)
past input states (cause repertoire CR) constituting
                                                            that has as many axes as the number of possible past
maximally irreducible causes of the present state of
                                                            and future states of the set (Fig. 4, right panel; the
the mechanism; and a core effect – the set of future        axes are depicted along a circle but should be imag-
output states (effect repertoire ER) constituting           ined in a high-dimensional space; the points are indi-
maximally irreducible effects of its present state.         cated as stars). Each concept specifies a maximally
For example, an element (or set of elements) imple-         irreducible CER, which is a set of probabilities over
menting the concept “table”, when ON, specifies             all possible past and future states, and these prob-
‘backward’ the maximally irreducible set of inputs          abilities specify a particular point in concept space
that could have caused its turning ON (e.g. seeing,         (more precisely, since probabilities must sum to 1,
touching, imagining a table); ‘forward’, it specifies       in the subspace given by the corresponding concept
the set of outputs that would be the effects of its         simplex). The concept ‘exists’ with an ‘intensity’
turning ON (e.g. thinking of sitting at, writing over,      given by maxϕMIP, that is, its degree of irreducibility
pounding on a table)15.                                     (shown by the size of the star).
Integrated information theory of consciousness: an updated account                        303

                 Fig. 4. - An integrated conceptual information structure. See text for explanation.

It is thus possible to evaluate the overall constellation    the constellation of concepts generated by a set of
of concepts generated by the set of elements in a sin-       elements cannot be reduced to the product of the
gle concept space, which can be called a conceptual          constellations generated by the parts (integration
information structure C. Among the relevant features         postulate); ii) ensuring that the constellation of con-
one can consider are: i) the intensity, i.e. irreducibil-    cepts generated by one part of the system have both
ity maxϕMIP of existing concepts; ii) the “shape” of the     selective causes and selective effects in the other
constellation of concepts in concept space; iii) the         part (information postulate); iii) choosing the set of
dimensionality of the sub-space spanned by all the           elements that generates the most irreducible constel-
concepts; iv) the scope of the subspace covered by           lation of concepts (exclusion postulate).
the concepts; v) the scope of the subspace covered by        As before, the irreducibility mandated by the inte-
the concepts weighted by their intensity16 17.               gration postulate can be determined by measuring
                                                             the difference D between the constellation of con-
Complexes
                                                             cepts generated by the whole, unpartitioned set of
                                                             elements s, and that generated after its partition P
By considering the conceptual information structure          into parts:
C (“constellation” C) specified in concept space by
all the concepts generated by a system (Fig. 4), the                   ΦP→ (C | s) = D (C |s, C | s/P→);
postulates of IIT can be applied not only to find the                  ΦP← (C | s) = D (C |s, C | s/P←)
maximally irreducible CER of a subset of elements
(concepts), but also to find sets of elements, called        where the arrow next to P indicates a unidirectional
complexes, which generate maximally integrated               partition, i.e. one that separates causes from effects
conceptual information structures. As with concepts,         across the parts by injecting noise in the connections
so with complexes, this can be done by: i) making            going from one part to the other. Applying as before
sure, by partitioning the elements of a system, that         the information postulate, one has:
304                                                   G. Tononi

    ΦP (C | s) = min [ ΦP→ (C | s), ΦP← (C | s) ]           among them is illustrated in Fig. 5a. Note, for exam-
                                                            ple, that due to the exclusion postulate, although
That is, one first partitions across the inputs (causes)    complexes can interact, they cannot overlap. Thus,
to one side of the partition (i.e. the outputs or effects   when two complexes of high maxΦMIP interact weak-
from the other side), then the other way around, and        ly, their union does not constitute a third complex,
one takes the minimum across the partition. Finally,        even though its ΦMIP value may be > 0: once again,
as before, one finds the partition for which ΦP (C |        there is no need to postulate additional entities,
s) reaches its minimum value, ΦMIP (C | s), where           because they would make no further difference
MIP is the minimum information partition, and ΦMIP          beyond what is accounted by the two complexes of
stands for integrated conceptual information. Thus,         high maxΦMIP plus their weak interactions21. This is a
if ΦMIP (C | s) >0, no partition can divide the system      direct application of Occam’s razor: “entities should
into non-interacting, mutually independent parts.           not be multiplied beyond necessity”22. We recognize
Moreover, the greater the value of ΦMIP, the more           this principle intuitively when we talk to each other:
irreducible the constellation of concepts generated         most people would assume that there are just two
by a particular set of elements18. Finally, according       consciousness (complexes of maxΦMIP) that interact
to the exclusion postulate, out of many possible con-       a little, and not also a third consciousness (complex
stellations of concepts generated by overlapping sets       of lower ΦMIP) that includes both speakers. In sum-
of elements only one exists: the one that is maximal-       mary, a complex is an individual, informationally
ly irreducible. Thus, one needs to evaluate ΦMIP for        integrated entity that is maximally irreducible: i) it
all sets of elements s, i.e. s = A, B, C, AB, AC, BC,       cannot be partitioned into more integrated parts; ii)
ABC19. The set of elements generating the constel-          it is not part of a more integrated system; iii) it is
lation with the maximum value of ΦMIP (maxΦMIP, or          separated through a boundary from everything exter-
maximally integrated conceptual information) con-           nal to it (it excludes it). In this view, any system of
stitutes the main complex within the overall system;        elements ‘condenses’ into distinct, non-overlapping
the corresponding concept space (simplex) is called         complexes that constitute local maxima of integrat-
qualia space, and the constellation of concepts it          ed conceptual information.
generates – the maximally integrated conceptual
(information) structure – is called a quale Q20.            Optimal spatio-temporal grain
For example, an exhaustive analysis of the system           The exclusion postulate should be applied not only
in Fig. 4 shows that the full set ABC constitutes a         over sets of elements, but over different spatial and
complex, as no other set of elements yields inte-           temporal scales. For any given system, one can
grated conceptual structures having a higher value          group and average the states of several microele-
of ΦMIP. In larger systems, one would first identify        ments into states of a smaller number of macro-ele-
the main complex and then, recursively, identify            ments. Similarly, one can group and average states
other complexes among the remaining elements.               over several micro-intervals into longer macro-inter-
Therefore, a complex can be defined as a set of ele-        vals. For each spatio-temporal grain, one calculates
ments generating a maximally irreducible constella-         CER, concepts (maximally irreducible CER), and
tion of concepts (a maximally integrated conceptual         complexes (sets of elements generating maximally
structure). In essence, then, just like a concept speci-    integrated conceptual structures). By the exclusion
fies a particular, maximally integrated distribution        postulate, a particular set of elements, over a particu-
of system states out of possible distributions (a point     lar spatio-temporal grain, will yield the max value of
in concept space), a complex specifies a particular,        ΦMIP, thereby excluding any overlapping subsets and
maximally integrated conceptual structure (constel-         spatio-temporal grains.
lation of points) out of possible conceptual struc-         As an example, consider the brain: over which ele-
tures in concept space. As indicated by the informa-        ments should one consider perturbations and the rep-
tion axiom, that constellation differs in its particular    ertoire of possible states? A natural choice would be
way from other possible constellations.                     neurons, but other choices, such as neuronal groups
A schematic representation of a reduction of a sys-         at a coarser scale, or synapses at a finer scale, might
tem into complexes plus the residual interactions           also be considered, not to mention molecules and
Integrated information theory of consciousness: an updated account                            305

Fig. 5. - Complexes: maxima of integrated conceptual information over elements, space, and time. In the left panel,
the blue ovals represent several separate complexes, i.e. local maxima of maxΦMIP, each containing a schematic
constellation, i.e. an integrated information structure comprising different concepts (stars). Each large blue oval –
a main complex corresponding to an individual consciousness generated by a subset of neurons within the brain
– is contained within a larger white oval that stands e.g. for the body, a system that does not constitute a complex
and is thus not conscious. Inside the body, besides the main complex, are smaller complexes having very low
max
    ΦMIP (only one shown) and presumably many smaller ones that are not represented. The curved lines represent
interactions among parts of the body that remain outside individual complexes and thus outside consciousness.
The large oval that encompasses both bodies indicates that the two consciousnesses interact within a larger sys-
tem that is again not a complex and is thus not conscious. The outer dashed oval stands for the immediate envi-
ronment. The right panels indicate that, within a system such as the brain, maxΦMIP will reach a maximum not only
within a particular subset of elements but also at a particular spatio-temporal scale. See text for further explanation.

atoms. Importantly, under certain circumstances, a            of just a few milliseconds. However, consciousness
coarser spatial scale (‘macro’-level) may produce             appears to flow at a longer time scale, from tens of
a complex with higher values of ΦMIP than a finer             milliseconds to 2-3 seconds, usually reaching maxi-
scale (‘micro’-level), despite the smaller number of          mum vividness and distinctness at a few hundred
macro- compared to micro-elements. In principle,              milliseconds (Fig. 5c). IIT predicts that, despite the
then, it should be possible to establish if in the brain      larger number of neural ‘micro’-states (spikes/no
consciousness is generated by neurons or groups               spikes, every few milliseconds), ΦMIP will be higher
of neurons. In this case the exclusion postulate              at the level of neural ‘macro’-states (burst of spikes/
would also mandate that the spatial scale at which            no bursts, averaged over hundreds of milliseconds).
ΦMIP is maximal, be it neurons or neuronal groups,            This is likely the case because a set of neurons
excludes finer or coarser groupings: there cannot be          widely distributed over the cerebral cortex can
any superposition of (conscious) entities at different        interact cooperatively only if there is enough time
spatio-temporal scales if they share informational/           to set up transiently stable firing patterns (attractors,
causal interactions (Fig. 5b)23.                              see below) by allowing spikes to percolate forward
Similar considerations apply to time. Integrated              and backward. Again, the exclusion postulate would
information can be measured at many temporal                  mandate that, whatever the temporal scale that maxi-
scales. Neurons can choose to spike or not at a scale         mizes ΦMIP, be it spikes or bursts, there cannot be
306                                                   G. Tononi

any superposition of (conscious) entities evolving at       In principle, then, given the “wiring diagram” and
different temporal scales if they share informational/      present state of a given system, IIT offers a way
causal interactions24 25.                                   of specifying the maximally integrated conceptual
                                                            structure it generates (if any)27. According to IIT,
                                                            that structure completely specifies “what it is like to
Identity between maximally                                  be” that particular mechanism in that particular state,
integrated conceptual structures                            whether that is a set of three interconnected logical
(qualia) and experiences                                    gates in an OFF state; a complex of neurons within
                                                            the brain of a bat spotting a fly through its sonar; or a
In summary, a particular set of elements at a par-          complex of neurons within the brain of a human won-
ticular spatio-temporal scale yielding a maximum of         dering about free will. In the latter examples, the full
integrated conceptual information (maxΦMIP) consti-         integrated conceptual structure is going to be extraor-
tutes a complex, a ‘locus’ of consciousness. The set        dinarily complex and practically out of reach: we are
of its concepts – maximally irreducible cause-effect        not remotely close to having the full wiring diagram of
repertoires (maxϕMIP>0) specified by various subsets        the relevant portions of a rodent or human brain; even
of elements within the complex – constitute a maxi-         if we did, obtaining the precise quale would be com-
mally integrated conceptual information structure           putationally unfeasible28. Nevertheless, by comparing
or quale (Fig. 4) – a shape or constellation of points      some overall features of the shapes of qualia generated
in qualia (concept) space26.                                by different systems or by the same system in different
Having defined complexes and qualia, IIT posits             states, it should be possible to evaluate broad simi-
identities between phenomenological and informa-            larities and differences between experiences. IIT also
tional/causal aspects of systems. The central iden-         implies that, if a collection of mechanisms does not
tity is the following: an experience is a maximally         give rise to a single maximally integrated conceptual
integrated conceptual (information) structure or            structure, but to separate qualia each reaching a maxi-
quale – that is, a maximally irreducible constellation      mum of integrated conceptual information, then there
of points in qualia space. Tentative corollaries of         is nothing it is like to be that collection, whether it is
this identity include the following: i) the particular      an array of electronic circuits, a heap of sand, a swarm
‘content’ or quality of the experience is the shape         of bats, or a crowd of humans.
of the maximally integrated conceptual structure in
qualia space (the constellation of concepts); ii) a         Matching
phenomenological distinction is a maximally irre-           So far, the maximally integrated conceptual struc-
ducible cause-effect distinction (a concept). In other      tures generated by a system of elements have been
words, unless there is a mechanism that can generate        considered in isolation from the environment – as is
a maximally irreducible cause-effect repertoire (con-       the case for the brain when it dreams. But of course
cept) – a distinct point in the quale – there is no cor-    it is also essential to consider how integrated con-
responding distinction in the experience the subject        ceptual structures are affected by the external world,
is having; iii) the intensity of each concept is its maxϕ   especially since the mechanisms generating them
MIP
    value; iv) the ‘richness’ of an experience is the       become what they are through a long evolutionary
number of dimensions of the shape; v) the scope of          history, developmental changes, and plastic changes
the experience is the portion of qualia space spanned       due to interactions with the environment.
by its concepts; vi) the level of consciousness is the
value of maximally integrated conceptual informa-           In any situation, a complex of high maxΦMIP has at its
tion maxΦMIP; vii) the similarity between concepts is       disposal a large number of concepts – maximally
their distance in qualia space, given the appropriate       irreducible cause-effect repertoires specified within
metric; viii) clusters of nearby concepts form modali-      a single conceptual structure. These concepts allow
ties and submodalities of experience; ix) the similar-      the complex to understand the situation and act in it
ity between experiences would be given by the simi-         in a context-dependent, valuable fashion. It would
larity between the corresponding shapes (see also the       be helpful to have a measure that assesses how well
final section and Tononi, 2008, 2010), and so on.           the integrated conceptual structure generated by an
Integrated information theory of consciousness: an updated account                   307

adapted complex fits the causal structure of the envi-   tion structure generates a good intrinsic model of
ronment. One way to do so is to define cause-effect      its input. Again, the system can do so in two ways:
matching (M) between a system and its environment        by modifying its own connections so they generate
as the difference between two terms, called Capture      a correlation structure similar to that induced by
and Mismatch:                                            the environment (the system’s Dream becomes a
                                                         model of World). In this way ‘memories’ formed
         Matching = Capture – Mismatch                   over a long time can help to disambiguate / fill in
                                                         current inputs and, more generally, to predict many
Capture is the minimum average difference             aspects of the environment (Tononi and Edelman,
between the constellations C when a complex inter-       1997). Another way is to change the environment
acts with its environment (C World), compared to         by exploring it or modifying it to make inputs match
when it is exposed to an uncorrelated, structureless     its own values and expectations (World is made to
environment (C Noise).                                   conform to the system’s ‘Dream’). In general, the
                                                         interactions with the environment would have to
  Capture = min < D [ C |s World, C |s Noise ] >         match specific cause repertoires with specific effect
                                                         repertoires in a way that yields perception-action
As before, D specifies a distance metric. Capture is     cycles of high adaptive value: in short, the ‘right’
an indication of how well the system samples the         cause should lead to the ‘right’ effect
statistical structure of its environment (deviations
from independence). Thus, high capture means that        Note that the balance between the two terms in the
the system is highly sensitive to the correlations in    expression for matching has two useful consequenc-
the environment. The system can do so in two ways:       es: maximizing Capture ensures that the system
on the input side, by sampling as many correlations      does not minimize Mismatch simply by disconnect-
as possible from the environment through a large         ing from World. Conversely, minimizing Mismatch
sensory bandwidth and distributing these correla-        ensures that the system does not maximize Capture
tions efficiently within the brain through a special-    simply by becoming sensitive to the correlations
ized connectivity (thereby reflecting to what extent     in its input from World without developing a good
World deviates from Noise, Tononi et al., 1996).         generative model.
On the output side, an organism can extract more
information by actively exploring its environment        Importantly, since within a given system it is likely
or modifying it to better pick up correlations, aided    that similar states yield similar constellations, a
by a rich behavioral repertoire (Tononi et al., 1999).   simpler expression for matching can be obtained by
Note that the minimum is taken because to match          considering differences between the probability dis-
system constellations generated with World and           tribution of system states S, rather than differences
with Noise one should pair them in such a ways as        between sets of constellations C:
to minimize the overall difference.
                                                         M = D [S World, S Noise] – D [S World, S Dream]
Mismatch is the minimum average difference 
between the constellations C when a complex inter-       (note that, while the above expression is based on
acts with its environment (C World), compared to         the distribution of system states, in principle the
when it is dreaming (C Dream), that is, when it          notion of matching can also be applied to the distri-
is disconnected from the environment both on the         bution of sequences of system states).
input and the output sides.
                                                         In the course of evolution, development, and learn-
Mismatch = min < D [ C |s World, C |s Dream ] >          ing, one would expect that the mechanisms of a
                                                         system change in such a way as to increase match-
Mismatch is an indication of how well the system         ing. Capture should increase because, everything
models the statistics of its environment. Thus, low      else being equal, an organism that obtains more
mismatch means that the system’s causal informa-         information about the structure of the environment
You can also read