Multimodal Interfaces - 1 Introduction to

Page created by Robin Graham
 
CONTINUE READING
Multimodal Interfaces

[1] Introduction to
    Human-Computer Interaction

          Jacques Bapst
Content Overview
Multimodal
Interfaces

             § Human-Computer interaction basic paradigms
             § Human capabilities / Human modeling / Cognitive Framework
                    Ø Input-Ouput channels / Human senses
                    Ø Model Human Processor
                    Ø Fitts' Law
                    Ø Human memory (sensory, short-term, long-term)
                    Ø Reasoning (deductive, inductive, abductive)
                    Ø Human perception
                    Ø Interaction process (action theory, affective aspects,   emotion)
             § User-Computer Dialog / Interaction Styles
                    Ø Mode / Temporal and Spatial Mode / Quasimode
                    Ø Command language
                    Ø Form fill-in / Spreadsheets
                    Ø Menu selection
                    Ø Natural language / Query language
                    Ø WIMP / Point-and-Click
                    Ø Direct manipulation / Indirect manipulation
                    Ø 3D Interfaces / Brain computer interface
             § Appendix : HCI and GUI short history
             EIA-FR / J. Bapst                           MMI_01                           2
Multimodal Interfaces

[1.1] Human-Computer Interaction
      "Bridging the Gap"
Human-Computer Interaction
Multimodal
Interfaces

                                 "With teletype interface and the Fortran language,
                                          the computer will be easy to use"

                                                                                              From RAND-Corporation
                     1950 : A scientist shows how a "home computer" could look like in 2004

             EIA-FR / J. Bapst                             MMI_01                                                     4
Human-Computer Interaction (HCI)
Multimodal
Interfaces

             § Human-Computer Interaction is also known as Man-Machine
                 Interaction (MMI).
             § One possible definition (ACM) :
                    Ø Human-computer     interaction is a discipline concerned with the
                       design, evaluation and implementation of interactive computing
                       systems for human use and with the study of major phenomena
                       surrounding them.
             § HCI is a large interdisciplinary science involving
                    Ø Computer  science and Engineering
                    Ø Graphic design
                    Ø Cognitive psychology
                    Ø Ergonomics (user's physical capabilities)
                    Ø Sociology (wider context of the interaction)

             § User interface design is an important subset of the HCI field of
                 study.

             EIA-FR / J. Bapst                         MMI_01                             5
Interaction Design and HCI
Multimodal
Interfaces

             § Relationship between Interaction Design, Human-Computer
                 Interaction and other fields : academic disciplines, design
                 practices and interdisciplinary fields.
                 Academic Disciplines                                                    Design Practices
                             Ergonomics                                                     Graphic Design
                            Psychology
                     Cognitive sciences                                                     Product Design

                                 Informatics                                                Artist Design
                                                               Interaction
                                 Engineering                     Design                     Industrial Design

                     Computer Science

                         Social Sciences                                                   Information
             (sociology, anthropology, ...)
                                                                                             Systems

                                                                                       Computer Supported
                                                           Human-Computer
                            Human Factors (HF)                                       Cooperative Work (CSCW)
                                                           Interaction (HCI)

                                       Cognitive Engineering                 Cognitive Ergonomics

                                                        Interdisciplinary Fields

             EIA-FR / J. Bapst                                      MMI_01                                      6
Multimodal Interfaces

[1.2] Human Capabilities
      Communication Channels
Human Characteristics / Abilities
Multimodal
Interfaces

             § HCI is undoubtedly a multi-disciplinary topic.
             § In HCI design, it is really important to understand something
                 about
                    Ø Human      information-processing characteristics
                          ü Cognitive architecture, memory, perception, motor skills, …
                    Ø How human action is structured
                    Ø The nature of human communication
                    Ø Human physical and physiological requirements

             § Human are limited in their capacity to process information.
                 This has important implications for the interaction design.
             § Important aspects
                    Ø Input-output    channels (senses and effectors)
                    Ø Memory
                    Ø Learning(acquiring skills)
                    Ø Reasoning / Problem solving (cognitive activity)

             EIA-FR / J. Bapst                              MMI_01                        8
Input-Output Channels
Multimodal
Interfaces

             § Input in the human occurs mainly through the senses (sensory
                 channels) and output through the motor control of the effectors.
             § Five major senses                  +     Balance          +       Kinesthesis
                    Ø Sight                              Ø Sense   of             Ø Proprioception
                    Ø Hearing                              equilibrium
                    Ø Touch
                    Ø Taste               Do not currently play a significant role in HCI
                    Ø Smell               except in specialized systems or virtual reality
             § Effectors
                    Ø Limbs (arms, legs, body position, …)
                    Ø Fingers
                    Ø Eyes
                    Ø Head       / Face
                    Ø Body
                    Ø Vocal      system

             EIA-FR / J. Bapst                                  MMI_01                               9
Human Senses : Sight
Multimodal
Interfaces

             § Human vision is a highly complex activity with a range of
                 physical and perceptual limitations.
             § Primary source of information.
             § Two main stages
                    Ø Physicalreception of the stimulus (photoreceptor of the retina)
                    Ø Processing and interpretation of that stimulus

             § Visual perception
                    Ø Size and depth (visual angle, stereoscopy, knowledge, …)
                    Ø Color (hue, intensity, saturation), Color blindness
                    Ø Brightness (luminance, contrast)

             § Reading
                    Ø Visual pattern perception
                    Ø Pattern decoding using internal representation of language
                    Ø Syntactic and semantic analysis (phrases)

             EIA-FR / J. Bapst                          MMI_01                          10
Human Senses : Hearing
Multimodal
Interfaces

             § Hearing is often considered secondary to sight (we tend to
                 underestimate the amount of information that we receive through
                 our ears).
             § Two main stages
                    Ø Physical   reception of the stimulus (sound wave propagated along the
                       auditory canal, received by tympanic membrane and transmitted to the
                       cochlea)
                    Ø Processing    and interpretation of that stimulus
             § Hearing perception
                    Ø Pitch (main frequency)
                    Ø Loudness (amplitude)
                    Ø Timbre (spectrum, envelope)
                    Ø Location (stereophony)

             § Voice recognition
                    Ø Perception,   decoding, syntactic and semantic analysis

             EIA-FR / J. Bapst                            MMI_01                              11
Human Senses : Touch
Multimodal
Interfaces

             § Touch is also known as haptic perception.
             § It provides us with vital information about our environment.
             § The skin contains three types of sensory receptor
                    Ø Thermoreceptors  respond to temperature
                    Ø Nociceptors      respond to intense pressure, heat and pain
                    Ø Mechanoreceptors respond to pressure

             § Act as
                    Ø Sensory    receptor   thermoreceptor, pressure receptor, pain
                    Ø Warning               hot, sharp, …
                    Ø Feedback              feel when in contact, necessary in prehension
             § A second aspect of haptic perception is the awareness of the
                 position of the body and limbs. This conscious awareness of
                 body position is known as kinesthesis or (if we include the sense
                 of equilibrium) proprioception.

             EIA-FR / J. Bapst                        MMI_01                                12
Multimodal Interfaces

[1.3] Human Modeling
      Cognitive Framework
What is Cognition ?
Multimodal
Interfaces
                                                                       [1]
             § What goes on in the mind in our everyday activities ?
             § Different kinds of cognition.

             EIA-FR / J. Bapst                 MMI_01                        14
What is Cognition ?
Multimodal
Interfaces
                                                                                        [2]
             § Norman (1993) distinguishes between two general modes :
                    Ø Experiential     cognition
                          ü Driving a car, reading a book, having a conversation, playing a game, …
                    Ø Reflective    cognition
                          ü Designing, learning, problem solving, writing a book, …

             § Cognition may also be described in terms of specific kind of
                 processes :
                    Ø Attention
                    Ø Perception    and recognition
                    Ø Memory
                    Ø Learning
                    Ø Reading,speaking, listening
                    Ø Problem-solving, planning, reasoning, decision-making
                    Ø. . .

             EIA-FR / J. Bapst                              MMI_01                                15
Human Modeling
Multimodal
Interfaces

             § Unfortunately no general and unified theory.
             § Cognitive and interaction models attempt to represent the users
                 as they interact with a system, modeling aspects of
                 their understanding, knowledge, intentions or processing.
             § Human models can be divided into categories according their
                 abstraction level.

                                                                             Reasoning
                                 Æ Theory of Action             [Norman]
                                 Æ Rasmussen's model
                                 Æ Human processor [Card, ...]                Reflex

                                   + Others : GOMS, ICS, CCT, Keystroke, …

             EIA-FR / J. Bapst                                      MMI_01               16
Model Human Processor (MHP)
Multimodal
Interfaces

             § From Card, Moran and
                 Newell (1983).
             § Describes the cognitive
                 process that people go
                 through between
                 perception and action
             § Simplistic view of human
                 behavior
                    Ø Ignores   environment and
                       other people (social
                       interactions)
             § Low level model
                    Ø Performance  oriented
                    Ø Allow empirical tests

             § Basis of GOMS model

             EIA-FR / J. Bapst                    MMI_01   17
Fitts' Law
Multimodal
Interfaces
                                                                                            [1]
             § The Fitts' law (1954) predicts the time required to move from a
                 starting position to a final target area based on the size and the
                 distance of this target area.
             § It describes the behavior of aimed and rapid movement.

                                                                                    W
                                                  D

                            d0                   d1              d2           d3   d4

                                            T = k log2(2D / W)

             k : constant values based on                         D       10 cm    10 cm     30 cm
                 cycle times τp and τm
                                                                  W       1 cm     1 mm      2 mm
                  Usually : 0.1 s                                     T   0.43 s   0.76 s    0.82 s

             EIA-FR / J. Bapst                         MMI_01                                         18
Fitts' Law
Multimodal
Interfaces
                                                                                                [2]
             § The Fitts' law has been formulated in several different ways.
             § One common form is the Shannon formulation.

                                                 T = a + b log2(D / W + 1)

                                 a and b are empirical constants which depend of the pointing
                                 device and the user dexterity. [a ≈ 0.1 s, b ≈ 0.1 s]

                  § The logarithmic term of the formula ( log2(D /W + 1) ) is called
                     index of difficulty (ID) ð T = a + b ID
                  § Useful when
                        Ø Designinguser interfaces
                        Ø Comparing pointing devices (determining a and b)

                  § Test yourself : www.tele-actor.net/fitts/index.html

             EIA-FR / J. Bapst                                   MMI_01                               19
Human Memory
Multimodal
Interfaces
                                                                                             [1]
             § Much of our everyday activity relies on memory.
             § It is generally agreed that there are three types of memory :
                    Ø Sensory  buffers
                    Ø Short-term memory or working memory
                    Ø Long-term memory

             § These memories interact, with information being processed and
                 passed between memory stores :

                    Sensory memories
                                           Attention                Repetition
                       - Iconic (visual)
                       - Echoic (aural)
                       - Haptic (touch)

                                               Short-term memory                 Rehearsal
                                                           or                    Encoding
                                                Working memory
                        Information not
                          attended to
                                                                    Retrieval
                                                                                   Long-term memory
                                              Forgetting

             EIA-FR / J. Bapst                             MMI_01                                     20
Human Memory
Multimodal
Interfaces
                                                                                     [2]
             § First stage : selection and encoding
                    Ø Determines    which information is attended to in the environment and
                      how it is interpreted.
                    Ø The more attention paid to something, and the more it is processed
                      in terms of thinking about it and comparing it with other knowledge,
                      the more likely it is to be remembered.
             § We don’t remember everything : involves filtering and
                 processing what is attended to.
             § Context is important in affecting our memory (i.e., where, when).
                    Ø Sometimes     it can be difficult to recall information that was encoded
                       in a different context.
             § We recognize things much better than being able to recall things.
                    Ø Betterat remembering images than words
                    Ø A reason why interfaces are largely visual
                          ü GUIs provide visually-based options : recognition
                          ü Command-line UIs require users to remember commands : recall

             EIA-FR / J. Bapst                           MMI_01                                  21
Sensory Memory
Multimodal
Interfaces

             § The sensory memories act as buffers for stimuli received through
                 the senses (constantly overwritten by new information [0.1 … 0.5 s]).
             § A sensory memory exists for each sensory channel
                    Ø Iconicmemory for visual stimuli
                    Ø Hechoic memory for aural stimuli
                    Ø Haptic memory for touch

             § Information is passed from sensory memory into short-term
                 memory by attention.
             § Attention is the concentration of the mind on one out of a number
                 of competing stimuli.
             § We can choose which stimuli to attend to (according our need, level
                 of interest, …; this explains the noisy party phenomenon).
             § Information received by sensory memories is quickly passed into
                 a more permanent memory store (generally the short-term memory)
                 or overwritten and lost.
             EIA-FR / J. Bapst                       MMI_01                              22
Short-Term Memory
Multimodal
Interfaces

             § Short-term memory or working memory (a slightly different
                 concept) acts as a scratch-pad for temporary recall of information.
             § Examples of use :
                    Ø Calculation (e.g. 25 x 6)
                    Ø Reading

             § Short-term memory access time is in the order of 70 ms.
             § Information can only be held temporarily : ~ 200 ms … 10 s
             § Short-term memory has a limited capacity. This was established
                 in experiments by Miller ("The magical number seven, plus or minus two").
             § The average person can remember 7 ± 2 chunks of information.
             § Further studies say 4 ± 2 (if no relationship between information)
             § A chunk of information is not precisely defined. It is any
                 meaningful unit (digits, words, people's faces, chess positions, etc.)
                 and depends of the user experience with the kind of information.

             EIA-FR / J. Bapst                       MMI_01                               23
Miller's Magical Number : 7±2
Multimodal
Interfaces

             § George Miller's theory (1956) says that 7±2 chunks of information
                 can be held in short-term memory at any one time.
             § It's one of the best known et remembered finding in psychology.
             § But unfortunately, this theory is often misinterpreted by HCI
                 designers.
             § Examples of inappropriate application of the theory :
                    Ø Never have more than seven bullets in a list
                    Ø Have no more than seven options on a pull-down menu
                    Ø Display only seven icons on a menu-bar or tool-bar
                    Ø Place no more than seven tabs at the top of a website page
                    Ø and so on…

             § All of these are wrongly based on Miller's law because all the
                 items can be scanned and rescanned visually and hence do not
                 have to be recalled from short-term memory.

             EIA-FR / J. Bapst                       MMI_01                        24
Long-Term Memory
Multimodal
Interfaces

             § The long-term memory is our permanent memory store
                 intended for the long-term storage of information (everything that
                 we know : experiential knowledge, procedural skills, etc.).
             § It has a huge capacity (if not unlimited).
             § Relatively slow access time (~0.1 s).
             § Forgetting, if at all, occurs much more slowly than in short-term
                 memory (long-term recall after minutes is the same as that after days).
             § Today, most researchers distinguish three long-term memory
                 sub-systems :
                    Ø Episodic   memory : memory of events and experiences

                                                                                     Declarative
                                                                                      Memory
                                          in a serial form (chronology)
                    Ø Semantic   memory : structured record of facts, concepts
                                          that we have acquired
                    Ø Procedural   skills : "know-how" memory (skills, procedures)

             EIA-FR / J. Bapst                       MMI_01                                 25
Long-Term Memory Structure
Multimodal
Interfaces

                                                          Long-term memory

                          Facts   Declarative memory                      Skills   Procedural memory

                                                                                       Play piano
                                                                                       Ride a bike

                     Semantic memory          Episodic memory

                         Paul's address        Last birthday party
                         Words meaning

             § The information in semantic memory is derived from that in our
                 episodic memory (we can learn new concept from our experiences).
             § Memory structure and processes are very complex and cannot
                 easily be reduced to a simple model.

             EIA-FR / J. Bapst                                   MMI_01                                26
Reasoning
Multimodal
Interfaces

             § Reasoning is the process by which we use our knowledge to
                 draw conclusions or infer new information.
             § There are different types of reasoning :
                    Ø Deductive  reasoning / Deduction
                    Ø Inductive reasoning / Induction
                    Ø Abductive reasoning / Abduction

             § Humans are able to think about things of which they have no
                 experience and solve problem they have never seen before.
             § Problem solving involve reasoning and vice versa.
             § Recurrent familiar situations allow people to acquire skills in a
                 particular domain (better information structure).
             § People build their own theories (called Mental models) to
                 understand the causal behavior of systems.
             § Sometimes they are based on an incorrect interpretation of the
                 facts and this can lead to the well known "Human error".
             EIA-FR / J. Bapst                      MMI_01                         27
Deductive Reasoning
Multimodal
Interfaces

             § Deductive reasoning derives the logically conclusion from the
                 given premises.
                    Ø If it is Monday it rains
                    Ø It is Monday
                    Ø Therefore it rains

             § Note that the logical conclusion does not necessarily have to
                 correspond to our notion of truth.
             § Deductive reasoning is therefore sometimes misapplied :
                    Ø Some  people are students
                    Ø Some students attend a lecture about Multimodal interfaces
                    Ø Therefore some people attend a lecture about Multimodal interfaces

                                                 Is this logically correct ?
             § The human deduction is poor when there is a clash between
                 truth and logical validity.

             EIA-FR / J. Bapst                            MMI_01                       28
Inductive Reasoning
Multimodal
Interfaces

             § Inductive reasoning is generalizing from cases we have seen
                 to infer information about cases we have not seen.
             § With inductive reasoning, we draw conclusions by moving from
                 specific case or cases and deriving a general rule (just the
                 opposite of deductive reasoning).
             § Example :
                    Ø Every EIA-FR  student I have ever seen owns a desktop computer
                    Ø Therefore I infer that all EIA-FR students own a desktop computer

             § Of course, this inference is unreliable and cannot (or at least
                 difficult) be proved to be true. It can only be proved to be false.
             § In spite of its unreliability, induction is a useful process, which we
                 use constantly in learning about our environment.

             EIA-FR / J. Bapst                        MMI_01                              29
Abductive Reasoning
Multimodal
Interfaces

             § Abductive reasoning is reasoning from observed facts to the
               action or state that caused it.
             § This is the method we use to derive explanations for the events
               we observe (we try to find hypothesis that would explain the
               observed facts).
             § Example :
                    ØI  know that Bob take his car when he misses the bus
                    Ø If I see Bob driving his car
                    Ø Therefore I may infer that he missed the bus

             § In spite of its unreliability, people do infer explanations in this
               way and hold onto them until they have evidence to support an
               alternative theory or explanation.
             § In interactive systems, if an event always follows an action, the
               user will infer that the event is caused by the action.
               If, in fact, the event and the action are unrelated, confusion and
               even error may result.
             EIA-FR / J. Bapst                       MMI_01                          30
Human Perception
Multimodal
Interfaces
                                                                         [1]
             § Human senses cannot easily be compared with computer
                 peripherals. Don't forget the functions of the brain which
                 heavily processes the sensors information before interpretation.
             § All human senses are subject to illusions in their perception
                 process.
             § Illusion research can provide fundamental insights into general
                 brain mechanisms.

             EIA-FR / J. Bapst                   MMI_01                             31
Human Perception
Multimodal
Interfaces
                                          [2]

             EIA-FR / J. Bapst   MMI_01         32
Human Perception
Multimodal
Interfaces
                                                 [3]
             § Stroop effect.

                                 Noir
                                 Papier
                                 Manger
                                 Vert
                                 Livre
                                 Rouge
                                 Vert
                                 Maison
                                 Creuser
                                 Bleu
                                 Texte
                                 Orange
                                 Téléphone
                                 Rouge
                                 Bleu
                                 Rire
                                 Agenda
                                 Orange
                                 Golf
                                 Noir

             EIA-FR / J. Bapst          MMI_01         33
Human Perception
Multimodal
Interfaces
                                                        [4]
             § 2- and 3-dimensional interpretation

             EIA-FR / J. Bapst                 MMI_01         34
Human Perception
Multimodal
Interfaces
                                                                              [5]
             § Can you read this ?

                     Sleon une édtue de l'Uvinertisé de Cmabrigde, l'odrre des
                     Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't
                     ltteers dnas un mot n'a pas d'ipmrotncae, la suele coshe
                     mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt
                     ipmrotnate est que la pmeirère et la drenèire letrte soit à la
                     tihng is taht the frist and lsat ltteer be at the rghit pclae.
                     bnnoe pclae.
                     The rset can be a toatl mses and you can sitll raed it wouthit
                     Le rsete peut êrte dnas un dsérorde ttoal et vuos puoevz
                     porbelm.
                     tujoruos lrie snas porlblème.
                     Tihs is bcuseae the huamn mnid deos not raed ervey lteter by
                     C'est prace que le creaveu hmauin ne lit pas chuaqe ltetre elle-
                     istlef, but the wrod as a wlohe.
                     mmêe, mias le mot cmome un tuot.

             § Probably a hoax, but nevertheless surprising.

             EIA-FR / J. Bapst                       MMI_01                             35
Human Perception
Multimodal
Interfaces
                                           [6]
             § Face recognition
             § Inverted faces

             EIA-FR / J. Bapst    MMI_01         36
Human-Machine Interaction Process
Multimodal
Interfaces

             § Action theory (D.A. Norman, 1986)
             § Action cycle composed of two main processes
                    Ø Execution process
                    Ø Evaluation process

             § Seven stages :
                    1. Establish   a goal

                    2. Form an intention
                    3. Specify an action sequence
                    4. Execute an action

                    5. Perceive    the system state
                    6. Interpretthe state
                    7. Evaluate the system state
                       with respect to the goals
                       and intentions

             EIA-FR / J. Bapst                        MMI_01   37
Action Cycle
Multimodal
Interfaces
                                                                                         [1]
             § The action theory try to deconstruct the process of translating an
                 intention into an action.
                    Ø The   action itself has two major aspects :
                        doing something and checking (execution and evaluation)

                                                           Goals

                                   Intention to act                      Evaluation
                                                        What to do ?

                                                       Comprehension
                   Gulf of                                                                 Gulf of
                  execution      Sequence of actions                    Interprétation   evaluation
                                                         Interaction

                                                       How to do it ?
                                     Execution                           Perception

                                                           World

             EIA-FR / J. Bapst                                 MMI_01                                 38
Action Cycle
Multimodal
Interfaces
                                                                                      [2]
             § Forming the goal.
                    Ø Can        be stated in a very imprecise way, e.g. "Make a nice meal"
             § Execution :
                    Ø Forming   the Intention.
                       Goals must be transformed into intentions, i.e., specific statements
                       of what has to be done to satisfy the goal; e.g. "Make a chicken
                       casserole using a can of prepared sauce"
                    Ø Specifying  an Action Sequence.
                       What is to be done to the world. The precise sequence of operators
                       that must be performed to effect the intention; e.g. "Defrost frozen
                       chicken, open can, ..."
                    Ø Executing   an Action.
                       Actually doing something. Putting the action sequence into effect on
                       the world; e.g. actually opening the can.

             EIA-FR / J. Bapst                              MMI_01                            39
Action Cycle
Multimodal
Interfaces
                                                                                [3]
             § Evaluation :
                    Ø Perceiving  the State of the World.
                      Perceiving what has actually happened; e.g. the experience of
                      smell, taste and look of the prepared meal.
                    Ø Interpreting the State of the World.
                      Trying to make sense of the perceptions available; e.g. putting
                      those perceptions together to present the sensory experience of
                      a chicken casserole.
                    Ø Evaluating the Outcome.
                      Comparing what happened with what was wanted; e.g. did the
                      chicken casserole match up to the requirement of "a nice meal" ?
             § Design objective
                    Ø To reduce the evaluation's gulf
                    Ø To reduce the execution's gulf

             § Design principles :        } Visibility                 } Good mapping
                                          } Good   conceptual model    } Feedback

             EIA-FR / J. Bapst                           MMI_01                          40
Execution's and Evaluation's Gulfs
Multimodal
Interfaces

             § The gulfs represent the gaps that exist between the user and the
               interface
             § Gulf of execution
                    Ø Distance      from the user to the physical system
             § Gulf of evaluation
                    Ø Distance      from the physical system to the user
             § The goal : reduce the gulfs in order to reduce the cognitive effort
                                    required to perform a task
                                 Interface mechanism ï Action sequence ï Intentions

                                       Interface perception ð Interpretation ð Evaluation

             EIA-FR / J. Bapst                               MMI_01                         41
Affective aspects / Emotions
Multimodal
Interfaces
                                                                            [1]
             § Human experience is far more complex and not limited
                 to perceptual and cognitive abilities.
             § Our emotional response to situations affects how we perform.
             § Positive emotions enable us to think more creatively
                 and to solve problem more quickly.
             § Negative emotions pushes us into narrow thinking.
                 A problem, easy to solve, will become difficult if we are frustrated
                 or stressed.
             § Emotion involves both physical and cognitive events.
             § That biological response - known as affect - changes the way we
                 deal with different situations, and this has an impact on the way
                 we interact with computer systems.
             § If we try to build interfaces that promote positive responses
                 (e.g. by using aesthetics, reward, ergonomics, …) then they are
                 likely to be more successful.
             EIA-FR / J. Bapst                     MMI_01                            42
Affective aspects / Emotions
Multimodal
Interfaces
                                                                                    [2]
             § Recent empirical studies [Tractinsky, 2000] show that the
                 aesthetics of an interface can have a positive effect on
                 people's perception of the system's usability.
             § Importance of the look and feel of an interface (and not only
                 usability) is gaining acceptance within the HCI community.
             § Users are likely to be more tolerant when the interface is
                 pleasing :
                    Ø Beautiful graphics
                    Ø Well-designed fonts and icons
                    Ø Nice feel of the way the elements have been laid out
                    Ø Elegant use of images and color
                    Ø Good sense of balance
                    Ø. . .

             § A key concern is to strike a balance between designing
                                 Pleasurable interfaces \__^__/ Usable interfaces

             EIA-FR / J. Bapst                           MMI_01                           43
Affective aspects / Emotions
Multimodal
Interfaces
                                                                 [3]
             § Expressive interfaces can affect
                 user attitude and behavior.
                    Ø Reassuring,  informative, fun
                    Ø Intrusive, annoying, get user angry

             § Anthropomorphism pros and cons.
                    Ø Well  accepted by children
                    Ø Some people may feel anxious,
                      feeling inferior or stupid
                    Ø A controversial debate

             § Interface agents, avatars, virtual pets,
                 interactive toys
                    Ø Often considered as trying and intrusive
                    Ø Downright deceptive, deceiving
                      and frustrating

             EIA-FR / J. Bapst                        MMI_01           44
Individual Differences
Multimodal
Interfaces

             § The psychological principles and properties apply to the majority
                 of people.
             § Notwithstanding this, we should remember that, although we
                 share common processes, humans (i.e. users) are not all the
                 same.
             § We should be aware of individual differences
                    Ø Long term differences : sex, physical / intellectual capabilities, …
                    Ø Short term differences : emotion, stress, fatigue, …
                    Ø Continuous changes : age, experience, skills, …

             § These differences should be taken into account in interface design
                    Ø Define personas (artifacts)
                    Ø Promote flexibility and adaptability
                    Ø Universal accessibility (impaired people)

             § In summary : User-centered design (philosophy and process)

             EIA-FR / J. Bapst                         MMI_01                                45
Multimodal Interfaces

[1.4] User-Computer Dialog
      Interaction Styles
      Interaction Paradigms
Mode
Multimodal
Interfaces

             § A mode is a distinct state of the interface in witch the same
               user input will produce a different result than it would in other
               state. The mode influences the effect of actions.
             § Typical examples :
                    Ø Insert/Replace  mode in word processing applications
                    Ø Caps lock (keyboard physical mode)
                    Ø Tool palettes in photo editing or drawing applications
                    Ø Modal dialog boxes

             § Modal interfaces should be avoided, if at all possible, because
               they lead to confusion or input errors, known as mode errors
               (the user performs an action that is appropriate to a different mode
                and gets hence an unexpected response).
             § If modes must be used, there should be clear indicators of the
               current mode to help prevent mode errors.
             § Nevertheless… modes can sometimes be helpful to control and
               guide user input.
             EIA-FR / J. Bapst                         MMI_01                         47
Spatial Mode / Temporal Mode
Multimodal
Interfaces

             § One can also present a mode as a multiplexer of user input that
               allows giving one precise meaning to an action.
             § Distinction between spatial modes and temporal modes
               (spatial multiplexing of input vs. temporal multiplexing).
             § A spatial mode uses the location of a user's action to
               determine the meaning of an event.
                    Ø Example    : Resizing handles in a drawing editor
             § A temporal mode uses the ordering of events in time to
                 determine their meaning.
                    Ø Example    : Tools palette in a drawing editor
             § Drawback of temporal modes : they have no intrinsic feedback.
                    ØA   possible workaround : change the shape of the cursor or other
                       noticeable interface state
             § Avoid sub-modes and limit the number of states to what is strictly
                 necessary.

             EIA-FR / J. Bapst                           MMI_01                          48
Quasimode and Modeless Interfaces
Multimodal
Interfaces

             § A quasimode (or spring-loaded mode) is a mode that is kept in
                 place only through some constant action on the part of the user
                 (e.g. pressing the Shift-Key or Ctrl-Key).
             § A quasimode is a modeless interaction that allows for the
                 benefits of a mode without the associated cognitive burden
                 (the user is performing a conscious action).
             § Modifier keys used in interfaces usually start a quasimode.

             § An interface that doesn't use modes is known as a modeless
                 interface (the same input from the user will always trigger the same
                 perceived action).

             EIA-FR / J. Bapst                      MMI_01                              49
Interaction Styles
Multimodal
Interfaces

             § Interaction can be seen as a dialog between the computer and
                 the user.
             § The choice of interaction style can have a profound effect on
                 the nature of this dialog.
             § The most common interaction styles :
                    Ø Command     language / Command line interface
                    Ø Form-fills and spreadsheets
                    Ø Menus
                    Ø Natural language and query language
                    Ø Question/answer dialog
                    Ø WIMP
                    Ø Point-and-click
                    Ø Direct manipulation
                    Ø 3D interfaces (à virtual reality)
                    Ø Brain-computer interface

             EIA-FR / J. Bapst                       MMI_01                    50
Command Language
Multimodal
Interfaces
                                                     [1]
             § User types in commands
                 in response to a prompt
             § May use function keys,
                 abbreviations or whole-
                 word commands
             § Examples
                    Ø OS         commands
                          ü MS-DOS
                          ü Unix shell
                    Ø Applications
                          ü FTP
                          ü Telnet

             EIA-FR / J. Bapst              MMI_01         51
Command Language
Multimodal
Interfaces
                                                                                [2]
             § Earliest form of interaction style and is still widely used
             § Flexible and extensible interface (appealing for expert users)
             § Required a formally defined syntax (should use user's vocabulary)
             § Useful for repetitive tasks
             § Support regular expression and creation of user-defined scripts
                 and macros
                    Ø $>         zip archive photo*.jpg
             § Suitable for interacting with networked-computer (low bandwidth)
             § Poor usability :
                    Ø Typing   is tiring and error prone
                    Ø Difficult to remember task names and parameters (bad learnability)
                    Ø Difficult to remember correct syntax
                    Ø Error messages and assistance are hard to provide

             § Not suitable for non-expert users
             EIA-FR / J. Bapst                        MMI_01                               52
Form Fill-in ("fill in the blanks")
Multimodal
Interfaces

             § Used primarily for data entry (but also useful in data retrieval)
             § Aimed at non-experts users
             § Paper form metaphor
             § Originally no need for a pointing
                 device (Keyboard, Tab, Enter)
             § Easy movement from field to field
             § Form fill-in interfaces were (and                Classic form fill-in
                 still are) especially useful for
                 routine, clerical work or for tasks that
                 require a great deal of data entry
             § User already familiar with actual
                 form (often based on actual paper
                 form for compatibility)                      More modern form fill-in

             EIA-FR / J. Bapst                       MMI_01                              53
Spreadsheets Forms
Multimodal
Interfaces

             § Spreadsheets can be considered as a sophisticated variation
                 of form filling
             § Grid of cells with formula
             § System maintains consistency and updates values
                 immediately
             § User can manipulate
                 values (in any order)
                 and observe effects
             § Sometimes blurred
                 distinction between
                 input and output fields
             § Attractive style for
                                                                    = Qty * Unit Price
                 complex forms
             § Spreadsheets are an attractive and flexible medium
                 for interaction
             EIA-FR / J. Bapst                MMI_01                                54
Menu Selection
Multimodal
Interfaces
                                                                       [1]
             § A menu is a set of options displayed on the screen
             § Text-based (ev. options numbered) or GUI-based (mouse selection)
             § Selection and execution of one (or more) of the options results
                 in a state change of the interface
             § Classical web-pages are often mainly
                 based on menu selection

             EIA-FR / J. Bapst                    MMI_01                          55
Menu Selection
Multimodal
Interfaces
                                                                                  [2]
             § Advantages
                    Ø Affords  exploration ("look around")
                    Ø Relies on recognition rather than recall
                    Ø Ideal for novice or intermittent users
                    Ø Structures workflow and decision
                      making (sequential, hierarchical, grouping)
                    Ø User's input does not have to be
                      parsed ð easier error handling
             § Disadvantages
                    Ø May  be slow for frequent users
                      (shortcuts should be implemented)
                    Ø Too many menus may lead to information
                      overload (discouraging the users)
                    Ø Not always suited for small graphic displays
                      (need adaptation)
                    Ø Highly hierarchical menus may be tedious (where to drill down ?)

             EIA-FR / J. Bapst                         MMI_01                            56
Menu Selection / Guidelines
Multimodal
Interfaces
                                                                     [3]
             § Make menu options meaningful in the user’s language
             § Logically group similar options to aid recognition
             § Use hierarchical organization where appropriate
                 (menus/submenus)
             § Use sequential organization where appropriate (arrange options
                 in order to suggest a workflow sequence)

             EIA-FR / J. Bapst                  MMI_01                          57
Natural Language Interaction
Multimodal
Interfaces

             § Natural language understanding
             § Forms : speech or written input (smart command language)
             § Very attractive style of interaction (at least at first glance)
             § A very difficult task
                    Ø Parsing language is very difficult (a language is vague and imprecise
                      by it’s very nature)
                    Ø Phrases and words are quite often ambiguous (homonyms, …)
                    Ø Spelling errors and/or variations exacerbate written input
                    Ø Synonyms exacerbate written and speech input
                    Ø Converting audio speech to machine-readable text is very difficult

             § Subject of considerable interest and research
             § Relatively successful in restricted domains or after an
                 extensive learning process (still natural language ?)
             § A simpler approach : query-language (restricted context, more
                 formal)

             EIA-FR / J. Bapst                            MMI_01                              58
Natural / Query / Command Language
Multimodal
Interfaces

             § Distinction between written natural language, query language
                 and command language is sometimes blurred.
             § What appears as a natural language interface may simply be
                 a front-end for a query sub-system.
             § The question is
                 parsed into
                 keywords to
                 form a query
             § Other related
                 example :
                 web search
                 engine

             EIA-FR / J. Bapst                   MMI_01                       59
Question / Answer
Multimodal
Interfaces

             § Simple mechanism for providing input to an application.
             § The user is asked a series of questions and is led through the
                 interaction step by step.
                    Ø Yes/no  response
                    Ø Multiple choice
                    Ø Codes

             § Examples : web questionnaires, web inquiries
             § Easy to learn but limited in functionality and power (appropriate
                 for restricted domains and for novice users).

             EIA-FR / J. Bapst                    MMI_01                           60
WIMP
Multimodal
Interfaces

             § WIMP : Windows + Icons + Menus + Pointing devices
             § Popularized by Graphical User Interface (GUI)
             § Currently the most common environment for interacting with
                 computers (sometimes simply called Windowing system)
             § Need appropriate visual representation of objects to interact with
                    Ø Symbolized    pictorial
                       representations (icons)
                       may be difficult to interpret
                       because of their small size
             § Typically based on metaphors
                      The essence of metaphor is
                      understanding and experiencing
                      one kind of thing in terms of another

             § Particularly suited to explore
                 an application

             EIA-FR / J. Bapst                                MMI_01            61
Point-and-Click Interaction
Multimodal
Interfaces

             § Point-and-Click is closely related to WIMP but a little bit simpler
             § In information browsing or simple multimedia systems, most
                 interactions require only a single click of a mouse button
             § Closely related to hypertext idea
             § Popularized by the web
                 (browser)
             § Not limited to mouse
                 device, also used for
                 touch screen (interactive
                 information kiosks)

             EIA-FR / J. Bapst                     MMI_01                            62
Direct Manipulation
Multimodal
Interfaces
                                                                          [1]
             § Direct manipulation (defined by Ben Shneiderman in 1982) is often
                 closely related to WIMP but not limited to it.
             § Direct manipulation involves continuous representation of the
                 object of interest, and rapid incremental reversible operations
                 whose impact on the object is immediately visible (feedback).
                 An auditory feedback may also be provided.
             § The intention is to allow a user
                 to directly manipulate objects
                 presented to them, using actions
                 that correspond at least loosely
                 to the physical world (metaphor).
             § Direct manipulation implies
                 physical actions instead of
                 complex syntax.
             § E.g. Drag-and-drop operation
             EIA-FR / J. Bapst                     MMI_01                          63
Direct Manipulation
Multimodal
Interfaces
                                                                                  [2]
             § Features of a direct manipulation interface (highlighted by Ben
                 Shneiderman) :
                  Ø Visibility of the objects of interest
                  Ø Incremental action with rapid feedback on all action
                  Ø Reversibility of all actions (allows exploration without penalties)
                  Ø Syntactic correctness of all actions (every user action is a legal
                    operation)
                  Ø Replacement of complex command language with actions to
                    manipulate directly the visible objects
             § With direct manipulation there is no clear distinction between
                 input and output (e.g. the document icon is an output expression in the
                 desktop metaphor, but that icon is used by the user to move the
                 document).
             § Directness partly depends on the gap between user's goal and
                 system image (evaluation and execution's gulf in action's theory).

             EIA-FR / J. Bapst                        MMI_01                              64
Direct Manipulation
Multimodal
Interfaces
                                                                             [3]
             § Two variants of direct manipulation
                    Ø Program manipulation
                    Ø Content manipulation

             § Program manipulation is typically focused on the management
                 of the program and its interface.
                    Ø Selection
                    Ø Drag-and-drop
                    Ø Controlmanipulation (button pushing, scrolling, ...)
                    Ø Resizing, reshaping, repositioning
                    Ø Connecting objects

             § Content manipulation is involved primarily with the direct
                 manual creation, modification and movement of data with a
                 pointing device.
                    Ø Drawing,painting, sketching
                    Ø 3D Modeling

             EIA-FR / J. Bapst                         MMI_01                      65
Direct Manipulation
Multimodal
Interfaces
                                                                                  [4]
             § The tool seems to disappear.
             § The user can apply intellect directly to the task (not to the tool).
             § Advantages
                    Ø Easy to learn and remember
                    Ø Encourages exploration (reduced anxiety)
                    Ø High subjective satisfaction (fun and entertaining)
                    Ø Recognition memory

             § Drawbacks
                    Ø Mouse   operations may be slower than typing
                    Ø Need to learn meaning of components (visual representation)
                    Ø Not so intuitive (most users don't discover it independently)
                    Ø More difficult to program (is this relevant ?)
                    Ø Tedious for repeated actions
                    Ø History keeping is harder

             EIA-FR / J. Bapst                           MMI_01                         66
Indirect Manipulation
Multimodal
Interfaces

             § Not all tasks can be described using concrete objects and not all
                 actions can be performed directly.
             § There is a continuum from "Do it yourself" to "Command control".
             § Most GUI's are a combination of direct and indirect manipulation.
                    Ø Using a menu is rather an indirect manipulation
                    Ø Using a pop-up menu is more direct, but it is less direct than
                      dragging an element.
             § Example 1 : choosing a color
                    Ø Using an "eye dropper" mouse pointer                       ð     [direct]
                    Ø By typing the color values ([255, 255, 0] to get yellow)   ð   [indirect]
             § Example 2 : defining a text size
             § The expression indirect manipulation is
                 also used when the user interact with
                 the real world (instrument, plant) through
                 an interface.
             EIA-FR / J. Bapst                            MMI_01                              67
3D Interfaces
Multimodal
Interfaces
                                                                            [1]
             § We live in a three-dimensional world.
             § There is an increasing use of 3D
                 effects in user interface design.
             § From simple techniques (shading,
                 etching, sculptural effects, 3D icons, …) to more complex
                 3D workspaces.
             § 3D workspaces give extra space in a more natural way than
                 iconizing windows (objects shrinks when they are further away).

             EIA-FR / J. Bapst                       MMI_01                        68
3D Interfaces
Multimodal
Interfaces
                                                                            [2]
             § In 3D workspaces, objects are displayed in perspective and their
                 relative size, light, angle and occlusion provide an intuitive sense
                 of distance.
             § The next step is virtual reality where the user can move within a
                 simulated 3D world (will be discussed later).

             EIA-FR / J. Bapst                     MMI_01                           69
Man-Computer Symbiosis
Multimodal
Interfaces

             § J.C.R. Licklider (1960) outlined "man-computer symbiosis"

                   "The hope is that, in not
                   too many years, human
                   brains and computing
                   machines will be coupled
                   together very tightly and
                   that the resulting
                   partnership will think as
                   no human brain has ever
                   thought and process data
                   in a way not approached
                   by the information-
                   handling machines we
                   know today."

             EIA-FR / J. Bapst                 MMI_01                      70
Brain-Computer Interface
Multimodal
Interfaces

             § A direct brain-computer interface (BCI) or
                 direct neural interface) would add a new
                 dimension to human-machine interaction.
                 It would represent one of the new frontiers
                 in science and technology.
             § Cerebral electric activity is recorded via the EEG : electrodes,
                 attached to the scalp, measure the electric signals of the brain.
                 These signals are amplified and transmitted to the computer,
                 which transforms them into device control commands.
             § Interesting research work in
                 this direction has been already
                 initiated, mainly motivated by
                 the hope to create new
                 communication channels for
                 those with severe neuromuscular
                 disorders.
             EIA-FR / J. Bapst                    MMI_01                             71
Multimodal Interfaces

[1.5] Appendix
      HCI and GUI Short History
Douglas Engelbart's Vision (≈1955)
Multimodal
Interfaces

             § "…I had the image of sitting at a big CRT screen with all kinds of symbols, new
                 and different symbols, not restricted to our old ones. The computer could be
                 manipulated, and you could be operating all kinds of things to drive the
                 computer…
                 ... I also had a clear picture that one's colleagues could be sitting in other rooms
                 with similar work stations, tied to the same computer complex, and could be
                 sharing and working and collaborating very closely. And also the assumption that
                 there'd be a lot of new skills, new ways of thinking that would evolve…"
             § In 1962 Doug Engelbart developed a conceptual framework for
                 augmenting human intellect ("boost the collective IQ").
             § He was a precursor in perceiving computers as facilitators for
                 communication, rather than only computation.
             § He founded the Augmentation Research Center at Stanford, the
                 precursor to Xerox PARC and developed a working vision of a
                 collaborative computing environment, with a graphic windowed
                 interface, mouse, hypertext system, networking, and electronic
                 mail.
             EIA-FR / J. Bapst                             MMI_01                                   73
First Mouse (Douglas Engelbart, 1964)
Multimodal
Interfaces

             EIA-FR / J. Bapst    MMI_01              74
Xerox PARC Alto + Star Projects (≈1975-1981)
Multimodal
Interfaces

             § Concept of personal workstation
                    Ø local processor
                    Ø Idea of a local area network
                       to share resources
             § Modern graphical user interface (GUI)
                    Ø bit-mapped  display, mouse
                    Ø Windows, menus, scroll bars, mouse selection, etc
                    Ø Familiar user’s conceptual model (simulated desktop)
                    Ø Promoted recognizing/pointing rather than remembering/typing
                    Ø Property sheets to specify appearance/behavior of objects
                    Ø What you see is what you get (WYSIWYG)
                    Ø Modeless interaction

             § First system based upon usability engineering
             § Commercial flop

             EIA-FR / J. Bapst                       MMI_01                          75
Apple Lisa (1983)
Multimodal
Interfaces

             § Predecessor of Macintosh
             § Based upon many ideas of the Star computer (Xerox)
             § Commercial failure as well (price ≈ $10'000)

             EIA-FR / J. Bapst               MMI_01                 76
Apple Macintosh (1984)
Multimodal
Interfaces

             § “Old ideas” but well done !
             § Mistakes of Lisa corrected + aggressive pricing
             § Interface guidelines encouraged consistency between
                 applications
             § Developer’s toolkit encouraged third party software
             § Domination in desktop
                 publishing because of
                 affordable laser printer
                 and excellent graphics

             EIA-FR / J. Bapst                 MMI_01                77
Windows 1.01 (1987)
Multimodal
Interfaces

             § Built on the cryptic MS-DOS operating system
             § Almost unusable
                    Ø No overlapping windows (unsightly tiled windows)
                    Ø No icons

             EIA-FR / J. Bapst                       MMI_01              78
Windows 2.03 (1988)
Multimodal
Interfaces

             § With overlapping windows and Mac-like
                    Ø Window-manipulation   terminology : "Minimize", "Maximize"
                    Ø Keyboard shortcuts (underlined mnemonics)
                    Ø Introduction of Word for Windows and Excel

             § No commercial success
                    Ø Developers   still maintained DOS versions of their applications

             EIA-FR / J. Bapst                          MMI_01                           79
Windows 3.1 (1992)
Multimodal
Interfaces

             § Follows Windows 3.0 (1990), a transition version which introduced
                 significantly revamped user interface and numerous technical
                 improvements.
             § First serious and successful desktop platform
                    Ø TrueType font system
                    Ø File Manager
                    Ø Program Manager
                    Ø Minesweeper

             § Followed in 1993 by
                 Windows for Workgroups
                 (3.11) with native
                 networking support

             EIA-FR / J. Bapst                  MMI_01                          80
MacOS X
Multimodal
Interfaces

             § New Aqua GUI
                    Ø Double-buffering
                    Ø Minimized
                      windows
                      stretching and
                      squeezing into
                      the dock
                    Ø Expose feature
                      to fit all
                      applications
                      on screen
                    Ø Several eye-
                      candy features

             EIA-FR / J. Bapst           MMI_01   81
And a lot more...
Multimodal
Interfaces

                            Amiga WorkBench (1985)            Digital-Research GEM (1985)

                                 NeXTstep (1988)                     OS/2 (1992)

             EIA-FR / J. Bapst                       MMI_01                                 82
And more...
Multimodal
Interfaces

                                                              Linux KDE (1996)

                                 Windows 95 (1995)

                                     BeOS (1998)
                                                              Windows Vista (2006)
             EIA-FR / J. Bapst                       MMI_01                          83
GUI History Timeline
Multimodal
Interfaces

             § A history of the GUI
                 (by Jeremy Reimer) :
                 arstechnica.com/articles/paedia/gui.ars

             § GUI Gallery
                 (by Nathan Lineback) :
                 toastytech.com/guis

             EIA-FR / J. Bapst                             MMI_01   84
Future GUI's / 3D Skins
Multimodal
Interfaces
                                                                        [1]
             § Microsoft Task Gallery Research project

                  § Video : research.microsoft.com/adapt/taskgallery/video.mpg

             EIA-FR / J. Bapst                   MMI_01                          85
Future GUI's
Multimodal
Interfaces
                                                                     [2]
             § Sun Looking Glass Research project

                  § Video : www.sun.com/software/looking_glass/demo.xml

             EIA-FR / J. Bapst                 MMI_01                      86
Key Points / What You Should Know
Multimodal
Interfaces

             § Human Communication Channels
                    Ø Senses         / Effectors
             § Human Modeling
                    Ø Model  Human Processor / Fitts' Law
                    Ø Human Memory (Sensory / Short-term / Long-term)
                    Ø Action Theory / Action Cycle Principle / Execution + Evaluation Gulfs

             § Reasoning
                    Ø Deductive  / Inductive / Abductive
                    Ø Affective aspects / Emotions

             § Interaction Paradigms
                    Ø Mode (Spatial / Temporal),
                                             Quasimode, Modeless
                    Ø Most Common Interaction Styles
                          ü      Command-Line
                          ü      …
                          ü      Direct Manipulation / Indirect Manipulation
                          ü      …
                          ü      Brain-Computer Interface
             EIA-FR / J. Bapst                                   MMI_01                   87
You can also read