Multimodal Interfaces - 1 Introduction to
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Multimodal Interfaces
[1] Introduction to
Human-Computer Interaction
Jacques BapstContent Overview
Multimodal
Interfaces
§ Human-Computer interaction basic paradigms
§ Human capabilities / Human modeling / Cognitive Framework
Ø Input-Ouput channels / Human senses
Ø Model Human Processor
Ø Fitts' Law
Ø Human memory (sensory, short-term, long-term)
Ø Reasoning (deductive, inductive, abductive)
Ø Human perception
Ø Interaction process (action theory, affective aspects, emotion)
§ User-Computer Dialog / Interaction Styles
Ø Mode / Temporal and Spatial Mode / Quasimode
Ø Command language
Ø Form fill-in / Spreadsheets
Ø Menu selection
Ø Natural language / Query language
Ø WIMP / Point-and-Click
Ø Direct manipulation / Indirect manipulation
Ø 3D Interfaces / Brain computer interface
§ Appendix : HCI and GUI short history
EIA-FR / J. Bapst MMI_01 2Multimodal Interfaces
[1.1] Human-Computer Interaction
"Bridging the Gap"Human-Computer Interaction
Multimodal
Interfaces
"With teletype interface and the Fortran language,
the computer will be easy to use"
From RAND-Corporation
1950 : A scientist shows how a "home computer" could look like in 2004
EIA-FR / J. Bapst MMI_01 4Human-Computer Interaction (HCI)
Multimodal
Interfaces
§ Human-Computer Interaction is also known as Man-Machine
Interaction (MMI).
§ One possible definition (ACM) :
Ø Human-computer interaction is a discipline concerned with the
design, evaluation and implementation of interactive computing
systems for human use and with the study of major phenomena
surrounding them.
§ HCI is a large interdisciplinary science involving
Ø Computer science and Engineering
Ø Graphic design
Ø Cognitive psychology
Ø Ergonomics (user's physical capabilities)
Ø Sociology (wider context of the interaction)
§ User interface design is an important subset of the HCI field of
study.
EIA-FR / J. Bapst MMI_01 5Interaction Design and HCI
Multimodal
Interfaces
§ Relationship between Interaction Design, Human-Computer
Interaction and other fields : academic disciplines, design
practices and interdisciplinary fields.
Academic Disciplines Design Practices
Ergonomics Graphic Design
Psychology
Cognitive sciences Product Design
Informatics Artist Design
Interaction
Engineering Design Industrial Design
Computer Science
Social Sciences Information
(sociology, anthropology, ...)
Systems
Computer Supported
Human-Computer
Human Factors (HF) Cooperative Work (CSCW)
Interaction (HCI)
Cognitive Engineering Cognitive Ergonomics
Interdisciplinary Fields
EIA-FR / J. Bapst MMI_01 6Multimodal Interfaces
[1.2] Human Capabilities
Communication ChannelsHuman Characteristics / Abilities
Multimodal
Interfaces
§ HCI is undoubtedly a multi-disciplinary topic.
§ In HCI design, it is really important to understand something
about
Ø Human information-processing characteristics
ü Cognitive architecture, memory, perception, motor skills, …
Ø How human action is structured
Ø The nature of human communication
Ø Human physical and physiological requirements
§ Human are limited in their capacity to process information.
This has important implications for the interaction design.
§ Important aspects
Ø Input-output channels (senses and effectors)
Ø Memory
Ø Learning(acquiring skills)
Ø Reasoning / Problem solving (cognitive activity)
EIA-FR / J. Bapst MMI_01 8Input-Output Channels
Multimodal
Interfaces
§ Input in the human occurs mainly through the senses (sensory
channels) and output through the motor control of the effectors.
§ Five major senses + Balance + Kinesthesis
Ø Sight Ø Sense of Ø Proprioception
Ø Hearing equilibrium
Ø Touch
Ø Taste Do not currently play a significant role in HCI
Ø Smell except in specialized systems or virtual reality
§ Effectors
Ø Limbs (arms, legs, body position, …)
Ø Fingers
Ø Eyes
Ø Head / Face
Ø Body
Ø Vocal system
EIA-FR / J. Bapst MMI_01 9Human Senses : Sight
Multimodal
Interfaces
§ Human vision is a highly complex activity with a range of
physical and perceptual limitations.
§ Primary source of information.
§ Two main stages
Ø Physicalreception of the stimulus (photoreceptor of the retina)
Ø Processing and interpretation of that stimulus
§ Visual perception
Ø Size and depth (visual angle, stereoscopy, knowledge, …)
Ø Color (hue, intensity, saturation), Color blindness
Ø Brightness (luminance, contrast)
§ Reading
Ø Visual pattern perception
Ø Pattern decoding using internal representation of language
Ø Syntactic and semantic analysis (phrases)
EIA-FR / J. Bapst MMI_01 10Human Senses : Hearing
Multimodal
Interfaces
§ Hearing is often considered secondary to sight (we tend to
underestimate the amount of information that we receive through
our ears).
§ Two main stages
Ø Physical reception of the stimulus (sound wave propagated along the
auditory canal, received by tympanic membrane and transmitted to the
cochlea)
Ø Processing and interpretation of that stimulus
§ Hearing perception
Ø Pitch (main frequency)
Ø Loudness (amplitude)
Ø Timbre (spectrum, envelope)
Ø Location (stereophony)
§ Voice recognition
Ø Perception, decoding, syntactic and semantic analysis
EIA-FR / J. Bapst MMI_01 11Human Senses : Touch
Multimodal
Interfaces
§ Touch is also known as haptic perception.
§ It provides us with vital information about our environment.
§ The skin contains three types of sensory receptor
Ø Thermoreceptors respond to temperature
Ø Nociceptors respond to intense pressure, heat and pain
Ø Mechanoreceptors respond to pressure
§ Act as
Ø Sensory receptor thermoreceptor, pressure receptor, pain
Ø Warning hot, sharp, …
Ø Feedback feel when in contact, necessary in prehension
§ A second aspect of haptic perception is the awareness of the
position of the body and limbs. This conscious awareness of
body position is known as kinesthesis or (if we include the sense
of equilibrium) proprioception.
EIA-FR / J. Bapst MMI_01 12Multimodal Interfaces
[1.3] Human Modeling
Cognitive FrameworkWhat is Cognition ?
Multimodal
Interfaces
[1]
§ What goes on in the mind in our everyday activities ?
§ Different kinds of cognition.
EIA-FR / J. Bapst MMI_01 14What is Cognition ?
Multimodal
Interfaces
[2]
§ Norman (1993) distinguishes between two general modes :
Ø Experiential cognition
ü Driving a car, reading a book, having a conversation, playing a game, …
Ø Reflective cognition
ü Designing, learning, problem solving, writing a book, …
§ Cognition may also be described in terms of specific kind of
processes :
Ø Attention
Ø Perception and recognition
Ø Memory
Ø Learning
Ø Reading,speaking, listening
Ø Problem-solving, planning, reasoning, decision-making
Ø. . .
EIA-FR / J. Bapst MMI_01 15Human Modeling
Multimodal
Interfaces
§ Unfortunately no general and unified theory.
§ Cognitive and interaction models attempt to represent the users
as they interact with a system, modeling aspects of
their understanding, knowledge, intentions or processing.
§ Human models can be divided into categories according their
abstraction level.
Reasoning
Æ Theory of Action [Norman]
Æ Rasmussen's model
Æ Human processor [Card, ...] Reflex
+ Others : GOMS, ICS, CCT, Keystroke, …
EIA-FR / J. Bapst MMI_01 16Model Human Processor (MHP)
Multimodal
Interfaces
§ From Card, Moran and
Newell (1983).
§ Describes the cognitive
process that people go
through between
perception and action
§ Simplistic view of human
behavior
Ø Ignores environment and
other people (social
interactions)
§ Low level model
Ø Performance oriented
Ø Allow empirical tests
§ Basis of GOMS model
EIA-FR / J. Bapst MMI_01 17Fitts' Law
Multimodal
Interfaces
[1]
§ The Fitts' law (1954) predicts the time required to move from a
starting position to a final target area based on the size and the
distance of this target area.
§ It describes the behavior of aimed and rapid movement.
W
D
d0 d1 d2 d3 d4
T = k log2(2D / W)
k : constant values based on D 10 cm 10 cm 30 cm
cycle times τp and τm
W 1 cm 1 mm 2 mm
Usually : 0.1 s T 0.43 s 0.76 s 0.82 s
EIA-FR / J. Bapst MMI_01 18Fitts' Law
Multimodal
Interfaces
[2]
§ The Fitts' law has been formulated in several different ways.
§ One common form is the Shannon formulation.
T = a + b log2(D / W + 1)
a and b are empirical constants which depend of the pointing
device and the user dexterity. [a ≈ 0.1 s, b ≈ 0.1 s]
§ The logarithmic term of the formula ( log2(D /W + 1) ) is called
index of difficulty (ID) ð T = a + b ID
§ Useful when
Ø Designinguser interfaces
Ø Comparing pointing devices (determining a and b)
§ Test yourself : www.tele-actor.net/fitts/index.html
EIA-FR / J. Bapst MMI_01 19Human Memory
Multimodal
Interfaces
[1]
§ Much of our everyday activity relies on memory.
§ It is generally agreed that there are three types of memory :
Ø Sensory buffers
Ø Short-term memory or working memory
Ø Long-term memory
§ These memories interact, with information being processed and
passed between memory stores :
Sensory memories
Attention Repetition
- Iconic (visual)
- Echoic (aural)
- Haptic (touch)
Short-term memory Rehearsal
or Encoding
Working memory
Information not
attended to
Retrieval
Long-term memory
Forgetting
EIA-FR / J. Bapst MMI_01 20Human Memory
Multimodal
Interfaces
[2]
§ First stage : selection and encoding
Ø Determines which information is attended to in the environment and
how it is interpreted.
Ø The more attention paid to something, and the more it is processed
in terms of thinking about it and comparing it with other knowledge,
the more likely it is to be remembered.
§ We don’t remember everything : involves filtering and
processing what is attended to.
§ Context is important in affecting our memory (i.e., where, when).
Ø Sometimes it can be difficult to recall information that was encoded
in a different context.
§ We recognize things much better than being able to recall things.
Ø Betterat remembering images than words
Ø A reason why interfaces are largely visual
ü GUIs provide visually-based options : recognition
ü Command-line UIs require users to remember commands : recall
EIA-FR / J. Bapst MMI_01 21Sensory Memory
Multimodal
Interfaces
§ The sensory memories act as buffers for stimuli received through
the senses (constantly overwritten by new information [0.1 … 0.5 s]).
§ A sensory memory exists for each sensory channel
Ø Iconicmemory for visual stimuli
Ø Hechoic memory for aural stimuli
Ø Haptic memory for touch
§ Information is passed from sensory memory into short-term
memory by attention.
§ Attention is the concentration of the mind on one out of a number
of competing stimuli.
§ We can choose which stimuli to attend to (according our need, level
of interest, …; this explains the noisy party phenomenon).
§ Information received by sensory memories is quickly passed into
a more permanent memory store (generally the short-term memory)
or overwritten and lost.
EIA-FR / J. Bapst MMI_01 22Short-Term Memory
Multimodal
Interfaces
§ Short-term memory or working memory (a slightly different
concept) acts as a scratch-pad for temporary recall of information.
§ Examples of use :
Ø Calculation (e.g. 25 x 6)
Ø Reading
§ Short-term memory access time is in the order of 70 ms.
§ Information can only be held temporarily : ~ 200 ms … 10 s
§ Short-term memory has a limited capacity. This was established
in experiments by Miller ("The magical number seven, plus or minus two").
§ The average person can remember 7 ± 2 chunks of information.
§ Further studies say 4 ± 2 (if no relationship between information)
§ A chunk of information is not precisely defined. It is any
meaningful unit (digits, words, people's faces, chess positions, etc.)
and depends of the user experience with the kind of information.
EIA-FR / J. Bapst MMI_01 23Miller's Magical Number : 7±2
Multimodal
Interfaces
§ George Miller's theory (1956) says that 7±2 chunks of information
can be held in short-term memory at any one time.
§ It's one of the best known et remembered finding in psychology.
§ But unfortunately, this theory is often misinterpreted by HCI
designers.
§ Examples of inappropriate application of the theory :
Ø Never have more than seven bullets in a list
Ø Have no more than seven options on a pull-down menu
Ø Display only seven icons on a menu-bar or tool-bar
Ø Place no more than seven tabs at the top of a website page
Ø and so on…
§ All of these are wrongly based on Miller's law because all the
items can be scanned and rescanned visually and hence do not
have to be recalled from short-term memory.
EIA-FR / J. Bapst MMI_01 24Long-Term Memory
Multimodal
Interfaces
§ The long-term memory is our permanent memory store
intended for the long-term storage of information (everything that
we know : experiential knowledge, procedural skills, etc.).
§ It has a huge capacity (if not unlimited).
§ Relatively slow access time (~0.1 s).
§ Forgetting, if at all, occurs much more slowly than in short-term
memory (long-term recall after minutes is the same as that after days).
§ Today, most researchers distinguish three long-term memory
sub-systems :
Ø Episodic memory : memory of events and experiences
Declarative
Memory
in a serial form (chronology)
Ø Semantic memory : structured record of facts, concepts
that we have acquired
Ø Procedural skills : "know-how" memory (skills, procedures)
EIA-FR / J. Bapst MMI_01 25Long-Term Memory Structure
Multimodal
Interfaces
Long-term memory
Facts Declarative memory Skills Procedural memory
Play piano
Ride a bike
Semantic memory Episodic memory
Paul's address Last birthday party
Words meaning
§ The information in semantic memory is derived from that in our
episodic memory (we can learn new concept from our experiences).
§ Memory structure and processes are very complex and cannot
easily be reduced to a simple model.
EIA-FR / J. Bapst MMI_01 26Reasoning
Multimodal
Interfaces
§ Reasoning is the process by which we use our knowledge to
draw conclusions or infer new information.
§ There are different types of reasoning :
Ø Deductive reasoning / Deduction
Ø Inductive reasoning / Induction
Ø Abductive reasoning / Abduction
§ Humans are able to think about things of which they have no
experience and solve problem they have never seen before.
§ Problem solving involve reasoning and vice versa.
§ Recurrent familiar situations allow people to acquire skills in a
particular domain (better information structure).
§ People build their own theories (called Mental models) to
understand the causal behavior of systems.
§ Sometimes they are based on an incorrect interpretation of the
facts and this can lead to the well known "Human error".
EIA-FR / J. Bapst MMI_01 27Deductive Reasoning
Multimodal
Interfaces
§ Deductive reasoning derives the logically conclusion from the
given premises.
Ø If it is Monday it rains
Ø It is Monday
Ø Therefore it rains
§ Note that the logical conclusion does not necessarily have to
correspond to our notion of truth.
§ Deductive reasoning is therefore sometimes misapplied :
Ø Some people are students
Ø Some students attend a lecture about Multimodal interfaces
Ø Therefore some people attend a lecture about Multimodal interfaces
Is this logically correct ?
§ The human deduction is poor when there is a clash between
truth and logical validity.
EIA-FR / J. Bapst MMI_01 28Inductive Reasoning
Multimodal
Interfaces
§ Inductive reasoning is generalizing from cases we have seen
to infer information about cases we have not seen.
§ With inductive reasoning, we draw conclusions by moving from
specific case or cases and deriving a general rule (just the
opposite of deductive reasoning).
§ Example :
Ø Every EIA-FR student I have ever seen owns a desktop computer
Ø Therefore I infer that all EIA-FR students own a desktop computer
§ Of course, this inference is unreliable and cannot (or at least
difficult) be proved to be true. It can only be proved to be false.
§ In spite of its unreliability, induction is a useful process, which we
use constantly in learning about our environment.
EIA-FR / J. Bapst MMI_01 29Abductive Reasoning
Multimodal
Interfaces
§ Abductive reasoning is reasoning from observed facts to the
action or state that caused it.
§ This is the method we use to derive explanations for the events
we observe (we try to find hypothesis that would explain the
observed facts).
§ Example :
ØI know that Bob take his car when he misses the bus
Ø If I see Bob driving his car
Ø Therefore I may infer that he missed the bus
§ In spite of its unreliability, people do infer explanations in this
way and hold onto them until they have evidence to support an
alternative theory or explanation.
§ In interactive systems, if an event always follows an action, the
user will infer that the event is caused by the action.
If, in fact, the event and the action are unrelated, confusion and
even error may result.
EIA-FR / J. Bapst MMI_01 30Human Perception
Multimodal
Interfaces
[1]
§ Human senses cannot easily be compared with computer
peripherals. Don't forget the functions of the brain which
heavily processes the sensors information before interpretation.
§ All human senses are subject to illusions in their perception
process.
§ Illusion research can provide fundamental insights into general
brain mechanisms.
EIA-FR / J. Bapst MMI_01 31Human Perception
Multimodal
Interfaces
[2]
EIA-FR / J. Bapst MMI_01 32Human Perception
Multimodal
Interfaces
[3]
§ Stroop effect.
Noir
Papier
Manger
Vert
Livre
Rouge
Vert
Maison
Creuser
Bleu
Texte
Orange
Téléphone
Rouge
Bleu
Rire
Agenda
Orange
Golf
Noir
EIA-FR / J. Bapst MMI_01 33Human Perception
Multimodal
Interfaces
[4]
§ 2- and 3-dimensional interpretation
EIA-FR / J. Bapst MMI_01 34Human Perception
Multimodal
Interfaces
[5]
§ Can you read this ?
Sleon une édtue de l'Uvinertisé de Cmabrigde, l'odrre des
Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't
ltteers dnas un mot n'a pas d'ipmrotncae, la suele coshe
mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt
ipmrotnate est que la pmeirère et la drenèire letrte soit à la
tihng is taht the frist and lsat ltteer be at the rghit pclae.
bnnoe pclae.
The rset can be a toatl mses and you can sitll raed it wouthit
Le rsete peut êrte dnas un dsérorde ttoal et vuos puoevz
porbelm.
tujoruos lrie snas porlblème.
Tihs is bcuseae the huamn mnid deos not raed ervey lteter by
C'est prace que le creaveu hmauin ne lit pas chuaqe ltetre elle-
istlef, but the wrod as a wlohe.
mmêe, mias le mot cmome un tuot.
§ Probably a hoax, but nevertheless surprising.
EIA-FR / J. Bapst MMI_01 35Human Perception
Multimodal
Interfaces
[6]
§ Face recognition
§ Inverted faces
EIA-FR / J. Bapst MMI_01 36Human-Machine Interaction Process
Multimodal
Interfaces
§ Action theory (D.A. Norman, 1986)
§ Action cycle composed of two main processes
Ø Execution process
Ø Evaluation process
§ Seven stages :
1. Establish a goal
2. Form an intention
3. Specify an action sequence
4. Execute an action
5. Perceive the system state
6. Interpretthe state
7. Evaluate the system state
with respect to the goals
and intentions
EIA-FR / J. Bapst MMI_01 37Action Cycle
Multimodal
Interfaces
[1]
§ The action theory try to deconstruct the process of translating an
intention into an action.
Ø The action itself has two major aspects :
doing something and checking (execution and evaluation)
Goals
Intention to act Evaluation
What to do ?
Comprehension
Gulf of Gulf of
execution Sequence of actions Interprétation evaluation
Interaction
How to do it ?
Execution Perception
World
EIA-FR / J. Bapst MMI_01 38Action Cycle
Multimodal
Interfaces
[2]
§ Forming the goal.
Ø Can be stated in a very imprecise way, e.g. "Make a nice meal"
§ Execution :
Ø Forming the Intention.
Goals must be transformed into intentions, i.e., specific statements
of what has to be done to satisfy the goal; e.g. "Make a chicken
casserole using a can of prepared sauce"
Ø Specifying an Action Sequence.
What is to be done to the world. The precise sequence of operators
that must be performed to effect the intention; e.g. "Defrost frozen
chicken, open can, ..."
Ø Executing an Action.
Actually doing something. Putting the action sequence into effect on
the world; e.g. actually opening the can.
EIA-FR / J. Bapst MMI_01 39Action Cycle
Multimodal
Interfaces
[3]
§ Evaluation :
Ø Perceiving the State of the World.
Perceiving what has actually happened; e.g. the experience of
smell, taste and look of the prepared meal.
Ø Interpreting the State of the World.
Trying to make sense of the perceptions available; e.g. putting
those perceptions together to present the sensory experience of
a chicken casserole.
Ø Evaluating the Outcome.
Comparing what happened with what was wanted; e.g. did the
chicken casserole match up to the requirement of "a nice meal" ?
§ Design objective
Ø To reduce the evaluation's gulf
Ø To reduce the execution's gulf
§ Design principles : } Visibility } Good mapping
} Good conceptual model } Feedback
EIA-FR / J. Bapst MMI_01 40Execution's and Evaluation's Gulfs
Multimodal
Interfaces
§ The gulfs represent the gaps that exist between the user and the
interface
§ Gulf of execution
Ø Distance from the user to the physical system
§ Gulf of evaluation
Ø Distance from the physical system to the user
§ The goal : reduce the gulfs in order to reduce the cognitive effort
required to perform a task
Interface mechanism ï Action sequence ï Intentions
Interface perception ð Interpretation ð Evaluation
EIA-FR / J. Bapst MMI_01 41Affective aspects / Emotions
Multimodal
Interfaces
[1]
§ Human experience is far more complex and not limited
to perceptual and cognitive abilities.
§ Our emotional response to situations affects how we perform.
§ Positive emotions enable us to think more creatively
and to solve problem more quickly.
§ Negative emotions pushes us into narrow thinking.
A problem, easy to solve, will become difficult if we are frustrated
or stressed.
§ Emotion involves both physical and cognitive events.
§ That biological response - known as affect - changes the way we
deal with different situations, and this has an impact on the way
we interact with computer systems.
§ If we try to build interfaces that promote positive responses
(e.g. by using aesthetics, reward, ergonomics, …) then they are
likely to be more successful.
EIA-FR / J. Bapst MMI_01 42Affective aspects / Emotions
Multimodal
Interfaces
[2]
§ Recent empirical studies [Tractinsky, 2000] show that the
aesthetics of an interface can have a positive effect on
people's perception of the system's usability.
§ Importance of the look and feel of an interface (and not only
usability) is gaining acceptance within the HCI community.
§ Users are likely to be more tolerant when the interface is
pleasing :
Ø Beautiful graphics
Ø Well-designed fonts and icons
Ø Nice feel of the way the elements have been laid out
Ø Elegant use of images and color
Ø Good sense of balance
Ø. . .
§ A key concern is to strike a balance between designing
Pleasurable interfaces \__^__/ Usable interfaces
EIA-FR / J. Bapst MMI_01 43Affective aspects / Emotions
Multimodal
Interfaces
[3]
§ Expressive interfaces can affect
user attitude and behavior.
Ø Reassuring, informative, fun
Ø Intrusive, annoying, get user angry
§ Anthropomorphism pros and cons.
Ø Well accepted by children
Ø Some people may feel anxious,
feeling inferior or stupid
Ø A controversial debate
§ Interface agents, avatars, virtual pets,
interactive toys
Ø Often considered as trying and intrusive
Ø Downright deceptive, deceiving
and frustrating
EIA-FR / J. Bapst MMI_01 44Individual Differences
Multimodal
Interfaces
§ The psychological principles and properties apply to the majority
of people.
§ Notwithstanding this, we should remember that, although we
share common processes, humans (i.e. users) are not all the
same.
§ We should be aware of individual differences
Ø Long term differences : sex, physical / intellectual capabilities, …
Ø Short term differences : emotion, stress, fatigue, …
Ø Continuous changes : age, experience, skills, …
§ These differences should be taken into account in interface design
Ø Define personas (artifacts)
Ø Promote flexibility and adaptability
Ø Universal accessibility (impaired people)
§ In summary : User-centered design (philosophy and process)
EIA-FR / J. Bapst MMI_01 45Multimodal Interfaces
[1.4] User-Computer Dialog
Interaction Styles
Interaction ParadigmsMode
Multimodal
Interfaces
§ A mode is a distinct state of the interface in witch the same
user input will produce a different result than it would in other
state. The mode influences the effect of actions.
§ Typical examples :
Ø Insert/Replace mode in word processing applications
Ø Caps lock (keyboard physical mode)
Ø Tool palettes in photo editing or drawing applications
Ø Modal dialog boxes
§ Modal interfaces should be avoided, if at all possible, because
they lead to confusion or input errors, known as mode errors
(the user performs an action that is appropriate to a different mode
and gets hence an unexpected response).
§ If modes must be used, there should be clear indicators of the
current mode to help prevent mode errors.
§ Nevertheless… modes can sometimes be helpful to control and
guide user input.
EIA-FR / J. Bapst MMI_01 47Spatial Mode / Temporal Mode
Multimodal
Interfaces
§ One can also present a mode as a multiplexer of user input that
allows giving one precise meaning to an action.
§ Distinction between spatial modes and temporal modes
(spatial multiplexing of input vs. temporal multiplexing).
§ A spatial mode uses the location of a user's action to
determine the meaning of an event.
Ø Example : Resizing handles in a drawing editor
§ A temporal mode uses the ordering of events in time to
determine their meaning.
Ø Example : Tools palette in a drawing editor
§ Drawback of temporal modes : they have no intrinsic feedback.
ØA possible workaround : change the shape of the cursor or other
noticeable interface state
§ Avoid sub-modes and limit the number of states to what is strictly
necessary.
EIA-FR / J. Bapst MMI_01 48Quasimode and Modeless Interfaces
Multimodal
Interfaces
§ A quasimode (or spring-loaded mode) is a mode that is kept in
place only through some constant action on the part of the user
(e.g. pressing the Shift-Key or Ctrl-Key).
§ A quasimode is a modeless interaction that allows for the
benefits of a mode without the associated cognitive burden
(the user is performing a conscious action).
§ Modifier keys used in interfaces usually start a quasimode.
§ An interface that doesn't use modes is known as a modeless
interface (the same input from the user will always trigger the same
perceived action).
EIA-FR / J. Bapst MMI_01 49Interaction Styles
Multimodal
Interfaces
§ Interaction can be seen as a dialog between the computer and
the user.
§ The choice of interaction style can have a profound effect on
the nature of this dialog.
§ The most common interaction styles :
Ø Command language / Command line interface
Ø Form-fills and spreadsheets
Ø Menus
Ø Natural language and query language
Ø Question/answer dialog
Ø WIMP
Ø Point-and-click
Ø Direct manipulation
Ø 3D interfaces (à virtual reality)
Ø Brain-computer interface
EIA-FR / J. Bapst MMI_01 50Command Language
Multimodal
Interfaces
[1]
§ User types in commands
in response to a prompt
§ May use function keys,
abbreviations or whole-
word commands
§ Examples
Ø OS commands
ü MS-DOS
ü Unix shell
Ø Applications
ü FTP
ü Telnet
EIA-FR / J. Bapst MMI_01 51Command Language
Multimodal
Interfaces
[2]
§ Earliest form of interaction style and is still widely used
§ Flexible and extensible interface (appealing for expert users)
§ Required a formally defined syntax (should use user's vocabulary)
§ Useful for repetitive tasks
§ Support regular expression and creation of user-defined scripts
and macros
Ø $> zip archive photo*.jpg
§ Suitable for interacting with networked-computer (low bandwidth)
§ Poor usability :
Ø Typing is tiring and error prone
Ø Difficult to remember task names and parameters (bad learnability)
Ø Difficult to remember correct syntax
Ø Error messages and assistance are hard to provide
§ Not suitable for non-expert users
EIA-FR / J. Bapst MMI_01 52Form Fill-in ("fill in the blanks")
Multimodal
Interfaces
§ Used primarily for data entry (but also useful in data retrieval)
§ Aimed at non-experts users
§ Paper form metaphor
§ Originally no need for a pointing
device (Keyboard, Tab, Enter)
§ Easy movement from field to field
§ Form fill-in interfaces were (and Classic form fill-in
still are) especially useful for
routine, clerical work or for tasks that
require a great deal of data entry
§ User already familiar with actual
form (often based on actual paper
form for compatibility) More modern form fill-in
EIA-FR / J. Bapst MMI_01 53Spreadsheets Forms
Multimodal
Interfaces
§ Spreadsheets can be considered as a sophisticated variation
of form filling
§ Grid of cells with formula
§ System maintains consistency and updates values
immediately
§ User can manipulate
values (in any order)
and observe effects
§ Sometimes blurred
distinction between
input and output fields
§ Attractive style for
= Qty * Unit Price
complex forms
§ Spreadsheets are an attractive and flexible medium
for interaction
EIA-FR / J. Bapst MMI_01 54Menu Selection
Multimodal
Interfaces
[1]
§ A menu is a set of options displayed on the screen
§ Text-based (ev. options numbered) or GUI-based (mouse selection)
§ Selection and execution of one (or more) of the options results
in a state change of the interface
§ Classical web-pages are often mainly
based on menu selection
EIA-FR / J. Bapst MMI_01 55Menu Selection
Multimodal
Interfaces
[2]
§ Advantages
Ø Affords exploration ("look around")
Ø Relies on recognition rather than recall
Ø Ideal for novice or intermittent users
Ø Structures workflow and decision
making (sequential, hierarchical, grouping)
Ø User's input does not have to be
parsed ð easier error handling
§ Disadvantages
Ø May be slow for frequent users
(shortcuts should be implemented)
Ø Too many menus may lead to information
overload (discouraging the users)
Ø Not always suited for small graphic displays
(need adaptation)
Ø Highly hierarchical menus may be tedious (where to drill down ?)
EIA-FR / J. Bapst MMI_01 56Menu Selection / Guidelines
Multimodal
Interfaces
[3]
§ Make menu options meaningful in the user’s language
§ Logically group similar options to aid recognition
§ Use hierarchical organization where appropriate
(menus/submenus)
§ Use sequential organization where appropriate (arrange options
in order to suggest a workflow sequence)
EIA-FR / J. Bapst MMI_01 57Natural Language Interaction
Multimodal
Interfaces
§ Natural language understanding
§ Forms : speech or written input (smart command language)
§ Very attractive style of interaction (at least at first glance)
§ A very difficult task
Ø Parsing language is very difficult (a language is vague and imprecise
by it’s very nature)
Ø Phrases and words are quite often ambiguous (homonyms, …)
Ø Spelling errors and/or variations exacerbate written input
Ø Synonyms exacerbate written and speech input
Ø Converting audio speech to machine-readable text is very difficult
§ Subject of considerable interest and research
§ Relatively successful in restricted domains or after an
extensive learning process (still natural language ?)
§ A simpler approach : query-language (restricted context, more
formal)
EIA-FR / J. Bapst MMI_01 58Natural / Query / Command Language
Multimodal
Interfaces
§ Distinction between written natural language, query language
and command language is sometimes blurred.
§ What appears as a natural language interface may simply be
a front-end for a query sub-system.
§ The question is
parsed into
keywords to
form a query
§ Other related
example :
web search
engine
EIA-FR / J. Bapst MMI_01 59Question / Answer
Multimodal
Interfaces
§ Simple mechanism for providing input to an application.
§ The user is asked a series of questions and is led through the
interaction step by step.
Ø Yes/no response
Ø Multiple choice
Ø Codes
§ Examples : web questionnaires, web inquiries
§ Easy to learn but limited in functionality and power (appropriate
for restricted domains and for novice users).
EIA-FR / J. Bapst MMI_01 60WIMP
Multimodal
Interfaces
§ WIMP : Windows + Icons + Menus + Pointing devices
§ Popularized by Graphical User Interface (GUI)
§ Currently the most common environment for interacting with
computers (sometimes simply called Windowing system)
§ Need appropriate visual representation of objects to interact with
Ø Symbolized pictorial
representations (icons)
may be difficult to interpret
because of their small size
§ Typically based on metaphors
The essence of metaphor is
understanding and experiencing
one kind of thing in terms of another
§ Particularly suited to explore
an application
EIA-FR / J. Bapst MMI_01 61Point-and-Click Interaction
Multimodal
Interfaces
§ Point-and-Click is closely related to WIMP but a little bit simpler
§ In information browsing or simple multimedia systems, most
interactions require only a single click of a mouse button
§ Closely related to hypertext idea
§ Popularized by the web
(browser)
§ Not limited to mouse
device, also used for
touch screen (interactive
information kiosks)
EIA-FR / J. Bapst MMI_01 62Direct Manipulation
Multimodal
Interfaces
[1]
§ Direct manipulation (defined by Ben Shneiderman in 1982) is often
closely related to WIMP but not limited to it.
§ Direct manipulation involves continuous representation of the
object of interest, and rapid incremental reversible operations
whose impact on the object is immediately visible (feedback).
An auditory feedback may also be provided.
§ The intention is to allow a user
to directly manipulate objects
presented to them, using actions
that correspond at least loosely
to the physical world (metaphor).
§ Direct manipulation implies
physical actions instead of
complex syntax.
§ E.g. Drag-and-drop operation
EIA-FR / J. Bapst MMI_01 63Direct Manipulation
Multimodal
Interfaces
[2]
§ Features of a direct manipulation interface (highlighted by Ben
Shneiderman) :
Ø Visibility of the objects of interest
Ø Incremental action with rapid feedback on all action
Ø Reversibility of all actions (allows exploration without penalties)
Ø Syntactic correctness of all actions (every user action is a legal
operation)
Ø Replacement of complex command language with actions to
manipulate directly the visible objects
§ With direct manipulation there is no clear distinction between
input and output (e.g. the document icon is an output expression in the
desktop metaphor, but that icon is used by the user to move the
document).
§ Directness partly depends on the gap between user's goal and
system image (evaluation and execution's gulf in action's theory).
EIA-FR / J. Bapst MMI_01 64Direct Manipulation
Multimodal
Interfaces
[3]
§ Two variants of direct manipulation
Ø Program manipulation
Ø Content manipulation
§ Program manipulation is typically focused on the management
of the program and its interface.
Ø Selection
Ø Drag-and-drop
Ø Controlmanipulation (button pushing, scrolling, ...)
Ø Resizing, reshaping, repositioning
Ø Connecting objects
§ Content manipulation is involved primarily with the direct
manual creation, modification and movement of data with a
pointing device.
Ø Drawing,painting, sketching
Ø 3D Modeling
EIA-FR / J. Bapst MMI_01 65Direct Manipulation
Multimodal
Interfaces
[4]
§ The tool seems to disappear.
§ The user can apply intellect directly to the task (not to the tool).
§ Advantages
Ø Easy to learn and remember
Ø Encourages exploration (reduced anxiety)
Ø High subjective satisfaction (fun and entertaining)
Ø Recognition memory
§ Drawbacks
Ø Mouse operations may be slower than typing
Ø Need to learn meaning of components (visual representation)
Ø Not so intuitive (most users don't discover it independently)
Ø More difficult to program (is this relevant ?)
Ø Tedious for repeated actions
Ø History keeping is harder
EIA-FR / J. Bapst MMI_01 66Indirect Manipulation
Multimodal
Interfaces
§ Not all tasks can be described using concrete objects and not all
actions can be performed directly.
§ There is a continuum from "Do it yourself" to "Command control".
§ Most GUI's are a combination of direct and indirect manipulation.
Ø Using a menu is rather an indirect manipulation
Ø Using a pop-up menu is more direct, but it is less direct than
dragging an element.
§ Example 1 : choosing a color
Ø Using an "eye dropper" mouse pointer ð [direct]
Ø By typing the color values ([255, 255, 0] to get yellow) ð [indirect]
§ Example 2 : defining a text size
§ The expression indirect manipulation is
also used when the user interact with
the real world (instrument, plant) through
an interface.
EIA-FR / J. Bapst MMI_01 673D Interfaces
Multimodal
Interfaces
[1]
§ We live in a three-dimensional world.
§ There is an increasing use of 3D
effects in user interface design.
§ From simple techniques (shading,
etching, sculptural effects, 3D icons, …) to more complex
3D workspaces.
§ 3D workspaces give extra space in a more natural way than
iconizing windows (objects shrinks when they are further away).
EIA-FR / J. Bapst MMI_01 683D Interfaces
Multimodal
Interfaces
[2]
§ In 3D workspaces, objects are displayed in perspective and their
relative size, light, angle and occlusion provide an intuitive sense
of distance.
§ The next step is virtual reality where the user can move within a
simulated 3D world (will be discussed later).
EIA-FR / J. Bapst MMI_01 69Man-Computer Symbiosis
Multimodal
Interfaces
§ J.C.R. Licklider (1960) outlined "man-computer symbiosis"
"The hope is that, in not
too many years, human
brains and computing
machines will be coupled
together very tightly and
that the resulting
partnership will think as
no human brain has ever
thought and process data
in a way not approached
by the information-
handling machines we
know today."
EIA-FR / J. Bapst MMI_01 70Brain-Computer Interface
Multimodal
Interfaces
§ A direct brain-computer interface (BCI) or
direct neural interface) would add a new
dimension to human-machine interaction.
It would represent one of the new frontiers
in science and technology.
§ Cerebral electric activity is recorded via the EEG : electrodes,
attached to the scalp, measure the electric signals of the brain.
These signals are amplified and transmitted to the computer,
which transforms them into device control commands.
§ Interesting research work in
this direction has been already
initiated, mainly motivated by
the hope to create new
communication channels for
those with severe neuromuscular
disorders.
EIA-FR / J. Bapst MMI_01 71Multimodal Interfaces
[1.5] Appendix
HCI and GUI Short HistoryDouglas Engelbart's Vision (≈1955)
Multimodal
Interfaces
§ "…I had the image of sitting at a big CRT screen with all kinds of symbols, new
and different symbols, not restricted to our old ones. The computer could be
manipulated, and you could be operating all kinds of things to drive the
computer…
... I also had a clear picture that one's colleagues could be sitting in other rooms
with similar work stations, tied to the same computer complex, and could be
sharing and working and collaborating very closely. And also the assumption that
there'd be a lot of new skills, new ways of thinking that would evolve…"
§ In 1962 Doug Engelbart developed a conceptual framework for
augmenting human intellect ("boost the collective IQ").
§ He was a precursor in perceiving computers as facilitators for
communication, rather than only computation.
§ He founded the Augmentation Research Center at Stanford, the
precursor to Xerox PARC and developed a working vision of a
collaborative computing environment, with a graphic windowed
interface, mouse, hypertext system, networking, and electronic
mail.
EIA-FR / J. Bapst MMI_01 73First Mouse (Douglas Engelbart, 1964)
Multimodal
Interfaces
EIA-FR / J. Bapst MMI_01 74Xerox PARC Alto + Star Projects (≈1975-1981)
Multimodal
Interfaces
§ Concept of personal workstation
Ø local processor
Ø Idea of a local area network
to share resources
§ Modern graphical user interface (GUI)
Ø bit-mapped display, mouse
Ø Windows, menus, scroll bars, mouse selection, etc
Ø Familiar user’s conceptual model (simulated desktop)
Ø Promoted recognizing/pointing rather than remembering/typing
Ø Property sheets to specify appearance/behavior of objects
Ø What you see is what you get (WYSIWYG)
Ø Modeless interaction
§ First system based upon usability engineering
§ Commercial flop
EIA-FR / J. Bapst MMI_01 75Apple Lisa (1983)
Multimodal
Interfaces
§ Predecessor of Macintosh
§ Based upon many ideas of the Star computer (Xerox)
§ Commercial failure as well (price ≈ $10'000)
EIA-FR / J. Bapst MMI_01 76Apple Macintosh (1984)
Multimodal
Interfaces
§ “Old ideas” but well done !
§ Mistakes of Lisa corrected + aggressive pricing
§ Interface guidelines encouraged consistency between
applications
§ Developer’s toolkit encouraged third party software
§ Domination in desktop
publishing because of
affordable laser printer
and excellent graphics
EIA-FR / J. Bapst MMI_01 77Windows 1.01 (1987)
Multimodal
Interfaces
§ Built on the cryptic MS-DOS operating system
§ Almost unusable
Ø No overlapping windows (unsightly tiled windows)
Ø No icons
EIA-FR / J. Bapst MMI_01 78Windows 2.03 (1988)
Multimodal
Interfaces
§ With overlapping windows and Mac-like
Ø Window-manipulation terminology : "Minimize", "Maximize"
Ø Keyboard shortcuts (underlined mnemonics)
Ø Introduction of Word for Windows and Excel
§ No commercial success
Ø Developers still maintained DOS versions of their applications
EIA-FR / J. Bapst MMI_01 79Windows 3.1 (1992)
Multimodal
Interfaces
§ Follows Windows 3.0 (1990), a transition version which introduced
significantly revamped user interface and numerous technical
improvements.
§ First serious and successful desktop platform
Ø TrueType font system
Ø File Manager
Ø Program Manager
Ø Minesweeper
§ Followed in 1993 by
Windows for Workgroups
(3.11) with native
networking support
EIA-FR / J. Bapst MMI_01 80MacOS X
Multimodal
Interfaces
§ New Aqua GUI
Ø Double-buffering
Ø Minimized
windows
stretching and
squeezing into
the dock
Ø Expose feature
to fit all
applications
on screen
Ø Several eye-
candy features
EIA-FR / J. Bapst MMI_01 81And a lot more...
Multimodal
Interfaces
Amiga WorkBench (1985) Digital-Research GEM (1985)
NeXTstep (1988) OS/2 (1992)
EIA-FR / J. Bapst MMI_01 82And more...
Multimodal
Interfaces
Linux KDE (1996)
Windows 95 (1995)
BeOS (1998)
Windows Vista (2006)
EIA-FR / J. Bapst MMI_01 83GUI History Timeline
Multimodal
Interfaces
§ A history of the GUI
(by Jeremy Reimer) :
arstechnica.com/articles/paedia/gui.ars
§ GUI Gallery
(by Nathan Lineback) :
toastytech.com/guis
EIA-FR / J. Bapst MMI_01 84Future GUI's / 3D Skins
Multimodal
Interfaces
[1]
§ Microsoft Task Gallery Research project
§ Video : research.microsoft.com/adapt/taskgallery/video.mpg
EIA-FR / J. Bapst MMI_01 85Future GUI's
Multimodal
Interfaces
[2]
§ Sun Looking Glass Research project
§ Video : www.sun.com/software/looking_glass/demo.xml
EIA-FR / J. Bapst MMI_01 86Key Points / What You Should Know
Multimodal
Interfaces
§ Human Communication Channels
Ø Senses / Effectors
§ Human Modeling
Ø Model Human Processor / Fitts' Law
Ø Human Memory (Sensory / Short-term / Long-term)
Ø Action Theory / Action Cycle Principle / Execution + Evaluation Gulfs
§ Reasoning
Ø Deductive / Inductive / Abductive
Ø Affective aspects / Emotions
§ Interaction Paradigms
Ø Mode (Spatial / Temporal),
Quasimode, Modeless
Ø Most Common Interaction Styles
ü Command-Line
ü …
ü Direct Manipulation / Indirect Manipulation
ü …
ü Brain-Computer Interface
EIA-FR / J. Bapst MMI_01 87You can also read