Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events

Page created by Ramon Warner
 
CONTINUE READING
Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events
Evaluation of In-Car SDS Notification Concepts
for Incoming Proactive Events

Hansjörg Hofmann and Mario Hermanutz and Vanessa Tobisch and Ute Ehrlich
and André Berton and Wolfgang Minker

Abstract Due to the mobile Internet revolution, people communicate increasingly
via social networks and instant messaging applications using their smartphones. In
order to stay “always connected” they even use their smartphone while driving their
car which puts the driver safety at risk. In order to reduce driver distraction an in-
tuitive speech interface which provides the driver with proactively incoming events
needs to be developed. Before developing a new speech dialog system developers
have to examine what the user’s preferred interaction style is.
This paper reports from a recent driving simulation study in which several speech-
based proactive notification concepts for incoming events in different contextual
situations are evaluated. 4 different speech dialog and 2 graphical user interface con-
cepts, one including an avatar, were designed and evaluated on usability and driving
performance. The results show that there are significant differences when compar-
ing the speech dialog concepts. Informing the user verbally achieves the best result
concerning usability. Earcons are perceived to be the least distractive. The presence
of an avatar was not accepted by the participants and led to an impaired steering
performance.
1 Introduction
Today, smartphones are considered as people’s companion and are used in various
daily situations. People do not even refrain from using their mobile devices man-
ually while driving, which distracts the driver and endangers the driver safety[5].
Due to the mobile Internet revolution the frequency of use of mobile devices has
increased. In order to be “always connected” people do not only send regular text
messages or simply call each other anymore. Nowadays, people communicate via
social media, email and other (instant) messaging applications using their smart-

Hansjörg Hofmann
Daimler AG, Ulm, Germany, e-mail: hansjoerg.hofmann@daimler.com
Mario Hermanutz
Daimler AG, Ulm, Germany e-mail: mario.hermanutz@daimler.com
Vanessa Tobisch
Daimler AG, Ulm, Germany e-mail: vanessa.tobisch@daimler.com
Ute Ehrlich
Daimler AG, Ulm, Germany e-mail: ute.ehrlich@daimler.com
André Berton
Daimler AG, Ulm, Germany e-mail: andre.berton@daimler.com
Wolfgang Minker
Ulm University, Germany e-mail: wolfgang.minker@uni-ulm.de

                                           102

       Proceedings of 5th International Workshop on Spoken Dialog Systems
       Napa, January 17-20, 2014
Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events
Hansjörg Hofmann et al.

phones. Informa Telecoms & Media estimates that by the end of 2013 rich content
messaging traffic per day will be twice the volume of SMS traffic[3]. According to
Informa Telecoms & Media each user sends an average of 32.6 rich content mes-
sages every day[3]. As more and more messages are sent per day, the users’ attention
will be increasingly demanded by the large number of proactively incoming mes-
sages. This increased mental demand will impair the driving performance which is
why an intuitive way of handling incoming proactive events and transferring their
content to the driver while driving needs to be found. Speech-interfaces offer a less
distractive and intuitive possibility to comfortably control in-vehicle information
systems and increase the driver safety[11]. Therefore, an intuitive speech interface
which provides the driver with proactively incoming events needs to be developed.
    Proactivity in human-machine interaction (HMI) in mobile environments did not
gain much attention in the research community, recently. Vico et al.[12] compare 2
proactive user interface concepts for a recommender system on a smartphone. The
results showed that users prefer a widget-based concept over a status bar notification
concept. However, the user interaction only concerned haptic input and visual output
on mobile devices and did not involve any speech interaction which would improve
driver safety in the automotive environment. Bader et al.[1] conducted a user study
in a real world driving setup to examine user acceptance of a proactive recommender
system. Results show that the proactive recommender system is perceived as helpful
and does not distract from driving. Again, only visual output is used to inform the
user about new information. A comparison of proactive speech dialog concepts has
not been addressed, yet. Furthermore, this study does not take the current contextual
situation into account. An intelligent user interface needs to be adaptive and has to
provide the information according to the current contextual situation.
    In this paper, we evaluate several speech-based proactive notification concepts
for incoming events in different contextual situations. We aim at finding out, which
is the most adequate speech interaction concept to inform the user proactively de-
pending on the current cognitive load and the priority of the incoming message. A
speech dialog system (SDS) prototype supported by a graphical user interface (GUI)
employing the designed notification concepts has been developed for German users.
In a recent driving simulator study, these concepts are evaluated on usability and
driving performance. We aim at investigating these measures only during the time
frame when a new message comes in. Maintaining the speech interaction active or
task resuming afterwards is not in focus of this research work. The research work is
performed within the scope of the EU FP7 funding project GetHomeSafe1 .
    The remainder of the paper is structured as follows: In Section 2, the speech-
based proactive notification concepts are briefly described. Section 3 presents the
experimental setup and its results. Finally, conclusions are drawn.
2 Proactive Notification Concepts
Different SDS and GUI concepts have been developed in order to simulate proactive
incoming events. Depending on the driving situation and the message priority the
one or the other notification concept might be better accepted by the user. As sending
1   http://www.gethomesafe-fp7.eu

                                          103

          Proceedings of 5th International Workshop on Spoken Dialog Systems
          Napa, January 17-20, 2014
Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events
Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events

and receiving emails is the most preferred application in the car while driving [6] an
email application has been chosen as use case. In this section, first the speech dialog
concepts are described, followed by the different GUI concepts.
2.1 Speech Dialog Concepts
The SDS prototypes have been developed for German users. As we aim at investi-
gating usability and driving performance during the time frame when a new message
comes in, the speech interaction is finished after the system has read out the message
to the user and the user has indicated to reply to the message.
2.1.1 Sound Notifications
Sound notifications only alert the user in an unobtrusive way, using a simple sound.
The first sound notification concept is an earcon. Earcons are commonly used in
HMI to provide information and feedback to the user about computer entities[2].
Here, we employed the Microsoft Outlook2 sound file which is played when an
email is received. The second sound notification is a slight cough evoked by the
SDS. Thereby, the driver shall be alerted in a more human-like and unobtrusive way.
After being alerted by the sound the user has to request to read out the message and
to reply to the message afterwards. A sample dialog is illustrated below:
    System:   
    Driver:   Read out message.
    System:   The message from Ute Ehrlich with the subject “meeting” is: “Dear Mr. Hofmann, ...”
    Driver:   Reply to message.
Sound notification concepts inform the user about newly available information un-
obtrusively. The user has the control to decide when the content is provided.
2.1.2 Verbal Notifications
Verbal notifications alert the user and already provide content about the delivered
message. The first concept only informs the user about the subject and the sender of
the incoming message. After being informed by the system the driver has to request
to read out the message:
         System:   You received a new message from Ute Ehrlich with the subject “Meeting”.
         Driver:   Read out message.
         System:   The message is: “Dear Mr. Hofmann, ...”
         Driver:   Reply to message.
   In the second verbal notification concept, the whole message is read out directly
without a request by the user:
         System: You received a new message from Ute Ehrlich with the subject “Meeting”.
                 The message is: “Dear Mr. Hofmann, ...”
         Driver: Reply to message.
Verbal notification concepts push information directly to the user without first con-
sulting the user. Therefore, these proactive notification concepts are very obtrusive
and immediately mentally occupy the driver. Applying the second verbal notifica-
tion concept requires fewer dialog steps compared to the other three notification
concepts. However, since all the content is presented by the system at the beginning

2   http://office.microsoft.com/outlook/

                                                  104

          Proceedings of 5th International Workshop on Spoken Dialog Systems
          Napa, January 17-20, 2014
Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events
Hansjörg Hofmann et al.

of the interaction the user might miss some important information and has to request
to repeat the message again.
2.2 GUI Design
Different GUIs have been designed in order to support the notification concepts and
to raise the user’s attention unobtrusively about an incoming event. When designing
the screens we followed the international standardized AAM-Guidelines[4].
    The different screens and their interaction are illustrated in Figure 1. At the be-
ginning, when the system is waiting for an incoming message, the start screen is pre-
sented. Depending on the speech dialog notification concept different GUI screens
are displayed. In case of a sound notification, only an email icon in the top bar of the
screen is presented. When the message is read out an overlay displaying the email’s
sender and subject is presented. When the email was answered the start screen ap-
pears again. In case of a verbal notification, when a new message comes in the email
icon appears, and the GUI displays immediately the email details.
    We also investigated the effect of an avatar (see Figure 2) on usability and driv-
ing distraction. The avatar might help raising the user’s attention about an incoming
email but might also lead to a higher level of distraction. Showing human-like ges-
tures the avatar raises the naturalness in the interaction. At the beginning, when the
system is waiting for an incoming message, the same start screen as illustrated in
Figure 1 is presented. When the user has to be alerted about an incoming email
the avatar appears and stays on the screen until the user has answered his email.
Afterwards, the avatar disappears again.

3 Evaluation
This Section explains the experimental setup and procedure, followed by the results.

3.1 Method
3.1.1 Participants
The experiment was conducted at the Daimler AG Research Site in Ulm, Germany.
In total, 25 German participants consisting of employees, student employees, and
externals participated in the experiment. All participants possessed a valid driver’s
license. Due to missing data recordings during the experiment data of one participant
had to be excluded from the analyses. One participant did not feel comfortable while
doing the experiment. Therefore, the experiment had to be aborted and the data was
excluded from the analyses. The remaining participants comprised 13 male and 10
female subjects with an average age of 31.5 years (standard deviation (SD) = 12.8).
61% of the participants were driving their car at least once a day. 52% had little
down to no experience with speech-controlled devices.

3.1.2 Experimental Design
4 speech-based notification concept variants and 2 GUI variants (with and with-
out avatar) have been designed. Each speech concept was combined with the GUI
variants whereby in total, 8 different HMI concepts were evaluated.

                                           105

       Proceedings of 5th International Workshop on Spoken Dialog Systems
       Napa, January 17-20, 2014
Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events
Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events

                                                Sound notification

                                                Verbal notification

                                                                      Fig. 2 Avatar Screenshot.

Fig. 1 GUI interaction of the different notification concepts.

   Each participant encountered all 8 conditions (“within-design”). During the ex-
periment, for each condition, 8 tasks had to be accomplished. We investigated the
participants’ speech dialog performance, the user acceptance concerning the notifi-
cation concept in different context situations, and influences on driving performance
while using the SDS.

3.1.3 Materials
Speech Dialog Prototype
For the experiment, a SDS employing the different notification HMI concepts de-
scribed in Section 2 has been developed. The SDS simulates incoming emails, which
are pushed at a random time. The emails were selected randomly and presented to
the user applying the different HMI concepts in a random order.
    During the experiment, the participants had to solve several tasks. The participant
had to retrieve the content of each incoming email by using the SDS and had to reply
to the message. The topic of the email content was separated in business and leisure
in order to give the email different levels of importance.
    After having indicated to answer the email a control question about the content of
the email was asked to find out if the participant retrieved the content of the message.
The control question was asked when a message with high priority was presented
in order to emphasize the importance of high priority messages. If the answer was
correct, the task was accomplished successfully. One of the goals of the study was to
find out, which HMI concept was most adequate in which situation. Therefore, after
each email we asked the participants if they found the way the content was presented
to be obtrusive (1: “too obtrusive”, 0: “adequate”, -1: “insufficient obtrusive”).

Questionnaire
During the experiment different questionnaires were used:
• Preliminary Interview: collects demographical data about the participants.
• Subjective Assessment of Speech System Interfaces (SASSI) questionnaire [7]:
  covers 6 dimensions and is widely used to measure subjective usability evaluation
  of SDS. As the speech interaction is very limited, only the relevant dimensions
  “system response accuracy”, “annoyance”, “speed” were used, which resulted in
  18 questions on a 5-point Likert scale (-2, .. , 2).
• Driving Activity Load Index (DALI) questionnaire [10]: covers 6 dimensions to
  evaluate the user’s cognitive load. We selected the 4 dimensions visual demand,
  auditory demand, temporal demand and interference, which where relevant for

                                                 106

        Proceedings of 5th International Workshop on Spoken Dialog Systems
        Napa, January 17-20, 2014
Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events
Hansjörg Hofmann et al.

  the comparison of the 8 conditions and their effects on the driving performance.
  For each dimension one question was asked on a 6-point scale (0, .. , 5).
• Final Interview: In the final interview, we asked questions about the usefulness
  of an avatar and its effect on cognitive load on a 5-point Likert scale (-2, .. , 2).

Driving Simulation Setup
The experiment was conducted in the driving simulator lab (see Figure 3). The par-
ticipants were sitting on the driver’s seat in a car which was placed in front of a 75”
flat screen TV where the driving simulation was running. The participants controlled
the driving simulation by the car steering wheel and pedals. During the experiment
the examiner was sitting at the control desk next to the car.
    Previous driving simulation studies employ the standard Lane Change Test
(LCT) by Mattes[9], which does not continuously mentally demand the user. Fur-
thermore, LCT is based on single tracks which limits the recordings to a certain
time. We employed the ConTRe (Continuous Tracking and Reaction)[8] task as
part of the OpenDS3 driving simulation software which complements the de-facto
standard LCT including higher sensitivity and a more flexible driving task without
restart interruptions. The steering task for lateral control resembles a continuous
follow drive which will help to receive more detailed results.

                        (a) External View.    (b) Driver perspective.
Fig. 3 Driving Simulator Lab.

   In order to simulate different cognitive load levels, the driverload evoked by the
driving simulation is varied. OpenDS allows to set parameters to generate different
levels of difficulty of the ConTRe task, which concern differences in the lateral
speed and frequency of movement of the lateral control task. Here, we employ a low
and a high difficulty level whose parameters have been experimentally determined.

3.1.4 Procedure
In the experiment, 8 conditions were evaluated. These 8 HMI concept variants are
presented to the user in different contextual situations.
   The experiment was split into 2 main blocks, in which the SDS prototypes had
to be used under different driver workload conditions (low and high). The order of
the 2 blocks was counterbalanced between participants to control for learning and
order effects. Within one block, each of the 8 conditions appeared randomly 4 times
while driving: for each condition 2 emails with high priority and 2 emails with low
priority were presented to the user. After each email, the examiner asked the con-
trol question in case of an email with high priority and always the obtrusiveness
3   www.opends.eu

                                             107

         Proceedings of 5th International Workshop on Spoken Dialog Systems
         Napa, January 17-20, 2014
Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events

question. Subsequently, the examiner resumed the driving simulation and the par-
ticipant continued driving. In total, in each block, 32 tasks had to be accomplished.
After having finished all the tasks within one block the participants had to fill out
the DALI questionnaire.
    The overall procedure of the experiment was as follows. First of all, participants
had to fill out the preliminary interview. Afterwards, they got to know the driving
simulation in a test drive lasting at least 4 minutes. Subsequently, the participants
completed a two-minute baseline drive under both workload conditions. The order
of the 2 baseline drives was counterbalanced between participants. Afterwards, the
participants were shown an instruction video of the SDS and the tasks including
the task priority and the follow-up questions were explained. Next, the participants
became familiar with the SDS by performing 4 trial tasks. Before the data collection
was conducted the participants were given further instruction to put them in the
situation of the intended scenario. In order to motivate the participants, they were
told that a high number of correct answered control questions and a good driving
performance throughout the experiment would have a positive effect on the payment
they would receive in the end. Now, the first data collection block was conducted.
After a short break the second block was performed, followed by 2 further baseline
drives. Finally, the participants had to fill out the SASSI and the final questionnaire.
3.1.5 Dependent Variables
The driving simulation OpenDS produces log files at run time. The driving perfor-
mance was only recorded during the speech dialogs. After each task the examiner
logged the task success and the obtrusiveness.
    Based on the collected data, the following measures were computed in order to
evaluate usability and the driving performance. Based on the examiner’s logs the
task success (TS) of each speech dialog and the obtrusiveness (ON) of each task is
assessed. Since the recognizer vocabulary was very limited and recognition errors
were not in focus of this paper the word accuracy is not computed. A subjective
usability assessment is achieved by employing the SASSI questionnaire. Based on
the OpenDS logs we compute the mean deviation (MDev) of the steering wheel
during each speech dialog. In order to assess subjective driver workload the DALI
questionnaire is analyzed.
    Depending on the contextual situation different results are expected. During high
driver workload, we expect better usability evaluation for the sound notification con-
cepts compared to the verbal notification concepts because of the high obtrusiveness
of the verbal notification concepts. During low driver workload drivers might accept
the verbal notification concepts better because they do not have to concentrate on
the primary task that much. Concerning messages with high priority, we expect the
verbal notification concepts to be better accepted because the important content is
directly presented to the user. Drivers might accept the sound notification concepts
better when messages with low priority are presented to the driver. Furthermore, we
expect the sound notification concepts to distract less than the verbal notification
concepts because the user can decide when the content shall be presented to him.

                                              108

        Proceedings of 5th International Workshop on Spoken Dialog Systems
        Napa, January 17-20, 2014
Hansjörg Hofmann et al.

Concerning the influence of the GUI on the driving performance, we expect the
avatar to cause more driver distraction due to the glances onto the GUI screen.

3.2 Results
In the following, the most relevant results concerning usability and driving perfor-
mance are presented. The results presented in this paper show the overall results
when comparing the different speech dialog concepts and the GUI concepts. In the
comparison of the speech dialog concepts only the data in which the avatar is not
present, is used. When the GUI concepts are compared, the different speech dialog
concepts are ignored. Concerning the ON of the different speech dialog concepts,
detailed results with reference to the different driver workload and priority levels
are presented. A detailed analysis comparing all 8 HMI concepts with reference to
the contextual situations is performed in the next step.
   In total, 730 dialogs during low and 730 dialogs during high driver workload were
transcribed and analyzed. First, the results of the usability evaluation are described,
followed by the driving performance. In the analyses of the data repeated measures
ANOVA tests were computed. Contrast analyses were applied in order to compare
the notification concepts with one another.

3.2.1 Usability
In this Section, first, the results of the comparison of the speech dialog concepts are
presented followed by the results of the comparison of the GUI concepts.
Comparison of Speech Dialog Concepts
Table 1 shows the TS of the different speech dialog concepts. All concepts achieve
more than 83% of TS. No significant differences between the concepts were found.
Table 1 Average TS comparing the speech dialog concepts.
                                           Earcon Cough Inform Readout
                                  TS [%]     85    83     88     85

    Figure 4 illustrates the ON results of the respective speech dialog concept with
reference to the different driver workload levels (DL L, DL H) and priority levels
(P L, P H). No main effects concerning the driver workload or the message pri-
ority were found. Overall, “Earcon” was found to be the least obtrusive concept
(F(1, 43) = 178.424, p < 0.001, η 2 = 0.81). However, “Earcon” tends to be insuf-
ficiently obtrusive. In contrast, “Cough” and “Readout” tend to be too obtrusive.
“Inform” appears to be the most adequate concept for all conditions.
                                                                                                                Driverload
                                                                                              Low                                                         High
                                                                               0,6             0,48           0,52                       0,6           0,5               0,48
                                                                               0,4                                                       0,4
                                                             ON (P_L, DL_L)

                                                                                                                       ON (P_L, DL_H)

                                                                               0,2    -0,59           0,00                               0,2                     0,07
                                                                                                                                               -0,55
                                                                                 0                                                         0
                                                      Low

                                                                              -0,2                                                      -0,2
                                                                              -0,4                                                      -0,4
                                                                              -0,6                                                      -0,6
                                                                                     Earcon Cough Inform Readout
                                           Priority

                                                                                                                                               Earcon Cough Inform Readout

                                                                               0,6                            0,43                       0,6           0,41              0,41
                                                                                               0,3
Fig. 4 Average ON compar-                                                      0,4                                                       0,4
                                                                                                                       ON (P_H, DL_H)
                                                             ON (P_H, DL_L)

                                                                               0,2    -0,5                                               0,2   -0,52
ing the speech dialog concepts
                                                      High

                                                                                 0                                                         0
                                                                              -0,2                    -0,05                             -0,2
with reference to the different                                               -0,4                                                      -0,4
                                                                                                                                                                 -0,14

driver workload and priority                                                  -0,6                                                      -0,6
                                                                                     Earcon Cough Inform Readout                               Earcon Cough Inform Readout
levels.

                                                                                     109

        Proceedings of 5th International Workshop on Spoken Dialog Systems
        Napa, January 17-20, 2014
Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events

   In Figure 5, the overall SASSI result for each speech dialog concept is pre-
sented. “Inform” was the most preferred concept (F(1, 18) = 17.67, p < 0.001) and
“Cough” was the least accepted by the participants (F(1, 18) = 19.65, p < 0.001).
Comparison of GUI Concepts
Figure 6 presents the average ON when comparing the 2 GUI concepts. Both vari-
ants seem to be adequate in obtrusiveness. No significant differences were revealed
when comparing the GUI showing the avatar with the GUI without the avatar.
   In the final questionnaire, the participants stated that the avatar did not support
in informing about incoming emails (MV = −1.12, SD = 1.15). Furthermore, the
presence of an avatar was generally perceived negatively (MV = −0.79, SD = 1.10).
                                    1,2                    1,02
            SASSI Overall Result

                                      1    0,81                                         0,6
                                    0,8
                                                                                        0,4
                                    0,6                                                        0,15
                                                                   0,29                 0,2              0,09
                                    0,4

                                                                                  ON
                                    0,2            -0,06                                  0
                                      0                                                -0,2
                                   -0,2                                                -0,4
                                          Earcon   Cough   Inform Readout              -0,6
                                                                                              Avatar   NoAvatar

           Fig. 5 Average SASSI overall                                           Fig. 6 Average ON comparing
           result comparing the speech di-                                        the GUI concepts.
           alog concepts.

3.2.2 Driving Performance
The results of the driving performance prove that, as was targeted, the average MDev
during low driver workload (MDev = 0.061) was significantly lower (F(1, 88) =
963.56, p < 0.001, η 2 = 0.02) than during high driver workload (MDev = 0.184).
   In the following, the results of the comparison of the 4 speech dialog concepts
are presented, followed by the results when comparing the 2 GUI concepts.
Comparison of Speech Dialog Concepts
When the participants used the SDS while driving (MDev = 0.125) the MDev was
higher compared to the baseline drives (MDev = 0.105). However, the difference
was not significant.
   Figure 7 shows the average MDev when comparing the 4 speech dialog concepts.
No significant differences could be revealed between the 4 concepts.
   In Figure 8, the overall results of the DALI questionnaire for each speech dialog
concept are presented. As illustrated in Figure 8, the 4 concepts were generally eval-
uated as little distractive. The “ReadOut” concept was found to be the most distrac-
tive (F(1, 22) = 18.00, p < 0.001, η 2 = 0.45) and “Earcon” was the least distractive
speech dialog concept (F(1, 22) = 21.17, p < 0.001, η 2 = 0.49).
Comparison of GUI Concepts
Figure 9 shows the average MDev when comparing the GUI concept with avatar
with the concept without avatar. The MDev was significantly higher when the avatar
was displayed on the screen (F(1, 261) = 11.09, p < 0.001, η 2 = 0.04).
   In the final questionnaire, the participants stated that they did not pay much at-
tention to the avatar (MV = −1.00, SD = 1.14) and that the avatar rather did not
distract from driving (MV = −0.62, SD = 1.58).

                                                                            110

        Proceedings of 5th International Workshop on Spoken Dialog Systems
        Napa, January 17-20, 2014
Hansjörg Hofmann et al.

       0,14                                                         3
                                                                   1,3                            2,61

                                                      RT [s] Result
                                                                                                                   0,14     0,136                       1,3
                                                                  2,5             2,13
                                                                                   1,06
       0,13   0,125                                                1,1     1,04
                       0,122   0,124                                2                     1,66     0,94                                                        1,03
                                                                                           0,92                    0,13              0,124              1,1              1,00

                                                DALI Overall
MDev
                                       0,118                              1,33
       0,12                                                       1,5
                                                                   0,9

                                                                                                            MDev

                                                                                                                                               RT [s]
                                                                                                                   0,12                                 0,9
                                                                    1
       0,11                                                        0,7
                                                                  0,5                                              0,11                                 0,7
        0,1                                                         0
                                                                   0,5
                                                                         Earcon                                     0,1                                 0,5
              Earcon   Cough   Inform Readout                             Earcon Cough
                                                                                  Cough Inform
                                                                                          Inform Readout
                                                                                                  Readout
                                                                                                                           Avatar   NoAvatar                  Avatar   NoAvatar

Fig. 7 Average MDev (left) Fig. 8 Average DALI overall
                                                            Fig. 9 Average MDev compar-
comparing the speech dialog result comparing the speech di-
                                                            ing the GUI concepts.
concepts.                   alog concepts.

3.3 Discussion
The results show that interacting with the SDS and responding to proactive events
did not negatively affect the steering performance of the participants.
   The participants were able to perform the tasks successfully using the 4 speech
dialog concepts. The results show that there are significant differences in usability
concerning the different concepts. The use of Earcons was generally accepted by the
participants but seems to be insufficient obtrusive. Earcons achieve the best DALI
result which confirms their unobtrusiveness. Using sounds as signals is common
in today’s cars to alert the user which is maybe the reason why the participants
accepted this concept. “Cough” achieves the worst SASSI result. This may be due
to participants not being used to such a natural behavior of a machine and therefore,
they might have missed hearing the notification sound. Informing the user about a
new incoming message is the most accepted speech dialog concept and seems to be
most adequate in obtrusiveness. Reading out a message at once achieves the worst
DALI result and appears to be too obtrusive, possibly because all the information is
presented at once which overloads the user mentally.
   The use of an avatar did not help improving the interaction and was not accepted
by the participants. Although participants indicated that they did not pay much atten-
tion to the avatar an impaired steering performance was conducted when the avatar
was displayed on the screen.
4 Conclusions
This paper reports from a recent driving simulation study in which several speech-
based proactive notification concepts for incoming events in different contextual
situations are evaluated. 4 different speech dialog concepts and 2 GUI concepts,
one including an avatar, were designed. An SDS prototype supported by a GUI em-
ploying the designed notification concepts was developed and evaluated on usability
and driving performance. The results show that the proactive presentation of infor-
mation by speech did not negatively affect the steering deviation. The results show
that there are significant differences when comparing the speech dialog concepts:
overall, informing the user verbally achieves the best result concerning usability.
Earcons are perceived to be the least distractive. The presence of an avatar was not
accepted by the participants and led to an impaired steering performance.
    In the next step, we will analyze all evaluation measures in detail with reference
to the different driver workload and priority levels. Furthermore, we will evaluate
the driving performance in different time periods during the speech interaction.

                                                                                  111

              Proceedings of 5th International Workshop on Spoken Dialog Systems
              Napa, January 17-20, 2014
Evaluation of In-Car SDS Notification Concepts for Incoming Proactive Events

References

 1. Bader, R., Siegmund, O., Woerndl, W.: A study on user acceptance of proactive in-vehicle
    recommender systems. In: 3rd International Conference on Automotive User Interfaces and
    Interactive Vehicular Applications (AutomotiveUI 2011) (2011)
 2. Blattner, M.M., Sumikawa, D.A., Greenberg, R.M.: Earcons and icons: their structure and
    common design principles. Hum.-Comput. Interact. 4(1), 11–44 (1989)
 3. Clark-Dickson, P., Talmesio, D., Sims, G.: VoIP and IP messaging: Operator strategies to
    combat are threat from ott players (revised and updated). Tech. rep., Informa Telecoms &
    Media (2013)
 4. Driver Focus-Telematics Working Group: Statement of principles, criteria and verification
    procedures on driver interactions with advanced in-vehicle information and communication
    systems. Alliance of Automotive Manufacturers (2002)
 5. Governors Highway Safety Association: Distracted driving: What research shows and what
    states can do. Tech. rep., U.S. Department of Transportation (2011)
 6. Hofmann, H., Ehrlich, U., Berton, A., Minker, W.: Speech interaction with the internet - a user
    study. In: Proceedings of Intelligent Environments. Guanajuato, Mexico (2012)
 7. Hone, K.S., Graham, R.: Subjective assessment of speech-system interface usability. In: Pro-
    ceedings of Eurospeech (2001)
 8. Mahr, A., Feld, M., Moniri, M.M., Math, R.: The ConTRe (continuous tracking and reac-
    tion) task: A flexible approach for assessing driver cognitive workload with high sensitiv-
    ity. In: A.L. Kun, L.N. Boyle, B. Reimer, A. Riener (eds.) Adjunct Proceedings of the 4th
    International Conference on Automotive User Interfaces and Interactive Vehicular Applica-
    tions.(AutomotiveUI 2012), October 17-19, Portsmouth,, New Hampshire, United States, pp.
    88–91. ACM, ACM Digital Library (2012)
 9. Mattes, S.: The lane-change-task as a tool for driver distraction evaluation. Proceedings of
    IGfA pp. 1–30 (2003)
10. Pauzie, A.: Evaluating driver mental workload using the driving activity load index (DALI).
    In: Proceedings of European Conference on Human Interface Design for Intelligent Transport
    Systems, pp. 67–77 (2008)
11. Peissner, M., Doebler, V., Metze, F.: Can voice interaction help reducing the level of distraction
    and prevent accidents? meta-study on driver distraction and voice interaction. Tech. rep.,
    Fraunhofer-Institute for Industrial Engineering (IAO) and Carnegie Mellon University (2011)
12. Vico, D.G., Woerndl, W., Bader, R.: A study on proactive delivery of restaurant recommenda-
    tions for android smartphones. In: Workshop Personalization in Mobile Applications, ACM
    Recommender Systems Conference (2011)

                                                  112

        Proceedings of 5th International Workshop on Spoken Dialog Systems
        Napa, January 17-20, 2014
You can also read