IAAA / PSTALN Dialog systems - Benoit Favre last generated on January 20, 2020 - page du TP

Page created by Andrea Chen
 
CONTINUE READING
IAAA / PSTALN Dialog systems - Benoit Favre last generated on January 20, 2020 - page du TP
IAAA / PSTALN
                               Dialog systems

                Benoit Favre 

                        Aix-Marseille Université, LIS/CNRS

                     last generated on January 20, 2020

Benoit Favre (AMU)                PSTALN: Dialog             January 20, 2020   1 / 35
IAAA / PSTALN Dialog systems - Benoit Favre last generated on January 20, 2020 - page du TP
What is a dialog system?

   Definition :
      ▶   Input/output with natural language
      ▶   Free use of language
      ▶   Reproduces human agent behavior
      ▶   Reply (or not) in natural language
      ▶   Uni/multimodal
   Spoken Dialog System (SDS)
      ▶   Interactive system with spoken language
      ▶   Required when using an acoustic communication channel only (phone)
      ▶   Can free other modalities (hands free)
   Difficulty
      ▶   No control of inputs
      ▶   Contextualize information
      ▶   Automatic speech recognition → transcript errors

   Benoit Favre (AMU)             PSTALN: Dialog             January 20, 2020   2 / 35
IAAA / PSTALN Dialog systems - Benoit Favre last generated on January 20, 2020 - page du TP
Examples
  Online discussion
     ▶   Eliza, virtual therapy
            ⋆   https://www.eclecticenergies.com/ego/eliza
     ▶   Mitsuku (best chatbot at Loebner price 2013)
            ⋆   http://www.mitsuku.com/
  Automated voice services
     ▶   “To erase a message, say erase..."
  Customer care
     ▶   1013 (in France): describe freely your problem
     ▶   Air Travel Information System (ATIS)
  Assistants
     ▶   Clippy
     ▶   SIRI

  Benoit Favre (AMU)              PSTALN: Dialog             January 20, 2020   3 / 35
IAAA / PSTALN Dialog systems - Benoit Favre last generated on January 20, 2020 - page du TP
Learning with an intelligent agent

   Replaces teacher in MOOC

   Benoit Favre (AMU)         PSTALN: Dialog   January 20, 2020   4 / 35
IAAA / PSTALN Dialog systems - Benoit Favre last generated on January 20, 2020 - page du TP
Interaction with a robot

Caroline Lyon, Chrystopher L. Nehaniv, Joe
Saunders, Interactive Language Learning by
Robots: The Transition from Babbling to               Nao (http://www.aldebaran-
Word Forms, PLoS One, 2012                            robotics.com)

      Benoit Favre (AMU)             PSTALN: Dialog                 January 20, 2020   5 / 35
IAAA / PSTALN Dialog systems - Benoit Favre last generated on January 20, 2020 - page du TP
Problems

  What’s a dialog?
     ▶   Study human behavior
  How to understand a sentence?
     ▶   Rule-based and template-based system
     ▶   Robust concept detection
  What strategies to make a successful dialog?
     ▶   Enforce local coherency
     ▶   Finite state machine
     ▶   Explore possible futures
  How to formulate an answer
     ▶   Language and speech synthesis

  Benoit Favre (AMU)                PSTALN: Dialog   January 20, 2020   6 / 35
IAAA / PSTALN Dialog systems - Benoit Favre last generated on January 20, 2020 - page du TP
Corpus-based study
   Human-human dialogues
     ▶   Language in the wild
     ▶   Want to build a program that replicates human behavior
     ▶   Difficult to ignore non-verbal interactions
     ▶   Humans behave differently in front of a system
   Wizard of oz (WoZ)
     ▶   Replace the system by a human operator
     ▶   Make the user believe that it is a real system (simulate errors / wait
         time...)
     ▶   Collect users dialog strategies
   Human-machine dialog
     ▶   Collect interactions with an existing system
     ▶   Use these data to evaluate / improve the system
   Simulation
     ▶   From a user model, simulate human input
     ▶   Detect infinite loops
     ▶   Estimate resolution time
     ▶   Train system parameters
  Benoit Favre (AMU)              PSTALN: Dialog                January 20, 2020   7 / 35
Human dialogs

  Sequence of spoken turns
     ▶   Between two or more people
  Who speaks after whom?
     ▶   Interruptions
     ▶   Finish someone else’s sentence
     ▶   Overlap
  Establish of a common ground
     ▶   Use of a common vocabulary
     ▶   Acquiescence, reformulation, convergence

  Benoit Favre (AMU)             PSTALN: Dialog     January 20, 2020   8 / 35
Speech acts

Theory (Austin & Searl) * Meaning can be expressed in terms of actions
instead of concepts (declarative logical form)
     Examples:
       ▶   “I apologize"
       ▶   “Can you do this?"
    Types of dialog acts
       ▶   verdictifs ou actes juridiques (acquitter, condamner, décréter)
       ▶   exercitifs (dégrader, commander, ordonner, pardonner, léguer)
       ▶   promissifs (promettre, faire vu de, garantir, parier, jurer de)
       ▶   comportatifs (sexcuser, remercier, déplorer, critiquer)
       ▶   expositifs (affirmer, nier, postuler, remarquer)

    Benoit Favre (AMU)              PSTALN: Dialog                January 20, 2020   9 / 35
Cooperation principle (Grice)

   For a conversation to be successful, speakers must cooperate:
      ▶   Quantity: give the right amount of information
      ▶   Quality: tell the truth
      ▶   Relevance: say important things
      ▶   Manner: be clear, brief and structured
   Example (Mitkov - Computational Linguistics)
      ▶   Quantity: Marie ate some chocolate → Marie did not eat all the
          chocolate
      ▶   Quality: (about an invoice) It costs an arm → It costs a lot
      ▶   Relevance: A: Can I watch TV? B: It’s bath time.
      ▶   Manner: Are you ready? vs Are you ready or are you not ready?
   Integration in a dialogue system:
      ▶   The user follows them if he has something to gain
      ▶   Should the system follow these principles?

   Benoit Favre (AMU)             PSTALN: Dialog              January 20, 2020   10 / 35
Dialog acts (DA)
   Dialog acts: a specialization of speech acts
   Subdivision of a speaking turn into “intentions"
     ▶   Question
            ⋆   Closed / open questions
     ▶   Declaration
            ⋆   Short answers
     ▶   Dysfluancies
            ⋆   Repetitions
            ⋆   Wrong pronunciation
     ▶   Interruption
     ▶   Filled pauses
     ▶   MRDA / DASL: specification of more than 160 types
   Task-oriented dialogue act categories
     ▶   Greetings
     ▶   Opening
     ▶   Negotiation
     ▶   Closing
     ▶   Good-bye
  Benoit Favre (AMU)                  PSTALN: Dialog        January 20, 2020   11 / 35
Anatomy of a dialog system
                                                                       syntactic
                                                  words                  tree
                                                                                                concepts,
                       question     Automatic             Syntactic                Semantic
                                                                                                relations
                                  transcription           analysis                  analysis

                                                                                                     Dialog
                                                                                                    manager

                                    Speech                  Lexical                 Syntactic       logical
                       answer
                                   synthesis              generation               generation   representation

                                                words,                 primitive
                                               prosody                  syntax

   Comprehension
     ▶   Dialogue acts
     ▶   Grammar
     ▶   Concepts
   Dialog management
     ▶   Matching
     ▶   Finite state machines
     ▶   Exploration
   Generation
     ▶   Templates
     ▶   Statistical generation
  Benoit Favre (AMU)                           PSTALN: Dialog                                          January 20, 2020   12 / 35
DA classification
   Model domain dialog acts (politeness, commands, information, ...)
   Corpus: (15651 instances, 66 classes)
      ▶   “je n’ai plus de tonalité sur ma ligne" → interruption_ligne
      ▶   “j’ai un problème suite euh déménagement" → mise_en_service
      ▶   “euh téléphone illimité en panne" → internet_voip
      ▶   “le clignotant reste toujours allumé" → messagerie_vocale
   Classifier
      1   Extract word n-gramms
             ⋆   je je_n'ai je_n'ai_plus n'ai_plus_de plus_de_tonalité...
      2   Use them as features of a classifier (résultats mlcomp.org)
             ⋆   0.207   SMO_weka_nominal
             ⋆   0.216   boostexter
             ⋆   0.232   sgd-logistic-stepsize0.3-iter5
             ⋆   0.272   liblinear-s6-B1
      3   Most relevant features
             ⋆   “plus de tonalité"
             ⋆   “sur ma ligne"
             ⋆   “une autre demande"
             ⋆   “ai un problème"
   Benoit Favre (AMU)                    PSTALN: Dialog        January 20, 2020   13 / 35
Grammars
    Extension of classic syntagmatic grammars
       ▶   Integrate domain semantics into grammar

                                              Requête

                   Action                               Voyage

                Je voudrais    Moyen      Départ        Arrivée        Temps

                              un train   de Paris   à Marseille   dans l'après-midi

    Grammar example:

Request -> $Action $Travel
Travel -> $Mean $Departure $Arrival $Time
Action -> I would like | I would have to | have you got
Mean -> a train | a flight | a plane
Departure -> from $City | starting from $City
Arrival -> to $City | arriving at $Ville
Weather -> in the afternoon | today | tomorrow | at $Hour | the $Date

    Benoit Favre (AMU)                    PSTALN: Dialog                       January 20, 2020   14 / 35
Concept detection
   Example: from Media corpus (WoZ, hotel reservation)
     ▶   Detect triplets (modality, type, value) linked to a task
     ▶   heu [+:reponse:oui oui] [+:connect:opposition mais] j’aimerai
         d’abord savoir si le [+:hotel:Ibis ibis] il y a un
         [?:service:jacuzzi jacuzzi] et une [?:service:piscine piscine] si
         ils acceptent [?:service:animaux les chiens]
   Formulate as sequence prediction
     ▶   BIO formalism (begin-inside-outside)
     ▶   Model type HMM/RNN/CRF
     ▶   Features on words, pos-tags...
                           ...         ...
                           occasion    O
                           de          O
                           la          B-evenement
                           fête        I-evenement
                           du          I-evenement
                           cheval      I-evenement
                           donc        B-connect
                           euh         O
                           à           B-loc-relative
                           proximité   I-loc-relative

  Benoit Favre (AMU)             PSTALN: Dialog           January 20, 2020   15 / 35
Dialogue management

                          Modèle de       Modèle de
                         l'utilisateur     la tâche    Réponse
              Analyse
            sémantique
                         Modèle de         Modèle     Commandes
                          discours        du monde

              Entrées     Gestionnaire de dialogue      Sorties

  Task model: expected requests, possible commands
  User model: do not tell the user what he already knows, predict the
  following sentence (objectives, knowledge, interests)
  Discourse model: conversation history (pronouns resolution), dialogue
  states, what to do at a given time
  World model: general knowledge for understanding

  Benoit Favre (AMU)             PSTALN: Dialog            January 20, 2020   16 / 35
ELIZA: the psychanalist

   EMACS: alt-x-doctor

   Benoit Favre (AMU)     PSTALN: Dialog   January 20, 2020   17 / 35
ELIZA: How does it work?
   Key words → responses
     ▶   “BONJOUR" → “Comment vas-tu aujourd’hui.. De quoi désires-tu
         discuter?"
     ▶   “PEUX-TU" → “Tu ne crois que je suis capable de
Dialog management: AIML (ALICE)

    AIML (Artificial Intelligence Markup Language)

        WHO IS YOUR DADDY?
        Steve
    
    Recursive rules
    Memorize previous answer and topic (broad category)
    Synonyms
    Random
    Learn new responses from user input

    Benoit Favre (AMU)                PSTALN: Dialog      January 20, 2020   19 / 35
ALICE: how to make a bot ?
     Available source code
        ▶   Program AB: https://code.google.com/p/program-ab/
        ▶   Knowledge base:
            https://code.google.com/p/aiml-en-us-foundation-alice/
     Question/answers
        ▶   Yes ("Is violet a color?")
        ▶   No ("Are fish mammals?")
        ▶   Sometimes ("Is the sky blue?")
     Reductions
        ▶   Synonyms: “Hello", “Hi there", “Howdy" → “Hi".
        ▶   Simplification: “I am feeling very happy right now" → “I am happy".
        ▶   Input segmentation: “Yes my name is Jim" → “Yes" + “My name is
            Jim".
Personality
 age = 15
 baseballteam = Red Sox
 birthday = Nov. 23, 1995
 birthplace = Bethlehem, Pennsylvania
 boyfriend = I am single
 celebrities = Oprah, Steve Carell, John Stewart, Lady Gaga
 ... Benoit Favre (AMU)                 PSTALN: Dialog         January 20, 2020   20 / 35
Voice XML
    W3C standard
    XML representation of a dialogue

             Please choose airline, hotel, or rental car.
        
             [airline hotel "rental car"]
        
             You have chosen .
        
    Can specify a grammar for the fields to be filled and the order in
    which to fill them
    Code evaluation ()
    Benoit Favre (AMU)                 PSTALN: Dialog         January 20, 2020   21 / 35
Information retrieval-based system
   Find most relevant answer for a question
      ▶   Exemple : stack overflow

   Multi-user: ask a question to another user (chatbots)
   Benoit Favre (AMU)               PSTALN: Dialog         January 20, 2020   22 / 35
Automaton

  Finite state machine to represent dialogue states
     ▶   Don’t list flights until all the fields are filled
  Entry into a state is conditioned by an interpretation (of concepts,
  recognition DA...) and dialogue history
  A state corresponds to a command and/or a response from the system
  Limit: locked in the structure of the dialogue
     ▶   Given a state and an interpretation of the user, we always go to
         another state

  Benoit Favre (AMU)               PSTALN: Dialog             January 20, 2020   23 / 35
MDP
 Problem: Which trajectory to follow in the dialog automaton to
 minimize task completion time?
    ▶   We don’t know the future, we have to make assumptions
 Markov Decision Process
    ▶   At each time t, the process (the user) is in a state s (it has an
        intention)
    ▶   The system chooses to perform an action a (for example it asks a
        question)
    ▶   This action randomly places the process (the user) in state s ′ ,
        according to a probability Pa (s, s ′ )
    ▶   This change of state gives the system a gain Ra (s, s ′ )
    ▶   The choices of the process depend only on a and s and not on the rest
        of the history (Markov property)
 How to maximize the cumulative gain over the entire dialogue?
    ▶   We call policy a series of actions noted Pi
    ▶   Use dynamic programming to find the policy that maximizes
                                  ∞
                                  ∑
                                        γt Ra (st , st+1 )
                                  t=0

 Benoit Favre (AMU)             PSTALN: Dialog               January 20, 2020   24 / 35
POMDP

  Partially Observable Markov Decision Process (POMDP)
       ▶   Extension of MDP the state (user intent) isn’t directly observed
       ▶   Distribution over states that the user can be in given what she says
       ▶   Observations are user sentences
 MDP                                             POMDP

                                 s'                                       s'

                            R(s,s')                                  R(s,s')

                            s                                        s

            Passé       Futur                            Passé   Futur

  Requires approximate inference

 Benoit Favre (AMU)                   PSTALN: Dialog             January 20, 2020   25 / 35
Model comparison

                       Markov Chain

           Markov Decision Process                     ?       ?

                                                       ?       ?
              Hidden Markov Chain

                                                       ?       ?
           Partially-observable MDP

  Benoit Favre (AMU)                  PSTALN: Dialog       January 20, 2020   26 / 35
Advanced elements
  Initiative
     ▶   System initiative (= le technicien freebox qui suit un script)
     ▶   User initiative (Google Now, commande vocale)
     ▶   Mixed initiative :
            ⋆   “SIRI, can you change brightness?" (user initiative)
            ⋆   “Yes, how bright?" (system initiative)
  Confirmation
     ▶   Explicit
            ⋆   “J’aimerai aller de Marseille à Barcelone"
            ⋆   “Votre lieu de départ est-il Marseille ?"
            ⋆   “Oui"
            ⋆   “Votre lieu d’arrivée est-il Barcelone"
     ▶   Implicite
            ⋆   “J’aimerai aller de Marseille à Barcelone"
            ⋆   “Quand souhaitez-vous aller de Marseille à Barcelone ?"
  Who is the user talking to?
     ▶   Kinect / Nao

  Benoit Favre (AMU)                 PSTALN: Dialog                    January 20, 2020   27 / 35
Chatbots with deep learning
   Replace classic elements
      1   Intent classification (what do you want?)
      2   Slot filling (what are the parameters of your query?)
      3   Next action prediction
   Typical frameworks (i.e. RASA)
      ▶   Provide pretrained intents / slots
      ▶   Active learning for annotating data
      ▶   Dialog flow is pre-scripted (best control)
   Data-driven approach to conversation modeling
      ▶   Given a conversation up to a point, can we predict what will happen
          next
      ▶   No need for linguistic analysis, but no linguistic prior
      ▶   Examples:
             1   Alternating language models
             2   Turn retrieval
             3   Machine translation (history → next turn)

   Benoit Favre (AMU)                PSTALN: Dialog              January 20, 2020   28 / 35
Alternating language model

   A simplified version of the encoder-decoder (or seq2seq) framework
      ▶   Trained the same way as a regular word-based language model
      ▶   At prediction time, alternate between user input and generation
             ⋆   Training data needs to be in the same form

   Implementations
      ▶   Word-by-word prediction
      ▶   Any language model (GPT-2...)
      ▶   Attention mechanisms
             w1 w2 w3                               w1 w2 w3

             M                    H    w1 w2  M

   Benoit Favre (AMU)                 PSTALN: Dialog              January 20, 2020   29 / 35
Representation learning
   Create an information retrieval system
      ▶   Which can retrieve the next turn given a history
      ▶   Encode history with a first recurrent model
      ▶   Encode next turn with a second recurrent model
      ▶   Compute a similarity between those representations (dot product)
   Training objective: triplet ranking
      ▶   Make sure the correct association has a higher score than a randomly
          selected pair
   Problem: the cost of retrieving a turn
      ▶   Everything can be precomputed, just the dot product remains
      ▶   Many approaches for finding approximate nearest neighbors in a high
          dimensional space (ie. locality preserving hashing)
            history

                        M   w1 w2  H      w1 w2        cosine

                              response

                                         M   w1 w2 

   Benoit Favre (AMU)              PSTALN: Dialog             January 20, 2020   30 / 35
Bi-encoder training
   Maximize margin between the result of hi · ri and ni · ri
      ▶   hi is the history
      ▶   ni is a random history
      ▶   ri is the response
                                   1∑
                        Loss =         max(0, 1 − hi · ri + ni · ri ))
                                   n i

   Keras model

   Benoit Favre (AMU)                PSTALN: Dialog                January 20, 2020   31 / 35
Experiments on Datcha corpus

   Corpus: Orange ATH TV

               Stat                Train            Valid      Test
               Conversations      16,140              698       606
               Turns             465,693           20,090    18,392
               Words           7,744,262          327,979   299,340

   Preprocessing
     ▶   Tokenization (based on penn tokenizer)
     ▶   A few rules to strip additional URLs, phone numbers, etc.
     ▶   Lower case
     ▶   Concatenate turns of the same participant with 
     ▶   Separate conversations by 
     ▶   Replace all TC[1-9] by a generic TC

  Benoit Favre (AMU)             PSTALN: Dialog                  January 20, 2020   32 / 35
Datcha results

   Evaluation metrics
                                 ∑
      ▶   Perplexity (PPL): − n1   logP(turn|history)
      ▶   Better-than-random (BTR): n1 |P(turn|history) > P(turn|noise)|
   Results on the ATH TV test set (3 last files):

                        Method                     PPL      BTR
                        Language model            17.52   69.39%
                        Information retrieval     11.85   93.91%

   Parameters
      ▶   LM: vocab=30k, layers=2, hidden=650, sample=1024, maxlen=35,
          batch=20, optim=sgd, epochs=8
      ▶   Bi-encoder: vocab=30k, embeddings=128 (init=w2v), hidden=256,
          maxlen=64, repr=128, batch=256, optim=Nadam, epochs=100

   Benoit Favre (AMU)                 PSTALN: Dialog               January 20, 2020   33 / 35
t-SNE Analysis

   t-SNE Projections of turn representations

  Benoit Favre (AMU)          PSTALN: Dialog   January 20, 2020   34 / 35
Conclusion
   Dialogue systems
     ▶   Understanding
     ▶   Planning
     ▶   Data driven
   Open issues
     ▶   Noisy input
            ⋆   Speech recognition / spelling errors
     ▶   Unrestricted domain
            ⋆   Chit-chat
            ⋆   Rapid development of new domains
     ▶   Grounding
            ⋆   Tackle real-world objects
     ▶   End-to-end training
            ⋆   Speech-recognition → dialog management → speech synthesis
     ▶   Training without experiencing
            ⋆   Simulation
   Current state of the art
     ▶   https://nlpprogress.com/english/dialogue.html
  Benoit Favre (AMU)                 PSTALN: Dialog            January 20, 2020   35 / 35
You can also read