Multi-word Units Under the Magnifying Glass - Vered Shwartz Talk @ ONLP, December 26, 2018

Page created by Richard Paul
 
CONTINUE READING
Multi-word Units Under the Magnifying Glass - Vered Shwartz Talk @ ONLP, December 26, 2018
Multi-word Units
  Under the Magnifying Glass

                 Vered Shwartz
Natural Language Processing Lab, Bar-Ilan University

         Talk @ ONLP, December 26, 2018
Multi-word Units Under the Magnifying Glass - Vered Shwartz Talk @ ONLP, December 26, 2018
Multi-Word Units (MWUs)∗
        A sequence of consecutive words that creates a new concept

 Vered Shwartz   · MWUs Under the Magnifying Glass · December 2018   2 / 34
Multi-word Units Under the Magnifying Glass - Vered Shwartz Talk @ ONLP, December 26, 2018
Multi-Word Units (MWUs)∗
        A sequence of consecutive words that creates a new concept
                  Noun compounds: flea market, flea bite, flea bite treatment, ...
                  Adjective-noun compositions: hot tea, hot day, ...
                  Verb-particle constructions: wake up, let go, ...
                  Light-verb constructions: make a decision, take a walk, ...
                  Idioms: look what the cat dragged in, ...

 Vered Shwartz   · MWUs Under the Magnifying Glass · December 2018                   2 / 34
Multi-word Units Under the Magnifying Glass - Vered Shwartz Talk @ ONLP, December 26, 2018
Multi-Word Units (MWUs)∗
        A sequence of consecutive words that creates a new concept
                  Noun compounds: flea market, flea bite, flea bite treatment, ...
                  Adjective-noun compositions: hot tea, hot day, ...
                  Verb-particle constructions: wake up, let go, ...
                  Light-verb constructions: make a decision, take a walk, ...
                  Idioms: look what the cat dragged in, ...

      May combine in a non-trivial way
                 Implicit meaning
                 Non-literal word usage

 Vered Shwartz   · MWUs Under the Magnifying Glass · December 2018                   2 / 34
Multi-word Units Under the Magnifying Glass - Vered Shwartz Talk @ ONLP, December 26, 2018
Multi-Word Units (MWUs)∗
        A sequence of consecutive words that creates a new concept
                  Noun compounds: flea market, flea bite, flea bite treatment, ...
                  Adjective-noun compositions: hot tea, hot day, ...
                  Verb-particle constructions: wake up, let go, ...
                  Light-verb constructions: make a decision, take a walk, ...
                  Idioms: look what the cat dragged in, ...

      May combine in a non-trivial way
                 Implicit meaning
                 Non-literal word usage

 * Also referred to as Multi-Word Expressions or phrases

 Vered Shwartz   · MWUs Under the Magnifying Glass · December 2018                   2 / 34
Multi-word Units Under the Magnifying Glass - Vered Shwartz Talk @ ONLP, December 26, 2018
Previous MWUs Representations

        Compositional Distributional Representations:
                vec(olive oil) = f (vec(olive), vec(oil))
                Many ways to learn f [Mitchell and Lapata, 2010,
                Zanzotto et al., 2010, Dinu et al., 2013]
                Usually applied to AN or NC, limited to specific number of words

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   3 / 34
Multi-word Units Under the Magnifying Glass - Vered Shwartz Talk @ ONLP, December 26, 2018
Previous MWUs Representations

        Compositional Distributional Representations:
                vec(olive oil) = f (vec(olive), vec(oil))
                Many ways to learn f [Mitchell and Lapata, 2010,
                Zanzotto et al., 2010, Dinu et al., 2013]
                Usually applied to AN or NC, limited to specific number of words

        Phrase Embeddings:
                Arbitrarily long phrases

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   3 / 34
Previous MWUs Representations

        Compositional Distributional Representations:
                vec(olive oil) = f (vec(olive), vec(oil))
                Many ways to learn f [Mitchell and Lapata, 2010,
                Zanzotto et al., 2010, Dinu et al., 2013]
                Usually applied to AN or NC, limited to specific number of words

        Phrase Embeddings:
                Arbitrarily long phrases
                Supervision from PPDB [Wieting et al., 2015]
                        Limited in coverage

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   3 / 34
Previous MWUs Representations

        Compositional Distributional Representations:
                vec(olive oil) = f (vec(olive), vec(oil))
                Many ways to learn f [Mitchell and Lapata, 2010,
                Zanzotto et al., 2010, Dinu et al., 2013]
                Usually applied to AN or NC, limited to specific number of words

        Phrase Embeddings:
                Arbitrarily long phrases
                Supervision from PPDB [Wieting et al., 2015]
                        Limited in coverage
                Generalizing word2vec [Poliak et al., 2017]
                        Can compose vectors for unseen phrases
                        Naive composition, doesn’t handle the complexity of phrases

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   3 / 34
Enter contextualized word embeddings!
                                           • Represent a word in context
                                              ◦ Good for word sense induction

                                           • Trained as language models
                                              ◦ On a large corpus
                                              ◦ Capture world knowledge

                                           • Improve performance of
                                             various NLP applications

                                           • Named after characters
                                             from Sesame Street

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   4 / 34
Enter contextualized word embeddings!
                                           • Represent a word in context
                                              ◦ Good for word sense induction

                                           • Trained as language models
                                              ◦ On a large corpus
                                              ◦ Capture world knowledge

                                           • Improve performance of
                                             various NLP applications

                                           • Named after characters
                                             from Sesame Street

 Are meaningful MWU representations built-in in these models?

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   4 / 34
Probing Tasks

        Simple tasks designed to test a single linguistic property
        [Adi et al., 2017, Conneau et al., 2018]

            Representation                         Minimal Model                            Prediction

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   5 / 34
Probing Tasks

        Simple tasks designed to test a single linguistic property
        [Adi et al., 2017, Conneau et al., 2018]

            Representation                         Minimal Model                            Prediction

          SkipThoughts(s)                                                           What is s’s length?
          InferSent(s)                                                              Is w in s?
          ...                                                                       ...

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   5 / 34
Probing Tasks

        Simple tasks designed to test a single linguistic property
        [Adi et al., 2017, Conneau et al., 2018]

            Representation                         Minimal Model                            Prediction

          SkipThoughts(s)                                                           What is s’s length?
          InferSent(s)                                                              Is w in s?
          ...                                                                       ...

        We follow the same for MWUs, with various representations

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   5 / 34
Representations

                                                                                   Contextualized
     Word Embeddings                    Sentence Embeddings
                                                                                   Word Embeddings
     word2vec                           SkipThoughts                               ELMo
     GloVe                              InferSent∗                                 OpenAI Transformer
     fastText                           GenSen∗                                    BERT

 ∗   supervised

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   6 / 34
Tasks and Results
1. MWU Type
Task Definition

         Dataset: Wiki50 corpus [Vincze et al., 2011]
         Input: sentence
         Goal: sequence labeling to BIO tags
                 MWUs: noun compounds, adjective-noun compositions, idioms,
                 light verb constructions, verb-particle constructions
                 Named entities: person, location, organization

         Example:
          O         B-MW_VPC      I-MW_VPC     B-MW_NC       I-MW_NC         O           O            O     O

    Authorities     meted           out       summary        justice        in        cases          as    this

  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?          8 / 34
1. MWU Type
Results
                                                        F1(M)                                                                                   F1(N)

        100                                                                                                                                                                                    95.9

                                                                                                                                                                                            Human
                                                                                              70.6

                                                                                           Human
                                                                                                                                                      61.5                           61.6

                                                                                                                                                                                  BERT
                                                                                                                                                   ELMo
          50                                                                                                                                                              44.9

                                                                                                                                                             OpenAI Transformer
                                                                                                                                32.6 31.4

                                                                                                                             GloVe

                                                                                                                                     fastText
                                                                 18.8               18.8                              20.6

                                                                                                                word2vec
                                                              ELMo

                                                                                 BERT

                        0          0.2      0.2         0.1                2.1                             0
           0
                                                                        OAIT
                             word2vec

                                         GloVe

                                                  fastText
                  Majority

                                                                                                     Majority

  (1) Identifying MWU type is difficult; (2) Named entities are easier; (3) Context helps
  Vered Shwartz and Ido Dagan                · How well do Pre-trained Text Representations Address Multi-word Units?                                                                                 9 / 34
2. Noun Compound Literality
 A constituent word may be used in a non-literal way

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   10 / 34
2. Noun Compound Literality
Task Definition

         Dataset: based on [Reddy et al., 2011] and [Tratz, 2011]
         Input: sentence s, target word w ∈ s (part of NC)
         Goal: is w literal in NC?

         Example:

                 Non-Literal Literal

            The crash course in litigation made me a better lawyer

  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   11 / 34
2. Noun Compound Literality
Results

               100
                                                                                                                                                                  87

                                                                                                                                                               Human
    Accuracy

                                                                                                                                               50
                50                                                                                                                                        44

                                                                                                                                  OpenAI Transformer
                                                                                                                          41.8

                                                                                                                                                       BERT
                                                                                        34.2                    35.5

                                                                                                                       ELMo
                                                           28.8         30.3

                                                                                                            GenSen
                                                 26.5                          SkipThoughts          24.9
                                                                  fastText
                                                        GloVe

                                      20
                                           word2vec

                                                                                               InferSent
                                Majority

                0
                                               Word Embeddings                 Sentence Embeddings                            Contextualized

  (1) word embeddings < sentence embeddings < contextualized; (2) Far from humans

  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?                                                               12 / 34
2. Noun Compound Literality
Analysis

     ELMo                                    OpenAI Transformer                      BERT
     A search team located the [crash]L site and found small amounts of human remains.
     landfill                                body                                    archaeological
     wreckage                                place                                   burial
     Web                                     man                                     wreck
     crash                                   missing                                 excavation
     burial                                  location                                grave
     After a [crash]N course in tactics and maneuvers, the squadron was off to the war...
     crash                                   few                                     short
     changing                                while                                   successful
     collision                               moment                                  rigorous
     training                                long                                    brief
     reversed                                couple                                  training

  (1) BERT > ELMo, both reasonable
  (2) OpenAI Transformer errs due to uni-directionality

  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   13 / 34
2. Noun Compound Literality
Analysis
     ELMo                                    OpenAI Transformer                      BERT
     The gold/[silver]L price ratio is often analyzed by traders, investors, and buyers.
     silver                                  platinum                                silver
     blue                                    black                                   copper
     platinum                                gold                                    platinum
     purple                                  silver                                  gold
     yellow                                  red                                     diamond
     Growing up with a [silver]N spoon in his mouth, he was always cheerful...
     silver                                  mother                                  wooden
     rubber                                  father                                  greasy
     iron                                    lot                                     big
     tin                                     big                                     silver
     wooden                                  man                                     little

  Things get tougher when both constituent nouns are non-literal!

  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   14 / 34
3. Noun Compound Relations

        NCs express semantic relations between the constituent words

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   15 / 34
3. Noun Compound Relations

        NCs express semantic relations between the constituent words
        May require world knowledge and common sense to interpret

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   15 / 34
3. Noun Compound Relations
Task Definition

         Dataset: based on [Hendrickx et al., 2013]
         Input: sentence s, NC ∈ s, paraphrase p
         Goal: does p explicate NC?

         Example: access road
                 Road that makes access possible X
                 Road forecasted for access season ×

  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   16 / 34
3. Noun Compound Relations
Results

               100
                                                                                                                                                                    92

                                                                                                                                                                 Human
                                                                                                                                                          74.2
                                                                                                                          67

                                                                                                                                                       BERT
                                                                                                                65.6
    Accuracy

                                                 60.9      60.1         60.7

                                                                                                                       ELMo
                                                                                                     58.5

                                                                                                            GenSen
                                           word2vec

                                                                  fastText
                                                        GloVe                           51.3

                                                                                               InferSent
                                      50                                                                                                       50
                50

                                                                               SkipThoughts
                                Majority

                                                                                                                                  OpenAI Transformer
                0
                                               Word Embeddings                 Sentence Embeddings                            Contextualized

  (1) word embeddings < sentence embeddings < contextualized; (2) Far from humans;
  (3) Open AI Transformer fails
  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?                                                                 17 / 34
riding a drug
                                                                                                       0                                  money earned
                                                                                                                     money with the       from drug sales
                                                                                                                     help of drug            money for
                                                                                                                                             buyingmoney
                                                                                                                                                    drug used
3. Noun Compound Relations 25                                                                                                                 moneyforgotten
                                                                                                                                                       drug
                                                                                                                                               by sellingmoney
                                                                                                                                                          drugsthat
                                                                                                                                                         secure
                                                                                                                                                                     is
                                                                                                                                                                for drug
                                                                                                       50                    money developed                 money that
Analysis                                                                                                                     for drug                        involves drug
                                                                                                       75

                                                                                                            80       60     40        20        0         20           40      60         80

                                      drug money                                                                                           stage area
    100
                                                  money reserved                                       1
                                                  for drug money  coming
                                                            from drug sale
     75        money to                             money causing drug                                 0
               buy drug
     50                                                   money gained                                  1
                                 money made               from selling drugs
                                 from drug business               money   involved                                                                           area that        area specified
                                                                   in drug business                                                                           includes  stage for stage
                                                                                                                                                              area covered
     25                                                                               money while       2                                                      for stage
                                                                                      riding a drug                                   area that       area   in  which
                                                                                                                                       produces stage  stage
                                                                                                                                                           areais who
                                                                                                                                                                  allocated
                                                                                                                                                                       is
      0                               money earned
                                      from drug sales
                                                                                                        3                                                   active in a stage
                money with the                                                                                                         the area
                                                                                                                                           area given
                                                                                                                                                 the
                                                                                                                                            forwill
                                                                                                                                               stage
                help of drug                    money for                                                                               stage       list for
     25                                         buyingmoney
                                                       drug used                                        4
                                                 moneyforgotten
                                                          drug
                                                  by sellingmoney
                                                             drugsthat  is                                                                 area where a
                                                            secure for drug                             5                                   stage is located
     50                   money developed                      money that
                          for drug                             involves drug
                                                                                                        6
     75                                                                                                                     area where
                                                                                                                             a stage is
                                                                                                        7
          80   60         40       20       0          20        40           60       80                        8               6               4                 2                0

                                        stage area                                                                                         purse net
                                                                                                      100
  No 1clear signal from BERT. Capturing implicit information is challenging!
                                                                                                      75                   net with
      0                                                                                                                     a purse
                                                                                                                                                           net riding
                                                                                                                                                            a purse
      1                                                                                               50
                                                                                                                                            a net who net done
                                                       area that       area specified                                                                    on purse
                                                                                                                                             joined a purse
                                                        includes stage for stage
                                                        area covered
      2                                                                                               25
  Vered Shwartz and Ido Dagan               ·
                         area that area in which
                                                         for stage
                                   How well do Pre-trained Text Representations AddressnetMulti-word
                                                                                           in which
                                                                                        there is purse
                                                                                                       Units?                                                                                  18 / 34
4. Adjective-Noun Relations

 Adjectives select different attributes of the noun they combine with

 The hot debate about the hot office (or: the cold war over the cold office)

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   19 / 34
4. Adjective-Noun Relations
Task Definition

         Dataset: based on [Hartung, 2015]
         Input: sentence s, AN ∈ s, attribute w
         Goal: is the attribute w conveyed in AN?

         Example: warm support:
                 temperature ×
                 emotionality X

  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   20 / 34
4. Adjective-Noun Relations
Results
               100

                                                                                                                                                                 77

                                                                                                                                                              Human
    Accuracy

                                                                                                     51.5                                     52.9
                                                                                        47.8                    49.3                                     50
                50                    46.3                              45.6

                                                                                                                                 OpenAI Transformer
                                                                                               InferSent
                                                                                                                          43.4

                                                                                                                                                      BERT
                                                                                                            GenSen
                                                   41.2

                                                                               SkipThoughts
                                Majority

                                                                  fastText
                                                             36

                                                                                                                       ELMo
                                             word2vec

                                                          GloVe

                0
                                                 Word Embeddings               Sentence Embeddings                            Contextualized

  Best model performs only slightly better than majority
  (Capturing implicit information is challenging)

  Vered Shwartz and Ido Dagan     · How well do Pre-trained Text Representations Address Multi-word Units?                                                            21 / 34
5. Adjective-Noun Entailment
Task Definition

         Dataset: [Pavlick and Callison-Burch, 2016]
         Input: premise p, hypothesis h, differ by a single adjective
         Goal: p → h?

         Example:
         p: Most people die in the class to which they were born.
         h: Most people die in the social class to which they were born. X

  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   22 / 34
5. Adjective-Noun Entailment
Results

          100

                                                                                                                                                      74.4

                                                                                                                                                   Human
                                                                                                              55.2
    F1

           50                                                                                      48.4

                                                                                                          GenSen
                                                                                                                        45.2
                                                                        40.4

                                                                                             InferSent
                                                                                                                                            37.2

                                                                                                                     ELMo
                                                 36.6
                                                                  fastText

                                                                                                                                         BERT
                                           word2vec

                                                                                      23.4
                                                           20.6
                                                                               Thoughts                                           14.7
                                                        GloVe

                                                                                                                               OAIT
                                                                               Skip

                                      0
             0
                                Majority

                                               Word Embeddings                 Sentence Embeddings                          Contextualized

  Bad performance for all models, best for sentence embeddings trained on RTE

  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?                                                     23 / 34
6. Verb-Particle Classification
 VPC meanings differ from their verbs’ meanings

 Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   24 / 34
6. Verb-Particle Classification
Task Definition

         Dataset: [Tu and Roth, 2012]
         Input: sentence s, VP ∈ s
         Goal: is VP a VPC?

         Example:
                   VPC                                                              Non-VPC

    We did get on together                    Which response did you get on that?

  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   25 / 34
6. Verb-Particle Classification
Results

               100

                                                                                                                                                                 82
                                                                                                                          76.4                           75

                                                                                                                                                              Human
                                      72.3                                                                                                    71.4
                                                   68.6                   70            68.6

                                                                                                                       ELMo
                                                             67.9                                    67.9

                                                                                                                                                      BERT
                                                                                                                65.7
                                Majority

                                                                                                                                 OpenAI Transformer
                                                                    fastText
    Accuracy

                                             word2vec

                                                                               SkipThoughts
                                                          GloVe

                                                                                               InferSent

                                                                                                            GenSen
                50

                0
                                                 Word Embeddings               Sentence Embeddings                            Contextualized

  Similar performance for all models. Is the good performance merely due to label imbalance?

  Vered Shwartz and Ido Dagan     · How well do Pre-trained Text Representations Address Multi-word Units?                                                            26 / 34
6. Verb-Particle Classification
Analysis

                                                  have on                     give in

                                   20                            10
                                                                 5
                                    0
                                                                 0
                                   20
                                                                  5
                                   40
                                             20     0
                                                  make for 20             5
                                                                              get0 on 5
                                                                 15
                                   10
                                                                 10
                                    5                             5
                                                                  0
                                    0
                                                                  5
                                    5                            10
                                   10                            15
                                        10         0        10        5       0         5   10

  Very weak signal from ELMo. Mostly performs well due to label imbalance.

  Vered Shwartz and Ido Dagan   · How well do Pre-trained Text Representations Address Multi-word Units?   27 / 34
Future Directions
Can we learn MWUs like humans do?

        [Cooper, 1999]: how do L2 learners process idioms?
                  Infer from context: 28% (57% success rate)
                  Rely on literal meaning: 19% (22% success rate)
                  ...

 Vered Shwartz   · MWUs Under the Magnifying Glass · January 2019   29 / 34
Inferring from context
We need richer context modeling

                                                               Previous news stories may help
                                                               understand that “crocodile tears”
                                                               refer to manipulative behavior

                                                               [Asl, 2013]: L2 learners interpret
                                                               idioms with more success through
                                                               extended contexts (stories) than
                                                               through sentential contexts

  Vered Shwartz   · MWUs Under the Magnifying Glass · January 2019                          30 / 34
Relying on literal meaning
We need world knowledge

                                                              “Cradle is something that you put
                                                              the baby in”

                                                              “You’re stealing a child from a
                                                              mother”

                                                              “So robbing the cradle is like dating
                                                              a really young person”

                                                              [Cooper, 1999]

 Vered Shwartz   · MWUs Under the Magnifying Glass · January 2019                               31 / 34
Recap

   1. Testing Existing Pre-trained Representations
                  Contextualized word embeddings provide better MWU
                  representations, but there is still a long way to go

 Vered Shwartz   · MWUs Under the Magnifying Glass · January 2019        32 / 34
Recap

   1. Testing Existing Pre-trained Representations
                  Contextualized word embeddings provide better MWU
                  representations, but there is still a long way to go

   2. Future Directions
                  To represent MWUs like humans do, we need better context and
                  world knowledge modeling

 Vered Shwartz   · MWUs Under the Magnifying Glass · January 2019          32 / 34
Recap

   1. Testing Existing Pre-trained Representations
                  Contextualized word embeddings provide better MWU
                  representations, but there is still a long way to go

   2. Future Directions
                  To represent MWUs like humans do, we need better context and
                  world knowledge modeling

                                                                    Thank you!

 Vered Shwartz   · MWUs Under the Magnifying Glass · January 2019                32 / 34
References I
 [Adi et al., 2017] Adi, Y., Kermany, E., Belinkov, Y., Lavi, O., and Goldberg, Y. (2017). Fine-grained analysis of sentence
    embeddings using auxiliary prediction tasks. In Proceedings of ICLR Conference Track.
 [Asl, 2013] Asl, F. M. (2013). The impact of context on learning idioms in efl classes. TESOL Journal, 37(1):2.
 [Conneau et al., 2018] Conneau, A., Kruszewski, G., Lample, G., Barrault, L., and Baroni, M. (2018). What you can cram into a
    single vector: Probing sentence embeddings for linguistic properties. In Proceedings of the 56th Annual Meeting of the
    Association for Computational Linguistics (Volume 1: Long Papers), pages 2126–2136. Association for Computational
    Linguistics.
 [Cooper, 1999] Cooper, T. C. (1999). Processing of idioms by l2 learners of english. TESOL quarterly, 33(2):233–262.
 [Dinu et al., 2013] Dinu, G., Pham, N. T., and Baroni, M. (2013). General estimation and evaluation of compositional
     distributional semantic models. In Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality,
     pages 50–58, Sofia, Bulgaria. Association for Computational Linguistics.
 [Hartung, 2015] Hartung, M. (2015). Distributional Semantic Models of Attribute Meaning in Adjectives and Nouns. Ph.D. thesis,
    Heidelberg University.
 [Hendrickx et al., 2013] Hendrickx, I., Kozareva, Z., Nakov, P., Ó Séaghdha, D., Szpakowicz, S., and Veale, T. (2013).
    Semeval-2013 task 4: Free paraphrases of noun compounds. In SemEval, pages 138–143.
 [Mitchell and Lapata, 2010] Mitchell, J. and Lapata, M. (2010). Composition in distributional models of semantics. Cognitive
    science, 34(8):1388–1429.
 [Pavlick and Callison-Burch, 2016] Pavlick, E. and Callison-Burch, C. (2016). Most "babies" are "little" and most "problems" are
    "huge": Compositional entailment in adjective-nouns. In Proceedings of the 54th Annual Meeting of the Association for
    Computational Linguistics (Volume 1: Long Papers), pages 2164–2173, Berlin, Germany. Association for Computational
    Linguistics.
 [Poliak et al., 2017] Poliak, A., Rastogi, P., Martin, M. P., and Van Durme, B. (2017). Efficient, compositional, order-sensitive
     n-gram embeddings. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational
     Linguistics: Volume 2, Short Papers, pages 503–508, Valencia, Spain. Association for Computational Linguistics.

 Vered Shwartz   · MWUs Under the Magnifying Glass · January 2019                                                              33 / 34
References II

 [Reddy et al., 2011] Reddy, S., McCarthy, D., and Manandhar, S. (2011). An empirical study on compositionality in compound
    nouns. In Proceedings of 5th International Joint Conference on Natural Language Processing, pages 210–218, Chiang Mai,
    Thailand. Asian Federation of Natural Language Processing.
 [Tratz, 2011] Tratz, S. (2011). Semantically-enriched parsing for natural language understanding. University of Southern
     California.
 [Tu and Roth, 2012] Tu, Y. and Roth, D. (2012). Sorting out the most confusing english phrasal verbs. In *SEM 2012: The First
     Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task,
     and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), pages 65–69,
     Montréal, Canada. Association for Computational Linguistics.
 [Vincze et al., 2011] Vincze, V., Nagy T., I., and Berend, G. (2011). Multiword expressions and named entities in the wiki50
     corpus. In Proceedings of the International Conference Recent Advances in Natural Language Processing 2011, pages 289–295.
     Association for Computational Linguistics.
 [Wieting et al., 2015] Wieting, J., Bansal, M., Gimpel, K., and Livescu, K. (2015). Towards universal paraphrastic sentence
    embeddings. CoRR, abs/1511.08198.
 [Zanzotto et al., 2010] Zanzotto, F. M., Korkontzelos, I., Fallucchi, F., and Manandhar, S. (2010). Estimating linear models for
    compositional distributional semantics. In Proceedings of the 23rd International Conference on Computational Linguistics,
    pages 1263–1271. Association for Computational Linguistics.

 Vered Shwartz   · MWUs Under the Magnifying Glass · January 2019                                                           34 / 34
You can also read