Usage in Question Answering - Inside ASCENT: Exploring a Deep Commonsense Knowledge Base and its

Page created by Aaron Joseph
Usage in Question Answering - Inside ASCENT: Exploring a Deep Commonsense Knowledge Base and its
Inside A SCENT: Exploring a Deep Commonsense Knowledge Base and its
                                                             Usage in Question Answering

                                                   Tuan-Phong Nguyen            Simon Razniewski              Gerhard Weikum
                                                                         Max Planck Institute for Informatics
                                                                              Saarbrücken, Germany

                                                                  Abstract                         (CSKBs), including expert-annotated KBs (e.g.,
                                                                                                   Cyc (Lenat, 1995)), crowdsourced KBs (e.g., Con-
                                                 A SCENT is a fully automated methodology
                                                 for extracting and consolidating commonsense      ceptNet (Speer and Havasi, 2012) and Atomic (Sap
arXiv:2105.13662v1 [cs.AI] 28 May 2021

                                                 assertions from web contents (Nguyen et al.,      et al., 2019)) and KBs built by automatic acqui-
                                                 2021). It advances traditional triple-based       sition methods such as WebChild (Tandon et al.,
                                                 commonsense knowledge representation by           2014, 2017), TupleKB (Mishra et al., 2017), Quasi-
                                                 capturing semantic facets like locations and      modo (Romero et al., 2019) and CSKG (Ilievski
                                                 purposes, and composite concepts, i.e., sub-      et al., 2020). Human-created KBs, although pos-
                                                 groups and related aspects of subjects. In this
                                                                                                   sessing high precision, usually suffer from low cov-
                                                 demo, we present a web portal that allows
                                                 users to understand its construction process,     erage. On the other hand, automatically-acquired
                                                 explore its content, and observe its impact in    KBs typically have better coverage, but also con-
                                                 the use case of question answering. The demo      tain more noise. Nonetheless, despite different
                                                 website1 and an introductory video2 are both      construction methods, these KBs are all based on a
                                                 available online.                                 simple subject-predicate-object model, which has
                                                                                                   major limitations in validity and expressiveness.
                                         1       Introduction
                                                                                                      We recently presented A SCENT (Nguyen et al.,
                                         Commonsense knowledge (CSK) is an enduring                2021), a methodology for automatically collecting
                                         theme of AI (McCarthy, 1960) that has been re-            and consolidating commonsense assertions from
                                         cently revived for the goal of building more robust       the general web. To overcome the limitations of
                                         and reliable applications (Monroe, 2020). Recent          prior works, A SCENT refines subjects with sub-
                                         years have witnessed the emerging of large pre-           groups (e.g., circus elephant and domesticated ele-
                                         trained language models (LMs), notably BERT (De-          phant) and aspects (e.g., elephant tusk and elephant
                                         vlin et al., 2018), GPT (Brown et al., 2020) and          habitat), and captures semantic facets of assertions
                                         their variants which significantly boosted the per-       (e.g., hlawyer, represents, clients, LOCATION: in
                                         formance of tasks requiring natural language under-       courtsi or helephant, uses, its trunk, PURPOSE: to
                                         standing such as question answering and dialogue          suck up wateri).
                                         systems (Clark et al., 2020). Although it has been           For a given concept, A SCENT searches through
                                         shown that such LMs implicitly store some com-            the web with pattern-based search queries dis-
                                         monsense knowledge (Talmor et al., 2019), this            ambiguated using WordNet (Miller, 1995) hyper-
                                         comes with various caveats, for example regarding         nymy. Then, irrelevant documents are filtered out
                                         degree of truth, or negation, and their commercial        based on similarity comparison against the corre-
                                         development is inherently hampered by their low           sponding Wikipedia articles. We then use a se-
                                         interpretability and explainability.                      ries of judicious dependency-parse-based rules to
                                            Structured knowledge bases (KBs), in contrast,         collect faceted assertions from the retained texts.
                                         give a great possibility of explaining and interpret-     The semantic facets, which come from preposi-
                                         ing outputs of systems leveraging the resources.          tional phrases and supporting adverbs are then la-
                                         There have been great efforts towards build-              beled by a supervised classifier. Finally, asser-
                                         ing large-scale commonsense knowledge bases               tions are clustered using similarity scores from
                                                            word2vec (Mikolov et al., 2013) and a fine-tuned
                                                             RoBERTa (Liu et al., 2019) model.
We executed the A SCENT pipeline for 10,000            1. When searching for source texts, A SCENT
prominent concepts (selected based on their respec-          combines the target subject with an informa-
tive number of assertions in ConceptNet) as pri-             tive hypernym from WordNet to distinguish
mary subjects. In (Nguyen et al., 2021), we showed           different senses of the word (e.g., “bus public
that the content of the resulting CSKB (hereinafter          transport” and “bus network topology” for the
referred to as A SCENT KB) is a milestone in both            subject bus).
salience and recall. As extrinsic evaluation, we
conducted a comprehensive evaluation of the con-          2. A SCENT refines subjects with multi-word
tribution of CSK to zero-shot question answering             phrases into subgroups and aspects. For ex-
(QA) with pre-trained language models (Petroni               ample, subgroups for the subject bus would
et al., 2020; Guu et al., 2020).                             be tourist bus and school bus, while one of its
                                                             aspects would be bus driver.
   This paper presents a companion web portal of
the A SCENT KB, which enables the following in-
                                                        Semantic facets. The validity of commonsense
                                                        assertions is usually non-binary (Zhang et al., 2017;
                                                        Chalier et al., 2020), and depends on specific tem-
    1. Exploration of the construction process
                                                        poral and spatial circumstances (e.g., lions live for
       of A SCENT, by inspecting word sense
                                                        10-14 years in the wild but for more than 15 years
       and Wikipedia disambiguation, web search
                                                        in captivity). Moreover, CSK triples often ben-
       queries, clustered statements, and source sen-
                                                        efit from further context regarding causes/effects
       tences and documents.
                                                        and instruments (e.g., elephants communicate with
    2. Inspection of the resulting KB, starting from    each other by creating sounds, beer is served in
       subjects, predicates, objects, or examining      bars). In A SCENT’s knowledge model, such infor-
       specific subgroups or aspects.                   mation is added to SPO triples via semantic facets.
                                                        A SCENT distinguished 8 types of facets: cause,
    3. Observation of the impact of structured knowl-   manner, purpose, transitive-object, degree, loca-
       edge on question answering with pretrained       tion, temporal and other-quality.
       language models, comparing generated an-
                                                        2.2   Extraction pipeline
       swers across various CSKBs and QA settings.
                                                        A SCENT is a pipeline operating in three phases:
The web portal is available at https://ascent.          source discovery, knowledge extraction and knowl-, and a screencast demonstrating          edge consolidation. Fig. 1 illustrates the architec-
the system can be found at            ture of the pipeline.
qMkJXqu_Yd4.                                            Source discovery. We utilize the Bing Web
                                                        Search API to obtain documents specific to each
2     A SCENT                                           subject, with search queries refined by the sub-
                                                        ject’s hypernyms in WordNet. We manually de-
Two major contributions of A SCENT are its ex-          signed query templates for 35 prominent hyper-
pressive knowledge model, and its state-of-the-art      nyms (e.g., if subject s0 has hypernym animal.n.01,
extraction methodology. Details are in the techni-      we produce the search query “s0 animal facts”,
cal paper (Nguyen et al., 2021). In this section, we    similarly for the hypernym professional.n.01, the
revisit the most important points.                      search query will be “s0 job descriptions”). We
                                                        then compute the cosine similarity between the
2.1    Knowledge model
                                                        bag-of-words representations of each obtained doc-
A SCENT extends the traditional triple-based data       ument and a respective Wikipedia article to deter-
model in existing CSKBs in two ways.                    mine the relevance of the documents. Low-ranked
Expressive subjects. Subjects in existing CSKBs         documents will be omitted in further steps.
are usually single nouns, which implies two short-      Knowledge extraction. The extractors take in the
comings: (i) different meanings for the same word       relevant documents and their outputs include: open
are conflated, and (ii) refinements and variants of     information extraction (OIE) tuples, list of sub-
word senses are missed out. A SCENT has addressed       groups and list of aspects. To obtain OIE tuples,
this problem with the following means:                  we extend the S TUFF IE approach (Prasojo et al.,
Open IE

                                    Supervised Facet Labeling                                      Assertion Clustering

                Relevant websites
                                      Coreference Resolution               OpenIE assertions          Facet Clustering

                                          Noun Chunking


                 Web Search

                                            Noun chunks

                                    Subgroup/Aspect Extraction
                   Concept S
                                                                                  Related terms   Commonsense assertions of S

                  (1) Retrieval                           (2) Extraction                               (3) Consolidation

                Figure 1: Architecture of the A SCENT extraction pipeline (Nguyen et al., 2021).

2018), a list of carefully crafted dependency-parse-                       answering (Section 4.3).
based rules, to pull out faceted assertions from the
texts. Then we classify each facet into one of the                         3     Commonsense QA setups
eight semantic labels using a fine-tuned RoBERTa                           One common extrinsic use case of KBs is question
model. For subgroups, noun phrases whose head                              answering. Recently, it was observed that prim-
word is the target subject are collected as candi-                         ing language models (LMs) with relevant context
dates and then are clustered using the hierarchical                        can considerably benefit their performance in QA-
agglomerative clustering (HAC) algorithm on av-                            like tasks (Petroni et al., 2020; Guu et al., 2020).
erage word2vec representations. Finally, we col-                           In (Nguyen et al., 2021), to evaluate the contri-
lect aspects from possessive noun chunks and SPO                           bution of structured CSK to QA, we conducted a
triples where P is either “have”, “contain”, “be                           comprehensive evaluation consisting of four differ-
assembled of” or “be composed of”.                                         ent setups, all based on the above idea.
Knowledge consolidation. We perform cluster-                                   1. In masked prediction (MP), LMs are asked
ing on SPO triples and facet values. As SPO                                       to predict single masked tokens in generic
triples, we first filter triple-pair candidates with                              sentences.
fast word2vec similarity. After that, advanced simi-
larity of triple pairs computed by another fine-tuned                          2. In free generation (FG), LMs arbitrarily gen-
RoBERTa model is fed to the HAC algorithm to                                      erate answer sentences to given questions.
group the triples into semantically similar clusters.                          3. Guided generation (GG) extends free genera-
For facet values, we group phrases with the same                                  tion by answer prefixes that prevent the LMs
head words together (e.g., “during evening” and                                   from evading answering.
“in the evening”).
                                                                               4. Span prediction (SP) is the task of locating
2.3   Web portal                                                                  the answer of a question in provided context.
The web portal (https://ascent.mpi-inf.mpg.                                   Examples of the QA setups can be seen in Ta-
de) is implemented in Python using Django, and                             ble 1. Generally, given a question, our system
hosted on an Nginx web server. The underlying                              will retrieve from CSKBs assertions relevant to it,
structured CSK is stored in a PostgreSQL database,                         and then use the assertions as additional context
while for the QA part, statements of all CSKBs                             to guide the LMs. In the A SCENT demonstrator,
are indexed and queried via Apache Solr, for fast                          we provide a web interface for experimenting with
text-based querying. All components are deployed                           all of those QA setups with context retrieved from
on a virtual machine with access to 4 virtual CPUs                         several popular CSKBs.
and 8 GB of RAM.
                                                                           4     Demonstration experience
   In the demonstration session, we show how users
can interact with the portal for exploring the KB                          In the demonstration session, attendees will experi-
(Section 4.1), understanding the KB construction                           ence three main functionalities of our demonstra-
(Section 4.2), and observing its utility for question                      tion system.
Setup    Input                                       Sample output             ity under the Browse menu. This way, they can
          Elephants eat [MASK]. [SEP] Ele-            everything (15.52%),      search, for instance, for all concepts that eat grass
  MP      phants eat roots, grasses, fruit, and       trees (15.32%), plants
          bark, and they eat a lot of these things.   (11.26%)                  (capybara, zebra, kangaroo, ...).
          C: Elephants eat roots, grasses, fruit,     They eat a lot of
          and bark, and they eat...                   grasses, fruits, and...      The website also provides a JSON-formatted
          Q: What do elephants eat?                                             data dump (678MB) of all 8.9 million assertions
          C: Elephants eat roots, grasses, fruit,     Elephants eat a lot of    extracted by the pipeline and their corresponding
          and bark, and they eat...                   things.
          Q: What do elephants eat?                                             source sentences and documents. This dataset is
          A: Elephants eat                                                      also accessible via the HuggingFace Datasets pack-
          question=“What do elephants eat?”           start=14, end=46,
  SP      context=“Elephants eat roots, grasses,      answer=“roots, grasses,   age3 .
          fruit, and bark, and they eat...”           fruit, and bark”

Table 1: Examples of QA setups (Nguyen et al., 2021).
                                                                                4.2   Inspecting the construction of assertions
                                                                                For many downstream use cases, it is important to
                                                                                know about the provenance of information.
4.1      Exploring the A SCENT KB
                                                                                   Users can inspect general properties of the con-
Concept page. Suppose a user wants to know                                      struction process by observing the WordNet lemma
which knowledge A SCENT stores for elephants.                                   and the Wikipedia page used for filtering, as well
They can enter the concept into the search field in                             as inspect specific statistics about the number of
the top right of the start page, and select the first                           retained websites, sentences, and assertions, in a
result from the autocompletion list, or press enter,                            panel at the bottom of subject pages (e.g., 435 web-
to arrive at the intended concept. The resulting                                sites were retained for elephant, from which 50k
website (see Fig. 2) is divided into three main areas.                          OpenIE assertions could be extracted).
   At the top left, they can inspect an image from                                 Furthermore, users can look deeply into the con-, the WordNet synset used                                    struction process of each assertion on its own dedi-
for disambiguation, the Wikipedia page used for                                 cated page, which displays the following:
result filtering, and a list of alternative lemmas, if
existing.                                                                         1. Clustered triples: These are triples that were
   At the top right, users can see subgroups and                                     grouped together in the knowledge consolida-
related aspects, which in our knowledge represen-                                    tion phase (cf. Section 2.2), where the most
tation model, can carry their own statements. This                                   frequent triple was selected as cluster repre-
way, they can learn that the most salient aspects of                                 sentative. For example, for the assertion hlion,
elephants are their trunks, tusks and ears, or that                                  eat, zebra, DEGREE: mostlyi (14), the cluster
elephant trunks have more than 40,000 muscles.                                       contains: hlion, eat, zebrai (9), hlion, prey on,
   The body of the page, presents the assertions,                                    zebrai (2), hlion, feed on, zebrai (1), hlion,
organized into groups of same-predicate assertions.                                  feed upon, zebrai (1), hlion, prey upon, zebrai
In each group, assertions are sorted by their fre-                                   (1). The numbers in parentheses indicate their
quency displayed beside their objects. For example,                                  corresponding frequency.
the most commonly mentioned foods of elephants
are grasses, fruits, and plants. Many assertions                                  2. Facets: The assertion’s facets are presented in
come with a red asterisk. This indicates that the as-                                a table whose columns are facet value, facet
sertion comes with semantic facets. When clicking                                    type and clustered facets. The frequency of
on an assertion, it will show a small box display-                                   each clustered facet is also indicated.
ing an SVG-based visualisation of the assertion in
which we illustrate all elements of the assertion: its                            3. Source sentences and documents: Finally, we
subject, predicate, object, facet labels and values,                                 exhibit the sentences from which the asser-
frequency of the assertion as well as frequency of                                   tions were extracted and their parent docu-
each facet. For example, one can see that the pur-                                   ments (in the form of URLs). Furthermore, in
pose of elephants using their trunks is to suck up                                   the extraction phase, we also recorded the po-
water.                                                                               sition of assertion elements (i.e., subject, pred-
                                                                                     icate, object, facet) in the source sentences.
Searching and downloading assertions. Alter-
natively to exploring statements starting from a                                  3
subject, users can start from a search functional-                              ascent_kb
Figure 2: Example of A SCENT’s page for the concept elephant.

      We show that information to users by high-             assertions to be retrieved per CSKB for each
      lighting each kind of element with a different         question.
      color in the source sentences.
                                                          4. Context sources: The user selects the sources
4.3   Experimenting with commonsense QA                      of context (i.e., “no context”, CSKBs and
The third functionality experienced in the demo ses-         “custom context”). If a CSKB is selected, the
sion is the utilization of commonsense knowledge             system will retrieve from that KB assertions
for question answering (QA).                                 relevant to the given input question. If “cus-
Input. There are four main parts in the input in-            tom context” is selected, user must then enter
terface for the QA experiment:                               their own content. The “no context” option is
                                                             available for all setups but Span Prediction.
  1. QA setup: The user chooses one QA setup
     they want to experiment with. Available            Output. The QA system presents its output in the
     are Masked Prediction, Span Prediction and         form of a table which has three columns: Source,
     Free/Guided Generation. If Masked Predic-          Answer(s) and Context. For Masked Prediction
     tion is selected, the user can choose how many     and Span Prediction, answers are printed with
     answers the LM should produce. For the Gen-        their respective confidence scores, meanwhile for
     eration settings, users can provide an answer      Free/Guided Generation, only answers are printed.
     prefix to avoid overly evasive answers.            For Span Prediction in which answers come di-
                                                        rectly from given contexts, we also highlight the
  2. Input query: The user enters the text question
                                                        answers in the contexts.
     as input. The question can be in the form
                                                           An example of the QA demo’s output for the
     of a masked sentence (in the case of Masked
                                                        question “What do rabbits eat?” under the Free
     Prediction), or a standard natural-language
                                                        Generation setting can be seen in Fig. 3. One can
     question (in other setups).
                                                        observe that language models’ predictions are heav-
  3. Retrieval options: The user can select one         ily influenced by given contexts. Without context,
     supported retrieval method and the number of       GPT-2 is only able to generate an evasive answer.
When being given context, it tends to re-generate         tent via CSV files. Some, like ConceptNet4 , We-
the first sentence in the context first, (e.g., see the   bChild5 , Atomic6 and Quasimodo7 , have a web por-
answers aligning with A SCENT, TupleKB and Con-           tal to visualise their assertions. The most common
ceptNet in Fig. 3). For the context retrieved from        way for CSKB visualisation is to use a single page
Quasimodo, GPT-2 is able to overlook the erro-            for each subject and group assertions by predicate
neous first sentence, however its generated answer        (e.g., in ConceptNet and WebChild). Quasimodo,
is rather elusive despite the fact that subsequent        on the other hand, implements a simple search in-
statements in the context all contain direct answers      terface to filter assertions and presents assertions
to the question.                                          in a tabular way (Romero and Razniewski, 2020).
   The question “Bartenders work in [MASK].” un-          The A SCENT demo has both functionalities: ex-
der the Masked Prediction setting is another ex-          hibiting assertions of each concept in a separated
ample for the influence of context on LMs’ output.        page, and supporting assertion filtering. Our demo
Since bartender is a subject well covered by the A S -    also uses an SVG-based visualisation of assertions
CENT KB, the assertions pulled out are all relevant       with semantic facets, which are a distinctive feature
(i.e., Bartenders work in bar. Bartenders work in         of the A SCENT knowledge model.
restaurant. . . ) which help guide the LM to a good       Context in LM-based question answering.
answer (bar). Meanwhile, because this subject is          Priming large pretrained LMs with context in
not present in TupleKB, its retrieved statements are      QA-like tasks is a relatively new line of research
rather unrelated (Work capitals have firm. Work           (Petroni et al., 2020; Guu et al., 2020). In our orig-
experiences include statement. . . ). Given that, the     inal paper, we made the first attempt to evaluate
top-1 prediction for this KB was tandem which is          the contribution of CSKB assertions to QA via four
obviously an evasive answer.                              different setups based on that idea. While others
                                                          use commonsense knowledge for (re-)training lan-
5   Related work                                          guage models (Hwang et al., 2021; Ilievski et al.,
                                                          2021; Ma et al., 2021; Mitra et al., 2020), to the
CSKB construction. Cyc (Lenat, 1995) is the               best of our knowledge, our demo system is the first
first attempt to build a large-scale common-              to visualize the effect of priming vanilla language
sense knowledge base. Since then, there have              models, i.e., without task-specific retraining.
been a number of other CSKB construction
projects, notably ConceptNet (Speer and Havasi,           6       Conclusion
2012), WebChild (Tandon et al., 2014, 2017), Tu-
pleKB (Mishra et al., 2017), and more recently            We presented a web portal for a state-of-the-art
Quasimodo (Romero et al., 2019), Dice (Chalier            commonsense knowledge base—the A SCENT KB.
et al., 2020), Atomic (Sap et al., 2019), and             It allows users to fully explore and search the
CSKG (Ilievski et al., 2020). The early approach          CSKB, inspect the construction process of each
to building a CSKB is based on human annota-              assertion, and observe the impact of structured
tion (e.g., Cyc with expert annotation and Con-           CSKBs on different QA tasks. We hope that the
ceptNet with crowdsourcing annotation). Later             portal enables interesting interactions with the A S -
projects tend to use automated methods based on           CENT methodology, and that the QA demo allows
open information extraction to collect CSK from           researchers to explore the potentials of combining
texts (e.g., WebChild, TupleKB and Quasimodo).            structured data with pre-trained language models.
Lately, CSKG is an attempt to combine various
commonsense knowledge resources into a single
KB. The common thread of these CSKB is that               References
they are all based on SPO triples as knowledge            Tom B Brown et al. 2020. Language models are few-
representation, which has shortcomings (Nguyen              shot learners. In NeurIPS.
et al., 2021). A SCENT is the first attempt to build          4
a large-scale CSKB with assertions equipped with              5
semantic facets built upon the ideas of semantic          webchild
role labeling (Palmer et al., 2010).                
KB visualization. Most CSKBs share their con-       
Figure 3: Free Generation output for question: “What do rabbits eat?”.

Yohan Chalier, Simon Razniewski, and Gerhard                evaluation in commonsense question answering. In
  Weikum. 2020. Joint reasoning for multi-faceted           AAAI.
  commonsense knowledge. In AKBC.
                                                          John McCarthy. 1960. Programs with common sense.
Peter Clark et al. 2020. From ‘F’to ‘A’ on the NY           RLE and MIT computation center.
  regents science exams: An overview of the Aristo
  project. AI Magazine.                                   Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey
                                                            Dean. 2013. Efficient estimation of word represen-
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and               tations in vector space. In ICLR.
   Kristina Toutanova. 2018. Bert: Pre-training of deep
   bidirectional transformers for language understand-    George A Miller. 1995. Wordnet: a lexical database for
   ing. In NAACL.                                           English. Communications of the ACM.

Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasu-         Bhavana Dalvi Mishra, Niket Tandon, and Peter Clark.
  pat, and Ming-Wei Chang. 2020. Realm: Retrieval-          2017. Domain-targeted, high precision knowledge
  augmented language model pre-training. In ICML.           extraction. TACL.
                                                          Arindam Mitra, Pratyay Banerjee, Kuntal Kumar Pal,
Jena D Hwang, Chandra Bhagavatula, Ronan Le Bras,
                                                            Swaroop Mishra, and Chitta Baral. 2020. How ad-
  Jeff Da, Keisuke Sakaguchi, Antoine Bosselut, and
                                                            ditional knowledge can improve natural language
  Yejin Choi. 2021. Comet-atomic 2020: On sym-
                                                            commonsense question answering? arXiv preprint
   bolic and neural commonsense knowledge graphs.
   In AAAI.
                                                          Don Monroe. 2020. Seeking artificial common sense.
Filip Ilievski, Alessandro Oltramari, Kaixin Ma, Bin
                                                            Communications of the ACM.
   Zhang, Deborah L McGuinness, and Pedro Szekely.
   2021. Dimensions of commonsense knowledge.             Tuan-Phong Nguyen, Simon Razniewski, and Gerhard
   arXiv preprint arXiv:2101.04640.                         Weikum. 2021. Advanced semantics for common-
                                                            sense knowledge extraction. In WWW.
Filip Ilievski, Pedro Szekely, and Bin Zhang. 2020.
   Cskg: The commonsense knowledge graph. arXiv           Martha Palmer, Daniel Gildea, and Nianwen Xue. 2010.
   preprint arXiv:2012.11490.                              Semantic role labeling. Synthesis Lectures on Hu-
                                                           man Language Technologies.
Douglas B Lenat. 1995. Cyc: A large-scale investment
  in knowledge infrastructure. Communications of the      Fabio Petroni, Patrick Lewis, Aleksandra Piktus, Tim
  ACM.                                                      Rocktäschel, Yuxiang Wu, Alexander H Miller, and
                                                            Sebastian Riedel. 2020. How context affects lan-
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Man-         guage models’ factual predictions. In AKBC.
  dar Joshi, Danqi Chen, Omer Levy, Mike Lewis,
  Luke Zettlemoyer, and Veselin Stoyanov. 2019.           Radityo Eko Prasojo, Mouna Kacimi, and Werner Nutt.
  Roberta: A robustly optimized BERT pretraining ap-        2018. Stuffie: Semantic tagging of unlabeled facets
  proach. arXiv preprint arXiv:1907.11692.                  using fine-grained information extraction. In CIKM.
Kaixin Ma, Filip Ilievski, Jonathan Francis, Yonatan      Julien Romero and Simon Razniewski. 2020. Inside
  Bisk, Eric Nyberg, and Alessandro Oltramari. 2021.         quasimodo: Exploring construction and usage of
  Knowledge-driven data construction for zero-shot           commonsense knowledge. In CIKM.
Julien Romero, Simon Razniewski, Koninika Pal,         Niket Tandon, Gerard de Melo, Fabian M. Suchanek,
   Jeff Z. Pan, Archit Sakhadeo, and Gerhard Weikum.     and Gerhard Weikum. 2014. Webchild: harvesting
   2019. Commonsense properties from query logs and      and organizing commonsense knowledge from the
   question answering forums. In CIKM.                   web. In WSDM.
Maarten Sap et al. 2019. Atomic: An atlas of machine
 commonsense for if-then reasoning. In AAAI.
                                                       Niket Tandon, Gerard de Melo, and Gerhard Weikum.
Robyn Speer and Catherine Havasi. 2012. Conceptnet       2017. Webchild 2.0 : Fine-grained commonsense
  5: A large semantic network for relational knowl-      knowledge distillation. In ACL.
  edge. Theory and Applications of Natural Language
Alon Talmor, Yanai Elazar, Yoav Goldberg, and          Sheng Zhang, Rachel Rudinger, Kevin Duh, and Ben-
  Jonathan Berant. 2019. olmpics - on what language      jamin Van Durme. 2017. Ordinal common-sense in-
  model pre-training captures. TACL.                     ference. TACL.
You can also read