Context-Sensitive Malicious Spelling Error Correction

Page created by Beverly Sparks
 
CONTINUE READING
Context-Sensitive Malicious Spelling Error Correction

                                                           Hongyu Gong, Yuchen Li, Suma Bhat, Pramod Viswanath
                                                                  University of Illinois at Urbana-Champaign
                                                         {hgong6, li215, spbhat2, pramodv}@illinois.edu

                                                              Abstract                           sider the sentence, “My thoughts are that peo-
                                                                                                 ple should stop being stupid.” which was as-
                                             Misspelled words of the malicious kind work
                                                                                                 signed a perceived toxicity score of 86% by Per-
                                             by changing specific keywords and are in-
arXiv:1901.07688v1 [cs.CL] 23 Jan 2019

                                             tended to thwart existing automated applica-
                                                                                                 spective. When the word “stupid” was misspelled
                                             tions for cyber-environment control such as         as “stup*d” in the same sentence, its toxicity
                                             harassing content detection on the Internet and     score reduces to 8%. This marked reduction in-
                                             email spam detection. In this paper, we fo-         duced by the deceptive spelling error on the key-
                                             cus on malicious spelling correction, which re-     word ‘stupid’ reflects the importance of accurate
                                             quires an approach that relies on the context       spelling correction for optimal toxicity detection.
                                             and the surface forms of targeted keywords.         Had the spelling error been detected and corrected
                                             In the context of two applications–profanity
                                                                                                 before scoring for toxicity, the score would not
                                             detection and email spam detection–we show
                                             that malicious misspellings seriously degrade       have lowered.
                                             their performance. We then propose a context-          Likewise, deliberate typographic errors are
                                             sensitive approach for malicious spelling cor-      common in phishing scams and spam emails. For
                                             rection using word embeddings and demon-            instance, “Please pya any fees you may owe to our
                                             strate its superior performance compared to         company” where pya is clearly a spelling error, in-
                                             state-of-the-art spell checkers.                    cluded to deceive spam filters that are sensitive to
                                                                                                 keywords. In both these instances, spelling cor-
                                         1   Introduction
                                                                                                 rection as a preprocessing step of the messages are
                                         Automatic spelling correction has been an im-           critical for a robust performance of the target sys-
                                         portant natural language processing component of        tem. Spelling correction in the preprocessing stage
                                         any input text interface of today (Jurafsky and         in malicious settings constitutes the focus of this
                                         Martin, 2014). While spelling errors in common          study.
                                         writing could be regarded as innocuous omissions,          In this paper, we empirically demonstrate the
                                         affecting opinions about the writer’s competence        effect of spelling errors in a malicious setting by
                                         at worst, those of the malicious kind can poten-        adding synthetic misspellings to sensitive words
                                         tially be highly offensive, since they are intended     in the context of two applications–profanity de-
                                         to deceive an automated mechanism–be it a spam          tection and spam detection. We propose an un-
                                         filter for emails or a profanity detector for social    supervised embedding-based algorithm to cor-
                                         media. For those applications that rely on au-          rect the targeted misspelled words. Earlier ap-
                                         tomatic detection of specific keywords to enable        proaches to spelling correction primarily depend
                                         real-time control of the cyber-environment, detect-     on the edit distance to find words morphologi-
                                         ing and correcting spelling errors is of primary im-    cally similar to corrections. More recently, spell
                                         portance.                                               checkers have been improved with the addition of
                                             Perspective (Google, 2017), is one such ap-         contextual information (e.g., n-gram modeling in
                                         plication that helps reduce abusive content on-         (Wint et al., 2018)), often with intensive compu-
                                         line by detecting profane language. It has been         tation and memory requirements (to obtain and
                                         pointed out that Perspective’s otherwise good per-      store N-gram statistics). A recently studied neu-
                                         formance in detecting toxic comments is not ro-         ral network-based spell checker learns the mis-
                                         bust to spelling errors (Hosseini et al., 2017). Con-   spelling pattern from annotated train data in a su-
pervised way, and was found to be sensitive to         misspelled sensitive words (Zhong, 2014) because
data domains and dependent on human annota-            word-based spam filters tend to make false deci-
tions (Ghosh and Kristensson, 2017). Our ap-           sions in the presence of misspellings (Saxena and
proach to make the correction procedure context-       Khan, 2015), and so are word-based cyberbullying
aware involves harnessing the geometric proper-        detection systems (Agrawal and Awekar, 2018).
ties of word embeddings. It is light-weight in         Misspelling is also seen in web attacks where a
comparison, unsupervised, and can be adapted to        phishing site has a domain name as a misspelling
a variety of data domains including Twitter data       of a legitimate website (Howard, 2008).
and spam data, since domain information can be         Approaches to correct malicious misspellings.
well captured by tuning embeddings on domain-          Given the malicious intent of misspellings in the
specific data (Taghipour and Ng, 2015).                context of the Internet, recent studies have pro-
   We first demonstrate the effect of spelling er-     posed correction strategies specifically for mali-
rors in a malicious setting in the context of two      cious misspellings. Levenshtein edit distance is
applications–profanity detection and spam detec-       commonly used in spell checking of detection sys-
tion. Then we propose an unsupervised algo-            tems. Spam filters (Zhong, 2014) and cyberbul-
rithm to correct the targeted spelling errors. Be-     lying detection systems (Wint et al., 2017) rely
sides performing well on the correction of syn-        on the idea of edit distance to recognize cam-
thetic spelling errors, the algorithm also performs    ouflaged words by capturing the pattern of mis-
well on spell checking of real data, and shows that    spellings. For example, (Rojas-Galeano, 2013)
it can improve hate speech detection using Twitter     studied an edit distance function designed to re-
data.                                                  veal instances where spammers had interspersed
                                                       black-listed words with non-alphabetic symbols.
2   Related Work                                       Lexical and context statistics, such as word fre-
                                                       quency, have also been used to make corrections
Spelling correction. Automatic spelling correc-        in social texts (Baziotis et al., 2017; Jurafsky and
tion in non-malicious settings has had a long re-      Martin., 2017). Existing approaches have the
search track as an important component of text         problem of domain specificity, since their lexical
processing and normalization. Early works have         statistics are obtained from a certain domain which
used edit distance to find morphologically simi-       might differ from the domain of target applica-
lar corrections (Ristad and Yianilos, 1998), noisy     tions.
channel model for misspellings (Jurafsky and              Our study considers the spelling correction in a
Martin, 2014), and iterative search to improve cor-    malicious setting where errors are not random, but
rections of distant spelling errors (Gubanov et al.,   are carefully introduced. Our context-aware ap-
2014). Word contexts have been shown to be             proach to spelling correction relies on the geome-
improve the robustness of spell checkers with n-       try of word embeddings, which has the advantage
gram language model as one approach to incorpo-        of efficient domain adaptation. As we will show in
rate contextual information (Hassan and Menezes,       Section 4.2, the embedding can be easily tuned on
2013; Farra et al., 2014). Other ways of incor-        a small corpus from the target domain to capture
porating contextual information include n-gram         domain knowledge.
statistics capturing the cohesiveness of a candidate
word with the given context (Wint et al., 2018).       3   Methods
Effect of malicious misspellings. Misspellings
are commonly seen in online social platforms such      We study malicious spelling correction for three
as Twitter and Facebook, and recent studies have       target applications–toxicity detection using the
drawn attention to the fact that online users de-      Perspective API, email spam detection, and hate
liberately introduce spelling errors to thwart bul-    speech detection on Twitter data. We work with
lying detection (Power et al., 2018) or to chal-       synthetic spelling errors in the first two applica-
lenge to moderators of online communities (Pa-         tions, and study the real misspellings in the third.
pegnies et al., 2017). This is because simple ways        For the toxicity detection task, we use the Per-
of filtering terms of profanity are rendered inad-     spective dataset (Google, 2017), which provides
equate in the presence of spelling errors. Like-       a set of comments collected from the Internet with
wise, email spam filters are often obfuscated by       human annotated scores of toxicity. Using the Per-
spective API we obtained the toxicity score for          stud and stupid can be thought of as the correct
2767 comments in the dataset.                            form of the erroneous word stupd. In such cases,
   The spam dataset consists of 960 emails from          spelling correction relies on the context in which
the Ling-Spam dataset 1 . We randomly split the          the word occurs.
data into 700 train emails and 260 test emails.
                                                         3.2   Effect of malicious spelling errors
Both the train and test sets were balanced with
respect to the positive and negative classes. The        We quantify the effect of spelling errors by show-
most frequent 2500 words in the training data were       ing that a simple mechanism of injecting malicious
selected, and we counted the occurrence of each          spelling errors greatly degrades the performance
word in the emails. The number of occurrences            of toxicity detection and spam detection. Toward
of these frequent words were used as a 2500-             this, we first describe the general mechanism to
dimension feature vector for each email. We first        generate synthetic errors and then study their im-
used these features to train a Naive Bayes classi-       pact on toxicity and spam detection. We point
fier. It achieved an accuracy of 98% in spam de-         out that owing to the absence of a dataset with in-
tection.                                                 tended malicious errors, we had to generate them
   As for the Twitter data, a total of 16K user posts    synthetically.
were collected from Twitter, a social communica-            Spelling error generation. We first choose the
tion platform and used in used in (Waseem and            “sensitive” words that the detection algorithm re-
Hovy, 2016). Of these tweets, 1937 were labeled          lies (the mechanism will be discussed separately
as being racist, 3117 were labeled as being sexist,      for each application we consider). We then replace
and the rest were neither racist nor sexist. The task    these words with their erroneous forms (which
of hate speech detection is to classify these tweets     may not be real words). Given the characteristics
into one of the three categories racist, sexist, nei-    of malicious errors mentioned in Section 3.1, our
ther. We randomly split the tweets into train and        assumption is that as a result of these errors the
test data, and trained a neural-network based hate       altered words will appear similar to the original
speech detection system on the training data (de-        ones. As illustrated in Table 1, we consider four
scribed later in Section 5.3).                           basic character-level operations to change the sen-
                                                         sitive words: insertion, permutation, replacement,
3.1   Characterization                                   and removal. In our experiments we perform these
                                                         operations on randomly picked characters of the
Here we summarize our assumptions on the char-
                                                         sensitive word. For permutation, we exchange this
acteristics of the malicious spelling errors.
                                                         character with the next one. Each operation in Ta-
(1) Malicious errors are usually made on the sensi-
                                                         ble 1 increases the edit distance by 1, and we gen-
tive keywords in order to obfuscate the true inten-
                                                         erate malicious misspellings of the sensitive word
tions and deceive detection systems that rely on
                                                         that are at most 2 edit distance from the correct
keywords (Weinberger et al., 2009).
                                                         word.
(2) The error yields a word that is similar to the
original word in surface form. The misspelled                        Table 1: Common spelling errors
words would be words that humans can easily un-
                                                              Type       insertion   permutation   replacement   removal
derstand (thus the erroneous words still convey the           Error        idio.t      moeny          chanse      stupd
intended meaning, while being in disguise). Their           Correction     idiot       money          chance      stupid
similarity is reflected in the small edit distance be-
tween the erroneous and the correct word (Mor-              Finally, we obtain revised sentences by replac-
gan, 1970). The errors often involve character-          ing the sensitive words in the original sentences
level operations as shown in Table 1 (Liu et al.,        with their erroneous counterparts. Next, we intro-
2011).                                                   duce our mechanism for selecting the “sensitive”
(3) With edit distance as the similarity criterion, it   words for each application.
is often the case that the word with an error is sim-       Perspective toxicity detection. We select 2767
ilar to multiple valid words, making correction a        clean comments from a total of 2969 comments
challenge in real applications. For example, both        in the Perspective data, with the selection criteria
                                                         given below.
  1
    http://csmining.org/index.php/
ling-spam-datasets.html                                    • The most toxic word in a comment should
contain more than two characters. A word           tection. In this section, we introduce our unsu-
      that is too short is not likely to be a meaning-   pervised algorithm on non-word error correction
      ful toxic word, and can have too many candi-       based on the relevant context.
      date corrections in the dictionary.                   Our correction method proceeds in two stages.
                                                         In the first stage of candidate enumeration, the
    • The most toxic word in a comment should ap-        algorithm detects the spelling error and proposes
      pear at least 100 times in the dataset, since      candidates that are valid words and similar in sur-
      rare words tend to be misspellings in online       face form to the misspelled word. In the next
      texts.                                             stage, correction via context, the best candidate
                                                         is chosen based on its context.
    • The most toxic word in a comment should
      be a content word, i.e., it should not be-         4.1   Candidate Enumeration
      long in the list of function words such as
      “must”, “ought”, “shall” and“should”. Func-        As in the case of non-word spelling correction
      tion words are not toxic and so were excluded      (Jurafsky and Martin, 2014), we check whether
      from our experiments.                              an erroneous word is valid or not using a vo-
                                                         cabulary of valid words. For toxicity detection
   For each comment, its predicted toxicity score        and spam detection, the vocabulary consists of
lies in the range of [0, 1]. The higher the score, the   words that occur more than 100 times in En-
more toxic the sentence is. For each sentence, we        glish Wikipedia corpus 2 For hate speech detec-
chose the most toxic word to be that word, with-         tion, the vocabulary consists of standard words
out which the toxicity score reduced the most. We        from Wikipedia and the list of internet slang words
then added malicious errors, as described above, to      as detailed in Section 5.3, since slang words fre-
this word. We note that for 2380 toxic comments          quently occur in tweets. For an invalid word, we
with toxicity scores higher than 0.5, spelling er-       find candidate words with the smallest Damerau-
rors brought down their predicted toxicity by 32%.       Levenshtein edit distance (Damerau, 1964; Leven-
Section 5 provides the details of the degradation in     shtein, 1966). This distance between two strings
toxicity detection.                                      is defined as the minimum number of unit oper-
   Spam detection. A Naive Bayes spam detec-             ations required to transform a string into another,
tion model provides the likelihood of each word          where the unit operations include character addi-
given the spam and non-spam classes. For a given         tion, deletion, transposition and replacement. We
word, the difference between these probabilities         note that because there are potentially multiple
reflects how important that word is in spam detec-       candidates having the smallest distance to the mis-
tion. We sorted all the words in decreasing order        spelled word, the output of this stage is a set of
of this difference in probabilities, and picked the      candidate corrected words.
most spam-indicative words. We then added ma-
licious errors to these words in test spam emails.       4.2   Correction via Context
For a spam filtering system which relies heavily         After enumerating all possible candidates, we
on the counts of spam-indicative words, the errors       choose the one that fits the best in the context.
can easily disguise spams as non-spams. This is          We propose a word-embedding-based method to
seen in the test accuracy dropping from 98% on           match the context with the candidate.
the original test data to 72% on the revised test        Pretrained embedding.         We use word2vec
data.                                                    CBOW model to train embeddings (Mikolov et al.,
   We note that for the Perspective data, we added       2013). The Perspective and Twitter data have an
errors to only one word per sentence, but did not        informal style, while email data consists of rela-
have such limits for the spam data. Also, we do          tively formal expressions. Word embeddings are
not limit the number of corrections during the spell     known to be domain-specific (Nguyen and Gr-
check process.                                           ishman, 2014) and naturally domain-specific cor-
                                                         pora are used to train word embeddings for use
4    Spelling Correction                                 in a given domain. We note that in the absence
We have shown that malicious errors degrade the            2
                                                             available at:   http://www.cs.upc.edu/˜nlp/
performance of toxicity detection and spam de-           wikicorpus/
of a large enough representative corpus to train          Table 2: Correction Accuracy on Perspective and Spam
domain-specific high-quality embeddings for this                Dataset     Our system   PyEnchant       Ekphrasis   Google
study, we reconcile with word embeddings trained              Perspective     0.840        0.476          0.713      0.323
on a large WikiCorpus (Al-Rfou et al., 2013) to                 Spam          0.786        0.618          0.639      0.526
capture the general lexical semantics and further
tuned on a domain-specific corpus, such as that of
                                                          tors:
the Perspective dataset, the spam emails data and
                                                                                            p
the Twitter data. This step allows us to combine                                     1      X
domain information in trained embeddings.                     dist(c, Tp ) = min          k   ai vi − vc k22 . (1)
                                                                             {ai } kvc k2
                                                                                               i=−p,
Error Correction. Word vectors permit us to de-                                                 i6=0
cide the fit of a word in a given context by consid-
ering the geometry of the context words in relation       This is a quadratic minimization problem for
to the given word. Firstly, the embedding of the          which we can find a closed-form solution.
compositional semantics of phrases or sentences              Instead of fixing one context window size, we
can be approximated by a linear combination of            consider multiple window sizes and weigh the dis-
the embeddings of their constituent words (Salehi         tances obtained using different window sizes. The
et al., 2015). Let vw be the embedding of the token       context words that are closer to the misspelled
w. Take the phrase “lunch box” as an example, it          word are more informative in candidate selection.
holds that                                                In the sentence “the stu*pid and stubborn adminis-
                                                          trators”, the word stubborn suggests that the mis-
                                                          spelled word should be a negative personality ad-
           vhate group ≈ αvhate + βvgroup ,               jective, and the word administrator that it should
                                                          be an adjective for people. Thus, the closest con-
where α and β are real-valued coefficients.               text word stubborn provides more relevant infor-
   This enables us to represent the context as a lin-     mation than the distant word administrator.
ear space of the component words. Another prop-              We thus weigh the distance by the inverse of the
erty is that semantically relevant words lie close in     context window size, i.e., the weight p1 for the win-
the embedding space. If a word fits the the con-          dow size p. Suppose that T is the full context, and
text, it should be close to the context space, i.e.,      the candidate-context distance is defined below:
the normalized Euclidean distance from the word                                          P
embedding to the context space should be small
                                                                                         X 1
                                                                       dist(c, T ) =               · dist(c, Tp ),       (2)
(Gong et al., 2017). We quantify the inconsistency                                             p
                                                                                         p=1
between the word and its context by the distance
between the word embedding and the span of the            where P = 4 in our experiments.
context words.                                               Given the context T , the suggested correction
                                                           ∗
                                                          c is the candidate with the smallest distance to
   For a misspelled word in a sentence, all words
                                                          the linear space of context words, i.e.,
occurring within a window of size p are considered
to be its context words. Let the context Tp be the                          c∗ = arg min dist(c, T ).                    (3)
set of words within distance p from w0 :                                                 c∈C

  Tp = {w−p , w−p+1 , . . . , w0 , . . . , wp−1 , wp },   5      Experiments
                                                          We evaluated our spelling correction approach in
where w0 is the misspelled word. Let C be the             three settings: toxicity and spam detection with
set of candidate replacements of w0 . Let vi be the       synthetic misspellings, and hate speech detection
word embedding of context word wi . We denote             with real spelling errors. Even though some recent
by dist(c, Tp ), the distance between a candidate c       works using neural networks (Ghosh and Kris-
and the linear span of the words in its context Tp ,      tensson, 2017) are available, they require ground
termed as the candidate-context distance of a can-        truth corrections for supervised training. For our
didate, defined as the normalized distance between        unsupervised approach, we compare its perfor-
the word embedding of the candidate, vc and its           mance with that of three strong unsupervised base-
linear approximation obtained by the context vec-         lines below, which are generic off-the-shelf tools
Table 3: Correction on perspective data

 original sentence   the stupid and stubborn administrators      anti American hate groups   you’re a biased fuck
 revised sentence    the stu*pid and stubborn administrators     anti American ahte groups   you’re a biased fucdk
 Our correction      the stupid and stubborn administrators      anti American hate groups   you’re a biased fuck
 PyEnchant           the sch*pid and stubborn administrators     anti American hate groups   you’re a biased Fuchs
 Ekphrasis           the stupid and stubborn administrators      anti American ate groups    youre a biased fuck
 Google              the stupid and stubborn administrators      anti American ahte groups   you’re a biased fucdk

used in many NLP systems such as AbiWord (abi,             bin is 0.55, whereas the revised toxicity is only
2018) and Google search engine (goo, 2018).                0.33 (a drop by 40%). This shows that toxicity
(1) Pyenchant: a generic spell checking library of         detection is very sensitive to spelling errors.
multiple correction algorithms (Kelly, 2015). We              From the same figure we see that our proposed
use the MySpell library in this work.                      error correction results in toxicity scores closer to
(2) Ekphrasis: a spelling correction and text nor-         the original ones when compared with the base-
malization tool for texts from social networks             lines, validating the effectiveness of our approach.
(Baziotis et al., 2017).                                   We note that in some cases our approach might re-
(3) Google spell checker: a Google search engine-          sult in a slightly higher toxicity score than the orig-
based spell check (Coad, 2018). We chose it for its        inal one, because of the pre-existing misspellings
ability to undertake context-sensitive correction on       in the original sentences. For example, original
the basis of user search history.                          sentences “you are a fagget”, “fukkin goof’s track
                                                           record” and “I suspect that closedmouth is g*y”
5.1   Toxicity Detection
                                                           have crude misspellings, which are used as In-
                                                           ternet slang. Our approach corrects these pre-
                                                           existing misspellings in addition to the errors we
                                                           add later, resulting in higher toxicity scores of cor-
                                                           rected sentences. We perform corrections of all
                                                           spelling errors, recovering more toxic words than
                                                           we added maliciously.

                                                           5.2     Spam Detection

  Figure 1: Toxicity scores by different approaches

  We took 2380 sentences from Perspective
dataset whose toxicity scores were higher than 0.5
and added synthetic errors maliciously to these
sentences as described in Section 3. Some ex-
ample Perspective sentences, revised sentences
and corrections given by different algorithms are              Figure 2: Spam detection accuracy on test emails
shown in Table 3. The correction accuracy
achieved by different approaches is reported in Ta-           We next evaluate our spelling correction on the
ble 2.                                                     spam data. Synthetic malicious errors were added
  We divided toxic sentences into 5 bins accord-           to spam-indicative words in spam mails based on
ing to their original toxicities. In Fig. 1, we report     our misspelling generation mechanism.
the average toxicity of the original, revised and             Some example spams are shown in Table 4.
corrected sentences in each bin. We see that the           Sensitive words such as “money” which are in-
malicious errors drastically reduce the sentences’         dicative of spams are highlighted, and the cor-
toxicities. The original average toxicity in the first     rections given by different approaches are shown.
Table 4: Correction examples on spam data

                       it really be a great oppotunity to make relatively    we have quit our jobs , and will soon buy a home on
 Original sentence
                       easy money , with little cost to you .                the beach and live off the interest on our money .
                       it really be a grfat opportunity to make relatively   we have quit our tobs, and will soon buy a home on
 Revised sentence
                       easy fmoney, with little cosgt to you.                the beach and live off the interest on our jmoney.
                       it really be a great opportunity to make relatively   we have quit our jobs, and will soon buy a home on
 Our correction
                       easy money, with little cost to you.                  the beach and live off the interest on our money.
                       it really be a graft opportunity to make relatively   we have quit our bots, and will soon buy a home on
 PyEnchant
                       easy fmoney, with little cost to you.                 the beach and live off the interest on our money.
                       it really be a great opportunity to make relatively   we have quit our tobs, and will soon buy a home on
 Ekphrasis
                       easy money, with little cost to you.                  the beach and live off the interest on our money.
                       it really be a great opportunity to make relatively   we have quit our jobs, and will soon buy a home on
 Google
                       easy fmoney, with little cosgt to you.                the beach and live off the interest on our jmoney.

Table 2 shows the spelling correction accuracy                      preprocessing tool which can replace aforemen-
achieved by our algorithm and the baselines for                     tioned tokens with a special set of tokens. For
the spam data.                                                      example, one tweet is “#isis #islam pc puzzle:
   Fig. 2 shows the spam detection accuracy on the                  converting to a religion of peace leading to vio-
original, revised and corrected test emails. Again                  lence ?,http://t.co/tbjusaemuh http://t.co/g4xoh...”,
we see that malicious errors resulted in a large                    which becomes “$HASHTAG$ $HASHTAG$ pc
accuracy drop, the accuracy was restored after                      puzzle: converting to a religion of peace leading
spelling correction, and the increase in accuracy                   to violence? $URL$ $URL$...” after preprocess-
using our approach was the maximal among those                      ing. This preprocessing stage can clean texts with-
compared.                                                           out dealing with misspellings. Tweets are prepro-
                                                                    cessed right before they are fed into the neural net-
5.3   Real Misspellings in Tweet Hate Speech                        work.
      Detection
                                                                       Hate speech detection. The state-of-the-art
Table 5: Example of tweets and hate speech categories               system for hate speech detection is a Bidirectional
                                                                    Long Short Term Memory (BLSTM) network
 Category    Example sentence                                       with attention mechanism (Agrawal and Awekar,
             #isis #islam pc puzzle: converting to a religion
  Racism                                                            2018). The model takes a sequence of words in a
             of peace leading to violence? http://t.co/tbjusaemuh
             Real question is do feminist liberal bigots            tweet, obtains word embeddings in the embedding
  Sexism     understand that different rules for men/women          layer, and passes them to bidirectional recurrent
             is sexism
  Neither    The mother and son team are sooooo nice !!!            layers to generate a dense representation of the in-
                                                                    put tweet. The feedforward layer takes the tweet
                                                                    vector and predicts its probability distribution over
   We have shown that synthetic misspellings are
                                                                    all three classes. The class with the highest likeli-
able to deceive the toxicity detector and the spam
                                                                    hood is chosen as the category of the input tweet.
detector, and reported the performance of spell
                                                                    In our experiment, we use a BLSTM model for the
checkers on the synthetic data. We also experi-
                                                                    hate speech detection task.
mented with data containing real spelling errors
collected from Twitter, where instances of hate                        Vocabulary construction. Firstly we build a
speech contain user-generated spelling errors. The                  vocabulary list using both a standard dictionary
correction performance of spell checkers are now                    and the frequent words (frequency higher than 5)
compared on the real misspellings.                                  in the training data. This list will serve as a ref-
   Tweet normalization. Some example tweets                         erence for spelling correction; words outside the
of racist and sexist nature are shown in Table 5.                   list will be taken as spelling errors and replaced
Tweets are notoriously noisy and unstructured                       with legal words from the vocabulary. The rea-
given the frequent occurrences of hashtags, URLs,                   son for collecting words from the training data is
reserved words and emojis. These non-standard                       to include the Internet slangs in tweets which may
tokens greatly increase the vocabulary size of                      not exist in the standard dictionary. Some exam-
tweets while also injecting noise to classifica-                    ples are “tmr” (for tomorrow), “fwd” (for forward)
tion tasks. We use Tweet Preprocessor, a tweet                      and “lol” (for laughing out loudly). Some previ-
Table 6: Hate speech detection results with different spelling corrections

          Category             Metric       Original    Our system      PyEnchant      Ekphrasis   Google
                              Precision      0.630        0.640           0.630         0.630      0.569
            racist             Recall        0.617        0.681           0.617         0.617      0.702
                              F1 score       0.623        0.660           0.623         0.623      0.628
                              Precision      0.641        0.630           0.629         0.629      0.649
            sexist             Recall        0.775        0.767           0.790         0.790      0.759
                              F1 score       0.701        0.692           0.701         0.701      0.700
                              Precision      0.721        0.721           0.718         0.718      0.703
     Macro average over
                               Recall        0.741        0.757           0.743         0.743      0.759
       all categories
                              F1 score       0.727        0.737           0.727         0.727      0.727

ous works proposed to replace these slang terms             results on the 571 test tweets to evaluate the ef-
with their standard forms (Gupta and Joshi, 2017;           fect of spelling correction on this downstream ap-
Modupe et al., 2017), which require either ex-              plication. As shown in Table 6, We report results
pert knowledge or human annotations. We argue               on the original test data, and also on the test data
that because these Internet slangs change rapidly           which are corrected by our approach, PyEnchant,
we should understand them in a data-driven man-             Ekphrasis, and google spell checker. We report
ner instead of standardizing them based on hu-              precision, recall and F1-score of racist and sexist
man knowledge. Since the representation of these            classification respectively and the macro-averages
slangs and their use in hate speech will be learned         (to evaluate the overall performance).
by the neural network from the train data, we add              The best performance of each metric is high-
these slangs to our vocabulary as legal words.              lighted in the table. Compared with the classifica-
   Spelling correction. Misspellings are another            tion performance on the test data with spelling er-
source of text noise which cannot be handled in the         rors for the racist category, our approach improves
tweet normalization stage. Some misspellings are            the precision by 1%, the recall by 6.4%, and the F1
user-created to deceive the online detection sys-           score by 3.7% absolute points. Both PyEnchant
tem. For example, the swear word “fucking” has a            and Ekphrasis spell checkers improve the recall of
lot of variants in tweets such as “fckn”, “f*ckin”,         sexist category, but decrease the precision, so their
“f**king”, “fckin”, “fuckin”, “fking”,“fkin” and            corrected forms achieve F1 scores similar to the
“fkn”. When a new variant of a swear word arises            original test data. Google spell checker also gives
in the test data, the model takes it as an out-of-          similar F1 score on both racist and sexist classes.
vocabulary word, and is unable to match it with             Our approach outperforms the other baselines in
any learned pattern. The purpose of spelling cor-           terms of F1 score for the racist class and the macro
rection is to map these new variants to words that          F1 score.
are known to the model. As discussed in Section 4,             For sexism-related tweets, our approach does
we enumerate the candidates of a misspelled word,           not improve the results compared to the original
and choose the candidate which best fits with the           test data. Taking a closer look at the nature of
context as its correction.                                  the sexist tweets we notice that they often con-
   Results. We do a random 80:20 train-test                 tain some abbreviations which might be taken as
split of the Twitter dataset. A detection system            misspellings, and that their contextual informa-
was trained on the normalized train data (with-             tion is insufficient to decide the appropriate cor-
out spelling correction) using the state-of-the-art         rections. For example, in the sexist tweet “I’m
BLSTM model. There were 3218 test tweets, 571               sorry but if you watch women ufc fights kys”. Our
of which had misspellings. We applied the dif-              approach replaces kys with keys, and the trained
ferent spelling correction approaches to these 571          neural network misclassified the corrected tweet.
tweets, and the corrected tweets were then cleaned          Another sexist-related tweet is “these nsw promo
as described in the tweet normalization stage. Pro-         girls think way too highly of themselves”, where
cessed tweets were input to the trained detection           nsw is incorrectly replaced with new by our ap-
system. We compared the hate speech detection               proach.
6   Conclusion                                          Jigsaw Google. 2017. Perspective. Available at:
                                                           https://www.perspectiveapi.com/.
In this study, we showed how malicious spelling
errors can deceive profanity- and spam detec-           Sergey Gubanov, Irina Galinskaya, and Alexey Baytin.
                                                          2014. Improved iterative correction for distant
tors. To deal with these malicious misspellings,          spelling errors. In ACL (2), pages 168–173.
we proposed a context-sensitive spelling correc-
tor based on word embeddings. Our spell checker         Itisha Gupta and Nisheeth Joshi. 2017. Tweet nor-
                                                            malization: A knowledge based approach. In Info-
is light-weight, unsupervised and can be easily             com Technologies and Unmanned Systems (Trends
incorporated into downstream applications. It               and Future Directions)(ICTUS), 2017 International
achieved a favorable spelling correction perfor-            Conference on, pages 157–162. IEEE.
mance when compared with general purpose spell-         Hany Hassan and Arul Menezes. 2013. Social text nor-
checking tools such as PyEnchant, Ekphrasis and           malization using contextual graph random walks. In
Google spell checkers on both synthetic and real          ACL (1), pages 1577–1586.
misspellings from different datasets.                   Hossein Hosseini, Sreeram Kannan, Baosen Zhang,
                                                          and Radha Poovendran. 2017. Deceiving google’s
                                                          perspective api built for detecting toxic comments.
References                                                arXiv preprint arXiv:1702.08138.
2018. Abiword. Available at: https://www.               Fraser Howard. 2008. Web attacks: Modern web at-
  abisource.com.                                          tacks. Network Security, 2008(4):13–15.
2018. Google search engine. Available at: https:        Dan Jurafsky and James H Martin. 2014. Speech and
  //www.google.com.                                       language processing, volume 3. Pearson London.

Sweta Agrawal and Amit Awekar. 2018. Deep learn-        Daniel Jurafsky and James H. Martin. 2017. Dis-
  ing for detecting cyberbullying across multiple so-     tributed representations of words and phrases and
  cial media platforms. In European Conference on         their compositionality. In Speech and Language
  Information Retrieval, pages 141–153. Springer.         Processing, chapter 5, pages 1–12.

Rami Al-Rfou, Bryan Perozzi, and Steven Skiena.         Ryan Kelly. 2015. Pyenchant. Available at: https:
  2013. Polyglot: Distributed word representations        //github.com/rfk/pyenchant.
  for multilingual nlp. In Proceedings of the Seven-    Vladimir I Levenshtein. 1966. Binary codes capable
  teenth Conference on Computational Natural Lan-         of correcting deletions, insertions, and reversals. In
  guage Learning, pages 183–192, Sofia, Bulgaria.         Soviet physics doklady, volume 10, pages 707–710.
  Association for Computational Linguistics.
                                                        Fei Liu, Fuliang Weng, Bingqing Wang, and Yang Liu.
Christos Baziotis, Nikos Pelekis, and Christos Doulk-     2011. Insertion, deletion, or substitution?: nor-
  eridis. 2017. Datastories at semeval-2017 task          malizing text messages without pre-categorization
  4: Deep lstm with attention for message-level and       nor supervision. In Proceedings of the 49th An-
  topic-based sentiment analysis. In Proceedings of       nual Meeting of the Association for Computational
  the 11th International Workshop on Semantic Eval-       Linguistics: Human Language Technologies: short
  uation (SemEval-2017), pages 747–754, Vancouver,        papers-Volume 2, pages 71–76. Association for
  Canada. Association for Computational Linguistics.      Computational Linguistics.
Noah Coad. 2018. Google spell check. Avail-             Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Cor-
  able at: https://github.com/noahcoad/                   rado, and Jeff Dean. 2013. Distributed representa-
  google-spell-check.                                     tions of words and phrases and their composition-
                                                          ality. In C. J. C. Burges, L. Bottou, M. Welling,
Fred J Damerau. 1964. A technique for computer de-        Z. Ghahramani, and K. Q. Weinberger, editors, Ad-
  tection and correction of spelling errors. Communi-     vances in Neural Information Processing Systems
  cations of the ACM, 7(3):171–176.                       26, pages 3111–3119. Curran Associates, Inc.
Noura Farra, Nadi Tomeh, Alla Rozovskaya, and Nizar     Abiodun Modupe, Turgay Celik, Vukosi Marivate, and
  Habash. 2014. Generalized character-level spelling      Melvin Diale. 2017. Semi-supervised probabilis-
  error correction. In ACL (2), pages 161–167.            tics approach for normalising informal short text
                                                          messages. In Information Communication Technol-
Shaona Ghosh and Per Ola Kristensson. 2017. Neural        ogy and Society (ICTAS), Conference on, pages 1–8.
  networks for text correction and completion in key-     IEEE.
  board decoding. arXiv preprint arXiv:1709.06429.
                                                        Howard L. Morgan. 1970. Spelling correction in sys-
Hongyu Gong, Suma Bhat, and Pramod Viswanath.             tems programs. In Communications of the ACM,
  2017. Geometry of compositionality.                     pages 90–94.
Thien Huu Nguyen and Ralph Grishman. 2014. Em-             Xinwang Zhong. 2014. Deobfuscation based on edit
  ploying word representations and regularization for        distance algorithm for spam filitering. In Machine
  domain adaptation of relation extraction. In ACL           Learning and Cybernetics (ICMLC), 2014 Interna-
  (2), pages 68–74.                                          tional Conference on, volume 1, pages 109–114.
                                                             IEEE.
Etienne Papegnies, Vincent Labatut, Richard Dufour,
   and Georges Linares. 2017. Impact of content fea-
   tures for automatic online abuse detection. In Inter-
   national Conference on Computational Linguistics
   and Intelligent Text Processing.

Aurelia Power, Anthony Keane, Brian Nolan, and
  Brian O’Neill. 2018.       Detecting discourse-
  independent negated forms of public textual cyber-
  bullying. Journal of Computer-Assisted Linguistic
  Research, 2(1):1–20.

Eric Sven Ristad and Peter N Yianilos. 1998. Learning
   string-edit distance. IEEE Transactions on Pattern
   Analysis and Machine Intelligence, 20(5):522–532.

Sergio A Rojas-Galeano. 2013.         Revealing non-
  alphabetical guises of spam-trigger vocables. Dyna,
  80(182):15–24.

Bahar Salehi, Paul Cook, and Timothy Baldwin. 2015.
  A word embedding approach to predicting the com-
  positionality of multiword expressions. In HLT-
  NAACL, pages 977–983.

Manish Saxena and PM Khan. 2015. Spamizer: An
 approach to handle web form spam. In Computing
 for Sustainable Global Development (INDIACom),
 2015 2nd International Conference on, pages 1095–
 1100. IEEE.

Kaveh Taghipour and Hwee Tou Ng. 2015. Semi-
  supervised word sense disambiguation using word
  embeddings in general and specific domains. In
  Proceedings of the 2015 Conference of the North
  American Chapter of the Association for Computa-
  tional Linguistics: Human Language Technologies,
  pages 314–323.

Zeerak Waseem and Dirk Hovy. 2016. Hateful sym-
  bols or hateful people? predictive features for hate
  speech detection on twitter. In Proceedings of the
  NAACL student research workshop, pages 88–93.

Kilian Weinberger, Anirban Dasgupta, John Langford,
  Alex Smola, and Josh Attenberg. 2009. Feature
  hashing for large scale multitask learning. In Pro-
  ceedings of the 26th annual international conference
  on machine learning, pages 1113–1120. ACM.

Zar Zar Wint, Theo Ducros, and Masayoshi Aritsugi.
  2017. Spell corrector to social media datasets in
  message filtering systems. In Digital Information
  Management (ICDIM), 2017 Twelfth International
  Conference on, pages 209–215. IEEE.

Zar Zar Wint, Théo Ducros, and Masayoshi Aritsugi.
  2018. Non-words spell corrector of social media
  data in message filtering systems. Journal of Dig-
  ital Information Management, 16(2).
You can also read