Ideas for research from social bots to beer - Thesis proposals - Florian Daniel
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Thesis proposals Ideas for research from social bots to beer Florian Daniel florian.daniel@polimi.it February 25, 2019
Detection of harmful social bots Identification of harmful communication patterns Conversational screen readers Domain-specific content extraction
Empirical study shows: bots may cause harm to humans
Examples When abuse happens
What bots do
What rce, Cu sto merS vc Disclose
What else can 1
eCom me
else can Chat
Talk with user
AASlang
sensitive facts
go wrong? 10
Boo Denigrate
s tJui
they do? Redirect user SM c
Ss e
ex a Be grossly 3
m
Pu offensive Regulated
o
Ore Be indecent 4 by law
Tay
MS or obscene
Write post DeathThreat 2
o Be threatening
Geic
Post Comment post Da Make false 6
t ing allegations
SethRich Iva
Forward post Sp na
am
Action Abuse
Deceive
Pola
W rBo Spam
ise t
Like message Instag Sh
res ib
e Spread
Endorse misinformation
Trump, Colluding Not regulated
Follow user Bots
Mimic interest
Clone profile
, Jason Slotkin6
InstaClone
Participate Create user Invade space
F. Daniel, C. Cappiello, B. Benatallah. Bots Acting Like Humans: Understanding and Preventing Harm. IEEE Internet Computing,
2019, accepted for publication. https://ieeexplore.ieee.org/document/8611348Detection of harmful social bots
Tweets
Account
How do we
identify and
classify bots
Harm1 according to the
Human harm they may
Harm2 HarmN cause?Not Safe For Work
Profile picture stolen from
existing people (usually women)
Personalized profile preferences
aimed at emulating actual users
Catchy profile
description
paired with
unsafe URL
Tweets aim to
redirect users
to untrusted URLs
Lorenzo Cannone, Matteo Di Pierro. Detection and Classification of Harmful Bots in Human-Bot Interactions on Twitter.
MSc Thesis, Politecnico di Milano, December 2018.News-Spreader
Propagandistic profile
picture and tile
Propaganda High number of interactions
hashtags and
messages in
description
Lots of (political) news
retweeting
Lorenzo Cannone, Matteo Di Pierro. Detection and Classification of Harmful Bots in Human-Bot Interactions on Twitter.
MSc Thesis, Politecnico di Milano, December 2018.Spam-Bot
Profile tile with catchy
Profile picture with messages
corporate logo
Corporate link
High number of tweets
in description
with high frequency
paired with
messages of
job offers
(product sales)
Repetitive tweets aimed at
spamming URLs
Lorenzo Cannone, Matteo Di Pierro. Detection and Classification of Harmful Bots in Human-Bot Interactions on Twitter.
MSc Thesis, Politecnico di Milano, December 2018.Fake-Follower
Optional profile picture Poor profile
(usually missing) customization
Poor Low number of content posting (usually null)
description Following as main interaction
(usually
empty)
Optional (re)tweeting
activity
Lorenzo Cannone, Matteo Di Pierro. Detection and Classification of Harmful Bots in Human-Bot Interactions on Twitter.
MSc Thesis, Politecnico di Milano, December 2018.Thesis ingredients
Creation of
datasets
Multi-class ensemble classifier
Feature selection and engineering
Can we add more types of harm?
How to train classes as users use BotBuster? Web app
Figure 7.3: Client - Server architecture
Inception neural network. Users with less than 10 tweets, or less than 10Identification of harmful communication patterns
Bot code repositories
GitHub
Which potential harms can we identify
inside the code of the bots?ne
ts
n
s
ce
fac
tio
on
e
bs
ma
siv
ati
ive
ro
lleg
en
for
sit
g
to
t
Harmful code
off
ce
n
s
isin
en
ea
file
ere
i
ten
en
pa
sly
es
dm
pro
s
e
int
ec
ea
fal
es
rat
ive
s
s
e
gro
ind
clo
thr
am
ic
rea
ne
nig
ke
us
ad
ce
m
Clo
patterns
Dis
Ma
Inv
De
Sp
Sp
Ab
Be
Be
Pattern
De
Be
Action
Mi
Follow Indiscriminate follow
Whitelist-based follow
Blacklist-based follow
Twitter Phantom follow
Like Indiscriminate like
Whitelist-based like
Python
Blacklist-based like
Mass like
Tweet Fixed-content tweet
Next: patterns search AI-generated tweet
Trusted source tweet
engine + web site for Mention Indiscriminate mention
users Opt-in mention
Targeted mention
Whitelist-based mention
Blacklist-based mention
What else can we Retweet Indiscriminate retweet
learn from the code of
Whitelist-based retweet
Blacklist-based retweet
bots? Is it possible to
Mass retweet
Talk to Indiscriminate talk
trace back from Fixed-content talk
AI-generated talk
messages to code? Talk with opt-in
Targeted talk
Pause Mimic human
Andrea Millimaggi and Florian Daniel. On Social Satisfy API contraints
Bots Behaving Badly: Empirical Study of Code
Patterns on GitHub. Submitted for publication to Store Store persistently
ICWE 2019, under review. Enables Prevents Vulnerable to content abuse Vulnerable to trust abuseConversational screen readers
Amazon
Alexa
Google Assistant
Can we extract conversational knowledge from webpages?
What about “talking” to websites?Domain-specific content extraction
How effectively
can we extract
recipes from
free text? And
how do we do
it?Typical data science steps
Domain understanding
Data collection
Manual inspection of data
Hypotheses formulation
Feature engineering, data labeling
Algorithm engineering (from AI/machine learning
to statistics)
Online tool Validation and hypotheses verificationSoft skills Proposals Template Florian Daniel http://www.floriandaniel.it
You can also read