DTAI Thesis Topics Dept. Computer Science KU Leuven 2019-2020 De Raedt
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
DTAI Thesis Topics Dept. Computer Science KU Leuven 2019-2020 http://people.cs.kuleuven.be/~luc.deraedt/dtaithesis18-19.pdf Luc De Raedt
Lab for Declarative Languages
and Artificial Intelligence
Machine Learning
4 ZAP, 1 res. manager
± 4 post-docs
± 25 Ph.D. students
Declarative Languages and Systems
4 ZAP
± 3 post-docs
±12 Ph.D. students Demoen retired
still interested
Bruynooghe
retired
in education in informatics 2AI is hot!
Self-driving cars -
Eve (the robot scientist)
Siri
IBM Watson in Jeopardy and “Machine Reading”
AlphaGo — (Deep) learning …
3DTAI's focus on AI
Machine Learning & Data Mining
how to extract knowledge from data
Uncertainty reasoning
how to represent and reason about uncertainty
Knowledge Representation
how to represent and reason about knowledge
Learning and Reasoning
4DTAI's focus on Declarative
Languages
Declarative = specify the what rather than the how
Different types of languages
Logic
Functional
Constraints
Probabilistic
Explainable / Understandable AI
5DTAI's methodology involves
Fundamental research
(theoretical as well as empirical)
Systems, Solvers and Software
Applications
Thesis can focus on one or more aspects,
depending on interests student
This presentation does not go in depth
about techniques but every thesis does
6This presentation
Overview of research
illustrations of possible thesis topics.
List of contact persons for topics
Full information — see online
Own topic
should be aligned with interests professor
7Research Topics
Probabilistic Programming and Predictive Learning and
Automated Data Science
Statistical Relational Learning Clustering
Graph and Network Mining Exploratory Data Mining
Privacy, Non-discrimination and
Ethical aspects
Knowledge-Base Systems Constraints
Verification of AI and ML
Static Analysis for Declarative
Functional Programming
Programming Languages
(dtai.cs.kuleuven.be/research )Applications
Sports Analytics Health Engineering & Sensors
Robotics Games
Text and Web Computational Creativity Applications of the Knowledge
Base System Paradigm
(dtai.cs.kuleuven.be/research )
9Research topics
Automated Data Science
Contact: Luc De Raedt
Can we (partly) automate data science ?
Can we automatically derive the right features ? the right representations ?
Can we automatically discover what we can learn / predict ?
Can we learn constraints ?
Example
database about students, professors, courses, and marks …
The SYNTH project —
the democratisation of Data Science
the automation of Data Science
11Automated Data Science
Contact: Luc De Raedt, Hendrik Blockeel
The Magical Ice Cream Factory
12Automated Data Science
Contact: Luc De Raedt, Anton Dries
Inductive
Motivation:Programming
Flash-fill
FlashFill in Excel 2 / 28
13Automated Data Science
Contact: Luc De Raedt
Can we recover
Learning formulas from/ aprogram
constraints CSV file? synthesis
I What are the formulas here?
I T1[:, 6] = SUM(T1[:, 3:5], row)
I T2[:, 2] = SUMIF(T1[:, 1]=T2[:, 1], T1[:, 6])
14Predictive learning and
clustering
Contact: Hendrik Blockeel, Jesse Davis
Basic idea: include in a single ensemble, trees predic
variables from many other variables
X2 X3
X1 X4
X6 X5
15Predictive learning and
clustering
Contact: Hendrik Blockeel, Jesse Davis
“Standard” machine learning
develop new algorithms for machine learning
Decision Trees
Predictive Clustering
Probabilistic Graphical Models
evaluation of machine learning (ROC etc.)
16Probabilistic Programming and
Statistical Relational Learning
Contact: Luc De Raedt, Hendrik Blockeel, Jesse Davis, Gerda Janssens
Key open question in AI — integrate
Probabilistic reasoning
Logical or relational
Machine learning
representations
statistical relational
probabilistic programming
learning 17We first review some basic concepts of logic programming: An atom pred(t1 , ..., tn )
consists of a predicate pred/n of arity n and ti terms. A term is either a (lower-
Probabilistic Programming
case) constant, and
a (uppercase) variable, or a functor f unc/n applied on n terms.
Statistical Relational
A definite clauseLearning
is an expression of the form h b1 , ..., bn , where h and the bi
are atoms. It states that h is true whenever all bi are true. If n is 0, we have a fact
f , which expresses that f is true. A substitution ✓ = {X1 = t1 , ..., Xn = tn }
maps each variable Xi to a term ti . Applying a substitution ✓ to an atom a
yields a✓, in which each occurrence of Xi in a is replaced with ti .
A ProbLog [12, 2] program consists of a set of labeled facts pi :: ci , where pi
E.g. ProbLog:
is a probability valueaand
probabilistic Prolog
ci a fact, and a set of definite clauses. Each ground
instance of such a fact represents a random variable that is true with probability
pi . We use the following ProbLog program as a running example in the paper:
0.05 :: burglary.
alarm :- burglary.
0.01 :: earthquake.
alarm :- earthquake.
0.7 :: hears_alarm(john).
calls(Pers) :- alarm, hears_alarm(Pers).
0.6 :: hears_alarm(mary).
It has the random variables: burglary, earthquake, hears alarm(john) and
P( hears_alarm(john) | burglary = true) ?
hears alarm(mary), and states that there is an alarm whenever there is burglary
or an earthquake. The last clause states that if there is an alarm and a person
hears the alarm, that person will call.
Challenges on inference,
To model univariate learning,
discrete distributions (e.g.,implementation,
uniform, Poisson), we also
application, ...
allow for discrete distribution probabilistic facts X ⇠ :: f . X is a logical
variable appearing in atom f and a probability density function. For example,
X ⇠ unif orm(7) :: apples(X) specifies that apples(X) is true with X sampled
from the set of integers between 1 and 7 with equal probability. Each grounding of
all the variables (except X) in f denotes a random variable. All random variables 18Probabilistic Programming and
Statistical Relational Learning
Action and activity learning /
Dynamics
Travian: A massively multiplayer real-time strategy game
Commercial game run by TravianGames GmbH
~3.000.000 players spread over different “worlds”
Can we build a model of this world ?
Can we use it for playing better ?
[Thon et al. ECML 08]
19Logic + Probability + Neural Networks
Contact: Hendrik Blockeel, Luc De Raedt
+ = 16
+ =3 + =?
+ =4
Data Query Answer
Answer
DeepProbLog [Manhaeve NeurIPS 2018]
20Robotics (and Vision)
Contact: LucReality is
De Raedt harder
Winograd’s
SHRDLU
Put diagram
the blue adapted from Winograd, Understanding Natural Language (1972)
pyramid on the block in the box Bring me
http://www.wiley.com/college/busin/icmis/oakman/outline/chap11/slides/blocks.htm the tea pot and the sugar
First-MM
● Details are important! For reasoning, planning...
● We cannot ignore position, orientation, shape,
physics, etc...
The CLEVR Dataset
● High-level concepts still useful (objects,
andproperties
Variations
and relations, background knowledge)Robotics
Contact: Luc De Raedt
Learn probabilistic - logic model
Moldovan et al. ICRA 12, 13, 14
Shelf
Shelf grasp
Shelf
tap
push
22Verifying AI & ML systems
Contact: Luc De Raedt, Hendrik Blockeel, Jesse
Davis, Bettina Berendt & Wannes Meert
Verification of software has a long tradition (eg model checking
techniques)
How to verify systems that learn ? that use AI ?
Our approach — combined principles of probabilistic logics with
verification
Topics
inductive synthesis of specifications
Markov Decision Processes (& reinforcement leanring)
Derive properties of learned systems …
23Socially Aware Data Mining Graph and Network Mining
Contact: Bettina Berendt
Help users
manage friends
and privacy by
data mining
Focus on Privacy and
(anti-discrimination)
24Text and Web
Contact: Bettina Berendt, Jesse Davis
Extraction of information from the web / social media
Taxonomy learning
Machine reading / Natural language processing
NaturalMachine reading …
25Knowledge-Base
Systems
Contact: Marc Denecker, Gerda Janssens
IDP
Advanced KBS system developed by group
FO(.) language rooted in predicate logic and logic programming
separation of domain knowledge and problem solving
Language extensions to increase expressivity
E.g. design patterns for FO(.) (past thesis)
Better solvers and more inference methods
E.g. a solver for rational numbers (past thesis)
26Knowledge-Base
Systems
Contact: Marc Denecker, Gerda Janssens
Three themes for students :
logical modeling of interesting AI problem +
expressing AI knowledge domains
logical analysis and implementation of software
systems and tasks + software by applying inference
on specifications
Advanced algorithmics and implementation +
extending/optimising the IDP software package.
27Applications of the Knowledge
Base System Paradigm
Logical modeling of AI problems
DAG manuscripts
Analysing medieval coloring &
extension
vocabulary Vms {
extern vocabulary V
IsSource(Manuscript )
}
theory Tms : Vms {
{ ! x : IsSource(x)Applications of the Knowledge
Base System Paradigm
Contact: Marc Denecker, Gerda Janssens
Software = Knowledge Base + Logical Inference + User
Interface
E.g., An interactive configuration system for an
insurance company
AIM : Build cheap, correct, reusable, maintainable
software from a logical specification
29Applications of the Knowledge
Base System Paradigm Winning the RuleML Challenge
Insurance application
Propagation constraints
and choices
Fill out necessary values
30Knowledge-Base
Systems
Contact: Marc Denecker, Gerda Janssens
Advanced algorithmics and implementation + extending/
optimising the IDP software package.
help us win the next CP or ASP competition
+ E.g., structuring search space as a hierarchy of search
problems
+ E.g., linear programming techniques in IDP
+ E.g., improved computation of definitions
+ E.g., algorithms for revision inference (updating solutions)
31Constraints
Contact:
Tom Schrijvers, Marc Denecker, & Luc De Raedt
• Hyper heuristics to solve constraint satisfaction
and optimization problems — formalisation
• Search Heuristics
• Role in IDP
• Role in Data Mining
• Learning of constraints
32UITLEG:
Functional Programming Functional Programming
Je kent Functional Programming van de ta
het vak Declaratieve Talen.
Contact: Tom Schrijvers Op onderzoeksgebied werken we rond alle
Haskell functionele talen, en Haskell in het bijzond
Actuele onderwerpen zijn:
- expliciete side-effects zoals monads,
★ Explicit Side-Effects
- gevorderde type system features
- domein-specifieke talen
Monads Transformers Effect Handlers
★ Advanced Type Systems
Type Classes Polymorphism Kinds
★ Domain-Specific Languages
Design Infrastructure Applications
★ Much more…
33Functional Programming
25
Widespread Adoption Early Adopters
Haskell Language + GHC Compiler
UITLEG:
Heel wat interessante uitdagingen komen voort uit de
Haskell Finance
groeiende mainstream adoptie van Functional
Telecom
Programming. Many Others
in Hoe langer hoe meer bedrijven gaan aan de slag met
functionele talen zoals Haskell en F# (F-sharp),
industry 12 en mainstream talen zoals Java en C# adopteren
functionele concepten.
Anonymous Functions
Functional Languages Mainstream
FP
now
1936 1958 1973 1987 2007 2014
λ calculus Lisp ML Haskell C# Java 8
2011
mainstream
C++11 Swift
Alonzo John Robin Haskell
Church McCarthy Milner Committee
34Functional Programming
201: The Oracle of Haskell
abs x
| x >= 0 = x
| x < 0 = -x
GHC
your oracle
compiler
✓exhaustive guards UITLEG:
ontwikkel een orakel dat
nagaat of guards in Haskell-
programma’s alle gevallen
dekken
35Static Analysis for Declarative Programming
Languages
Declarative Programming Languages
Contact: Tom Schrijvers, Gerda Janssens
UITLEG:
Je kent de Declaratieve Taal Prolog uit het vak Declaratieve
Talen.
Op onderzoeksgebied werken we rond de automatische
analyse van Prolog-programma’s.
Actuele onderwerpen zijn:
- een type checker om Prolog statisch getypeerd te maken
- de eindigheid van programma’s te bepalen
★ Type Checking - analyseren van complexe control flow zoals coroutines
★ Termination Analysis
★ Reasoning about Coroutines
36Declarative Programming
Automatically Inferring
Languages
Properties of Interest
powerful dynamic flexible
append([],L,L). UITLEG:
append([X|Xs],Ys,[X|Zs]) :-
Delcaratieve talen zoals Prolog zijn heel krachtig,
dynamisch en flexibel.
append(Xs,Ys,Zs). De uitdaging bestaat erin om automatisch belangrijke
eigenschappen af te leiden van Prolog programma’s om na
te gaan of ze correct zijn, altijd eindigen en hoe je ze
efficient kan compileren.
optimisation correctness termination
37Delcarative Programming
Industrial-Strength
Languages
Static Types for Prolog
Prolog
+ Types
Program
Case Study:
Industrial Partner
your type
checker Prosyn
UITLEG:
Expert System
Prolog is een ongetypeerde taal. Hierdoor is het
makkelijke om via schrijffouten moeilijk op te sporen
bugs te introduceren.
1 MegaLoC
In deze thesis ontwikkel je een type systeem voor
bugs Prolog:
Prolog
De programmeur schrijft type-signaturen voor zijn
predikaten, en jouw type checker gebruikt die om bugs
op te sporen.
38
Je evalueert je type checker op het Prosyn expertApplication Areas
• Airplanes collect many flight
parameters
Industry
• Airplane health & reliability
Questio
extremely important
• BUT: Ground maintenance Sources:
checks cost flying time • http://www.b737.org.
Contact: Wannes Meert • Anomaly Detection B
Lacaille, Proceeding
• Automating diagnostics and •
Learning (Benelearn
http://techcrunch.com
predicting when the airplane •
•
Boeing 737 Bleed Ai
Boeing 737NG Aircra
Theses with:
will need repairs = win-win Section (SDS), ATA
Boeing
Jetairfly
EuroMillions Basketball League 3
3E
mage source: http://www.b737.org.uk/737ng.htm
Sirris
Thomson-Reuters
Xenit 4.2. Estimating the skeleton configuratio
Pepite
Melexis
Flanders Make
imec
Cern
…
40Sports Analytics
Contact: Jesse Davis
Machine Learning for sports
Soccer & basketball
E-sports
41Sports Analytics
Tasks
Strategy detection
Performance analysis & prediction
Scouting
42Sports Analytics
Thesis Topics
Soccer analytics
Model flow of a game
Quantify team performance
Learn aging curves of players
Basketball analytics
Detect surprising events
43Health
Tasks
Continuous monitoring
Injury risk profiles
44Health
Thesis topics
Performance management and Injury prevention
Sensor fusion for surface detection and skill detection
in runners
Kinect monitoring for qualitative feedback during
rehabilitation
45Anomaly Detection Typically, no usage at
Contact: Jesse Davis, Hendrik Blockeel, night,
Wannes Meert Except for sporadic
maintenance
Anomalies are behaviors that do not conform to what is
expected
Anomalies typical entail significant costs such as
fraudulent credit card transaction, excess usage, etc.
Topics:
Design new algorithms to detect anomalies,
Applications, e.g., airplanes, CERN, resourcesEngineering & Sensors
Contact: Wannes Meert, Jesse Davis, Hendrik Blockeel, Luc De
Raedt
ght
ty
ce
sing
Analysing data from airplanes Large Hadron Collider maintenance (CERN)
Anomaly DetectionEngineering & Sensors The automatic Engineer
Contact: Wannes Meert http://dtai.cs.kuleuven.be
Example use case: Automatic Engineer
Goal: Learn constraints and programs over heterogeneous knowledge sources to assist
engineers in proposing new designs, finding similar designs, and verifying designs.
Probabilistic programming
Measurements
Technical drawings Standards
Spreadsheets
Active learning Constraint programmingAI Challenges
Games
Contact: Luc De Raedt, Jesse Davis, Anton Dries,
Hendrik Blockeel
learning to solve science tests formulated in natural
language (like SAT, GMAT, GRE, …)
Tests as a testbed for intelligent behavior, for
“reasoning”
Allen AI Institute, Levesque’s Winograd test, IBM
Watson …
49d3
pe T h
rce ree m
Problem e
m a r
d e m pro
l e
b ble bili
a r ba e
n
s a from ty
th
Th n r
4 s on the from
e p t an achi h
an erce d 20 nes s . f w i t k e t is e s
e o t a k
d
de 5 pe ntage perc , B a
A
3 tim sum b ag He Wha e ta
fec r en nrdown at th e a s. d . le h d?
tive c e o f t osf th C ypth . s
a rb l e e
r arb re
n t de t t 5 h i s o
. W res fecie i hebilit ro eas A gMinike n ma d it d m s als
ha peAc d tivoeba totaat lduc heand an con g i
t is teivepr pites is l p e5 re
theth ly. do ce rod 0 p ogf 52bag ceosnesist ba
d e
Fin m prothb One is 3 ction rcen diamto
s u e he actatrhds, c s of 10 c
ac abili cho per r t,th3 ntd
hs, 1 onta ards
hin ty t ose cen spe e0 prob e 3 clu i n i ng 1 from
e A ha c a b s , 3 h a de
s
t its a p 4 p ivel t , t b i lity th a nd 1 e a rts, 1 ck
? n a 3 spad 3
l a tio orig iece erce y. cards tag
i n e s
p opu On a f inate . It nt of th
e s ame
hand
h a
. Find
o f the ase. cent o s fr is
o s u i
s all
10
n t i s e e r 1 m t .
e rce tain d 98 p while e
. 1 p c e r s e , s ult s itiv In a g
e 0 a i s e a re p o roup
p o s w it h d i ti v e e a is b r own of 10
p d th e o s i v n e yes. peop
Su fecte t for a p
t e d g
e r s o
t is t h T w le, 60
in s
te d gi v e fe c n p h a ? e g roup. o p eople perce
a l i n s e l t , w s e What nt ha
e dic fecte e not ly cho resu disea neith
e i s t
are s
e l e ve
m e in s e r o h e c t ed fro
o s o f tho ndom ositiv s the f them
h
prob
a b m
t h n t r a a p h a a s b i l i t y
ce If a s on rown
eyes at
th
per esult. d give e pers ?
r a n t h
ted abilit y
t e s
p rob
thed3
pe T h
rce ree m
Problem e
m a r
d e m pro
l e
b ble bili
a r ba e
n
s a from ty
th
Th n r
4 s on the from
e p t an achi h
an erce d 20 nes s . f w i t k e t is e s
e o t a k
d
de 5 pe ntage perc , B a
A
3 tim sum b ag He Wha e ta
fec r en nrdown at th e a s. d . le h d?
tive c e o f t osf th C ypth . s
a rb l e e
r arb re
n t de t t 5 h i s o
. W res fecie i hebilit ro eas A gMinike n ma d it d m s als
ha peAc d tivoeba totaat lduc heand an con g i
t is teivepr pites is l p e5 re
theth ly. do ce rod 0 p ogf 52bag ceosnesist ba
d e GOAL: solve the
Fin m prothb One is 3 ction rcen diamto
ac abili cho per
s u
r
e
e t,th3
he actatrhds, c s of 10 c
ntd
hs, 1 onta
i n
ards
from
hin ty t ose cen spe e0 prob 3 clu i ng 1 a de
problem directly
e A ha
?
l a
n
s
t its a p 4 p ivel t
tio orig iece erce y. cards
, c t a b i lity th
b
a
s
tag
, a nd 1
i n
3
3
spad
h e a rts, 1 ck
e
3
opu On a f inate . It nt of th hand s . Find
r
n t o f
ce tain d 98 p while ei
p
the ase. cent o
s e from text e r 1
s fr is
o m
e s ame
s u i t .
h a s all
10
p e , lt In a g
0 . 1 a c e r
e a s e
re s u os itiv roup
o s e it h d i s v e a p b r own of 10
p p d w th e o s i ti i v e n is e peop
Su fecte t for p d g r s o is yes. le, 60
s e a c t e p e a t t h e T w o
in l te d gi v
i n fe e n , w h e ? g roup. p eople perce
a s l t s What nt ha
e dic fecte e not ly cho resu disea neith
e i s t
are s
e l e ve
m e in s e r o h e c t ed fro
o s o f tho ndom ositiv s the f them
h
prob
a b m
t h n t r a a p h a a s b i l i t y
ce If a s on rown
eyes at
th
per esult. d give e pers ?
r a n t h
ted abilit y
t e s
p rob
theComputational Creativity
Games
I like my men like I like my graves: nameless.
Contact: Luc De Raedt I like my coffee like I like my country: cold.
Algoritmic perspective on creative behaviour
(Help) generate e.g. humor, music, …
Thesis Thomas Winters 52Artificial intelligence, reasoning about uncertainty, action- and activity learning,
machine learning, data mining, constraint programming, probabilistic programming
(ProbLog), automated data science, language for mining and learning.
Luc De Raedt Applications in natural language, vision, robotics, automatic programming.
Verification of AI and ML. Computational Creativity.
Machine learning, data mining, probabilistic logics, declarative languages for data
mining.
Hendrik Blockeel
Application domains include bio-informatics, arts, history, compiler development,
optimization.
Machine learning, data mining for personalized medicine. Artificial intelligence, statistical
relational learning, transfer learning, anomaly detection
Jesse Davis
Applications in healthcare (e.g., clinical practice, physical therapy, medical and biological
texts, etc.). Applications to sport (e.g., football and basketball)
Bettina Berendt Web mining, privacy, social media, user issues
Probabilistic programming and methods. Data Science Applications. Applications in
Wannes Meert engineering. Collaborations with industry.
53functional programming, constraint and logic programming, type systems,
Tom Schrijvers programming language theory, programming language design and implementation,
program analysis
Constraint programming, Knowledge Base Systems, SAT solving, declarative
languages (formal modelling languages),
Marc Denecker
Applications in configuration, scheduling, optimization, security, business rule systems,
executable formal software specifications, logical workflow languages.
Performant probabilistic ILP data mining systems, integration of logic programming
techniques in the knowledge representation language FO(.), program analysis and
Gerda Janssens abstract interpretation, implementations of logic programs, verification of functional
equivalence of C programs
Bart Demoen Schools onderwijs in de informatica / Education in informatics
Check out dtai-web for more details
54Questions ? Advisable to contact promotors or daily advisors before selecting a topic Also, attend thesis info market after Easter Holidays
You can also read