AI Methods for Digital Heritage: An Introduction to the Workshop - Prof. Dr. Günther Görz Department Informatik, AG Digital Humanities, FAU ...

Page created by Paul Johnston
 
CONTINUE READING
AI Methods for Digital Heritage: An Introduction to the Workshop - Prof. Dr. Günther Görz Department Informatik, AG Digital Humanities, FAU ...
AI Methods for Digital Heritage:
An Introduction to the Workshop

Prof. Dr. Günther Görz
Department Informatik, AG Digital Humanities, FAU Erlangen-Nürnberg
AI Methods for Digital Heritage: An Introduction to the Workshop - Prof. Dr. Günther Görz Department Informatik, AG Digital Humanities, FAU ...
KÜNSTLICHE INTELLIGENZ 4 / 2009
AI Methods for Digital Heritage: An Introduction to the Workshop - Prof. Dr. Günther Görz Department Informatik, AG Digital Humanities, FAU ...
KÜNSTLICHE INTELLIGENZ 4 / 2009 :
 Focus on Cultural Heritage and AI
  ● Building knowledge networks from cultural heritage data by federating data
    bases of memory institutions
  ● Knowledge transfer and education
        (Fostering group conversations in the museum café)
  ● Disclosure of texts through linguistic annotation and analysis with emphasis
    on their semantic content, also by means of virtual working environments
  ● Classifying named entities and time specifications in text corpora
  ● Conceptual modelling for the documentation of architecture
  ● Besides “close reading” there were first approaches to “distant reading”,
    but not yet for images / collections
G. Görz, FAU, Informatik DH                                                        3
AI Methods for Digital Heritage: An Introduction to the Workshop - Prof. Dr. Günther Görz Department Informatik, AG Digital Humanities, FAU ...
An attempt to explore the
potential of AI for research in a
humanities discipline from
about the same time...
AI Methods for Digital Heritage: An Introduction to the Workshop - Prof. Dr. Günther Görz Department Informatik, AG Digital Humanities, FAU ...
J. Barceló: Computational Intelligence in Archeology (2009)
           Interdisciplinarity: science and technology AND hermeneutics

Imagining an “Automated Archeologist” relying essentially on machine learning

●     Discovering the function of tools
●     Reconstructing incomplete data
●     Understanding what an archaeological site was
●     Explaining ancient societies
●     General common sense understanding

G. Görz, FAU, Informatik DH                                                     5
AI Methods for Digital Heritage: An Introduction to the Workshop - Prof. Dr. Günther Görz Department Informatik, AG Digital Humanities, FAU ...
Current situation
● Mass digitization, indexing, networking, mediation:
       ●    Dramatic increase of digital data corpora,
            both through retro-digitization and genuine digital generation
             ● especially also of image collections

             ● 3D models

             ● multimedia data

             ● In progress: Federation of corpora by integrators such as Europeana
               or PHAROS (photo archives), but still insufficient on the semantic side.
● Still a lot of manual annotation is going on (cf. Amazon Mechanical Turk);
  for training sets there are severe problems with biases
● Discussion point: Consequent and consistent use of controlled vocabularies /
  formal ontologies ??

G. Görz, FAU, Informatik DH                                                               6
AI Methods for Digital Heritage: An Introduction to the Workshop - Prof. Dr. Günther Görz Department Informatik, AG Digital Humanities, FAU ...
Current situation
● Great progress has been made in methods of Machine Learning (DeepLearning)
       ● applications in OCR for handwriting and deciphering of closed books / scrolls
       ● object recognition
       ● automatic annotation of works of art
       ● First approaches to Explainable AI
         ...which in my opinion can only be achieved by hybrid systems (cf. modelling).
● Discussion point: Methodological problems with unsupervised learning
       ● what could the (theoretical) framework – at least the terminology – be for explanation??
       ● Role of “curated knowledge” in cultural heritage (institutions)
● Labeling vs. Semantics: comes in with a reasoning framework
       ● “Sense-relational” encoding of meaning in structures
       ● Resoning to reveal implicit knowledge – beyond previously stored associations in links

G. Görz, FAU, Informatik DH                                                                       7
AI Methods for Digital Heritage: An Introduction to the Workshop - Prof. Dr. Günther Görz Department Informatik, AG Digital Humanities, FAU ...
Current situation
  ● Modeling: Continuous development of CIDOC CRM (v. 7 à ISO),
    a general reference ontology for the cultural heritage sector with
    extensions for specific purposes
         ● Recent work is focussing on ontology design patterns (cordh, Linked ART,...)
           and Linked Open Data
         ● Problems of vagueness, uncertainty and inconsistency
            in the sources
         ● Implementation with Semantic Web techniques opens
           up a potential for inferencing
                ●   mass data cause significant performance problems

G. Görz, FAU, Informatik DH                                                               8
AI Methods for Digital Heritage: An Introduction to the Workshop - Prof. Dr. Günther Görz Department Informatik, AG Digital Humanities, FAU ...
Ontology-Based Knowledge Extraction

G. Görz, FAU, Informatik DH           9
Ontology-Based Knowledge Extraction: Ideal Case (VTM)

G. Görz, FAU, Informatik DH                             10
Linked Open Data Cloud

G. Görz, FAU, Informatik DH   11
Making Data Fit for the Linked Open Data Cloud
● But... for “big data” : data integrity and semantics
       ● Many resources with (sometimes sligthly) different data models and vocabularies
       ● ...different spellings, naming conventions, time specifications, multilingualism, etc
       ● AI methods could help a lot (pattern recognition, parsing, learning, etc.), but actually
       ● semiautomatic steps (example taken from cordh/PHAROS, International Consortium of Photo Archives)

                                                                                           © Minadakis cordh
G. Görz, FAU, Informatik DH                                                                                12
Research Data, Research Questions and Knowledge Transfer

● Change of research goals and research questions with the availability
  of big Linked Open (Usable) Data??
       ●    Up to now: Research questions in humanities primarily solved by
            “close reading”, i.e. case studies, etc.
       ●    What has changed with the amendment by “distant reading”?
       ●    First of all: degrees of granularity ... but not only
       ●    New research goals and questions?
● Hybrid systems (cf. B. Ludwig, 3/2020)
● Operationalization problem: From high-level questions down to “data”
        Computational Thinking
                     Foster 2011: How Computation Changes Research (in: Switching Codes)
G. Görz, FAU, Informatik DH                                                                13
Transdisciplinarity
● Already in our journal special issue (2009) transdisciplinarity in the true sense
  (Mittelstrass) had been addressed w.r.t. decisive contributions of AI techniques
● Federation of cultural heritage and science data as a starting point for
  transdisciplinary research reaching beyond the capabilities of particular
  disciplines
● Modelling and simulation of complex systems such as medieval cities
   ● in contrast to interdisciplinary work, the disciplines involved will themselves
     change through synergisms
   ● The treatment of difficult questions in a holistic dimension will lead to new
     problem-oriented ways of knowledge generation, development and transfer

G. Görz, FAU, Informatik DH                                                        14
Downright paradigmatic in our context

G. Görz, FAU, Informatik DH             15
European Time Machine
● A few decisive steps towards a broad target portfolio,
  all of which require AI methods:
● Comprehensive digitization of a variety of historical sources
  requires a series of extraction processes,
  including document segmentation and “understanding”
● Alignment of named entities
● Simulation of hypothetical spatiotemporal 4D reconstructions
● Data acquisition goes hand in hand with modeling – in particular of events,
  actors, place, and time – and long term preservation.
● Important contributions of AI: computer vision and pattern recognition,
  natural language processing, machine learning, knowledge representation and
  processing, simulation
G. Görz, FAU, Informatik DH                                                     16
European Time Machine

                              © TMO 2020
G. Görz, FAU, Informatik DH            17
Core Challenges
● Diverse data
● Uncertain data
       ●    imprecise – imprecision vs. inaccuracy
       ●    incomplete (unknown attribute values)
       ●    ambiguous
       ●    vague
       ●    inconsistent
● Plurality of access methods and audiences
● Strategy for sustainability
       ● Availability and stability
       ● Long term: data formats, standards, software, hardware

G. Görz, FAU, Informatik DH                                       18
Cultural Heritage Research Data Ecosystem ?
Diverse data: Challenges on the institutional side

  Quoting Robert Sanderson, CNI Keynote 2020:
On the institutional side, in particular with memory institutions (GLAM), there
are still problems with the diversity of institutions, cultures and objects
● Libraries: Many non-unique information-carrying objects
● Archives: Many unique information-carrying objects
● Museums: Relatively few unique image-carrying objects
● Conservation (science): Activities to research and preserve (unique) objects
                                                                       © Sanderson

G. Görz, FAU, Informatik DH                                                          19
Important Tasks
● Engineering effective and efficient hybrid systems (architectures) capable to
  deal with big data
● Building hypotheses by finding “interesting” regularities – also by inductive
  reasoning – and testing them against resilient data
● Unified access to European history as Linked Open Data through the
  Semantic (“Epistemic”) Web
● Other fields of activity for AI are with the mediation of culture:
   ● Education has requirements in providing localized and customized data and
     extremely enhanced levels of detail

G. Görz, FAU, Informatik DH                                                   20
Important Tasks
       ●    Need for new smart algorithms for meaningful extraction of information
            and creation of knowledge from noisy, heterogeneous and complex data
            at a massive scale
             ● Cultural tourism as an example: Information is required in different
               levels of granularity for the preparation of visits, the visits on site,
               and follow-up processing
       ●    High expectations in providing plausible and reliable explanations
            (“Explainable AI”) in causal chains
             ● !! Carefully distinguish causality from correlation

       ●    Coping with incompleteness, ambiguity, vagueness ... and errors
            will be our steady companion

G. Görz, FAU, Informatik DH                                                               21
guenther.goerz@fau.de

https://wwwdh.cs.fau.de/

http://erlangen-crm.org/

http://wiss-ki.eu/
You can also read