CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021

Page created by Brittany Cruz
 
CONTINUE READING
CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021
CLARIN: One Infrastructure For
      Many Languages
           CLARIN ERIC
        The 31st of May 2021
CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021
Organisers

This edition of the CLARIN Café is organized and hosted by

Andreas Witt and Francesca Frontini
CLARIN ERIC BoD

                                                             2
CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021
Plan

14:00 - 14:10 Opening and CLARIN 101 - Francesca Frontini (ILC-CNR and CLARIN ERIC)

14:10 - 14:20 CLARIN as a Multilingual Infrastructure - Andreas Witt (IDS and CLARIN ERIC)

14:20 - 14:35 K-centres Under The Angle Of Multilinguality - Steven Krauwer (CLARIN ERIC)

14:35 - 14:50 The CKLD K-centre - Felix Rau (University of Cologne)

14:50 - 15:05 The European Language Equality Project (ELE) - Georg Rehm (DFKI)

15:05 - 15:20 The European Federation of National Institutions for Language (EFNL) -
Sabine Kirchmeier (EFNIL, Deputy President)

15:20 - 15:30 Short coffee break

15:30 - 16:00 Panel discussion on "CLARIN, an infrastructure for a Multilingual Europe"

                                                                                          3
CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021
Recording

The event is recorded for further dissemination purposes.
Questions and comments? Put them in the chat box.

                                                            4
CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021
CLARIN 101
https://www.clarin.eu/content/clarin-in-a-nutshell
CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021
CLARIN ...

●   is the Common Language Resources and Technology
    Infrastructure
●   has the ESFRI ERIC status since 2012, Landmark since 2016
●   provides easy and sustainable access for scholars in the
    humanities and social sciences and beyond
    – to digital language data (in written, spoken, video or
      multimodal form)
    – and advanced tools to discover, explore, exploit, annotate,
      analyse or combine them, wherever they are located
    – through a single sign-on environment
●   serves as an ecosystem for knowledge sharing
●   is an integral part of the European Open Science Cloud
    –   See clarin.eu/eosc

                                                                6
CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021
CLARIN today

●   68 centres
●   21 members: (AT, BG,
    CY, CZ, DE, DK, EE, FI, GR,
    HR, HU,IS, IT, LT, LV, NL,
    NO, PL, PT, SE, SI)
●   3 observers: FR, UK, ZA

                                  7
CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021
The Technical Infrastructure

clarin.eu/fair   vlo.clarin.eu   switchboard.clarin.eu

                                                    8
CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021
The Knowledge Infrastructure

    https://www.clarin.eu/content/clarin-for-researchers
     https://www.clarin.eu/content/knowledge-sharing
                                                           9
CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021
10
The café

#CLARINcafe

              11
CLARIN as a multilingual
    infrastructure
Panel discussion
Panelists:
 ●   Sabine Kirchmeier (EFNIL, Deputy President)
 ●   Georg Rehm (DFKI)
 ●   Franciska de Jong (CLARIN ERIC)
 ●   Jurgita Vaičenonienė (Vytautas Magnus University and CLARIN-LT)
Getting involved in CLARIN
• Join our NewsFlash
  – https://www.clarin.eu/content/newsflash
• Check out our events
    – https://www.clarin.eu/events
•   Open calls
    – https://www.clarin.eu/content/funding-opportunities

•   Follow us on Twitter @CLARINERIC
•   And stay tuned for the next cafés
    – https://www.clarin.eu/content/clarin-cafe
    – #clarincafe

                                                            14
See you at the next café

           ParlaMint II Release Café
          28th June 14:00-16:00 CEST
       Comparable parliamentary corpora
          now 13 more languages!

                                          15
CLARIN as a Multilingual
          Infrastructure

                   Andreas Witt
                    31 May 2021

                                  1
CLARIN - a language infrastructure

• Common Language Resources and Technology
  Infrastructure
• many object languages
   • BUT: most LRs in computational linguistics focus on
      English
• one meta language: English

• many CLARIN tools aim to be language-neutral
   • BUT: the output quality often differs depending on
     the input language

                                                           2
CLARIN - a pan-European infrastructure

• members: consortia in 21 countries, many of them are
  members of the EU
    • 24 official languages in the EU
    • in most EU countries more than one language is used
    • official language(s) + immigration + sign language(s)
• but also countries from the EU are members of CLARIN
  (South Africa)

• CLARIN aims at supporting all languages

                                                             3
Example: Virtual Language Observatory

                                        4
Put our hands together for…

• The European Federation of National Institutions for
  Language (EFNL)

• The European Language Equality Project (ELE)

• CLARIN Knowledge Centre for linguistic diversity and
  language documentation (CKLD), University of Cologne

                                                         5
… and prevent this from happening:

                                     6
CLARIN as a Multilingual
          Infrastructure

                   Andreas Witt
                    31 May 2021

                                  7
K-centres Under The Angle Of
              Multilinguality
                   Steven Krauwer ( steven@clarin.eu ) &
             Bente Maegaard ( bmaegaard@hum.ku.dk )

         CLARIN Café - One Infrastructure For Many Languages
                                               May 31 2021
CLARIN and languages

 • The announcement says “CLARIN is language-neutral” …
 • … but personally I would rather say “All languages are
   equally dear to CLARIN”
 • We do not just tolerate all languages …
 • … but we love them all, and we love the diversity!
 • CLARIN has many centres that give access to language data,
   services and tools, but in this presentation we will focus on
   the centres that give access to knowledge and expertise, the
   so-called K(nowledge)-centres
 • At this moment we have 23 (with 2 more in the pipeline),
   and here we will show how they support the multilingual
   nature of CLARIN and the diversity of languages

CLARIN                                                             2
K-centres and their focus
 • K-centres come in types, and can focus on (possibly
   combinations of) e.g.
         -   specific languages: Danish, Basque
         -   modalities: written text, speech
         -   linguistic topics: morphology, field linguistics
         -   language processing topics: text mining, speech recognition
         -   data types: tree banks, wordnets
         -   language independent topics: IPR, data management
         -   … and many others
 • For more information about K-centres:
         - See description of what they do and what they are on
           https://www.clarin.eu/content/knowledge-centres
         - See full list of K-centres on
           http://vonweber.elsnet.org/cgi/kcentres_page.cgi
         - Search K-centres for specific expertise on
           http://vonweber.elsnet.org/cgi/kcentres_search.cgi
CLARIN                                                                     3
K-centres and languages: language portals
Some K-centres serve as portals for a language or group of
languages and have broad knowledge about the language(s)
and about the availability of resources and tools to work with
them. At this moment the following have declared themselves
portal
         - CLASSLA: Slovene, Croatian, Bosnian, Serbian, Montenegrin,
           Macedonian, Bulgarian
         - CORLI: French
         - CorpLingCz: Czech
         - DANSK: Danish
         - K-BLP: Belarusian
         - NLP:EL: Greek, Greek sign language
         - PolLinguaTec: Polish
         - PORTULAN: Portuguese
         - Spanish-K-centre: Spanish, Basque, Catalan, Galician
         - SWELANG: Swedish
 • Note: 11 out of 24 official EU languages are covered by a
   K-centre language portal
CLARIN                                                                  4
K-centres and languages: other language
expertise
 • Some K-centres don’t serve as a portal for a specific
   language but have working experience with it that they are
   happy to share, but often from a specific perspective. Some
   examples:
         - CLARIN-SPEECH works on speech analysis for Swedish and English
         - IMPACT-CKC works on digitisation and OCR for a number of
           languages, such as Spanish, English, Polish, French, Dutch, German,
           Slovene, Czech, Latin, Bulgarian
         - PoliLinguaTec serves as a portal for Polish, but also works on
           English, German, Russian, Ukrainian, Bulgarian, Lithuanian, French,
           Spanish, Hungarian, Hebrew
 • Note: the list of “other” languages covered by K-centres
   contains some 40 individual languages, language families
   and groups of languages; out of 24 official EU languages
   only Maltese and Slovak are completely missing
CLARIN                                                                           5
K-centres and languages: dealing with
language barriers and diversity
Some K-centres address topics that are multilingual by their
very nature, such as
• Second language learning and bilingual development: e.g.
  ACE, CLARIN-HUMLAB, CLARIN-Learn, CLARIN-SMS
• Facilitating translation (manual or by machine): e.g.
  CLARIN-SMS, K-BLP, NLP:EL, PolLinguaTec, PORTULAN, TRTC
• Studying or dealing with language diversity: e.g. CKLD (see
  next presentation in the programme), CLARIN-HUMLAB,
  CLARIN-Learn, CLARIN-SMS, PhA-OeAW

CLARIN                                                          6
Concluding remarks
From the perspective of multilinguality K-centres offer a lot of
expertise to users of the CLARIN infrastructure:
         - Jointly the CLARIN K-centres provide expertise on nearly all official
           EU languages, and on a rich variety of other languages Europe- and
           worldwide.
         - Out of 24 official EU languages 11 have K-centre portals that can
           provide broad expertise on matters related to the language and its
           processing
         - Multilinguality and diversity as such are addressed as well
But we are not there yet:
         - For the remaining 13 languages coverage via K-centres is quite
           uneven. For some languages there is a wealth of language and
           language processing expertise available, even if there is no
           dedicated declared K-centre portal for it, whereas for others the
           available expertise offered by K-centres is focused on specialized
           topics rather than the language at large.
Finally:
          - Please note that K-centres are not the only places in CLARIN
            where one can find relevant expertise, but they are the ones that
            have agreed to make it more visible and more easily accessible!
CLARIN                                                                             7
CLARIN Knowledge-Centre
 for linguistic diversity and
  language documentation
                      Felix Rau
Data Center for the Humanities / University of Cologne
●   distributed K-Centre
●   since 2018
●   7 partners
●   from 2 countries
    (Germany/UK)
Linguistic Diversity and Language Documentation
●   Linguistic Diversity               ●   Minority languages
     ○   Global linguistic diversity   ●   Endangered languages
     ○   Language typology             ●   Under-resourced & under-researched
     ○   Language comparison
                                           languages
     ○   Typological databases
                                       ●   Language community involvement
●   Language Documentation
     ○   Audio-visual data
                                       ●   Participatory research
     ○   Language archiving            ●   Multilingualism
     ○   Linguistic fieldwork
Partners
ELAR   Endangered Languages Archive                    SOAS University of London
SWLI   SOAS World Languages Institute                  SOAS University of London

DCH    Data Centre for the Humanities                  University of Cologne
IfL    Department of Linguistics                       University of Cologne

HZSK   Hamburg Centre for Language Corpora             University Hamburg
INEL   Grammatical Descriptions, Corpora and           Academy of Sciences and Humanities
       Language Technology for Indigenous Northern     in Hamburg
       Eurasian Languages
ZAS    Leibniz-Zentrum Allgemeine Sprachwissenschaft
Activities
●   Consultations
     ○   CKLD Helpdesk (via CLARIN-D Helpdesk infrastructure)
     ○   Via the individual partner institutions (e.g. DCH)
●   Trainings
     ○   Field methods training (e.g. IfL, ELAR)
     ○   In-country language documentation training (e.g. ELAR)
     ○   EXMARaLDA training (HZSK)
●   Research
     ○   QUEST project: quality standards for audio-visual language data (INEL, DCH, HZSK,IfL, ZAS)
     ○   Joint publications
Audience
●   Linguists (and other researcher interested in endangered and minority
    languages)
●   Language communities and speakers
●   Educators (working with endangered and minority languages)
Thank you!
https://ckld.uni-koeln.de/
European Language Equality: An Overview

Georg Rehm (Co-Coordinator ELE, DFKI)
31-05-2021 CLARIN Café
http://www.european-language-equality.eu
European Language Equality (ELE) – Summary

                                  Objective: development of a strategic research,
                                    innovation and deployment agenda to achieve
                                    digital language equality in Europe by 2030
                                  Consortium: 52 partners from all over Europe
                                  Coordinator: ADAPT Centre (Dublin City University)
                                  Co-coordinator: DFKI
                                  Runtime: 18 months –
                                    ELE and ELG will both end in June 2022
                                  Start on 1 January 2021

     European Language Equality                                                        2
Context: “Language Equality” EP Resolution
                                                                   European Parliament
                                                                   2014-2019

EP Resolution Language equality in the digital age                                                     TEXTS ADOPTED
                                                                                                        Provisional edition

P8_TA(2018)0332 – partially based on the STOA study                P8_TA-PROV(2018)0332

Voting (11 Sept. 2018): 592 yes – 45 no
                                                                   Language equality in the digital age
                                                                   European Parliament resolution of 11 September 2018 on language equality in the
                                                                   digital age (2018/2028(INI))

Selected Recommendations addressed by ELE:                         The European Parliament,

                                                                   – having regard to Articles 2 and 3(3) of the Treaty on the Functioning of the European
                                                                     Union (TFEU),

25. Establish a large-scale, long-term coordinated funding         – having regard to Articles 21(1) and 22 of the Charter of Fundamental Rights of the
                                                                     European Union,

    programme for research, development and innovation in the      – having regard to the 2003 UNESCO Convention for the Safeguarding of the Intangible
                                                                     Cultural Heritage,

    field of language technologies, at European, national and      – having regard to Directive 2003/98/EC of the European Parliament and of the Council of
                                                                     17 November 2003 on the re-use of public sector information1,

    regional levels, tailored specifically to Europe’s needs and   – having regard to Directive 2013/37/EU of the European Parliament and of the Council of
                                                                     26 June 2013 amending Directive 2003/98/EC on the re-use of public sector information2,

    demands                                                        – having regard to Decision (EU) 2015/2240 of the European Parliament and of the Council
                                                                     of 25 November 2015 establishing a programme on interoperability solutions and
                                                                     common frameworks for European public administrations, businesses and citizens (ISA2
                                                                     programme) as a means for modernising the public sector3,

29. Create a European LT platform for sharing of services          – having regard to the Council resolution of 21 November 2008 on a European strategy for
                                                                     multilingualism (2008/C 320/01)4,

                                                                   – having regard to the Council decision of 3 December 2013 establishing the specific

27. Europe has to secure its leadership in language-centric AI       programme implementing Horizon 2020 – the Framework Programme for Research and

                                                                   1
                                                                       OJ L 345, 31.12.2003, p. 90.
                                                                   2
                                                                       OJ L 175, 27.6.2013, p. 1.
                                                                   3
                                                                       OJ L 318, 4.12.2015, p. 1.
                                                                   4
                                                                       OJ C 320, 16.12.2008, p. 1.

       European Language Equality                                                                                                                              3
European Language Equality: Consortium
• 5 core partners: Adapt Centre, DFKI, Charles University, ILSP, University of the Basque Country
• 9 networks, associations, initiatives: LT Innovate (via Crosslang), EFNIL, ELEN, ECSPM, CLARIN, CLAIRE
  (via University of Leiden), NEM (via Eurescom), LIBER, Wikimedia

• 9 companies: Tilde, ELDA, Expert System, Sail Labs, Kantan MT (via Xcelerator MT), Pangeanic,
  Semantic Web Company, Ontotext (via Sirma AI), SAP
• 29 research organisations: University of Vienna, University of Antwerp, Institute for Bulgarian
  Language, University of Zagreb, University of Copenhagen, University of Tartu, University of
  Helsinki, CNRS, Research Institute for Linguistics, Institute for Icelandic Studies, FBK, University of
  Latvia, Institute of the Lithuanian Language, Luxembourg Institute of Technology, University of
  Malta, University of Utrecht, Language Council of Norway, Polish Academy of Sciences, University
  of Lisbon, Romanian Academy, University of Cyprus, Slovak Academy of Sciences, Jozef Stefan
  Institute, Barcelona Supercomputing Center, Royal Institute of Technology, University of Zurich,
  University of Sheffield, University of Vigo, Bangor University

       European Language Equality                                                                           4
European Language Equality: Approach
• Main result: Strategic Agenda and Roadmap – all deliverables provide input for this main report
  •   Detailed description of the European Language Equality Programme (cf. EP resolution)
• Research partners prepare updates of META-NET White Papers (one deliverable each).
• Networks and initiatives to produce reports (one deliverable each) in which they collect,
  consolidate and present their own position, needs, wishes, demands, visions etc.
• Companies to produce various technical deep dives for the different technology areas.
• Several additional reports to be produced, primarily by the core partners.
• Reports to be prepared in a more or less autonomous way based on templates
• Reports to be used as input for the strategic agenda and roadmap – the main project result.
• Total number of languages taken into account in ELE: approx. 75.
• Close collaboration with ELG

        European Language Equality                                                                  5
WP1 (lead: R.C. “Athena”, ILSP)                                       WP2 (lead: Charles University)
European Language Equality:                                           European Language Equality:
Status Quo in 2020/2021                                               The Future Situation in 2030

                                    WP3 (lead: Univ. of the Basque Country)
                                    Development of the
                                    Strategic Agenda and Roadmap

                     WP4 (lead: DFKI)
                     Communication – Dissemination – Exploitation – Sustainability

WP5 (lead: ADAPT Centre)
Project Management
                                                                                                        Work
                                                                                                       Packages

       European Language Equality                                                                                 6
WP1 European Language Equality: Status Quo in 2020/2021                                            WP2 European Language Equality: The Future Situation in 2030
Task 1.1: Defining Digital Language Equality                                                       Task 2.1: The perspective of European LT developers (industry and research)
Task 1.2: Language Technologies and Language-centric AI – State of the Art                         Task 2.2: The perspective of European LT users and consumers
Task 1.3: Language Technology Support of Europe’s Languages in 2020/2021                           Task 2.3: Science – Technology – Society: Language Technology in 2030

                                                                32 reports on the tech-            Reports from             Deep dives           Report on              Forecast:
Digital Language              Language Technology
                                                                nology support of 32               networks,                (MT, speech,         external               Language
Equality: Definition of       and language-centric AI:
                                                                European languages                 initiatives and          text analytics,      consultations          Technology
the concept                   State of the Art
                                                                (META-NET White Paper update)      associations             data)                and surveys            in 2030

WP3 Development of the Strategic Agenda and Roadmap
Task 3.1: Desk research – landscaping
Task 3.2: Consolidation and aggregation of all input received
Task 3.3: Final round of feedback collection

Existing strategic
                                            Strategic agenda and roadmap:                            Final round of feedback                   Strategic agenda and roadmap:
documents and projects
                                            initial version                                          collection                                final version
in LT/AI

WP4 Communication – Dissemination – Exploitation – Sustainability                               EP/EC Workshop                                 ELE Conference
Task 4.1: Overall project communication and dissemination
Task 4.2: Liaise with EP/EC – organisation of a targeted workshop                                                                                                                       Work
Task 4.3: Organisation of final ELE conference
Task 4.4: Production of PR materials and sustainable results
                                                                                                ELE Strategic Agenda and Roadmap
                                                                                                (print version, interactive version)
                                                                                                                                               Final ELE Book Publication
                                                                                                                                                                                     Packages and
WP5 Project Management                                                                                                                                                                   main
                                                                                                                                                                                     Deliverables
Task 5.1: Overall project management including Project Management Office
Task 5.2: Digital collaboration and document management infrastructure

                    European Language Equality                                                                                                                                                 7
Start of the ELE project   M1    ELE kick-off meeting

                                       M2

Digital collaboration and document           Digital Language Equality – preliminary definition (D1.1)                                       Continuous inclusion of
                                       M3
 management infrastructure (D5.1)            Promotional materials and PR package (D4.1); project infrastructure (D5.1)                      the community through
                                                                                                                                               various means (esp.
                                                                                                                                               meetings, website,
                                       M4    Specification of the consultation process including templates, surveys, events etc. (D2.1)
                                                                                                                                                email, discussion
                                                                                                                                                   groups etc.)
                                       M5

                                       M6    Communication and dissemination plan (D4.2)

                                       M7

                                       M8
                                                                                                                                              External consultation
       Project mgmt. report (D5.2)     M9    Report on the state of the art in Language Technology and Language-centric AI (D1.2)              and brainstorming
                                                                                                                                                 meetings (both
                                                                                                                                            face-to-face and virtual)
                                       M10

                                       M11

                                       M12

                                       M13   Digital Language Equality – full specification of the concept (D1.3)
                                                                                                                                            Feedback loops to include
                                                                                                                                            input and comments from
                                             Reports on 32 European languages (D1.4-D1.35)                                                  the Language Technology
                                       M14
                                             Reports from relevant European initiatives (D2.2-D2.12); technology deep dives (D2.13-D2.16)          community

                                             Strategic agenda including roadmap – initial version (D3.2)

                                                                                                                                                                        Timeline
                                       M15
                                             Report on all external consultations and surveys (D2.17)
                                             Database and dashboard with the empirical data collected in D1.4-D1.35 (and others) (D1.36)
                                       M16
                                             Report on the state of Language Technology in 2030 (D2.18)

                                       M17   Report on the final round of feedback collection (D3.3)

       Project mgmt. report (D5.3)           Strategic agenda including roadmap – final version (D3.4)
                                       M18                                                                                                                                         8
             End of the ELE project          ELE EP/EC workshop (D4.3); ELE conference (D4.4); ELE book publication (D4.6)
HPC initiative (High Performance Computing) and RDA (Research Data Alliance), among others. DFKI Berlin hosts
the German/Austrian Chapter of W3C and has a good working relationship to DIN (Deutsches Institut für Normung).

                  Network – Initiative – Association                      Represented by ELE consortium partner(s)
 Association of European Research Libraries (LIBER)                     LIBER
 Big Data Value Association                                             SAP
 Confederation of Laboratories for AI Research in Europe (CLAIRE)       ULEID
 Cracking the Language Barrier                                          DFKI and various others
 European Civil Society Platform for Multilingualism (ECSPM)            ECSPM
 European Federation of National Institutions for Language (EFNIL)      EFNIL
 European Language Equality Network (ELEN)                              ELEN
 European Language Grid (ELG)                                           DFKI, ILSP, CUNI, ELDA and others
 European Lexicographic Infrastructure (ELEXIS)                         JSI
 European Research Infrastructure for LRs and Technology (CLARIN)       CLARIN ERIC (CUNI and ILSP are members)
 LT-Innovate – Europe's LT Business Association                         CRSLNG (SAIL, EXPSYS, TILDE are members)
 META-NET                                                               DFKI, CUNI, ILSP, TILDE, ELDA and others
 New European Media (NEM)                                               ERSCM
 Public-Private Partnership on AI (AI PPP)                              SAP, TILDE
 Wikipedia, Wikidata, Abstract Wikipedia                                WMD
       External networks, initiatives and associations that ELE will consult with through established connections
 AI4EU (European AI on Demand Platform), Dbpedia, Europeana, European Association for Machine Translation (EAMT),        Networks
 European Commission (DG Translate, DG Interpretation/SCIC), European Parliament (DG Translation, CULT Committee,
 ITRE Committee), Global WordNet Association (GWA), HumanE-AI-Net (and other ICT-48-2020 projects), Network to               and
 promote linguistic diversity (NPLD), World Wide Web Consortium (W3C) and various others                                 initiatives
        Table 5: Networks, initiatives and associations – either represented by ELE partners or external ones
ELE will communicate with all of these initiatives in terms of getting input and feedback for the strategic agenda and
roadmap with a special emphasis on the consortium partners and relevant networks and initiatives (Table 5).
           European Language Equality                                                                                                  9
1.3.6   Current situation in the countries
partners are already members of the LTC). The fully populated LTC is meant to be a representative, balanced and
inclusive body that includes representatives from all relevant stakeholders and from all European countries.
   No.                                         Deliverable name                           WP   Short name    Type   Diss. level   Date
  D1.1    Digital Language Equality – preliminary definition                               1   DCU          R        Public        3
  D1.2    Report on the state of the art in Language Technology and Language-centric AI    1   EHU          R        Public        9
  D1.3    Digital Language Equality – full specification of the concept                    1   DCU          R        Public        13
  D1.4    Report on Basque                                                                 1   EHU          R+OTH    Public        14
  D1.5    Report on Bulgarian                                                              1   IBL          R+OTH    Public        14
  D1.6    Report on Catalan                                                                1   BSC          R+OTH    Public        14
  D1.7    Report on Croatian                                                               1   FFZG         R+OTH    Public        14
  D1.8    Report on Czech                                                                  1   CUNI         R+OTH    Public        14
  D1.9    Report on Danish                                                                 1   UCPH         R+OTH    Public        14
  D1.10   Report on Dutch                                                                  1   UU           R+OTH    Public        14
  D1.11   Report on English                                                                1   USFD         R+OTH    Public        14
  D1.12   Report on Estonian                                                               1   UTART        R+OTH    Public        14
  D1.13   Report on Finnish                                                                1   UHEL         R+OTH    Public        14
  D1.14   Report on French                                                                 1   CNRS         R+OTH    Public        14
  D1.15   Report on Galician                                                               1   UVIGO        R+OTH    Public        14
  D1.16   Report on German                                                                 1   DFKI         R+OTH    Public        14
  D1.17   Report on Greek                                                                  1   ILSP         R+OTH    Public        14
  D1.18   Report on Hungarian                                                              1   NYTI         R+OTH    Public        14
  D1.19   Report on Icelandic                                                              1   SAM          R+OTH    Public        14
  D1.20   Report on Irish                                                                  1   DCU          R+OTH    Public        14
  D1.21   Report on Italian                                                                1   FBK          R+OTH    Public        14

                                                                                                                                         Deliverables
  D1.22   Report on Latvian                                                                1   IMCS         R+OTH    Public        14
  D1.23   Report on Lithuanian                                                             1   LKI          R+OTH    Public        14

                                                                                                                                             1/3
  D1.24   Report on Luxembourgish                                                          1   LIST         R+OTH    Public        14
  D1.25   Report on Maltese                                                                1   UOM          R+OTH    Public        14
  D1.26   Report on Norwegian                                                              1   LCNOR        R+OTH    Public        14
  D1.27   Report on Polish                                                                 1   PAS          R+OTH    Public        14
  D1.28   Report on Portuguese                                                             1   ULIS         R+OTH    Public        14
  D1.29   ReportEuropean Language Equality
                  on Romanian                                                              1   ICIA         R+OTH    Public        14              10
  D1.30   Report on Serbian                                                                1   FILFUB       R+OTH    Public        14
D1.22   Report on Latvian                                                                     1   IMCS     R+OTH   Public   14
D1.23   Report on Lithuanian                                                                  1   LKI      R+OTH   Public   14
D1.24   Report on Luxembourgish                                                               1   LIST     R+OTH   Public   14
D1.25   Report on Maltese                                                                     1   UOM      R+OTH   Public   14
D1.26   Report on Norwegian                                                                   1   LCNOR    R+OTH   Public   14
D1.27   Report on Polish                                                                      1   PAS      R+OTH   Public   14
D1.28   Report on Portuguese                                                                  1   ULIS     R+OTH   Public   14
D1.29   Report on Romanian                                                                    1   ICIA     R+OTH   Public   14
D1.30   Report on Serbian                                                                     1   FILFUB   R+OTH   Public   14
D1.31   Report on Slovak                                                                      1   JULS     R+OTH   Public   14
D1.32   Report on Slovenian                                                                   1   JSI      R+OTH   Public   14
D1.33   Report on Spanish                                                                     1   BSC      R+OTH   Public   14
D1.34   Report on Swedish                                                                     1   KTH      R+OTH   Public   14
D1.35   Report on Welsh                                                                       1   BNGR     R+OTH   Public   14
D1.36   Database and dashboard with the empirical data collected in D1.4-D1.35 (and others)   1   ILSP     R+OTH   Public   16
D2.1    Specification of the consultation process including templates, surveys, events etc.   2   CUNI     R       Public   4
D2.2    Report from CLAIRE                                                                    2   ULEID    R+OTH   Public   14
D2.3    Report from CLARIN                                                                    2   CLARIN   R+OTH   Public   14
D2.4    Report from LT Innovate                                                               2   CRSLNG   R+OTH   Public   14
D2.5    Report from META-NET                                                                  2   CUNI     R+OTH   Public   14
D2.6    Report from ELG                                                                       2   DFKI     R+OTH   Public   14
D2.7    Report from ECSPM                                                                     2   ECSPM    R+OTH   Public   14
D2.8    Report from EFNIL                                                                     2   EFNIL    R+OTH   Public   14
D2.9    Report from ELEN                                                                      2   ELEN     R+OTH   Public   14
D2.10   Report from LIBER                                                                     2   LIBER    R+OTH   Public   14
D2.11   Report from NEM                                                                       2   ERSCM    R+OTH   Public   14
D2.12   Report from Wikipedia                                                                 2   WMD      R+OTH   Public   14
D2.13   Technology deep dive Machine Translation                                              2   TILDE    R       Public   14   Deliverables
                                                                                                                                     2/3
D2.14   Technology deep dive Speech Technologies                                              2   SAIL     R       Public   14
D2.15   Technology deep dive Text Analytics and Natural Language Understanding                2   EXPSYS   R       Public   14
D2.16   Technology deep dive Data                                                             2   SWC      R       Public   14
D2.17   Report on all external consultations and surveys                                      2   CUNI     R       Public   15
D2.18   Report on the state of Language Technology in 2030                                    2   CUNI     R       Public   16
D3.1       European
        Report      Languagestrategic
                on existing  Equality documents and projects in LT/AI                         3   EHU      R       Public   3              11
D3.2    Strategic agenda including roadmap – initial version                                  3   DFKI     R       Public   15
D2.8    Report from EFNIL                                                          2   EFNIL    R+OTH   Public       14
  D2.9    Report from ELEN                                                           2   ELEN     R+OTH   Public       14
  D2.10   Report from LIBER                                                          2   LIBER    R+OTH   Public       14
  D2.11   Report from NEM                                                            2   ERSCM    R+OTH   Public       14
  D2.12   Report from Wikipedia                                                      2   WMD      R+OTH   Public       14
  D2.13   Technology deep dive Machine Translation                                   2   TILDE    R       Public       14
  D2.14   Technology deep dive Speech Technologies                                   2   SAIL     R       Public       14
  D2.15   Technology deep dive Text Analytics and Natural Language Understanding     2   EXPSYS   R       Public       14
  D2.16   Technology deep dive Data                                                  2   SWC      R       Public       14
  D2.17   Report on all external consultations and surveys                           2   CUNI     R       Public       15
  D2.18   Report on the state of Language Technology in 2030                         2   CUNI     R       Public       16
  D3.1    Report on existing strategic documents and projects in LT/AI               3   EHU      R       Public       3
  D3.2    Strategic agenda including roadmap – initial version                       3   DFKI     R       Public       15
  D3.3    Report on the final round of feedback collection                           3   EHU      R       Public       17
  D3.4    Strategic agenda including roadmap – final version                         3   DFKI     R       Public       18
  D4.1    Promotional materials and PR package                                       4   DFKI     R+OTH   Public       3
  D4.2    Communication and dissemination plan                                       4   DFKI     R       Public       6
  D4.3    Report on EP/EC workshop                                                   4   DFKI     R       Public       18
  D4.4    Report on ELE conference                                                   4   DFKI     R       Public       18
  D4.5    Strategic agenda and roadmap (print version, online version)               4   DFKI     R       Public       18
  D4.6    ELE book publication                                                       4   DFKI     R       Public       18
  D5.1    Digital collaboration and document management infrastructure               5   DCU      R+OTH Confidential   3
  D5.2    Project management report (interim report)                                 5   DCU      R     Confidential   9
  D5.3    Project management report (final report)                                   5   DCU      R     Confidential   18

                                                     Table 7: List of deliverables
                                                                                                                            Deliverables
ELE                                                 European Language Equality                                               46
                                                                                                                                3/3
             European Language Equality                                                                                               12
Summary
• ELE is a new EU project that started in January 2021 and ends in June 2022
• Its goal is the development of the Strategic Research, Innovation and Implementation Agenda and
  a Roadmap for achieving full Digital Language Equality in Europe by 2030
• Once-in-a-decade opportunity
• Consortium with 52 partners covering all European countries and all major initiatives
• Many consultation events, roundtables, stakeholder meetings planned
• Close collaboration with its sister project European Language Grid (ELG)
• Intended overlap between ELE and ELG consortium, ELG NCCs and META-NET members
• Firmly establish LT and language-centric AI in Horizon Europe and Digital Europe Programme
• Please participate in our stakeholder events, surveys and questionnaires!
• ELE and ELG will finish up with a joint META-FORUM 2022 in June 2022.

       European Language Equality                                                                   13
You’re invited to stay in touch via https://european-language-equality.eu
European Language Equality                                                                               14
European Language Equality

                                    Thank you!

The European Language Equality
project has received funding from   Georg Rehm (Co-Coordinator ELE, DFKI)
the European Union under
grant agreement № LC-01641480 –     31-05-2021 CLARIN Café
101018166 (ELE).                    http://www.european-language-equality.eu
Presentation of EFNIL

                     Sabine Kirchmeier
                       Vice president
European Federation of National Institutions for Language (EFNIL)
Outline
1. About EFNIL
2. EFNIL Projects
3. EFNIL institutions and CLARIN

                                   2
European Federation of National Institutions for Language -
    EFNIL: 40 member organizations from 29 countries

                                                              3
EFNIL activities
• Projects aiming at the description and analysis of the current linguistic
  situation in Europe
   – LLE : Articles describing language legislation in Europe
   – ELM: European Language Monitor
   – ELIPS: European Languages and their Intelligibility in the Public Space.
• Scientifically based analysis of cross-state language problems and
  questions of language policy in annual conferences and publications
• Consultation services in the field of language policy for political decision
  makers of the EU institutions and member states
• Propagation of the cultural and practical benefits of European linguistic
  diversity and plurilingualism through relevant actions and publications.
    – EFNIL Master’s Thesis Award (MTA).
                               www.efnil.org                                     5
European Language Monitor
• It is a scientific review of the language situation in
  European countries repeated in intervals of 4
  years.
• The information is comparable over time.
• The information is comparable across countries.
• It provides exact reference to the actual
  legislation in each country.
• It provides background knowledge about the
  status of the languages of Europe.

                                                       6
European Language Monitor
1.    Country situation. Official, regional, indigenous, immigrant languages spoken within
      and outside the country, legal status, accordance with conventions
2.    Legal situation. Language law, constitutional status, other regulations, language
      demands for citizenship
3.    Primary and secondary education. Languages of instruction, languages offered
4.    Tertiary education. Languages of instruction, languages used in publications and
      dissertations
5.    Media. Papers, TV, film, music
6.    Business. Regulations, company languages, annual reports, websites
7.    Dissemination of languages. Official languages taught abroad
8.    Language organizations. Official, non-governmental but publicly funded, private
9.    Language technology.

                                                                                        7
European Language Monitor
Visualisation online
What are language provisions about?

                                      9
Official language plan or strategy?

Yes; 10;
  45%
                    No; 12;
                     55%
Funding programme for language technology?

             No
         answer; 3;
            14%

                                    Yes; 12;
No; 7;                                54%
 32%
ELIPS
ELIPS investigates the following topics:

• Plain language policies and actions
• Easy-to-read language policies and actions
• Terminology policies and actions
• Policies and actions on the use of other languages,
  gender, cultural and sexual diversity
• Training of information providers in public institutions
• Collaboration between translation services of the EU
  institutions and the experts in member states
1.6. Materials, instructions, services and tools
                                   Web       Guidelines        Models or   Tools
                                   service(s)(online, pdf or   templates
Country                                      printed)
Austria                            No answer No answer         No answer   No answer
Belgium (Flemish Community)        No        Yes               Yes         Yes
Bulgaria                           Yes       No                No          No
Denmark                            Yes       Yes               Yes         Yes
Estonia                            No        Yes               No          No
Finland (Swedish)                  Yes       Yes               Yes         No
Finland (Finnish)                  Yes       Yes               Yes         Yes
Germany                            Yes       Yes               No          Yes
Grand Duchy of Luxembourg          Yes       Yes               Yes         No
Greece                             Yes       Yes               Yes         Yes
Hungary                            No        Yes               No          No
Iceland                            Yes       Yes               No          No
Ireland (excl. Northern Ireland)   Yes       Yes               No          Yes
Italy                              Yes       Yes               No          No
Latvia                             Yes       No                No          No
Lithuania                          Yes       Yes               No          Yes
Malta                              No answer No answer         No answer   No answer
Netherlands                        Yes       Yes               No          Yes
Norway                             Yes       Yes               Yes         Yes
Portugal                           No answer No answer         No answer   No answer
Slovak Republic                    Yes       Yes               No          No
Slovenia                           Yes       No                No          Yes
Sweden                             Yes       Yes               Yes         Yes
Switzerland                        No        Yes               No          No
UK (England)                       Yes       Yes               Yes         No
UK (Wales)                         No        Yes               Yes         No
UK (Northern Ireland)              Unknown Unknown             Unknown     Unknown
UK (Scotland)                      No        No                No          No

Table 1: Which materials, instructions, services and tools are available in your country in
order to help public administration comply with the principles of plain language?
EFNIL conferences
Thessaloniki 2010   “Language, languages and new technologies: ICT in the service of languages”
London 2011         “The Role of Language Education in Creating a Multilingual Europe”
Budapest 2012       “Lexical Challenges in Multilingual Europe”
Vilnius 2013        “Translation and Interpretation in Europe”
Florence 2014       “Language use in university teaching and research past, present, future”
Helsinki 2015        “Language use in public administration – theory and practice in the European
                    states”
Warsaw 2016         “Stereotypes and linguistic prejudices in Europe”
Mannheim 2017       “National language institutions and national languages”
Amsterdam 2018      “Language variation: a factor of increasing language complexity and a challenge
                    for language policy within Europe”
Tallinn 2019        “Language and Economy: Language Industries in a Multilingual Europe“
Webinar 2020        “Language in the Corona Crisis”
Cavtat 2021:        “The role of national language institutions in the digital age”

                    Conference publications available online on www.efnil.org
EFNIL Master’s Thesis Award
The EFNIL Master’s Thesis Award is an annual competition to find the best master’s
theses in Europe within the area of language use, language policy and multilingualism.
EFNIL wishes to inspire and motivate young researchers to engage in scientific projects
on language use, language policy and multilingualism, and to disseminate new
ground-breaking research on language use, language policy and multilingualism to a
wider audience.
The students that submit the best three theses will each be awarded:
•   1. the EFNIL Master's Thesis Award (1500 Euro)
•   2. an invitation to present their thesis at the annual international EFNIL conference
    (all expenses paid)
•   3. the opportunity to publish an article based on their thesis in EFNIL’s annual
    conference proceedings
•   4. the opportunity to publish the full thesis on EFNIL’s website www.efnil.org.

Next submission deadline is 31 December 2021.
EFNIL institutions and Digital
               Linguistics
• 5 EFNIL member organizations are CLARIN
  centres
• Some EFNIL institutions are collecting their
  own resources or participate in national
  ressource building projects, such as
  dictionaries and corpora.
• EFNIL as an organisation, and 5 EFNIL member
  organizations are part of the ELE consortium.

                                              16
EFNIL and CLARIN

                   17
How do EFNIL members use the
   CLARIN infrastructure?

                               18
What future improvements of CLARIN
           would you like to see?
•   More tools
•   More processed resources
•   More training possibilities
•   Legal workshops
•   Lower fees for small institutions
•   Clearer differentiation between CLARIN and other
    infrastructure projects/communities providing access to LT
    resources. If I have problem X, which infrastructure should I
    use?

                                                                    19
Are EFNIL members satisfied with
           CLARIN?
Possible connections between EFNIL
              and CLARIN
• More formalized exchange of information (promotion of
  news and conferences)
• Involving more EFNIL members in CLARIN
• Tapping into EFNIL projects (plain language corpora)
• Joint efforts to cover more languages (minority languages)
• Cooperating on policy and legal issues
• Questions about language resources and infrastructures in
  ELM
• Joint projects :
   – multilingual language data – written and spoken
   – open online language teaching environments
   – grammar and dictionary collections.
Thank you for listening
You can also read