Frédéric Blain - Fred Blain

Page created by Derek Grant
 
CONTINUE READING
Frédéric Blain
                              Lecturer of Translation Technology
                              University of Wolverhampton (UK)

Personal Details
Date of birth – 13 September 1986                                         Nationality – French
Website – fredblain.org                                                  GPG key – FC7C3BC0
LinkedIn – fredblain
Twitter – @fblain                                                  Contact : f.blain@wlv.ac.uk

Education & Qualifications
 2009 - 2013         Ph.D. in Computer Science
                     Computer Science Laboratory of the University of Le Mans (LIUM)
                     Le Mans University (France)
            Thesis   « Evolutive translation models »
          Advisors   Prof. Holger Schwenk – Facebook, LIUM
                     Dr. Jean Senellart – SYSTRAN International

 2007 - 2009         Master of Research Degree in Computer Science
                     Le Mans University (France)
          Domain     Human-computer communication and educational engineering
      Dissertation   « Creation of bilingual corpus from comparable resources »
         Advisors    Prof. Holger Schwenk – Facebook, LIUM
                     Dr. Jean Senellart – SYSTRAN International

 2004 - 2007         Bachelor’s Degree in Computer Science
                     Le Mans University (France)

Current & Previous Appointments
 2020 – present      Lecturer of Translation Technology at the University of Wolverhampton
                     Research Institute in Information and Language Processing – RGCL
 2015 – 2020         Research Associate in Machine Translation at the University of Sheffield
                     Department of Computer Science – NLP Research Lab
                     Supervised by Pr Lucia Specia
 2013 – 2014         Postdoc Researcher in Machine Translation at the University of Le Mans
                     Department of Computer Science – Language and Speech Technology group
                     Supervised by Pr Holger Schwenk
 2009 – 2012         Research engineer
                     SYSTRAN SA – Paris (France)
Professional and External Standing
RESEARCH INTERESTS

Natural Language Processing with focus on Machine Translation to enrich end users’ experience,
with the use of deep-learning methods for weakly-supervised and unsupervised Quality Estima-
tion of Machine Translation. Previously worked on Post-Editing, incremental training and adap-
tation through time.

RESEARCH GRANTS

Obtained funding by conceptualising and (co-)writing the following research grants :

 Date      Sponsor        Title                                                        Funding
                          Browser-based Multilingual Translation
 2019 –    EC H2020                                                                    e 542,187
                          – Role : named researcher and WP lead coordinator
 2021
                          Predicting Relevance and Quality of Machine Translation
 2018      Amazon         for Product Review                                           e 62,000
                          – Role : PI
                          Predicting Relevance and Quality of News Translation
 2017 –    EAMT                                                                        e 9,800
                          – Role : co-PI
 2018

RESEARCH PROJECTS

Browser-based Multilingual Translation (Bergamot) – (January 2019 - December 2021)

Funded by the European Commission, the Bergamot project will add and improve client-side
machine translation in a web browser. Unlike current cloud-based options, running directly on
users’ machines empowers citizens to preserve their privacy and increases the uptake of lan-
guage technologies in Europe in various sectors that require confidentiality. Free software inte-
grated with an open-source web browser, such as Mozilla Firefox, will enable bottom-up adoption
by non-experts, resulting in cost savings for private and public sector users who would other-
wise procure translation or operate monolingually. Our combined research on user experience,
domain adaptation, quality estimation, outbound translation, and efficiency support a broad
browser-based innovation plan.

Predicting Relevance and Quality of Machine Translation for Product Reviews – (2018)

Funded by the Amazon Academic Research Awards (AARA) program, this project was to devise
a Quality Estimation (QE) approach for the machine translation (MT) of product reviews. On
online market platforms such as Amazon, product reviews are abundant but written in a single
language (often English). Automatically translating such reviews could better enable products to
reach foreign markets. However, this type of content introduces important challenges to state
of the art machine translation, which often results in far from perfect quality translations, and
thus automatic quality estimation becomes paramount.

Quality Translation 21 (QT21) – (April 2015 - January 2018)

Quality Translation 21 is a machine translation project which has received funding from the
European Union’s Horizon 2020 Research and Innovation program. Many of the languages not
supported by our current technologies show common traits : they are morphologically complex,
with free and diverse word order. Often there are not enough training resources and/or proces-
sing tools. Together this results in drastic drops in translation quality. The combined challenges
of linguistic phenomena and resource scenarios have created a large, and under-explored, grey
area in the language technology map of European languages.
Combining support from key stakeholders, QT21 addressed this grey area by substantially im-
proved statistical and machine-learning based translation model, improved evaluation and conti-
nuous learning from mistakes, all with a strong focus on scalability.

MateCAT – (October 2011 - October 2014)

European project led by the Bruno Kessler Foundation (FBK), and conducted with the Computer
Science laboratory of Le Mans University (LIUM), The University of Edinburgh and Translated
Srl. For professional translators, it aimed at reducing the post-editing cost through the use of
an optimised web-based CAT tool. To improve the user’s productivity, the project partners have
worked on in-domain adaptation, project adaptation, automatic quality estimation and both on-
line and incremental adaptation from user feedback. MateCAT nowadays is used by thousands
of professional translators to deliver translations in more than 100 languages to 10,000 active
users all over the world.

COSMAT – (October 2009 - October 2012)

Led by the LIUM, working with SYSTRAN and the INRIA, the project aimed at providing a colla-
borative translation service of scientific documents to the scientific community. The result of this
project was planned to be hosted on the HAL, an open archive where authors can deposit scho-
larly documents from all academic fields. Independently of the characteristics bound to scientific
documents (domain adaptation, entities recognition, etc.), the collaborative aspect of this project
relied on both translated and reviewed versions of the scientific documents (PhD thesis, articles,
etc.) which are used to improve the quality of the machine translation system through an analy-
sis based on post-editing.

PROGRAM COMMITTEES & REVIEWING

Program Committee Member for Conferences : ACL long/short (2016, 17, 18, 19, 20) ; COLING
(2016, 18) ; CoNLL (2017) ; EACL long/short (2017) ; EAMT (2017, 2020) ; EMNLP (2015, 17, 18) ;
IJCNLP long/short (2017) ; LREC (2016, 18, 20) ; MT-Summit (2017) ; NAACL (2015, 18)

Program Committee Member for Workshops : WMT (2015, 16, 17, 18) ; IAMT (2014) ; SemEval
(2016) ; QEAPE (2018) ; NLPOSS (2018)

PUBLICATIONS

Referred Journal Papers
F OMICHEVA , M., S UN , S., YANKOVSKAYA , L., B LAIN , F., G UZMAN , F., F ISHEL , M., A LETRAS , N.,
C HAUDHARY, V., S PECIA , L. « Unsupervised Quality Estimation for Neural Machine Translation ».
Transactions of the Association for Computational Linguistics (TACL). 2020.

Referred Conference Papers – in Print/Press

S UN , S., F OMICHEVA , M., B LAIN , F., C HAUDHARY, V., E L -K ISHKY, A., R ENDUCHINTALA , A., G UZ -
MAN , F., S PECIA , L. « An Exploratory Study on Multilingual Quality Estimation ». To appear in the
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational
Linguistics and the 9th International Joint Conference on Natural Language Processing, Decemer,
2020.

B LAIN , F., A LETRAS , N., S PECIA , L. « Quality In, Quality Out : Learning from Actual Mistakes ».
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation,
November, 2020.
O KABE , S., B LAIN , F., S PECIA , L. « Multimodal Quality Estimation for Machine Translation ». Pro-
ceedings of the 58th Annual Meeting of the Association for Computational Linguistics, July, 2020.

S PECIA , L., B LAIN , F., L OGACHEVA , V., A STUDILLO , R., M AR TINS , A. « Findings of the WMT 2018
Shared Task on Quality Estimation ». Proceedings of the Third Conference on Machine Translation,
Shared Task Papers, Brussels, Belgium, November, 2018.

I VE , J., S CAR TON , C., B LAIN , F., S PECIA , L. « Sheffield Submissions for the WMT18 Quality Es-
timation Shared Task »Proceedings of the Third Conference on Machine Translation, Shared Task
Papers, Brussels, Belgium, November, 2018.

I VE , J., B LAIN , F., S PECIA , L. « deepQuest : A Framework for Neural-based Quality Estima-
tion ». Proceedings of COLING 2018, the 27th International Conference on Computational Linguis-
tics, Santa Fe, New Mexico, USA, August 20-26, 2018.

C HATTERJEE , R., N EGRI , M., T URCHI , M., B LAIN , F., S PECIA , L. « Combining Quality Estima-
tion and Automatic Post-editing to Enhance Machine Translation output ». Proceedings of the
13th Biennial Conference of the Association for Machine Translation in the America, Boston, USA,
March 2018.

B LAIN , F., S PECIA , L., M ADHYASTHA , P. « Exploring Hypotheses Spaces in Neural Machine Trans-
lation ». Proceedings of the Machine Translation Summit XVI, Nagoya, Japan, September 2017.

S PECIA , L., H ARRIS , K., B LAIN , F., B URCHARDT, A., M ACKETANZ , V., S KADIPA , I., N EGRI , M.,
T URCHI , M. « Translation Quality and Productivity : A Study on Rich Morphology Languages ».
Proceedings of the Machine Translation Summit XVI, Nagoya, Japan, September 2017.

B LAIN , F., S CAR TON , C., S PECIA , L. « Bilexical Embeddings for Quality Estimation ». Proceedings
of the Second Conference on Machine Translation, Volume 2 : Shared Task Papers, Copenhagen,
Denmark, September 2017.

C HATTERJEE , R., N EGRI , M., T URCHI , M., F EDERICO , M., S PECIA , L., B LAIN , F. « ». Proceedings
of the Second Conference on Machine Translation, Volume 1 : Research Papers, Copenhagen, Den-
mark, September 2017.

P ETER , J-T., N EY, H., B OJAR , O., P HAM , N-Q., N IEHUES , J., WAIBEL , A., B URLOT, F., Y VON ,
F., P INNIS , M., S ICS , V., B ASTINGS , J., R IOS , M., A ZIZ , W., W ILLIAMS , P., B LAIN , F., S PECIA ,
L. « The QT21 Combined Machine Translation System for English to Latvian ». Proceedings of the
Second Conference on Machine Translation, Volume 2 : Shared Task Papers, Copenhagen, Den-
mark, September 2017.

L OGACHEVA , V., B LAIN , F., S PECIA , L. « USFD’s Phrase-level Quality Estimation Systems ». Pro-
ceedings of the First Conference on Machine Translation (WMT), Berlin, Germany, August 2016.

B LAIN , F., S ONG , X., S PECIA , L. « Sheffield Systems for the English-Romanian WMT Translation
Task ». Proceedings of the First Conference on Machine Translation (WMT), Berlin, Germany, Au-
gust 2016.

P ETER , J-T., A LKHOULI , T., N EY, H., H UCK , M., B RAUNE , F., F RASER , A., T AMCHYNA , A., B OJAR ,
O., H ADDOW, B., S ENNRICH , R., B LAIN , F., S PECIA , ET AL . « The QT21/HimL Combined Machine
Translation System ». Proceedings of the First Conference on Machine Translation (WMT), Berlin,
Germany, August 2016.
A KER , A., B LAIN , F., D UQUE , A., F OMICHEVA , M., S EVA , J., S HAH , K. « USFD at SemEval-2016
Task 1 : Putting different State-of-the-Arts into a Box ». Proceedings of the 10th International
Workshop on Semantic Evaluation (SemEval-2016), San Diego, California, June 2016.

B LAIN , F., L OGACHEVA , V., S PECIA , L. « Phrase-Level Segmentation and Labelling of Machine
Translation Errors ». Proceedings of the 10th Edition of Language Resources and Evaluation Confe-
rence (LREC), Portorož, Slovenia, May 2016.

S HAH K., L OGACHEVA , V., P AETZOLD , G., B LAIN , F., B ECK , D., B OUGARES , F., S PECIA , L.
« SHEF-NN : Translation Quality Estimation with Neural Networks ». Proceedings of the Tenth
Workshop on Statistical Machine Translation (WMT), Lisbon, Portugal, September 2015.

B LAIN , F., B OUGARES , F., H AZEM , A., B ARRAULT, L., S CHWENK , H. « Continuous Adaptation to
User Feedback for Statistical Machine Translation ». North American Chapter of the Association
for Computational Linguistics – Human Language Technologies (NAACL HLT 2015), Denver, Colo-
rado, USA, June 2015.

B LAIN , F., H AZEM , A., B OUGARES , F., B ARRAULT, L., S CHWENK , H. « Project adaptation over
serveral days ». Translation in Transition 2015, Germersheim, Germany, January 2015.

F EDERICO , M., B ER TOLDI , M., C ETTOLO , M., N EGRI , M., T URCHI , M., T ROMBETTI , M., C AT -
TELAN , A., F ARINA , A., L UPINETTI , D., M AR TINES , A., M ASSIDDA , A.,S CHWENK , H., B ARRAULT ,
L., B LAIN , F., K OEHN , P., B UCK , C., G ERMANN , U. « The MateCat tool ». Proceedings of the 25th
International Conference on Computational Linguistics (COLING’14), Dublin, Ireland, August 2014.

B LAIN , F. « Projet COSMAT : traduction automatique de contenus scientifiques pour l’anglais et
le français ». Workshop “Les Rencontres du Numérique de l’ANR” by the French National Research
Agency, Paris, France, April 2013.

B LAIN , F., S CHWENK , H., S ENELLAR T, J. « Incremental Adaptation Using Translation Informa-
tion and Post-Editing Analysis ». Proceedings of the Eighth International Conference on Language
Resources and Evaluation (IWSLT’12), Hong-Kong, Chine, December 2012.

L AMBER T, P., S CHWENK , H., B LAIN , F. « Automatic Translation of Scientific Documents in the
HAL Archive ». Proceedings of the Eight International Conference on Language Resources and Eva-
luation, Istanbul, Turquie, May 2012.

L AMBER T, P., S ENELLAR T, J., R OMARY, L., S CHWENK , H., Z IPSER F., L OPEZ , P., B LAIN , F. « Col-
laborative Machine Translation Service for Scientific texts ». Proceedings of the Demonstrations
at the 13th Conference of the European Chapter of the Association for Computational Linguistics,
Avignon, France, April 2012.

B LAIN , F., S ENELLAR T, J., S CHWENK , H., P LITT, M., R OTURIER , J. « Qualitative Analysis of Post-
Editing for High Quality Machine Translation ». Proceedings of the 13th Machine Translation Sum-
mit 2011, Xiamen, Chine, September 2011.

Non-Refereed Abstracts, Reports & other Publications – in Print

B OJAR , O., VARIŠ , D., L IU , Q., N EGRI , M., T URCHI , M., N IEHUES , J., S PECIA , L., B LAIN , F.
« Human-informed Continuous Learning ». QT21 deliverable, 2018.
B URCHARDT, A., B OJAR , O., G RAHAM , Y., H ARRIS , K., L IU , Q., M A , Q., M ARHEINECKE , K., S PE -
CIA , L., B LAIN , F., S KADIN, A , I., P INNIS , M., T URCHI , M., M ACKETANZ , V., P ETER , J-T., VARIŠ ,
D., W ILLIAMS , P. « Quality Estimation Metrics and Analysis of 2 nd Annot. Round and Error
Profiles ». QT21 deliverable, 2018.

N IEHUES , J., H A , T -L., B URLOT, F., Y VON , F., P ETER , J-T., B LAIN , F., B OJAR , O., S KADINA ,
I., D AIBER , J., S IMA ’ AN , K., VALERIO M ICELI B ARONE , A., S ENNRICH , R., W ILLIAMS , P., K IM ,
Y., S CHAMPER , J., A LKHOULI , T., E SPAÑA -B ONET, C. « Final Report on Under-Resourced Lan-
guages ». QT21 deliverable, 2018.

B URCHARDT, A., B LAIN , F., B OJAR , O., D EHDARI , J., G RAHAM , Y., G ÖRÖG , A., H EIGOLD , G.,
L IU , Q., M A , Q., S PECIA , L., S KADIN, A , I., P INNIS , M., T URCHI , M., M ACKETANZ , V., P ETER , J-T.,
W ILLIAMS , P. « Evaluation Metrics and Analysis of First Annotation Round ». QT21 deliverable,
2017.

N IEHUES , J., B URLOT, F., P ETER , J-T., B LAIN , F., B OJAR , O., S KADINA , I., S IMA ’ AN , K., W ILLIAMS ,
P. « Intermediate Report on Under-resourced languages ». QT21 deliverable, 2017.

B LAIN , F., B URCHARDT, A., B OJAR , O., D UGAST, C., G RAHAM , Y., H ARRIS , K., H UCK , M., L OM -
MEL , A., N EGRI , M., N IEHUES , J., S PECIA , L., T HORSTEN , J-P., T URCHI , M., Y VON , F., B RANDON ,
L., C ORNELIUS , E., L IU , H., M ELBY, A. « Periodic Report M1-M18 ». QT21 deliverable, 2016.

B ER TOLDI , B LAIN , F. « Second Report on User-adaptive MT ». MateCAT deliverable, 2014.

B ER TOLDI , N., T URCHI , M., G ERMANN , U., B LAIN , F., S CHWENK , H., C ATTELAN , A. « Open Source
Distribution ». Third report on field and lab tests, 2014.

B ER TOLDI , N., T URCHI , M., G ERMANN , U., B LAIN , F., S CHWENK , H., R OUSSEAU , A., C ATTELAN ,
A. « Open Source Distribution ». MateCAT deliverable, 2014.

Non-Refereed Abstracts, Reports & other Publications – in Press

B LAIN , F. « COSMAT : traduction automatique de contenus scientifiques pour l’anglais et le
francais ». Workshop “Les Rencontres du Numérique de l’ANR” by the French National Research
Agency, Paris, France, 2012.

B LAIN , F. « Learn from Post-Editing for High Quality Machine Translation ». Forum “Jeune Re-
cherche” by the Ph.D. school (finalist of the poster competition), Le Mans, France, November 2011.

B LAIN , F. « Post-Editing Analysis for High Quality Machine Translation ». Young researchers away
day, Nantes, France, April 2011.

B LAIN , F. « Modèle de traduction évolutif ». « R&T PME » workshop by the Delegation General pour
l’Armement (French Defense Agency), Issy-Les-Moulineaux, France, April 2011.
Teaching Experience
 2020/21            Machine Translation (module leader), Translation Technology
 2019 - 2020        Co-supervision of MSc students in Computer Science
                    – The University of Sheffield and Imperial College London, UK
 2017 - 2019        Guest lecturer on Machine Translation (NLP module, MSc students)
                    – The University of Sheffield, UK
 2014 (~16h)        « Algorithms and Advanced Programming », Recursion and binary trees (lan-
                    guage : C)
                    – Le Mans Université, France
 2006 - 2007        Instructorship for the « Internet and Computer Science Certification » (C2i)
                    – Le Mans Université, France

Contribution to open source projects
DEEP Q UEST – 1st framework for neural-based Quality Estimation (lead REF impact case study)
Q UEST ++ – An open source toolkit for pipelined Translation Quality Estimation
M OSES – An open source Statistical Machine Translation system
MateCAT – An open source Computer Assisted Translation tool

Technical skills
 Programming                    Python, Perl, Shell scripting (sh/bash/csh), C/C++
 Machine Learning Software      Keras, Pytorch, Scikit-learn, CRF-suite
 Systems                        GNU/Linux, Unix
 Versioning                     Git, SVN, CVS
 Languages                      French (native), English

References
On request.

Wolverhampton, September 2020.
You can also read