Going beyond Google Translate?

Page created by Helen Brewer
 
CONTINUE READING
Going beyond Google Translate?
Going beyond Google Translate?
                   Francesca Chessa                                                            Gavin Brelstaff
                 DLS, University of Sassari                                           CRS4, Loc. Piscina Manna, Ed. 1
                       Sassari (SS)                                                          09010 Pula (CA)
                          Italy.                                                                   Italy.
                       fch @ uniss.it                                                            gjb @ crs4.it

ABSTRACT                                                                words via a popup text-box – any misalignment between
We motivate and describe the design and implementation of a             compounds of words cannot be easily corrected, and can
web-based system for the alignment of parallel texts. It builds on      significantly impede semantic interpretation.         Misalignment
the interactive color-highlight interface now deployed at Google        happens whenever SMT indicates a false equivalence between
Translate. By a series of simple point and click operations             text-ranges in the original and translation – and is usually a fair
translators can mark up equivalent text-ranges in their own             sign that SMT could not assimilate adequate context. Here, we
translation and in the original. When successful, the visual cues       provide an interface that lets the human translator markup up what
created by this activity should benefit the understanding of readers    they consider a correct alignment between words, or groups of
of limited degrees of bilingualism – and, may also capture aspects      words, in the original and their own translation – with a view to
of semantic context not readily available to algorithmic SMT. We        articulating context that may not be readily available to SMT. We
provide a working demonstration that treats poetic texts.               detail below how the interface runs off a web-page and allows the
                                                                        alignment of equivalent ranges in parallel texts via a simple point-
                                                                        and-click action.       Alignments created by the user are
Categories and Subject Descriptors                                      instantaneously made visible using a variant of the interactive
H.1.2 [User/Machine Systems] Human factors. H.5.2 [User                 color-highlight system mentioned above. Key to reducing the
Interfaces] (D.2.2, H.1.2, I.3.6) Natural language. I.7.2               complexity of the implementation of the interface is our
                                                                        systematic deployment of open-standard, non-proprietary, web
[Document Preparation] Markup languages. H.5.3 [Group and
                                                                        technologies. The same ideas might be integrated into a full text
Organization Interfaces] Web-based interaction.
                                                                        editor – but we prefer to deliver an alignment tool directly from
General Terms                                                           the web-page in order to promote web-based collaborative
Documentation, Design, Human Factors, Standardization.                  interaction between translators. As such, we progress beyond
                                                                        Germann’s Yawat demonstrator [6] to a cross-browser solution
                                                                        facilitated by the jQuery Javascript library [7].
Keywords
Multilingual web, Translation, Parallel texts, Semantic context,
Intermediate representation, TEI markup, Poetry, Cross-browser.
                                                                        2. HCI TO COMPLEMENT SMT
1. INTRODUCTION                                                         Machine Intelligence typically progresses by the development of
In terms of HCI, computer-assisted translation is still relatively      algorithms that emulate aspects of human perceptual and
unsophisticated. Typically statistical machine translation (SMT)        cognitive activity – where these algorithms process the same data
is first computed and then presented to the reader as a fait            available to humans (e.g. digital texts), and attempt to produce the
accompli, however inaccurate it might be. Recently, however,            same or better results. The tendency is towards objective
SMT web-services such as Google Translate [1] have adopted an           algorithmic optimization that is to be achieved without explicit
interactive color-highlight system by which words or phrases in         access to semantic context – in the hope that such elusive context
the source text that correspond to those in the translated text light   might emerge implicitly as statistical correlations inherent in the
up as the reader passes the cursor over them – benefiting thus          computation of the algorithm. This is fine in scenarios where full
from an existing metaphor refined over the years [2,3,4,5].             automation can achieved, but can be counterproductive otherwise
Although that service does let readers improve on the translated        – especially when human intervention is later required in order to
                                                                        first find and then correct a significant percentage of mistaken
                                                                        results, as can happen with SMT.
                                                                        Indeed it may be preferable to design algorithms to complement
                                                                        acknowledged human skills rather than focus on optimizing
                                                                        existing algorithms to compete with and supersede them. A first
                                                                        step towards designing such algorithms, for SMT, is to establish
                                                                        an intermediate representation that might reasonably articulate
                                                                        semantic context so that it can be readily manipulated by both
                                                                        man and machine – by either cognition or computation (sketched
                                                                        in Figure 1).
Figure 1: A spatio-visual intermediate representation for semantic context
We side-step the innate complexities of a theoretical approach              most languages, and ideograms in Chinese. This focuses the
[8,9] and seek, instead, an intermediate representation amongst             user immediately on their task of aligning words between
the elements presented as interactive color highlights in user              texts.
interfaces like Google Translate. In the first instance, this
                                                                       2.   Any translator is necessarily engaged in a language-based
representation might be pictured as simply drawing a set of
                                                                            task and thus we try to keep our interface non-verbal so not
labeled boxes around words and phrases in the original and
translated texts and then joining lines between those boxes that            to disturb their cognitive activity – by focusing as far a
                                                                            possible on spatio-visual cues.
carry equivalent meaning. Depending on the type of equivalence
the lines might be colored differently (e.g. green: literal; yellow:   3.   The default cursor sprite displayed in any web browser is bad
approximate; or red: paraphrase). This perspective makes it clear           for pointing at words while they are being read: Either a little
that we are dealing with a spatio-visual mapping between parallel           white hand, or a small vertical dark bar, obscures letters in
texts. Although a diagram of such a mapping should be easy to               the word – disrupting the task. We swap-in a little see-
visually digest, the task of constructing it, in a standard machine-        through cursor, instead – if the browser permits (Opera alone
readable format, may be considered beyond the competence of an              does not).
typical translator – even furnished with the latest computer
graphics tools: the manipulation of intersecting lines quickly         4.   Since the two parallel texts are displayed, on screen, side-by-
becomes overly complex for the non-technical user. Translators,             side the words to be aligned across the texts are almost
by nature, are familiar with constructing texts rather than                 always in view at the same time. Thus when the user scrolls
graphics. Thus we provide a paired-down graphical user interface            down in one text it is useful to automatically scroll down the
(GUI) designed specifically to simplify the mapping task to a               other synchronously – as we have programmed.
series of point-and-click operations occurring on top of the           5.   Our eyes’ fovea, cannot resolve much more that a few words
familiar interactive color-highlight system. Thus the translator            at a time – and reading fluently generally requires a delicate
may conceive of their task as a traditional markup task that                choreography of eye movements involving short term visual
operates upon text, not graphics, and which they can do in                  memory. Thus, although our interface seems to pretend we
sequence, (focusing on only one equivalence at a time) while                can, we can never truly read two parallel texts simultaneously
periodically reviewing the overall results – either by tracing the          – it is simply impossible to resolve the letters in two well
cursor along the lines of text, or by the other means described             separated locations at once [11]. Instead, our interface is
later. Beneath the GUI, the intermediate representation is                  designed to facilitate a smooth switching of gaze between
maintained as digital text, not graphics – using the TEI markup             those two locations. We do this by reinforcing the visibility
language [10]: an open standard format defined in XML, and used             of the highlighted words – so when they are not being
extensively within academic text-annotation and archival                    resolved by the fovea our peripheral vision is better able to
communities.                                                                direct the next eye movement towards them.               Such
                                                                            reinforcement is achieved by the use of an extended semantic
                                                                            highlighting scheme– detailed below.

3. DESIGN FACTORS
We were guided by both ergonomic and pragmatic factors in the          4. DOMAIN: POETIC TEXTS
design of the GUI:                                                     We focus on texts considered to be an extreme challenge for SMT
                                                                       at Google: poetry [12] – with the intention of expressing elusive
1.   Text selection is made by point-and-click, and not by the         aspects of semantic communication in order to differentiate those
     click-and-drag operation ubiquitous in text-editors. This         that can be spatio-visually articulated from others that cannot.
     mandates the automatic pre-segmentation (via jQuery) of
     each text into its constituent semantic atoms: e.g. words in
Any translator, committed to provide a definitive version of a           inside an s element and within it would be nested several w
poem, eventually arrives at an irreversible order of words – and         elements. Now convenient for segmentation mark-up, this format
may actually wish to document their choices by justifying their          has the advantage that it remains valid TEI/XML, while it can
correspondence to the original. They may deviate from literal            easily be transformed back into canonical line/stanza form by
correspondence for many good reasons – seldom due to a wish to           applying an XSL stylesheet that implements a technique known as
mystify or add artifice. To convey the thought expressed in a            grouping [14].
source text while judiciously ignoring literalness, word order, or
grammatical voice is to obtain what Nida terms dynamic                   We also rationalize alignment mark-up with respect to
equivalence [13]. Such deviation from literalness is also essential      conventional TEI practice: by labeling each  tag using an n
simply to reestablish an equivalent esteem, in the translation, to       attribute simply composed from the text in that element – first
that attained by the original work in its original language. That        substituting with underscore characters any punctuation and
SMT seldom achieves this often becomes pitifully apparent when           intervals of white space. Thus we label the Latin text in loca and then we align to it an
                                                                         English translation as follows for
In TEI [10] the  tag is provided to mark-up word-like entities        parts where we the prefix la: indicates the source
(not necessarily orthographic words) and we adopt it as the              language is Latin. To disambiguate any multiple occurrences of a
smallest semantic units for alignment. It is also useful to align        given phrase in the source text an ordinal postfix is appended:
compounds of such semantic units to indicate the textual                 e.g.: n="la:in_loca.2". An additional type attribute is
expression of a coherent idea – for this we enclose the units            inserted whenever the translation is not to be considered literal –
within a TEI  (phrase) or  (sentence) tag. Here we               to indicate if it is approximate or a paraphrase, e.g:
intend a loose definition of a sentence that is again not tied to any
                                                                         for parts.
particular typographic convention:             Poets often violate
punctuations and omit full-stops and commas to gain their effect,        This direct labelling avoids the additional complexity that would
yet they still insist on pedantic positioning of their line-breaks and   be incurred the conventional TEI practice of deploying link and
stanzas. In the latter case, TEI offers the  and the  tag         linkGrp tags [15]. Finally, we provide for limited rich-text
to delimit the start and end of each line and stanza, respectively –     rendering by respecting TEI tags , 
as the following extract from T.S. Eliot’s Ash Wednesday                 (rendered as italics) and ,  (rendered in bold)
illustrates:                                                             – with each treated the same as the  tag, but with an
                                                       additional rend attribute to distinguish them.
Here are the years that walk between,
bearing
Away the fiddles and the flutes, restoring
One who moves in the time between sleep and
waking, wearing                                                      5. METHOD

                                                       5.1 Web delivery
White light folded, sheathing about her,                              Both source and translation text are delivered to the browser as
folded.
The new years walk, restoring                                     XML in the TEI format detailed above, and loaded thus into
...                                                                      individual frames an HTML frameset – so that the two texts may
While jewelled unicorns draw by the gilded                            be read side by side as is traditional for parallel texts. To this end
hearse.                                                              the XML in each frame is immediately transformed using a simple

                                                                         XSL stylesheet so that the each s, phr, or w element becomes an
Yet this encoding becomes less attractive when we wish to grant          HTML span element of class s, phr, or w, respectively (denoted
the freedom to segment over more than one line or stanza. For
                                                                         herein as span.s, span.phr and span.w). Thus the
example if we attempt to segment the single phrase “wearing
                                                                         translated Latin phrase above becomes:
White light” straddling the two stanzas above:
                                                                         for
wearingWhite
                                                                         parts.
light.
we run into the problem of overlapping mark-up [13]: the XML             For each span.w a hover event-handler is attached using jQuery,
stops being valid when one range-based tag intersects another:           so that whenever the reader passes the cursor over that element it
Two hierarchies ({lg, l} and {s,phr,w}) are competing and one            receives a color highlight. By default the highlight is green,
needs to be given priority. TEI provides a way to preserve            yellow if type is “approx”, and red if it is a paraphrase. In our
and  tags by assigning an enjamb attribute to those on the           system the same highlighting effect is applied in the parallel frame
newline. Since semantic segmentation is our priority we instead          at any span.w that shares the same n attribute value (modulo a
transform  and  tags into point-based mark-up and                 language prefix). Finally any span.phr and span.s enclosing
dedicate range-based delimitation to segmentation and alignment,         highlighted span.w are themselves highlighted, in white and
thus:                                                                    grey respectively – in order to project semantic context to the
                                                                         reader.
... waking, wearing

White light folded, sheathing about her,                           5.2 Interactive alignment
folded....                                                          Once color highlighting is activated each word in the text is then
                                                                         made available for interactive alignment, via mouse clicks. This
For clarity, above we show only one level of semantic
                                                                         achieved in the following stages:
segmentation – a phr element – which would normally be nested
Figure 2: GUI: Selection by Click in the source text – the original Sardinia poem by Antonino Mura Ena
(top) Nothing is highlighted while cursor is not over a word of the text;
(middle) Cursor now hovers over the word peraula which becomes highlighted in green, with the surrounding phrase in white and
idea/sentence in lighter grey;
(lower) Following a mouse click the word peraula gains a dashed black border – which indicates that it is selected. Selection might
alternatively have been initiated in the text on the right.
Figure 3: GUI: Selection by Alignment of the translation (Following on from Figure 2)
(top) Moving the cursor to the right hand text highlights a chosen word, as before. The previous selection continues to be indicated by the
dashed border though it loses its highlight.
(middle) Following a mouse click the word word also gains a border – it is now selected too; Further words could be selected in either
frame before continuing or the color could be changes to indicate approximation (yellow) or paraphrase (red).
(lower) A further click with a meta-key held down invokes an alignment – with all currently selected words (here peraula and word) being
marked-up in literal correspondence (in green). Immediately, both borders are dissolved – indicating the operation is complete – and
instantly the highlighting switches on both words to indicate the newly made alignment. From now on, whenever the cursor passes over
either word peraula or word both will light up, unless a split operation is later applied to one of them

Figure 4: Review: by clicking on the Notes icon (not shown) words from the source text are temporarily displayed in blue next to their
aligned counterparts in the translation. Here we illustrate only the first two line.
1. Atomic segmentation: Each separate word in the text is            7. ACKNOWLEDGMENTS
automatically enclosed in a span.w element – for which is            Our thanks go to Gianluigi Zanetti,CRS4 and Sergio Usai .
generated an n attribute duplicating its textual content. For
languages like Chinese each ideogram is segmented individually,
rather than using just white-space boundaries. Punctuation marks     8. REFERENCES
always cause boundaries between span.w segments.                     [1] Google Translate, accessed May 2011.
                                                                         http://translate.google.com.
2. Restoring state: Any text delivered within a  tag is given
                                                                     [2] Brelstaff, G.J., Chessa, F. 1998 "Sustaining the paper
the n attribute of that tag when it becomes a span.w element –
                                                                         metaphor with Dynamic HTML", Conference Companion,
thus state gets restored on delivery. Generally such elements
                                                                         HCI 98, Ed. Jon May et al, Sheffield UK, 16-17.
contain more than one word and are the product of a previous
merge operation (see point 4 below).                                 [3] Bouvin, N.O., Zellweger, P.T., Gronbaek, K., Mackinlay,
                                                                         J.D. 2002, Fluid Annotations Through Open Hypermedia:
3. Selecting by click: Each resultant span.w element is assigned         Using and Extending Emerging Web Standards, WWW
an on-click event-handler so that once clicked it draws a thin           Conf., (Honolulu, Hawaii, USA. May, 2002), 160-171. DOI=
black rectangular border around its text (or removes it upon a           http://doi.acm.org/10.1145/511446.511468.
second click). The user can thus select a group of elements one
by one in one or both of the parallel texts – with the borders       [4] Multilingual markup demo – accessible since 2009.
providing the visual feedback.                                           http://fch.uniss.it/MLM.
                                                                     [5] Tiedemann, 2006 J. ISA & ICA—Two web interfaces for
4. Merge operation: When several elements are selected they can
                                                                         interactive alignment of bitexts, In Proceedings of LREC.
then be merged. This is achieved by clicking upon one of them
                                                                         Genova, Italy.
while holding down a meta-key (e.g. Control, Alt, Command key)
on the keyboard. The event-handler, this time, assigns the same n    [6] Germann, U. 2008. Yawat: yet another word alignment tool,
attribute value to all of the selected span.w elements. That             Proceedings of the ACL-08: HLT Demo Session, (Columbus,
value is the concatenation of each component words, in text-order        Ohio, USA June 2008), 22-28.
– omitting punctuation and separated by underscores.                 [7] jQuery JavaScript Library v1.4.2, accessed May 2011.
5. Alignment: When elements from both frames are                         http://jquery.com/
simultaneously selected for a merge it becomes an alignment          [8] Civera, J., Juan, A. 2007, Domain Adaptation in Statistical
operation which is achieved by assigning a common label – the n          Machine Translation with Mixture Modeling, Proceedings of
attribute value – to each element selected. The label is computed        the Second Workshop on Statistical Machine Translation,
as the concatenation of only those words from the source text            (Prague, June 2007) 177–180.
frame. When it is assigned to the elements in the translation        [9] Behrens, C., Kashyap, V. 2002, The Emergent Semantic
frame the language prefix is attached – as discussed earlier.            Web: A Consensus Approach for Deriving Semantic
6. Split operation: When a merge operation is attempted on a             Knowledge on The Web. In Real world semantic web
previously merged group of elements then the event-handler               applications Eds: Kashyap & Shklar, IOS Press, 69-90.
simply splits the elements back into their component elements by     [10] Text Encoding Initiative Consortium, 2011. TEI P5:
reassign their previous n attribute value.                                Guidelines for Electronic Text Encoding and Interchange.
7. Review: At any time the Notes icon can be clicked whereby              XML Version, 1.9.0. updated on Feb 21 2011.
words from the source text are temporarily displayed in blue next         http://www.tei-c.org/release/doc/tei-p5-doc/en/html/
to their aligned counterparts in the translation (see Figure 4).     [11] Pelli, D. G., Tillman, K. A., Freeman, J., Su, M., Berger, T.
Another way to review is simply to trace the cursor along the             D., & Majaj, N. J. (2007). Crowding and eccentricity
reading line word by word and observer the highlighting – a               determine reading rate. Journal of Vision, 7(2):20, 1–36,
process that can be automatically emulated by clicking on the             http://journalofvision.org/7/2/20/, doi:10.1167/7.2.20.
Walkthrough button– one word per second – without the need to        [12] Genzel, D., Uszkoreit, J., Och, F. 2010. “Poetic” Statistical
move the mouse. In addition, once the Save button is pressed the          Machine Translation: Rhyme and Meter, Proceedings of the
resultant XML is displayed and it may be reviewed or even edited          2010 Conference on Empirical Methods in Natural
before submitting it to an archive server.                                Language Processing, 158-166.
                                                                     [13] Marinelli, P., Vitali, F., Zacchiroli, S. 2008.Towards the
6. DEMO                                                                   unification of formats for overlapping markup. The New
Our interactive color-highlight interface has been successfully           Review of Hypermedia and Multimedia. Vol.1,No.14, 57-94.
used by translators to align multilingual parallel texts of poems
involving the following languages: English, Russian, Chinese,        [14] Tennison, J. 2008. XSL Pages: Grouping. accessed May
Italian, Latin and Sardinian. – as can be reviewed at:                    2011 http://www.jenitennison.com/xslt/grouping/index.html
http://fch.uniss.it/_MLW                                             [15] Boot, P. 2009. Towards a TEI-based encoding scheme for
Figures 2 and 3 show screenshots of the GUI in action aligning            the annotation of parallel texts, Literary and Linguistic
the parallel text of a Sardinian poem and its English translation.        Computing, Vol.24, No.3, 347-361.
You can also read