Chinese Character Decomposition for Neural MT with Multi-Word Expressions

*ADAPT Research Centre

^ Insight Centre for Data Analytics

Dublin City University, Ireland
Chinese Character Decomposition for
Neural MT with Multi-Word Expressions
Lifeng Han*, Gareth J. F. Jones*, Alan F. Smeaton^, and Paolo Bolzoni
research papers @NoDaLiDa2021:the 23rd Nordic Conference on Computational Linguistics
& COLING20:MWE-LEX WS
1
Bonus takeaway:
AlphaMWE multilingual corpus
with MWEs
ADAPT seminar series June,2021

www.adaptcentre.ie
Content
• Background (motivation of this work)

• Related work (Decomposition4MT)

• Our refined decomposition NMT model with MWEs

• Automatic and Crowd-source Human evaluation

• Experts’ analysis with Examples (new insight)

• => AlphaMWE (multilingual lexicon: corpus with MWEs)
Cartoon, https://www.freepik.com/premium-vector/boy-with-lot-books_5762027.htm
Data: https://
github.com/
poethan/
MWE4MT
2
radical4mt

www.adaptcentre.ie
Background
Driven: - two factors
• Sub-character NMT:

• BPE for English and western languages.

• How about asian ideograph script?

• NMT bottlenecks - MWEs, low-frequency words, OOV words:

• How to better address Multiword Expression translations

• MWE: “display lexical, syntactic, semantic, pragmatic and/or statistical
idiomaticity”,

• e.g. kick the bucket, by and large, pull one's leg
3

Parallel corpora
Trainer
Neural networks
Encoding
Source text
Decoder, NNs
Encoder, NNs
RNN
+
+
Decoding
CNN
+ Attention
Learned
Model
MT outputs
4
NMT components
Target language

5
NMT
Linguistic structure &
Knowledge
Learning model
Semantics &
Disambiguation
MWE methodology
Dictionary usage
Bilingual phase table
…
Attention
Coverage
All attention (Transformer,
2017)
BERT (2018google)
Pretraining lang.
model
+
Bi-direc. RNN
Bi-direc. RNN
+
+
Tree2string
Eriguchi et al. 2016ACL
String2tree
Aharoni&Goldberg17ACL
…
Tree2Tree?
T2T NN program Translation
Chen et al.2018ICLR
Dependency
Wu et al. 2017IJCAI
.
.
.
A bigger view of
the belonging,
Within NMT
research paradigm,
NMT branches.
here
Syntax structure &
dependency

www.adaptcentre.ie
Background
Chinese characters, example
• Semantic part + phonetic part

• Semantics: radicals

• Phonetics: related to the overall pronunciation of the character
Background
6
Chinese radical (Dāo, knife) evolution from Pictogram to Regular script
Shang Dynasty

(1600-1046BC)
Western-
Zhou Dynasty
(1045-771BC)
Warring
States period
(476-221BC)
Han Dynasty
(202BC-220)
Eastern
Han (from 57AD
on)
Bronze
inscriptions
Oracle bone
script
Bronze

Inscription
Silk (on Seal) Regular script

www.adaptcentre.ie
Background
Background
7
(fēng)
(semantic, metal) (phonetic, féng)
…
(jiàn)
(phonetic, qiān) (semantic, knife)
… … …
Level-1:
Level-2:
Level-3:
…
Full-stroke:
Word level 28 / / ⇥⇤ / ⌅ / ⇧⌃ / ⌥ / / ⌦↵ / / ✏
Character 28 ⇥ ⇤ ⌅ ⇧ ⌃ ⌥ ⌦ ↵ ✏
Pronunciation èr shí bā Suì chú shī bèi fā xiàn sǐ yú jiù jīn shān yī jiā shāng chǎng
Radical 28 ⇣⌘ ✓◆  ⌫ ⇠ ⇡⇢ ⌧ !⇡" ↵ #$ %"& '(

English Ref. 28-Year-Old Chef Found Dead at San Francisco Mall

www.adaptcentre.ie
Background
Background
8
ZH source:
ZH pinyin:
Nián nián suì suì huā xiāng sì, suì suì nián
nián rén bù tóng.
EN reference:
The flowers are similar each year, while
people are changing every year.
EN MT output:
One year spent similar, each year is
diﬀerent
Example of MWEs in MT as a challenge. Reference: Han et al. (2020LREC) MultiMWE: Building a Multi-lingual Multi-Word Expression
(MWE) Parallel Corpora. https://www.aclweb.org/anthology/2020.lrec-1.363/

www.adaptcentre.ie
Content




• = AlphaMWE (multilingual lexicon: corpus with MWEs)
cartoon, https://www.amazon.ca/Tweety-Bird-not-
Related
work

9

www.adaptcentre.ie
Related work
Chinese character decomposition
• radical embeddings as additional features for Chinese → English and Japanese Chinese NMT.

• Our own: Han and Kuang (2018) : a range of encoding models including word+character,
word+radical, and word+character+radical (best) with bidirectional RNNs

• Zhang and Matsumoto (2018): radical embeddings as additional features to character level LSTM-
based NMT on Japanese → Chinese translation

• Bidirectional English Japanese, English Chinese and Chinese Japanese NMT with word,
character, ideograph and stroke levels

• Zhang and Komachi (2018)

• experiments showing that the ideograph level was best for ZH→EN MT, while the stroke level was best
for JP→EN MT

• No intermediate level decomposition testing
10
Han and Kuang (2018) Incorporating Chinese radicals into neural machine translation: deeper than character level. In:
30th European Summer School in Logic, Language and Information (ESSLLI 2018) https://arxiv.org/abs/1805.01565

www.adaptcentre.ie
Related work
Chinese character decomposition
11
Han and Kuang (2018) Incorporating Chinese radicals into neural machine translation: deeper than character level. In:
30th European Summer School in Logic, Language and Information (ESSLLI 2018) https://arxiv.org/abs/1805.01565

www.adaptcentre.ie
Content


• Refined decomposition Neural MT model with MWEs


12
data: https://github.com/
poethan/MWE4MT
radical4mt
!

www.adaptcentre.ie
IDS files from CHISE
13
Character Decomposition Decomposition
(lì)
[G] [T]
(jù) [GTKV] [J]
(hán) [GTV] [JK]
(yǒng) [GTV] [JK]
Character construction: : up-down, : left-right,
: inside-outside, : embedded
Refined decomposition model with MWEs
CHISE (CHaracter Information
Service Environment) project.
Comprised of 88,940 Chinese
characters from CJK (Chinese,
Japanese, Korean script) Unified
Ideographs
https://github.com/cjkvi/cjkvi-ids
http://www.chise.org/

Extraction procedure
• To obtain a decomposition level L representation of Chinese character α:

• go through the IDS file L times.

• Each time, we search the IDS file character list to match the newly generated
smaller sized characters and

• re-place them with decomposed representation recursively.
14

Examples of decomposition/extraction
15
shared bilingual glossaries: https://github.com/poethan/MWE4MT/tree/master/radical4mt
Zh MWE ⾼高尔夫球俱乐部 (golf club), 汽⻋车散热器 (car radiator)
Rxd1
{亠⼝口冋𠂊⼩小⼆二⼈人王求⺅亻具乐咅阝} , {⺡氵⽓气⻋车龷攵执⺣灬吅⽝犬吅}
Rxd2
{⼂丶⼀一⼝口⼌冂⼝口𠂊⼩小⼀一⼀一⼈人⼀一⼟土⼀一⺢氺⼂丶⺅亻且⼀一乐立⼝口阝}, {⺡氵𠂉⼀一乁⻋车卄⼀一攵⺘扌丸⺣灬⼝口
⼝口⽝犬⼝口⼝口}
Rxd3 {⼂丶⼀一⼝口⼌冂⼝口𠂊⼩小⼀一⼀一⼈人⼀一⼗十⼀一⼀一⼅亅丷八⼂丶⺅亻且⼀一乐亠丷⼀一⼝口阝}, {⺡氵𠂉⼀一乁⻋车⼗十⼁丨
⼀一攵⺘扌九⼂丶⺣灬⼝口⼝口⽝犬⼝口⼝口}
level2: generates ⼟土⺢氺, then level3 further decomposed them.

Adding MWEs and decomposed MWEs
16
Lifeng Han, Gareth J.F. Jones and Alan F. Smeaton. 2020. MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora.
Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pages 2970–2979 Marseille, 11–16 May 2020

www.adaptcentre.ie
Content




cartoon, https://images.app.goo.gl/Y6bAfr9oFsswWjUY7
What’s
happening
?
17

www.adaptcentre.ie
BLEU scores with increasing learning steps
Evaluation
18

www.adaptcentre.ie
BLEU scores with increasing learning steps
Evaluation
19

www.adaptcentre.ie
Human Direct Assessment
Evaluation
20

www.adaptcentre.ie
Content




cartoon, https://www.bbc.com/news/
radical4mt
21
data: https://github.com/
poethan/MWE4MT

www.adaptcentre.ie
Expert Analysis: new research trend
22
- BLEU has long been criticised as not reflecting real differences between high-performing
MT models.
- Crowd-source human evaluation is not reliable either, with very recent work highlighting
that professional translators disagree with crowd-source human ranking of MT systems
largely via WMT data. cite{Freitag et al. 2021 MT_HA}.
- BLEU Crowd-source Human Assessment (CSHA):
- tend to favour ‘boring’ translations.
- When lexical diversity improves in MToutput: get lower scores (BLEU, CSHA)
- We look at detailed translation examples from the different system outputs, at 100K
learning steps, by human expert, native speaker: examples reflect the advantages from
decomposition models, e.g. RXD3 (as also RXD1) on MWEs translation.

23
src
28
28
ref
28 @-@ Year @-@ Old Chef Found Dead at San Francisco Mall
a 28 @-@ year @-@ old chef who had recently moved to San Francisco was found dead in the stairwell of a local
mall this week .
rxd3
the 28 @-@ year @-@ old chef was found dead at a San Francisco mall
a 28 @-@ year @-@ old chef who recently moved to San Francisco has been found dead on a stairwell in a local mall
this week .
base
the 28 @-@ year @-@ old chef was found dead in a shop in San Francisco
a 28 @-@ year @-@ old chef who has moved to San Francisco this week was found dead on the stairs of a local mall .
base
MWE
28 @-@ year @-@ old chef was found dead at a San Francisco mall
a 28 @-@ year @-@ old chef who recently moved to San Francisco was found dead this week at a local mall .
rxd3
MWE
28 @-@ year @-@ old chef was found dead at a San Francisco mall
a 28 @-@ year @-@ old chef recently moved to San Francisco was found dead this week at a local mall .
rxd1
the 28 @-@ year @-@ old chef was found dead at a San Francisco mall
a 28 @-@ year @-@ old chef recently moved to San Francisco was found dead in a local shopping mall this week .
rxd2
the 28 @-@ year @-@ old chef was found dead in a San Francisco mall
a 28 @-@ year @-@ old San Francisco chef was found dead in a local mall this week .

www.adaptcentre.ie
Expert analysis: insights on MWEs
24
1) Chinese MWE 商场 (shāng chǎng) in the first sentence: - correctly translated as {mall} by {rxd3}
model {-vs-} translated as {shop} by the baseline character sequence model
2) MWE 楼梯间 (lóutījiān) in the second sentence:
- correctly translated as {stairwell} by the {rxd3} model {-vs-} baseline: as {stairs}
3) MWE 近⽇日(jìn rì) meaning {recently} in the second sentence:
- totally missed out by the original character sequence model = results in a misleading ambiguous
translation of an even larger content, i.e., did the chief moved to San Francisco (SF) {recently} or
{this week}
- MWE 近⽇日(jìn rì) correctly translated by the {rxd3} model = overall meaning of the sentence is
clear.

www.adaptcentre.ie
Expert analysis: on Multi-word Expressions
25
1) It is not reflected by BLEU scores
1) because the lower percentage of MWEs in corpus. However, it is an important part of the corpus
and human languages/expressions.
2) Because of the MWE interpretations, lexical diversity in translation, and reference corpus.
2) it is not reflected by crowd-source human assessment
1) Because they were not well trained
2) Not with clear guidelines in most cases
3) Not from linguistic/translator background
4) Favour the candidate translation with n-gram matching to source/reference
- Whenever Human Expert Assessment is available/possible, do it!

www.adaptcentre.ie
Expert analysis: Diagnose RXD2
26
1) RXD1 separates character into semantic+phonetic.
2) RXD3 decomposes more stroke like sequence with order.
3) RXD2 generates smaller size characters mis-leading langauge understanding model.
example (Figure before):
RXD2: new characters 从(cóng) and 王(wáng) respectively from 劍(Jiàn, {sword}) and 鋒 (fēng, {edge/
sharp point}), but they have no direct meaning from their father characters, instead meaning “from and
“king respectively. (fēng)
(semantic, metal) (phonetic, féng)
…
(jiàn)
(phonetic, qiān) (semantic, knife)
… … …
Level-1:
Level-2:
Level-3:
…
Full-stroke:

www.adaptcentre.ie
Content





https://
github.com
/poethan/
AlphaMWE
Bonus corpus
27

www.adaptcentre.ie
AlphaMWE
Procedure for constructing AlphaMWE
AlphaMWE
28

www.adaptcentre.ie
AlphaMWE
Size, coverage, usage - come to join us
• Extracted all 750 English sentences which have vMWE tags included

• English source: Walsh, et al. (2018) https://gitlab.com/parseme/
parseme_corpus_en

• The target covered so far: Chinese, German, Polish, Italian, with Spanish/French
under editing (why not to join the team?!!).

• It's comparable to some standard shared task usage.

• development and test data sets from the annual WMT (Bojar et al., 2017) and also
from the NIST MT challenges - approximately 2,000 sentences for Dev/testing
over some years (https://www.nist.gov/programs-projects/machine-translation)

• In plan to submit for shared tasks: Multilingual/bilingal MT, NLP
AlphaMWE
29

www.adaptcentre.ie
Examples
of
AlphaMWE
sentences:
EN and
DE/PL/ZH/IT
30
Plain
English
Corpus
The chair was comfortable, and the beer had gone slightly to his head.
I was smoking my pipe quietly by my dismantled steamer, and saw them all cutting capers in the light, with
their arms lifted high, when the stout man with mustaches came tearing down to the river, a tin pail in his
hand, assured me that everybody was 'behaving splendidly, splendidly, dipped about a quart of water and
tore back again. (the italic was not annotated in source English)
English
MWEs
gone (slightly) to his head, cutting capers, tearing down, tore back
Target
Chiense
Corpus
[sourceVMWE: gone (slightly) to his head][targetVMWE: ( )
]
“ ”
[sourceVMWE: cutting capers; tearing down; tore back][targetVMWE: ; ; ]
Target
German
Corpus
Der Stuhl war bequem, und das Bier war ihm leicht zu Kopf gestiegen. [sourceVMWE: gone (slightly) to his
head][targetVMWE: (leicht) zu Kopf gestiegen]
Ich rauchte leise meine Pfeife an meinem zerlegten Dampfer und sah, wie sie alle im Licht mit hoch
erhobenen Armen Luftsprünge machten, als der stämmige Mann mit Schnurrbart mit einem Blecheimer in der
Hand zum Fluss hinunterkam und mir versicherte, dass sich alle prächtig, prächtig benahmen, etwa einen
Liter Wasser eintauchte und wieder zurückwankte”. [sourceVMWE: cutting capers; tearing down; tore back]
[targetVMWE: Luftsprünge machten; hinunterkam; zurückwankte]
Target
Polish
Corpus
Krzesło było wygodne, a piwo lekko uderzyło mu do głowy. [ sourceVMWE: gone (slightly) to his head]
[targetVMWE: (lekko) uderzyło mu do głowy]
Cicho paliłem swoją fajkę przy zdemontowanym parowcu i widziałem, jak wszyscy pląsają w świetle, z
podniesionymi wysoko ramionami, gdy twardziel z wąsami przyszedł szybkim krokiem do rzeki, blaszany
wiaderko w dłoni, zapewnił mnie, że wszyscy zachowują się wspaniale, wspaniale, nabrał około ćwiartkę wody
i zawrócił szybkim krokiem”. [sourceVMWE: cutting capers; tearing down; tore back][targetVMWE: pląsają;
przyszedł szybkim krokiem; zawrócił szybkim krokiem]
Target
Italian
Corpus
La sedia era comoda, e la birra gli aveva leggermente dato alla testa. [ sourceVMWE: gone (slightly) to his
head][targetVMWE: aveva (leggermente) dato alla testa ]
Stavo fumando tranquillamente la pipa vicino al mio piroscafo smontato, e li ho visti tutti giocare
gioiosamente alla luce, con le braccia alzate, quando l'uomo robusto con i baffi è venuto giù al fiume
alacremente, un secchio di latta in mano, mi ha assicurato che tutti si stavano comportando splendidamente,
splendidamente, ha preso circa un litro d'acqua ed è tornato indietro velocemente. [ sourceVMWE: cutting
capers; tearing down; tore back] [targetVMWE: giocare gioiosamente; venuto giù alacremente; tornato
indietro velocemente]

31
News!
https://github.com/poethan/AlphaMWE/releases/tag/V1.0

www.adaptcentre.ie
References
• Our work:

• AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations. Forthcoming in Joint Workshop on Multiword Expressions and Electronic
Lexicons (MWE-LEX) @COLING-2020, pages 44–57 https://www.aclweb.org/anthology/2020.mwe-1.6/

• Detection of Verbal Multi-Word Expressions via Conditional Random Fields with Syntactic Dependency Features and Semantic Re-Ranking, Proceedings
of the 13th Workshop on Multiword Expressions (MWE 2017), Valencia, Spain, April 4, 2017, 114-120 https://www.aclweb.org/anthology/W17-1715/

• MultiMWE: building a multi-lingual multi-word expression (MWE) parallel corpora. In: 12th International Conference on Language Resources and
Evaluation (LREC), 11-16 May, 2020, Marseille, France. (Virtual). https://www.aclweb.org/anthology/2020.lrec-1.363/

• Chinese Character Decomposition for Neural MT with Multi-Word Expressions. 23rd Nordic Conference on Computational Linguistics. Data available
under the subfolder 'radical4mt'. https://www.aclweb.org/anthology/2021.nodalida-main.35/

• Translation Quality Assessment: A Brief Survey on Manual and Automatic Methods. @NoDaLiDa21. https://ep.liu.se/ecp/179/003/ecp2021179003.pdf

• Based on/refer to:

• Agata Savary, et al. 2017. The PARSEME shared task on automatic identification of verbal multiword expressions. In MWE2017.

• Abigail Walsh, et al. 2018. Constructing an annotated corpus of verbal MWEs for English. In (LAW-MWE-CxG2018), pages 193–200.

• Carlos Ramisch et al. 2018. Edition 1.1 of the PARSEME shared task on automatic identification of verbal multiword expressions. In LAW-
MWE-CxG-2018)

• Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation. 2021 https://arxiv.org/abs/2104.14478
32
We endorse the PARSEME shared task events and the corpus!

www.adaptcentre.ie
References
• MWE:

• Timothy Baldwin and Su Nam Kim. 2010. Multiword expressions. In Handbook of Natural LanguageProcessing, Second Edition, pages 267–292.
Chapman and Hall.

• Mathieu Constant, et al. 2017. Survey: Multiword expression processing: A Survey. Computational Linguistics, 43(4):837–892.

• Ivan A. Sag, et al. 2002. Multiword expressions: A pain in the neck for nlp. In Alexander Gelbukh, editor, Computational Linguistics and Intelligent Text
Processing.

• MWE corpus:

• Akihiko Kato, Hiroyuki Shindo, and Yuji Matsumoto. 2018. Construction of Large-scale English Verbal Multiword Expression Annotated Corpus. In
LREC.

• Nathan Schneider, et al. 2014. Comprehensive annotation of multiword expressions in a social web corpus. In Proceedings of the LREC.

• Veronika Vincze. 2012. Light verb constructions in the SzegedParalellFX English–Hungarian parallel corpus. In LREC.

• MT with MWE:

• Dhouha Bouamor, Nasredine Semmar, and Pierre Zweigenbaum. 2012. Identifying bilingual multi-word expressions for statistical machine
translation. In LREC.

• Patrik Lambert and Rafael E. Banchs. 2005. Data Inferred Multi-word Expressions for Statistical Machine Translation. In Proceedings of
Machine Translation Summit X, pages 396–403, Thailand.

• Xiaoqing Li, Jinghui Yan, Jiajun Zhang, and Chengqing Zong. 2019. Neural name translation improves neural machine translation. In
Machine Translation, pages 93–100, Singapore. Springer.

• Matīss Rikters and Ondřej Bojar. 2017. Paying Attention to Multi-Word Expressions in Neural MachineTranslation. In Proceedings of the
16th Machine Translation Summit.

• Inguna Skadina. 2016. Multi-word expressions in english-latvian machine translation. Baltic J. Modern Computing, 4:811–825.
References
33

www.adaptcentre.ie
34
• Dankeschön!
• 谢谢!
• Thank you!
• Gracias!
• Grazie!
• Dziękuję Ci!
• Merci!
• Dank je!
• спасибі!
• धन्यवाद!
• Благодаря ти!
quiz: which language do you recognise? 😉
Go
raibh maith
agat!
tak skal
du have
Takk skal du ha
tack
Kiitos
Þakka þér fyrir
Qujan Qujanaq Qujanarsuaq

Further Reading A.I(MWEs)
• [1] Erwan Moreau, Ashjan Alsulaimani, Alfredo Maldonado, Lifeng Han, Carl Vogel and Koel
Dutta Chowdhury. Semantic Re-Ranking of CRF Label Sequences for Verbal Multi-Word
Expression Extraction. Book Chapter. Stella Markantonatou, Carlos Ramisch, Agata Savary,
and Veronika Vincze Volume Editors. Language Science Press (LangSci). pp.1-24. 2018
• [2] Alfredo Maldonado, Lifeng Han, Erwan Moreau, Ashjan Alsulaimani, Koel Dutta
Chowdhury, Carl Vogel and Qun Liu. Detection of Verbal Multi-Word Expressions via
Conditional Random Fields with Syntactic Dependency Features and Semantic Re-Ranking.
In MWE workshop with EACL 2017, Spain. (one of the three main co-authors)
• Previous related:
• [3]Lifeng Han, Xiaodong Zeng, Derek F. Wong, Lidia S. Chao. Chinese Named Entity
Recognition with Graph-based Semi-supervised Learning Model}{SIGHAN workshop in ACL-
IJCNLP. 2015.
• [4]Lifeng Han, Derek F. Wong, Lidia S. Chao, Liangye He, et al. A Study of Chinese Word
Segmentation Based on the Characteristics of Chinese.Language Processing and
Knowledge in the Web - Proceedings of the International Conference of the German
Society for Computational Linguistics and Language Technology.
• [5]Lifeng Han, Derek F. Wong, Lidia S. Chao. Chinese Named Entity Recognition with
Conditional Random Fields in the Light of Chinese Characteristics.Proceeding of
International Conference of Language Processing and Intelligent Information Systems. IIS
2013, LNCS Vol. 7912, pp. 57-68

Further Reading A.II(MT)
• [6]Lifeng Han, Shaohui Kuang. Incorporating Chinese Radicals Into
Neural Machine Translation: Deeper Than Character Level. In
ESSLLI-2018. August 6-17, Sofia, Bulgaria. http://doras.dcu.ie/
24732/8/esslli_han_incorperating_.pdf
• [7]Lifeng Han, Derek F. Wong, Lidia S. Chao, et al. A Universal
Phrase Tagset for Multilingual Treebanks. CCL and NLP-NABD
2014, LNAI 8801, pp. 247 - 258.
• [8]Lifeng Han, Derek F. Wong, et al. Phrase Tagset Mapping for
French and English Treebanks and Its Application in Machine
Translation Evaluation. Language Processing and Knowledge in
the Web - Proceedings of the International Conference of the
German Society for Computational Linguistics and Language
Technology, (GSCL 2013), Darmstadt, Germany, on September
25-27, 2013. LNCS Vol. 8105

Further Reading A.III(MTE)
• [9]Lifeng Han. Machine Translation Evaluation Resources and Methods: A Survey.
Presented in IPRC-2018 (Ireland Postgraduate Research Conference, 8-9 November,
Dublin) pp.1-18. arXiv CS.CL(1605.04515)
• [10]Lifeng Han, Derek F. Wong, et al. Unsupervised Quality Estimation Model for
English to German Translation and Its Application in Extensive Supervised
Evaluation. The Scientific World Journal. Issue: Recent Advances in Information
Technology. ISSN:1537-744X
• [11]Lifeng Han, Derek F. Wong, et al. Language-independent Model for Machine
Translation Evaluation with Reinforced Factors. MT SUMMIT 2013. pp. 215-222.
• [12]Lifeng Han, Derek F. Wong, et al. A Description of Tunable Machine Translation
Evaluation Systems in WMT13 Metrics Task. In ACL-WMT 2013.
• [13]Lifeng Han, Derek F. Wong, et al. Quality Estimation for Machine Translation
Using the Joint Method of Evaluation Criteria and Statistical Modeling. ACL-WMT
2013.
• [14]Lifeng Han, Derek F. Wong, Lidia S. Chao. LEPOR: A Robust Evaluation Metric for
Machine Translation with Augmented Factors. Proceedings of the 24th International
Conference on Computational Linguistics (COLING 2012): Posters, pages 441-450.

Chinese Character Decomposition for Neural MT with Multi-Word Expressions

More Related Content

What's hot (20)

More from Lifeng (Aaron) Han (20)

Recently uploaded (20)

Chinese Character Decomposition for Neural MT with Multi-Word Expressions