Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Security Design and Validation Research Group (SerVal)
Disciplines :
Computer science
Author, co-author :
Sun, Zeyu
Zhang, Jie
Harman, Mark
Papadakis, Mike ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC)
Zhang, Lu
External co-authors :
yes
Language :
English
Title :
Automatic Testing and Improvement of Machine Translation
Publication date :
2020
Event name :
42nd International Conference on Software Engineering
Event date :
from 23-5-2020 to 29-5-2020
Main work title :
International Conference on Software Engineering (ICSE)
Peer reviewed :
Peer reviewed
FnR Project :
FNR11686509 > Michail Papadakis > CODEMATES > COntinuous DEvelopment with Mutation Analysis and TESting > 01/09/2018 > 31/08/2021 > 2017
Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston. 2013. OntoNotes. https://catalog. ldc. upenn. edu/LDC2013T19.
Yonatan Belinkov and Yonatan Bisk. 2018. Synthetic and natural noise both break neural machine translation. In Proc. ICLR.
Tsong Y Chen, Shing C Cheung, and Shiu Ming Yiu. 1998. Metamorphic testing: a new approach for generating next test cases. Technical Report.
Yong Cheng, Lu Jiang, and Wolfgang Macherey. 2019. Robust Neural Machine Translation with Doubly Adversarial Inputs. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28-August 2, 2019, Volume 1: Long Papers. 4324-4333. https://www. aclweb. org/ anthology/P19-1425/
Yong Cheng, Zhaopeng Tu, Fandong Meng, Junjie Zhai, and Yang Liu. 2018. Towards robust neural machine translation. arXiv preprint arXiv:1805. 06130 (2018).
CWMT. 2018. The CWMT Dataset. http://nlp. nju. edu. cn/cwmt-wmt/.
George Doddington. 2002. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the second international conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc., 138-145.
David M Eberhard, Gary F Simons, and Charles D Fennig. 2019. Ethnologue: Languages of the world. (2019).
Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2018. HotFlip: White-Box Adversarial Examples for Text Classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 31-36. https://doi. org/10. 18653/v1/P18-2006
Free Software Foundation. 2019. GNU Wdiff. https://www. gnu. org/software/ wdiff/
Carlo Giglio and Richard Caulk. 1965. Article 17 of the Treaty of Uccialli. The Journal of African History 6, 2 (1965), 221-231.
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2672-2680. http://papers. nips. cc/paper/5423-generative-adversarialnets. pdf
Google. 2019. Google Translate. http://translate. google. com.
Yvette Graham, Timothy Baldwin, Aaron Harwood, Alistair Moffat, and Justin Zobel. 2012. Measurement of progress in machine translation. In Proceedings of the Australasian Language Technology Association Workshop 2012. 70-78.
Jiatao Gu, Yong Wang, Kyunghyun Cho, and Victor OK Li. 2018. Search engine guided neural machine translation. In Thirty-Second AAAI Conference on Artificial Intelligence.
Jie Hao, XingWang, Baosong Yang, LongyueWang, Jinfeng Zhang, and Zhaopeng Tu. 2019. Modeling Recurrence for Transformer. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 1198-1207. https://doi. org/10. 18653/v1/N19-1122
Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, Shujie Liu, Tie-Yan Liu, Renqian Luo, Arul Menezes, Tao Qin, Frank Seide, Xu Tan, Fei Tian, Lijun Wu, Shuangzhi Wu, Yingce Xia, Dongdong Zhang, Zhirui Zhang, and Ming Zhou. 2018. Achieving Human Parity on Automatic Chinese to English News Translation. CoRR abs/1803. 05567 (2018). arXiv:1803. 05567 http://arxiv. org/abs/1803. 05567
Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, James Law, Kevin Lee, Jason Lu, Pieter Noordhuis, Misha Smelyanskiy, Liang Xiong, and Xiaodong Wang. 2018. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. In 24th International Symposium on High-Performance Computer Architecture (HPCA 2018), February 24-28, Vienna, Austria.
Georg Heigold, Stalin Varanasi, Günter Neumann, and Josef van Genabith. 2018. How Robust Are Character-BasedWord Embeddings in Tagging and MT Against Wrod Scramlbing or Randdm Nouse?. In Proceedings of the 13th Conference of the Association for Machine Translation in the Americas, AMTA 2018, Boston, MA, USA, March 17-21, 2018-Volume 1: Research Papers. 68-80. https://aclanthology. info/papers/W18-1807/w18-1807
James W Hunt and Thomas G Szymanski. 1977. A fast algorithm for computing longest common subsequences. Commun. ACM 20, 5 (1977), 350-353.
Yue Jia and Mark Harman. 2011. An Analysis and Survey of the Development of Mutation Testing. IEEE Transactions on Software Engineering 37, 5 (September? October 2011), 649 ? 678.
Jiajun Jiang, Yingfei Xiong, and Xin Xia. 2019. A manual inspection of Defects4J bugs and its implications for automatic program repair. Science China Information Sciences 62, 10 (2019), 200102.
Jiajun Jiang, Yingfei Xiong, Hongyu Zhang, Qing Gao, and Xiangqun Chen. 2018. Shaping Program Repair Space with Existing Patches and Similar Code. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2018). ACM, New York, NY, USA, 298-309. https: //doi. org/10. 1145/3213846. 3213871
Jiajun Jiang, Yingfei Xiong, Hongyu Zhang, Qing Gao, and Xiangqun Chen. 2018. Shaping program repair space with existing patches and similar code. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 298-309.
Vladimir Karpukhin, Omer Levy, Jacob Eisenstein, and Marjan Ghazvininejad. 2019. Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation. arXiv preprint arXiv:1902. 01509 (2019).
Huda Khayrallah and Philipp Koehn. 2018. On the impact of various types of noise on neural machine translation. arXiv preprint arXiv:1805. 12282 (2018).
Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, andWestleyWeimer. 2011. Genprog: A generic method for automatic software repair. Ieee transactions on software engineering 38, 1 (2011), 54-72.
Yang Liu and Maosong Sun. 2015. Contrastive unsupervised word alignment with non-local features. In Twenty-Ninth AAAI Conference on Artificial Intelligence.
Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In Association for Computational Linguistics (ACL) System Demonstrations. 55-60. http://www. aclweb. org/anthology/P/P14/P14-5010
M. Chris Mason. 2017. Strategic Insights: Lost in Translation. https://ssi. armywarcollege. edu/index. cfm/articles/Lost-In-Translation/2017/08/17
Mike Papadakis, Marinos Kintis, Jie Zhang, Yue Jia, Yves Le Traon, and Mark Harman. 2019. Mutation testing advances: an analysis and survey. In Advances in Computers. Vol. 112. Elsevier, 275-378.
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 311-318.
Parmy Olson. 2018. The Algorithm That Helped Google Translate Become Sexist. https://www. forbes. com/sites/parmyolson/2018/02/15/the-algorithm-thathelped-google-translate-become-sexist/#224101cb7daa.
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532-1543.
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Semantically equivalent adversarial rules for debugging nlp models. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 856-865.
Eric Sven Ristad and Peter N Yianilos. 1998. Learning string-edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 5 (1998), 522-532.
Robert Parker, David Graff, Junbo Kong, Ke Chen, Kazuaki Maeda. 2011. English Gigaword Fifth Edition. https://catalog. ldc. upenn. edu/LDC2011T07.
Ripon K. Saha, Yingjun Lyu, Hiroaki Yoshida, and Mukul R. Prasad. 2017. ELIXIR: Effective Object Oriented Program Repair. In Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2017). IEEE Press, Piscataway, NJ, USA, 648-659. http://dl. acm. org/citation. cfm?id=3155562. 3155643
SpaCy. 2019. SpaCy. https://spacy. io/.
Matthias Sperber, Jan Niehues, and Alex Waibel. 2017. Toward robust neural machine translation for noisy input sequences. In International Workshop on Spoken Language Translation (IWSLT).
Liqun Sun and Zhi Quan Zhou. 2018. Metamorphic testing for machine translations: MT4MT. In 2018 25th Australasian Software Engineering Conference (ASWEC). IEEE, 96-100.
Ann Taylor, Mitchell Marcus, and Beatrice Santorini. 2003. The Penn treebank: an overview. In Treebanks. Springer, 5-22.
Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, ?ukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, and Jakob Uszkoreit. 2018. Tensor2Tensor for Neural Machine Translation. CoRR abs/1803. 07416 (2018). http://arxiv. org/abs/1803. 07416
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ?ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., 6000-6010.
VoiceBoxer. 2016. WHAT ABOUT ENGLISH IN CHINa http://voiceboxer. com/ english-in-china/.
Rining Wei and Jinzhi Su. 2012. The statistics of English in China: An analysis of the best available data from government sources. English Today 28, 3 (2012), 10-14.
Qi Xin and Steven P. Reiss. 2017. Leveraging Syntax-related Code for Automated Program Repair. In Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2017). IEEE Press, Piscataway, NJ, USA, 660-670. http://dl. acm. org/citation. cfm?id=3155562. 3155644
Jie Zhang, Junjie Chen, Dan Hao, Yingfei Xiong, Bing Xie, Lu Zhang, and Hong Mei. 2014. Search-based inference of polynomial metamorphic relations. In Proceedings of the 29th ACM/IEEE international conference on Automated software engineering. ACM, 701-712.
Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, and Satoshi Nakamura. 2018. Guiding neural machine translation with retrieved translation pieces. arXiv preprint arXiv:1804. 02559 (2018).
Jie Zhang, Lingming Zhang, Mark Harman, Dan Hao, Yue Jia, and Lu Zhang. 2018. Predictive mutation testing. IEEE Transactions on Software Engineering 45, 9 (2018), 898-918.
Jie Zhang, Muyao Zhu, Dan Hao, and Lu Zhang. 2014. An empirical study on the scalability of selective mutation testing. In 2014 IEEE 25th International Symposium on Software Reliability Engineering. IEEE, 277-287.
Jie M Zhang, Mark Harman, Lei Ma, and Yang Liu. 2019. Machine Learning Testing: Survey, Landscapes and Horizons. arXiv preprint arXiv:1906. 10742 (2019).
Yin Zhang, Rong Jin, and Zhi-Hua Zhou. 2010. Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics 1, 1 (01 Dec 2010), 43-52. https://doi. org/10. 1007/s13042-010-0001-0
Micha Ziemski, Marcin Junczys-Dowmunt, and Bruno Pouliquen. 2016. The united nations parallel corpus v1. 0. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). 3530-3534.