Longyue Wang
Research Fellow @ NLP Centre, Tencent AI Lab
Longyue is a senior research fellow at Tencent AI Lab. He has received the B.Sc. degree in network engineering in 2011, and the M.Sc. degree in software engineering in 2014. From 2015 to 2018, he pursued Ph.D. at Dublin City University, supervised under Prof. Qun Liu and Prof. Andy Way. In 2018, he was awarded the Ph.D. degree in computer applications, and achieved the Best Thesis Award by the European Association for Machine Translation (EAMT).
Longyue has studied and practiced in a broad field of Natural Language Processing and Computational Linguistics such as word segmentation, named entity, discourse parsing and grammatical error correction. His research interests are Machine Translation, Large Language Model, Discourse Modelling and Deep Learning. He has published 40 papers in leading NLP journals and conferences such as AIJ, ICLR, ACL. He has applied for 50 U.S., HK, Japan and China patents.
Contact |
Infomation |
---|---|
Email: | vincentwang0229 [AT] gmail [DOT] com |
Tel: | +86 755 86013388 - 57508 |
Address: | Block B, Viseen, Shenzhen Bay Sci & Tech Ecological Park, Yuehai Subdistrict, Nanshan District, Shenzhen, China |
News
Academic Qualifications
2015 - 2018 |
Ph.D. in Computer Applications Dublin City University (DCU) Supervisor: Prof. Qun Liu and Prof. Andy Way Thesis: Discourse-Aware Neural Machine Translation |
2011 - 2014 |
M.Sc. in Software Engineering University of Macau (UM) Supervisor: Prof. Derek F. Wong and Prof. Lidia S. Chao Thesis: Domain Adaptation for Statistical Machine Translation |
2007 - 2011 |
B.Sc. in Network Engineering Shandong University of Science and Technology (SUST) Supervisor: Prof. Wenxue Wei |
Academic Experience
Work
- 2018/08 - Present Scientific Researcher, Tecent AI Lab, China
- 2017/05-2017/11 Postgraduate Intern, Tecent AI Lab, China
- 2014/12-2015/09 Research Assistant, Dublin City University, Ireland
- 2011/10-2014/10 Research Assistant, NLP2CT Laboratory, Macau
- 2013/09-2013/12 Postgraduate Intern, Iconic Translation Machines Ltd., Ireland
- 2013/07-2013/08 Visiting Scholar, Instituto Superior Técnico, Portugal
Project
- 2021/05-2022/05 Rhino-Bird Focused Research Program (Tencent AI Lab, Co-PI)
- 2017/12-2018/12 Discourse Machine Translation (Huawei Noah's Ark Lab - DCU Joint Project, Co-PI)
- 2014/12-2016/12 Dialogue Machine Translation (Huawei Noah's Ark Lab - DCU Joint Project, Co-PI)
- 2013/09-2013/12 Improving Domain-Specific Machine Translation (Enterprise Ireland Commercialisation Fund)
- 2013/05-2016/04 Construction of Portuguese-Chinese Parallel Treebank Based on Unsupervised Learning Approach (University of Macau Research Committee, MYRG076(Y1-L2)-FST13-WF) [Full List]
Teaching
- 2018/02-2018/06 Teaching Assistant at DCU Statistical Machine Translation (CA4012)
- 2017/01-2017/06 Teaching Assistantat DCU Introduction to Programming (CA146)
- 2017/01-2017/06 Teaching Assistant at DCU Statistical Machine Translation (CA4012)
- 2016/02-2016/06 Teaching Assistant at DCU Introduction to Programming (CA146)
- 2014/09-2014/10 Teaching Assistant at UM Natural Language Processing (SFTW462) [Full List]
Selected Publications
Please go to [Full List] or [Google Scholar] to see all my publications (* is co-first and † is corresponding author).
A Survey on Zero Pronoun Translation
Longyue Wang, Siyou Liu*, Mingzhou Xu, Linfeng Song, Shuming Shi, and Zhaopeng Tu
ACL 2023 [pdf]
New Trends in Machine Translation using Large Language Models: Case Examples with ChatGPT
Chenyang Lyu, Jitao Xu, Longyue Wang
Arxiv 2023 [pdf] [blog]
Document-Level Machine Translation with Large Language Models
Longyue Wang*, Chenyang Lyu*, Tianbo Ji*, Zhirui Zhang*, Dian Yu, Shuming Shi, Zhaopeng Tu
Arxiv 2023 [pdf] [data]
GuoFeng: A Benchmark for Zero Pronoun Recovery and Translation
Longyue Wang, Mingzhou Xu*, Derek F. Wong, Hongye Liu, Linfeng Song, Lidia S. Chao, Shuming Shi and Zhaopeng Tu
EMNLP 2022 [pdf] [code]
Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation
Liang Ding, Longyue Wang*, Shuming Shi, Dacheng Tao, Zhaopeng Tu
ACL 2022 [pdf]
Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation
Liang Ding, Longyue Wang*, Xuebo Liu, Derek F. Wong, Dacheng Tao, Zhaopeng Tu
ACL 2021 [pdf]
Understanding and Improving Lexical Choice in Non-Autoregressive Translation
Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, Dacheng Tao, Zhaopeng Tu
ICLR 2021 [pdf]
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning
Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Zhaopeng Tu
ICLR 2021 [pdf]
How Does Selective Mechanism Improve Self-Attention Networks?
Xinwei Geng, Longyue Wang, Xing Wang, Bing Qin, Ting Liu, and Zhaopeng Tu
ACL 2020 [pdf]
One Model to Learn Both: Zero Pronoun Prediction and Translation
Longyue Wang, Zhaopeng Tu, Xing Wang, and Shuming Shi
EMNLP 2019 [pdf]
Assessing the Ability of Self-Attention Networks to Learn Word Order
Baosong Yang, Longyue Wang, Derek Wong, Lidia S. Chao, and Zhaopeng Tu
ACL 2019 [pdf]
Convolutional Self-Attention Networks
Baosong Yang, Longyue Wang, Derek Wong, Lidia S. Chao, and Zhaopeng Tu
NAACL 2019 [pdf]
Learning to Jointly Translate and Predict Dropped Pronouns with a Shared Reconstruction Mechanism
Longyue Wang, Zhaopeng Tu, Andy Way, and Qun Liu.
EMNLP 2018 [pdf]
Translating Pro-Drop Languages with Reconstruction Models
Longyue Wang, Zhaopeng Tu, Shuming Shi, Tong Zhang, Yvette Graham, Qun Liu
AAAI 2018 [pdf] [bitex] [poster]
Exploiting Cross-Sentence Context for Neural Machine Translation
Longyue Wang, Zhaopeng Tu, Andy Way, Qun Liu
EMNLP 2017 [pdf] [bitex] [slides]
A Novel Approach for Dropped Pronoun Translation
Longyue Wang, Zhaopeng Tu, Xiaojun Zhang, Hang Li, Andy Way and Qun Liu
NAACL-HLT 2016 [pdf] [bitex] [slides]
Awards & Campaign
Honors and Awards
- 2023/02 IEEE Senior Member
- 2022/12 2022 Industrial Development and Innovation Talent Award
- 2022/07 NAACL2022 Outstanding Action Editior
- 2021/12 2021 Industrial Development and Innovation Talent Award
- 2021/06 Tencent Outstanding Contributor × 2
- 2019/06 2018 EAMT Best Thesis Award (1 person/year)
- 2018/10 Dublin City University Research Day Winner
- 2017/07 Chinese Government Award for Outstanding Students Abroad (nominated)
- 2016/04 Dublin City University Award for Engagement with Business/Industry
- 2016/04 NAACL 2016 Student Travel Awards
- 2015/06 Dublin City University Studentship
- 2014/10 Excellent Graduate Student of University of Macau
- 2014/05 ACL 2014 Student Travel Awards
- 2013/09 Enterprise Ireland Commercialisation Fund Award
- 2011/09 University of Macau Graduate Assistantship
- 2011/09 University of Macau Student Fellowship
- 2011/06 Excellent Graduate Student of Shandong University of Science and Technology
- 2011/05 Excellent Graduate Student of Shandong Province
- 2007/12-2010/05 Outstanding Undergraduate Scholarship (First Prize × 3, Second Prize × 2)
Academic Competitions
Year | Name | Result |
---|---|---|
2021 | WMT: News Translation Task (Zh-En, De-En) | 1st Rank |
2020 | WMT: Chat Translation Task (De-En) | 2nd Rank |
2020 | WMT: News Translation Task (Zh-En, En-Zh) | 2nd Rank |
2020 | WMT: Biomedical Translation Task (En-De) | 1st Rank |
2020 | WMT: Biomedical Translation Task (De-En) | 2nd Rank |
2017 | AI Challenger: English-Chinese Machine Translation (Bi-weekly Interim) | 2nd Rank |
2015 | CoNLL : Shallow Discourse Parsing | 11st Rank |
2014 | CoNLL : Grammatical Error Correction | 5th Rank |
2014 | WMT : Medical Translation Task (En-De, Cz-En, Fr-En) | 1st Rank |
2014 | WMT : Medical Translation Task (De-En, En-Cz, En-Fr) | 2nd Rank |
2013 | CoNLL : Grammatical Error Correction | 3rd Rank |
2012 | CLP Bake-off: Micro-blog Word Segmentation | 4th Rank |
2012 | CLP Bake-off: Chinese Name Disambiguation | 6th Rank |
2011 | National Post-Graduate Mathematic Contest in Modeling | 2nd Prize |
2010 | US Mathematical Contest in Modeling | SP |
Resources
Corpora

Webnovel: Discourse-Level Literary Translation Corpus
A Chinese-English document-level corpus in literary domain

mZPRT: Zero Pronoun Recovery and Translation Dataset
A benchmark contains human-annotated zero pronouns in texts from five domains

TVsub: DCU-Tencent Chinese-English Dialogue Corpus
More than two million sentence pairs were extracted from the subtitles of television episodes for machine translation.

MVsub: DCU-Huawei Chinese-English Dialogue Corpus
About one million sentence pairs were extracted from the subtitles of movies for machine translation.

UM Corpus: Multi-domain Chinese-English Data
Two million English-Chinese data, which is categorized into eight different genres for machine translation.
Toolkits & Systems

Tencent Translate: Literary Translation System
The system is designed to be a literary translation system. Choose "Chinese-to-English" language and select "Literary" domain.

Macaw-LLM
Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration.

Tencent Translate: Chinese-English Translation
The system is designed to be a general-domain translation system. Choose "Chinese-English" language and select "General" domain.

TODAY: Hotel Booking Translation System
The system is designed to be a real-time, semantics-enhanced, task-oriented machine translation system.

iSeg: Chinese Word Segmentator for Micro-Blog Text
The Chinese word segmentor is designed for informal and user-generated text.

iSenWeb: Translation Web Interface
The interface can be used to design web-based machine translation systems based on Moses toolkit.
Others
Professional Services
Journal
- 2021 Transactions of Association for Computational Linguistics Action Editor
- 2021 Computational Linguistics Standing Reviewer
- 2017 Machine Translation Standing Reviewer
Conference
- 2023 AACL Senior Area Chair; EAMT Best Thesis Award PC Member; ACL Reviewer
- 2022 ACL Rolling Review Action Editor
- 2021 ACL Area Chair
- 2021 IJCAI/AAAI Senior PC Member
- 2020 ACL Area Chair
Organization
- Chinese Information Processing Society (YSSNLP) Member
- China Computer Federation (CCF) Member
- Institute of Electrical and Electronics Engineers (IEEE) Senior Member
Natural/Programming Languages
- English: IELTS 6.5 (writing 7.0)
- Mandarin Chinese: native speaker
- Cantonese Chinese: studying
- Python & Shell: 5 years
- Java & C++: 1 year
Events
- 2023/05 WMT2023 Shared Task: Discourse-Level Literary Translation (Organizer)
- 2022/08 The 18th China Conference on Machine Translation. (Invited Tutorial and Pannel)
- 2022/07 DataFun Summit 2022. (Invited Talk)
- 2022/07 2022 Tencent Academic and Industrial Conference. (Invited Talk)
- 2021/10 Natural Language Processing Youth Elite Forum (Invited Talk)
- 2021/10 School of Informatics, Xiamen University (Invited Talk)
- 2021/10 NLP2CT Lab, FST, University of Macau (Invited Talk)
- 2018/01 The 17th Machine Translation Summit. (Award Talk)
- 2018/01 The 1st International Workshop on Discourse Processing. (Invited Talk)
- 2018/01 Sogou, Inc. (Invited Talk)
- 2017/11 AI: Accelerating Impact (Demo & Poster Presentation)
- 2017/11 Huawei's Video Intelligence Forum (Demo Presentation)
- 2017/10 New Tranx Information Technology Co.,Ltd. (Invited Talk)
- 2016/10 The 10th Annual Irish Human Computer Interaction Conference (Poster Presentation)
- 2016/10 The 1st Deep Learning for Machine Translation (Attendee)
- 2016/10 ADAPT Centre for Digital Content Technology (Poster Presentation)
- 2014/07 The 5th Lisbon Machine Learning School (Monitor)
- 2014/11 The 10th China Workshop on Machine Translation (Local Organization Committee)
- 2014/01 Online course of Stanford University: Machine Learning (Accomplished)
- 2013/07 The 3rd Lisbon Machine Learning School (Attendee)
- 2012/12 The 15th Oriental COCOSDA Workshop (Local Organization Committee)
Social Experiences
- 2010/08-2011/09 Deputy, 25th National Congress of the All-China Students’ Federation, Beijing
- 2010/07-2010/09 Intern, Labor and Social Security Bureau, Tsingtao, China
- 2009/05-2011/05 University Leader, Google Caring for China, Tsingtao, China
- 2009/04-2010/05 President of Student Union of Shandong University Science and Technology
- 2008/07-2008/09 Volunteer, Beijing 2008 Olympic Games
- 2007/08-2007/10 Volunteer, Research Center for Contemporary China
Interns
For inquiries about internships, please send your resume to vincentwang0229 AT gmail.com.
- 2022/04 ~ Present Chenyang Lyu, Ph.D. Student Dublin City University
- 2022/03 ~ Present Zefeng Du, M.Sc. Student University of Macau
- 2022/07 ~ Present Bingshuai Liu, M.Sc. Student Xiamen University
- 2021/05 ~ 2023/04 Zhihao Wang, Ph.D. Student Xiamen University
- 2021/10 ~ 2022/07 Donghuai Liu, M.Sc. Student Xiamen University
- 2021/08 ~ 2022/07 Kangjie Zheng, Ph.D. Student Peking University
- 2021/02 ~ 2021/12 Mingzhou Xu, Ph.D. Student University of Macau
- 2020/09 ~ 2021/09 Hongye Liu, B.A. Student Beijing Institute of Technology (M.Sc., Imperial College London, now Ph.D. King Abdullah University of Science and Technology
- 2020/02 ~ 2020/06 Li Ding, M.Sc. Student Hong Kong Polytechnic University (now Research Fellow, OPPO Academy)
- 2019/12 ~ 2021/07 Liang Ding, Ph.D. Student The University of Sydney (now TET Research Fellow, JD Explore Academy)
- 2019/12 ~ 2021/02 Xuebo Liu, Ph.D. Student University of Macau (now Assistant Professor, Harbin Institute of Technology Shenzhen)
- 2019/03 ~ 2020/02 Yilin Yang, Ph.D. Student Oregon State University
- 2019/03 ~ 2020/02 Xinwei Geng, Ph.D. Student Harbin Institute of Technology
- 2019/02 ~ 2020/02 Yong Wang, Ph.D. Student University of Hong Kong (now Senior Research Fellow, Lightspeed & Quantum Studios, Tencent)
- 2018/12 ~ 2019/12 Bo He (co-supervise), M.Sc. Student Nanjing University of Aeronautics and Astronautics (now Engineer, Pingduoduo)
- 2018/08 ~ 2019/02 Baosong Yang (co-supervise), Ph.D. Student University of Macau (now Algorithm Expert, Alibaba DAMO Academy)