Natural Language Processing

We aim to conduct cutting-edge research and become a local hub in Asia in natural language processing (NLP) and language technology. Geographically, we are naturally drawn towards language problems and challenges in the region which might otherwise be overlooked in the research community. Our goal is to not only create new tools and knowledge agonistic to low-resource languages in the region, but also to practically create the very best NLP technology for Vietnamese. Consequently, we are pushing new state-of-the-arts in low-resource language problems, language modeling and translation, conversational AI, information extraction, and the like.

New technology requires new fundamental research. To this end, our team collaborates with the Machine Learning team to work on foundations of machine learning for NLP such as self-supervised learning, adversarial learning, multi-task learning, graph neural networks and knowledge graph, and also collaborates with the Computer Vision team for multimodal research in vision and language.

The NLP team has helped boost the global visibility of VinAI by establishing a strong collaborator network with prominent researchers all over the world, for example, from the University of Oregon in the USA, Nanyang Technological University in Singapore, the University of Melbourne and Monash University in Australia. We achieved substantial research outputs with 34 papers published at top-tier NLP/AI conferences, including EMNLP (11 papers), ACL (3), NAACL (3), InterSpeech (8), AAAI (5), ICLR (2), NeurIPS (1) and IJCAI (1), under a wide range of, but not limited to, the following topics:

NLP Findings of ACL
Retrieving Relevant Context to Align Representations for Cross-lingual Event Detection

We study the problem of cross-lingual transfer learning for event detection (ED) where…

NLP InterSpeech Top Tier
XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech

We present XPhoneBERT, the first multilingual model pre-trained to learn phoneme representations for…

NLP CIKM
A Capsule Network-based Model for Learning Node Embeddings

In this paper, we focus on learning low-dimensional embeddings for nodes in graph-structured…

Related publications

NLP Findings of ACL
May 22, 2023

Nguyen Van Chien, Linh Van Ngo, Nguyen Huu Thien

NLP InterSpeech Top Tier
May 22, 2023

Linh The Nguyen, Thinh Pham, Dat Quoc Nguyen

NLP EMNLP Findings
October 17, 2022

Vinh Tong, Dat Quoc Nguyen, Trung Thanh Huynh, Tam Thanh Nguyen, Quoc Viet Hung Nguyen and Mathias Niepert

NLP EMNLP Findings
October 17, 2022

Viet Dac Lai*, Hieu Man*, Linh Ngo, Franck Dernoncourt and Thien Huu Nguyen

NLP EMNLP Top Tier
October 17, 2022

Minh Van Nguyen, Bonan Min, Franck Dernoncourt and Thien Huu Nguyen

VinAI Translate

Do not miss these Seminars & Workshops

Jey Han Lau

University of Melbourne

Rumour and Disinformation Detection in Online Conversations
Thu, Sep 14 2023 - 10:00 am (GMT + 7)
Tim Baldwin

Mohamed bin Zayed University of Artificial Intelligence

Fairness in Natural Language Processing
Tue, Dec 20 2022 - 02:00 pm (GMT + 7)
Anh Tuan Luu

VinAI Research

Towards Robustness Against Natural Language Adversarial Attacks
Fri, Aug 14 2020 - 03:00 pm (GMT + 7)

Released Source Codes

NO

Code

Paper

Conference

Year

01.

3D-UCaps

58
11
3D-UCaps: 3D Capsules Unet for Volumetric Image Segmentation MICCAI 2021
02.

BARTpho

88
7
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese InterSpeech 2021
03.

Blur-kernel-space-exploring

125
33
Exploring Image Deblurring via Blur Kernel Space CVPR 2021

Technical Blog

October 27, 2022

Nguyen Luong Tran, Duong Minh Le and Dat Quoc Nguyen

October 22, 2022

Linh The Nguyen*, Nguyen Luong Tran*, Long Doan*, Manh Luong and Dat Quoc Nguyen