EnTDA: Entity-to-Text based Data Augmentation Approach for Named Entity Recognition Tasks

Xuming Hu, Yong Jiang, Aiwei Liu, Zhongqiang Huang, Pengjun Xie, Fei Huang, Lijie Wen, Philip S. Yu

To alleviate the excessive reliance on the dependency order among entities in existing augmentation paradigms, we develop an entity-to-text instead of text-to-entity based data augmentation method named: EnTDA to decouple the dependencies between entities by adding, deleting, replacing and swapping entities, and adopt these augmented data to bootstrap the generalization ability of the NER model.

Data Augmentation named-entity-recognition +1

Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation

Chen Wang, Yuchen Liu, Boxing Chen, Jiajun Zhang, Wei Luo, Zhongqiang Huang, Chengqing Zong

Existing zero-shot methods fail to align the two modalities of speech and text into a shared semantic space, resulting in much worse performance compared to the supervised ST methods.

Automatic Speech Recognition Machine Translation +3

ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition

NAACL 2022 Xinyu Wang, Min Gui, Yong Jiang, Zixia Jia, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

As text representations take the most important role in MNER, in this paper, we propose {\bf I}mage-{\bf t}ext {\bf A}lignments (ITA) to align image features into the textual space, so that the attention mechanism in transformer-based pretrained textual embeddings can be better utilized.

Multi-modal Named Entity Recognition named-entity-recognition

MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

EMNLP 2021 Xinyin Ma, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Weiming Lu

Entity retrieval, which aims at disambiguating mentions to canonical entities from massive KBs, is essential for many tasks in natural language processing.

Entity Linking Entity Retrieval +1

Risk Minimization for Zero-shot Sequence Labeling

ACL 2021 Zechuan Hu, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

In this paper, we propose a novel unified framework for zero-shot sequence labeling with minimum risk training and design a new decomposable risk function that models the relations between the predicted labels from the source models and the true labels.

Multi-View Cross-Lingual Structured Prediction with Minimum Supervision

ACL 2021 Zechuan Hu, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

In structured prediction problems, cross-lingual transfer learning is an efficient way to train quality models for low-resource languages, and further improvement can be obtained by learning from multiple source languages.

Cross-Lingual Transfer Structured Prediction +1

Bridging the Domain Gap: Improve Informal Language Translation via Counterfactual Domain Adaptation

AAAI 2021 Ke Wang, Guandan Chen, Zhongqiang Huang, Xiaojun Wan, Fei Huang

Despite the near-human performances already achieved on formal texts such as news articles, neural machine transla- tion still has difficulty in dealing with ”user-generated” texts that have diverse linguistic phenomena but lack large-scale high-quality parallel corpora.

Domain Adaptation TAG +1

Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning

ACL 2021 Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

We find empirically that the contextual representations computed on the retrieval-based input view, constructed through the concatenation of a sentence and its external contexts, can achieve significantly improved performance compared to the original input view based only on the sentence.

Chinese Named Entity Recognition Chunking +1

Automatic Speech Recognition and Topic Identification for Almost-Zero-Resource Languages

Matthew Wiesner, Chunxi Liu, Lucas Ondel, Craig Harman, Vimal Manohar, Jan Trmal, Zhongqiang Huang, Najim Dehak, Sanjeev Khudanpur

Automatic speech recognition (ASR) systems often need to be developed for extremely low-resource languages to serve end-uses such as audio content categorization and search.

Automatic Speech Recognition Humanitarian +1

Transfer Learning based Dynamic Multiobjective Optimization Algorithms

Min Jiang, Zhongqiang Huang, Liming Qiu, Wenzhen Huang, Gary G. Yen

This approach takes the transfer learning method as a tool to help reuse the past experience for speeding up the evolutionary process, and at the same time, any population based multiobjective algorithms can benefit from this integration without any extensive modifications.

BIG-bench Machine Learning Multiobjective Optimization +1

Statistical Machine Translation Features with Multitask Tensor Networks

IJCNLP 2015 Hendra Setiawan, Zhongqiang Huang, Jacob Devlin, Thomas Lamar, Rabih Zbib, Richard Schwartz, John Makhoul

We present a three-pronged approach to improving Statistical Machine Translation (SMT), building on recent success in the application of neural networks to SMT.

Machine Translation Tensor Networks +1

