Search Results for author: Shen Huang

Found 20 papers, 8 papers with code

CSP:Code-Switching Pre-training for Neural Machine Translation

no code implementations • EMNLP 2020 • Zhen Yang, Bojie Hu, Ambyera Han, Shen Huang, Qi Ju

Unlike traditional pre-training method which randomly masks some fragments of the input sentence, the proposed CSP randomly replaces some words in the source sentence with their translation words in the target language.

Decoder Machine Translation +3

Paper
Add Code

Exploring Key Point Analysis with Pairwise Generation and Graph Partitioning

1 code implementation • 17 Apr 2024 • Xiao Li, Yong Jiang, Shen Huang, Pengjun Xie, Gong Cheng, Fei Huang

Our objective is to train a generative model that can simultaneously provide a score indicating the presence of shared key point between a pair of arguments and generate the shared key point.

Argument Mining graph partitioning +2

Paper
Code

EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models with Semi-structured Data

no code implementations • 25 Dec 2023 • Shirong Ma, Shen Huang, Shulin Huang, Xiaobin Wang, Yangning Li, Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang

Experimental results demonstrate the effectiveness of continual pre-training of E-commerce LLMs and the efficacy of our devised data mixing strategy.

In-Context Learning

Paper
Add Code

SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding

1 code implementation • 21 Aug 2023 • Tianyu Yu, Chengyue Jiang, Chao Lou, Shen Huang, Xiaobin Wang, Wei Liu, Jiong Cai, Yangning Li, Yinghui Li, Kewei Tu, Hai-Tao Zheng, Ningyu Zhang, Pengjun Xie, Fei Huang, Yong Jiang

However, LLMs are sometimes too footloose for natural language understanding (NLU) tasks which always have restricted output and input format.

Entity Typing Event Extraction +3

192

Paper
Code

End-to-End Beam Retrieval for Multi-Hop Question Answering

2 code implementations • 17 Aug 2023 • Jiahao Zhang, Haiyang Zhang, Dongmei Zhang, Yong liu, Shen Huang

This approach models the multi-hop retrieval process in an end-to-end manner by jointly optimizing an encoder and two classification heads across all hops.

Ranked #1 on Question Answering on HotpotQA

Language Modelling Large Language Model +3

Paper
Code

EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce

1 code implementation • 14 Aug 2023 • Yangning Li, Shirong Ma, Xiaobin Wang, Shen Huang, Chengyue Jiang, Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang

EcomInstruct scales up the data size and task diversity by constructing atomic tasks with E-commerce basic data types, such as product information, user reviews.

Instruction Following Language Modelling +2

191

Paper
Code

DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition

1 code implementation • 5 May 2023 • Zeqi Tan, Shen Huang, Zixia Jia, Jiong Cai, Yinghui Li, Weiming Lu, Yueting Zhuang, Kewei Tu, Pengjun Xie, Fei Huang, Yong Jiang

Also, we discover that the limited context length causes the retrieval knowledge to be invisible to the model.

Multilingual Named Entity Recognition named-entity-recognition +4

371

Paper
Code

ChatIE: Zero-Shot Information Extraction via Chatting with ChatGPT

1 code implementation • 20 Feb 2023 • Xiang Wei, Xingyu Cui, Ning Cheng, Xiaobin Wang, Xin Zhang, Shen Huang, Pengjun Xie, Jinan Xu, Yufeng Chen, Meishan Zhang, Yong Jiang, Wenjuan Han

Zero-shot information extraction (IE) aims to build IE systems from the unannotated text.

Event Extraction named-entity-recognition +3

738

Paper
Code

DAMO-NLP at NLPCC-2022 Task 2: Knowledge Enhanced Robust NER for Speech Entity Linking

1 code implementation • 27 Sep 2022 • Shen Huang, Yuchen Zhai, Xinwei Long, Yong Jiang, Xiaobin Wang, Yin Zhang, Pengjun Xie

Speech Entity Linking aims to recognize and disambiguate named entities in spoken languages.

Entity Linking named-entity-recognition +5

371

Paper
Code

PM-MMUT: Boosted Phone-Mask Data Augmentation using Multi-Modeling Unit Training for Phonetic-Reduction-Robust E2E Speech Recognition

no code implementations • 13 Dec 2021 • Guodong Ma, Pengfei Hu, Nurmemet Yolwas, Shen Huang, Hao Huang

To boost the performance of PMT, we propose multi-modeling unit training (MMUT) architecture fusion with PMT (PM-MMUT).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Adversarial Sample Detection for Speaker Verification by Neural Vocoders

1 code implementation • 1 Jul 2021 • Haibin Wu, Po-chun Hsu, Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang, Zhiyong Wu, Helen Meng, Hung-Yi Lee

We also show that the neural vocoder adopted in the detection framework is dataset-independent.

Speaker Verification

Paper
Code

Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders

no code implementations • ACL 2021 • Chen Xu, Bojie Hu, Yanyang Li, Yuhao Zhang, Shen Huang, Qi Ju, Tong Xiao, Jingbo Zhu

To our knowledge, we are the first to develop an end-to-end ST system that achieves comparable or even better BLEU performance than the cascaded ST counterpart when large-scale ASR and MT data is available.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Dynamic Curriculum Learning for Low-Resource Neural Machine Translation

no code implementations • COLING 2020 • Chen Xu, Bojie Hu, Yufan Jiang, Kai Feng, Zeyang Wang, Shen Huang, Qi Ju, Tong Xiao, Jingbo Zhu

This eases training by highlighting easy samples that the current model has enough competence to learn.

Low-Resource Neural Machine Translation NMT +1

Paper
Add Code

Code-switching pre-training for neural machine translation

no code implementations • 17 Sep 2020 • Zhen Yang, Bojie Hu, Ambyera Han, Shen Huang, Qi Ju

Decoder Machine Translation +3

Paper
Add Code

Cognitive Representation Learning of Self-Media Online Article Quality

no code implementations • 13 Aug 2020 • Yiru Wang, Shen Huang, Gongfu Li, Qiang Deng, Dongliang Liao, Pengda Si, Yujiu Yang, Jin Xu

The automatic quality assessment of self-media online articles is an urgent and new issue, which is of great value to the online recommendation and search.

Representation Learning

Paper
Add Code

A Multi-oriented Chinese Keyword Spotter Guided by Text Line Detection

no code implementations • 3 Jan 2020 • Pei Xu, Shan Huang, Hongzhen Wang, Hao Song, Shen Huang, Qi Ju

Chinese keyword spotting is a challenging task as there is no visual blank for Chinese words.

Keyword Spotting Line Detection

Paper
Add Code

Utterance-level end-to-end language identification using attention-based CNN-BLSTM

no code implementations • 20 Feb 2019 • Weicheng Cai, Danwei Cai, Shen Huang, Ming Li

In this paper, we present an end-to-end language identification framework, the attention-based Convolutional Neural Network-Bidirectional Long-short Term Memory (CNN-BLSTM).

Language Identification

Paper
Add Code

TencentFmRD Neural Machine Translation for WMT18

no code implementations • WS 2018 • Bojie Hu, Ambyer Han, Shen Huang

Our systems are neural machine translation systems trained with our original system TenTrans.

Machine Translation NMT +1

Paper
Add Code

Addressing Domain Adaptation for Chinese Word Segmentation with Global Recurrent Structure

no code implementations • IJCNLP 2017 • Shen Huang, Xu sun, Houfeng Wang

Boundary features are widely used in traditional Chinese Word Segmentation (CWS) methods as they can utilize unlabeled data to help improve the Out-of-Vocabulary (OOV) word recognition performance.

Chinese Word Segmentation Domain Adaptation +2

Paper
Add Code

Bi-LSTM Neural Networks for Chinese Grammatical Error Diagnosis

no code implementations • WS 2016 • Shen Huang, Houfeng Wang

Grammatical Error Diagnosis for Chinese has always been a challenge for both foreign learners and NLP researchers, for the variousity of grammar and the flexibility of expression.

Grammatical Error Detection Sentence +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.