Search Results for author: Xiang Kong

Found 24 papers, 13 papers with code

Large Language Model-guided Document Selection

no code implementations7 Jun 2024 Xiang Kong, Tom Gunter, Ruoming Pang

Filtering allows us to quality-match a model trained on the full corpus across diverse benchmarks with at most 70% of the FLOPs, 2.

In-Context Learning Language Modelling +1

Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training

1 code implementation23 May 2024 Xianzhi Du, Tom Gunter, Xiang Kong, Mark Lee, ZiRui Wang, Aonan Zhang, Nan Du, Ruoming Pang

In this work, we revisit the settings by adopting step time as a more accurate measure of model complexity, and by determining the total compute budget under the Chinchilla compute-optimal settings.


Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation

no code implementations19 Feb 2024 Aiwei Liu, Haoping Bai, Zhiyun Lu, Xiang Kong, Simon Wang, Jiulong Shan, Meng Cao, Lijie Wen

In this paper, we propose a method to evaluate the response preference by using the output probabilities of response pairs under contrastive prompt pairs, which could achieve better performance on LLaMA2-7B and LLaMA2-13B compared to RLAIF.

Language Modelling Large Language Model

Mega: Moving Average Equipped Gated Attention

6 code implementations21 Sep 2022 Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig, Jonathan May, Luke Zettlemoyer

The design choices in the Transformer attention mechanism, including weak inductive bias and quadratic computational complexity, have limited its application for modeling long sequences.

Image Classification Inductive Bias +3

Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders

no code implementations EACL 2021 Xiang Kong, Adithya Renduchintala, James Cross, Yuqing Tang, Jiatao Gu, Xian Li

Recent work in multilingual translation advances translation quality surpassing bilingual baselines using deep transformer models with increased capacity.

Decoder Machine Translation +1

BLT: Bidirectional Layout Transformer for Controllable Layout Generation

1 code implementation9 Dec 2021 Xiang Kong, Lu Jiang, Huiwen Chang, Han Zhang, Yuan Hao, Haifeng Gong, Irfan Essa

During inference, BLT first generates a draft layout from the input and then iteratively refines it into a high-quality layout by masking out low-confident attributes.

Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade

1 code implementation Findings (ACL) 2021 Jiatao Gu, Xiang Kong

Fully non-autoregressive neural machine translation (NAT) is proposed to simultaneously predict tokens with single forward of neural networks, which significantly reduces the inference latency at the expense of quality drop compared to the Transformer baseline.

Machine Translation Translation

Incorporating a Local Translation Mechanism into Non-autoregressive Translation

1 code implementation EMNLP 2020 Xiang Kong, Zhisong Zhang, Eduard Hovy

In this work, we introduce a novel local autoregressive translation (LAT) mechanism into non-autoregressive translation (NAT) models so as to capture local dependencies among tar-get outputs.

Machine Translation Position +2

Deep Transformers with Latent Depth

1 code implementation NeurIPS 2020 Xi-An Li, Asa Cooper Stickland, Yuqing Tang, Xiang Kong

As an extension of this framework, we propose a novel method to train one shared Transformer network for multilingual machine translation with different layer selection posteriors for each language pair.

Language Modelling Machine Translation +2

A Two-Step Approach for Implicit Event Argument Detection

no code implementations ACL 2020 Zhisong Zhang, Xiang Kong, Zhengzhong Liu, Xuezhe Ma, Eduard Hovy

It remains a challenge to detect implicit arguments, calling for more future work of document-level modeling for this task.

Sentence Vocal Bursts Valence Prediction

Decoupling Global and Local Representations via Invertible Generative Flows

1 code implementation ICLR 2021 Xuezhe Ma, Xiang Kong, Shanghang Zhang, Eduard Hovy

In this work, we propose a new generative model that is capable of automatically decoupling global and local representations of images in an entirely unsupervised setting, by embedding a generative flow in the VAE framework to model the decoder.

Decoder Density Estimation +3

Decompressing Knowledge Graph Representations for Link Prediction

1 code implementation11 Nov 2019 Xiang Kong, Xianyang Chen, Eduard Hovy

Specifically, embeddings of entities and relationships are first decompressed to a more expressive and robust space by decompressing functions, then knowledge graph embedding models are trained in this new feature space.

Knowledge Graph Embedding Knowledge Graphs +1

Generalized Data Augmentation for Low-Resource Translation

no code implementations ACL 2019 Mengzhou Xia, Xiang Kong, Antonios Anastasopoulos, Graham Neubig

Translation to or from low-resource languages LRLs poses challenges for machine translation in terms of both adequacy and fluency.

Data Augmentation Translation +1

MaCow: Masked Convolutional Generative Flow

2 code implementations NeurIPS 2019 Xuezhe Ma, Xiang Kong, Shanghang Zhang, Eduard Hovy

Flow-based generative models, conceptually attractive due to tractability of both the exact log-likelihood computation and latent-variable inference, and efficiency of both training and sampling, has led to a number of impressive empirical successes and spawned many advanced variants and theoretical investigations.

Computational Efficiency Density Estimation +1

An Adversarial Approach to High-Quality, Sentiment-Controlled Neural Dialogue Generation

no code implementations22 Jan 2019 Xiang Kong, Bohan Li, Graham Neubig, Eduard Hovy, Yiming Yang

In this work, we propose a method for neural dialogue response generation that allows not only generating semantically reasonable responses according to the dialogue history, but also explicitly controlling the sentiment of the response via sentiment labels.

Dialogue Generation Response Generation +1

Neural Machine Translation with Adequacy-Oriented Learning

no code implementations21 Nov 2018 Xiang Kong, Zhaopeng Tu, Shuming Shi, Eduard Hovy, Tong Zhang

Although Neural Machine Translation (NMT) models have advanced state-of-the-art performance in machine translation, they face problems like the inadequate translation.

Attribute Machine Translation +3

Evaluating Automatic Speech Recognition Systems in Comparison With Human Perception Results Using Distinctive Feature Measures

no code implementations13 Dec 2016 Xiang Kong, Jeung-Yoon Choi, Stefanie Shattuck-Hufnagel

This paper describes methods for evaluating automatic speech recognition (ASR) systems in comparison with human perception results, using measures derived from linguistic distinctive features.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints

no code implementations13 Dec 2016 Xiang Kong, Preethi Jyothi, Mark Hasegawa-Johnson

Mismatched transcriptions have been proposed as a mean to acquire probabilistic transcriptions from non-native speakers of a language. Prior work has demonstrated the value of these transcriptions by successfully adapting cross-lingual ASR systems for different tar-get languages.

Cross-Lingual ASR TAR

Landmark-based consonant voicing detection on multilingual corpora

no code implementations10 Nov 2016 Xiang Kong, Xuesong Yang, Mark Hasegawa-Johnson, Jeung-Yoon Choi, Stefanie Shattuck-Hufnagel

Three consonant voicing classifiers were developed: (1) manually selected acoustic features anchored at a phonetic landmark, (2) MFCCs (either averaged across the segment or anchored at the landmark), and(3) acoustic features computed using a convolutional neural network (CNN).

Cannot find the paper you are looking for? You can Submit a new open access paper.