no code implementations • 18 Feb 2025 • Jiaqi Zhao, Ming Wang, Miao Zhang, Yuzhang Shang, Xuebo Liu, YaoWei Wang, Min Zhang, Liqiang Nie
Then, we conduct extensive experiments with the baseline within each class, covering models with various sizes (7B-70B), bitwidths, training levels (LLaMA1/2/3/3. 1), architectures (Mixtral, DeepSeekMoE and Mamba) and modality (LLaVA1. 5 and VILA1. 5) on a wide range of evaluation metrics. Through comparative analysis on the results, we summarize the superior of each PTQ strategy and modelsize-bitwidth trade-off considering the performance.
no code implementations • 19 Dec 2024 • Xiabin Zhou, Wenbin Wang, Minyan Zeng, Jiaxian Guo, Xuebo Liu, Li Shen, Min Zhang, Liang Ding
Efficient KV cache management in LLMs is crucial for long-context tasks like RAG and summarization.
no code implementations • 18 Dec 2024 • Yifan Lu, Yigeng Zhou, Jing Li, Yequan Wang, Xuebo Liu, Daojing He, Fangming Liu, Min Zhang
Multi-hop question answering (MHQA) poses a significant challenge for large language models (LLMs) due to the extensive knowledge demands involved.
1 code implementation • 21 Nov 2024 • Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Min Zhang, Zhaopeng Tu
To address this, we propose DRPruning, which incorporates distributionally robust optimization to restore balanced performance across domains, along with further improvements to enhance robustness.
1 code implementation • 28 Oct 2024 • Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Min Zhang, Zhaopeng Tu
Despite their remarkable abilities in various tasks, large language models (LLMs) still struggle with real-time information (e. g., new facts and terms) due to the knowledge cutoff in their development process.
1 code implementation • 10 Oct 2024 • Yutong Wang, Jiali Zeng, Xuebo Liu, Derek F. Wong, Fandong Meng, Jie zhou, Min Zhang
Large language models (LLMs) have achieved reasonable quality improvements in machine translation (MT).
no code implementations • 7 Oct 2024 • Peijie Dong, Lujun Li, Xiang Liu, Zhenheng Tang, Xuebo Liu, Qiang Wang, Xiaowen Chu
Specifically, we model the ZC proxy as a symbolic equation and incorporate a unified proxy search space that encompasses existing ZC proxies, which are composed of a predefined set of mathematical symbols.
1 code implementation • 4 Oct 2024 • Tengfei Yu, Xuebo Liu, Zhiyi Hou, Liang Ding, DaCheng Tao, Min Zhang
This study aims to refine the use of speech datasets for LSM training by addressing the limitations of vanilla instruction tuning.
1 code implementation • 4 Oct 2024 • Jun Rao, Xuebo Liu, Lian Lian, Shengjun Cheng, Yunjie Liao, Min Zhang
With instruction tuning, Large Language Models (LLMs) can enhance their ability to adhere to commands.
no code implementations • 19 Sep 2024 • Jun Rao, Xuebo Liu, Zepeng Lin, Liang Ding, Jing Li, DaCheng Tao, Min Zhang
Knowledge distillation (KD) is a technique that compresses large teacher models by training smaller student models to mimic them.
1 code implementation • 12 Jun 2024 • Yutong Wang, Jiali Zeng, Xuebo Liu, Fandong Meng, Jie zhou, Min Zhang
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
no code implementations • 7 May 2024 • Junchao Wu, Runzhe Zhan, Derek F. Wong, Shu Yang, Xuebo Liu, Lidia S. Chao, Min Zhang
The efficacy of an large language model (LLM) generated text detector depends substantially on the availability of sizable training data.
1 code implementation • 29 Apr 2024 • Xinyu Ma, Xuebo Liu, Derek F. Wong, Jun Rao, Bei Li, Liang Ding, Lidia S. Chao, DaCheng Tao, Min Zhang
Experimental results show that MMT models trained on our dataset exhibit a greater ability to exploit visual information than those trained on other MMT datasets.
no code implementations • 27 Mar 2024 • Dongfang Li, Zetian Sun, Baotian Hu, Zhenyu Liu, Xinshuo Hu, Xuebo Liu, Min Zhang
Large language models have been widely adopted in natural language processing, yet they face the challenge of generating unreliable content.
1 code implementation • 26 Feb 2024 • Liangxin Liu, Xuebo Liu, Derek F. Wong, Dongfang Li, Ziyi Wang, Baotian Hu, Min Zhang
In this work, we propose a novel approach, termed SelectIT, that capitalizes on the foundational capabilities of the LLM itself.
no code implementations • 19 Feb 2024 • Hong Chen, Chengtao Lv, Liang Ding, Haotong Qin, Xiabin Zhou, Yifu Ding, Xuebo Liu, Min Zhang, Jinyang Guo, Xianglong Liu, DaCheng Tao
Large language models (LLMs) have significantly advanced the field of natural language processing, while the expensive memory and computation consumption impede their practical deployment.
1 code implementation • 22 Jan 2024 • Keqin Peng, Liang Ding, Yancheng Yuan, Xuebo Liu, Min Zhang, Yuanxin Ouyang, DaCheng Tao
In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent.
1 code implementation • 5 Dec 2023 • Xinyu Ma, Xuebo Liu, Min Zhang
In multilingual translation research, the comprehension and utilization of language families are of paramount importance.
1 code implementation • CVPR 2024 • Yaofang Liu, Xiaodong Cun, Xuebo Liu, Xintao Wang, Yong Zhang, Haoxin Chen, Yang Liu, Tieyong Zeng, Raymond Chan, Ying Shan
For video generation, various open-sourced models and public-available services have been developed to generate high-quality videos.
1 code implementation • 25 Jul 2023 • Hexuan Deng, Xin Zhang, Meishan Zhang, Xuebo Liu, Min Zhang
In this paper, we conduct a holistic exploration of the Universal Decompositional Semantic (UDS) Parsing.
1 code implementation • 12 Jul 2023 • Yuzhuang Xu, Shuo Wang, Peng Li, Xuebo Liu, Xiaolong Wang, Weidong Liu, Yang Liu
Although neural machine translation (NMT) models perform well in the general domain, it remains rather challenging to control their generation behavior to satisfy the requirement of different users.
1 code implementation • 24 May 2023 • Qihuang Zhong, Liang Ding, Juhua Liu, Xuebo Liu, Min Zhang, Bo Du, DaCheng Tao
Token dropping is a recently-proposed strategy to speed up the pretraining of masked language models, such as BERT, by skipping the computation of a subset of the input tokens at several middle layers.
1 code implementation • 3 May 2023 • Chi Seng Cheang, Hou Pong Chan, Derek F. Wong, Xuebo Liu, Zhaocong Li, Yanming Sun, Shudong Liu, Lidia S. Chao
Moreover, the knowledge memorized by PLMs may quickly become outdated, which affects the generalization performance of PLMs on future data.
1 code implementation • 24 Mar 2023 • Keqin Peng, Liang Ding, Qihuang Zhong, Li Shen, Xuebo Liu, Min Zhang, Yuanxin Ouyang, DaCheng Tao
We show that: 1) The performance of ChatGPT depends largely on temperature, and a lower temperature usually can achieve better performance; 2) Emphasizing the task information can further improve ChatGPT's performance, particularly in complex MT tasks; 3) Introducing domain information can elicit ChatGPT's generalization ability and improve its performance in the specific domain; 4) ChatGPT tends to generate hallucinations for non-English-centric MT tasks, which can be partially addressed by our proposed prompts but still need to be highlighted for the MT/NLP community.
1 code implementation • 8 Dec 2022 • Zhaocong Li, Xuebo Liu, Derek F. Wong, Lidia S. Chao, Min Zhang
In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer knowledge from the parent model during the training of the child model.
Low Resource Neural Machine Translation
Low-Resource Neural Machine Translation
+4
1 code implementation • 2 Dec 2022 • Hexuan Deng, Liang Ding, Xuebo Liu, Meishan Zhang, DaCheng Tao, Min Zhang
Preliminary experiments on En-Zh and En-Ja news domain corpora demonstrate that monolingual data can significantly improve translation quality (e. g., +3. 15 BLEU on En-Zh).
1 code implementation • 23 Nov 2022 • Zhijun Wang, Xuebo Liu, Min Zhang
Existing research generally treats Chinese character as a minimum unit for representation.
Ranked #1 on
Machine Translation
on WMT2017 Chinese-English
1 code implementation • 3 Nov 2022 • Peiyuan Gong, Xuebo Liu, Heyan Huang, Min Zhang
Pretraining-based (PT-based) automatic evaluation metrics (e. g., BERTScore and BARTScore) have been widely used in several sentence generation tasks (e. g., machine translation and text summarization) due to their better correlation with human judgments over traditional overlap-based methods.
no code implementations • 16 Apr 2022 • Zheng Zhang, Liang Ding, Dazhao Cheng, Xuebo Liu, Min Zhang, DaCheng Tao
Data augmentations (DA) are the cores to achieving robust sequence-to-sequence learning on various natural language processing (NLP) tasks.
1 code implementation • ACL 2022 • Bei Li, Quan Du, Tao Zhou, Yi Jing, Shuhan Zhou, Xin Zeng, Tong Xiao, Jingbo Zhu, Xuebo Liu, Min Zhang
Inspired by this, we design a new architecture, {\it ODE Transformer}, which is analogous to the Runge-Kutta method that is well motivated in ODE.
1 code implementation • 7 Nov 2021 • Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao
We release 70 small and discriminative test sets for machine translation (MT) evaluation called variance-aware test sets (VAT), covering 35 translation directions from WMT16 to WMT20 competitions.
1 code implementation • Findings (EMNLP) 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu
Pre-training (PT) and back-translation (BT) are two simple and powerful methods to utilize monolingual data for improving the model performance of neural machine translation (NMT).
1 code implementation • ACL 2021 • Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao
The high-quality translation results produced by machine translation (MT) systems still pose a huge challenge for automatic evaluation.
1 code implementation • Findings (ACL) 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Shuming Shi, Zhaopeng Tu
In response to this problem, we propose a simple and effective method named copying penalty to control the copying behaviors in decoding.
no code implementations • Findings (ACL) 2021 • Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu
Non-autoregressive translation (NAT) significantly accelerates the inference process via predicting the entire target sequence.
1 code implementation • ACL 2021 • Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu
Results demonstrate that the proposed approach can significantly and universally improve translation quality by reducing translation errors on low-frequency words.
1 code implementation • 2 May 2021 • Wenhai Wang, Enze Xie, Xiang Li, Xuebo Liu, Ding Liang, Zhibo Yang, Tong Lu, Chunhua Shen
By systematically comparing with existing scene text representations, we show that our kernel representation can not only describe arbitrarily-shaped text but also well distinguish adjacent text.
1 code implementation • 3 Mar 2021 • Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao
Meta-learning has been sufficiently validated to be beneficial for low-resource neural machine translation (NMT).
Domain Adaptation
Low Resource Neural Machine Translation
+4
1 code implementation • ICLR 2021 • Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Zhaopeng Tu
Encoder layer fusion (EncoderFusion) is a technique to fuse all the encoder layers (instead of the uppermost layer) for sequence-to-sequence (Seq2Seq) models, which has proven effective on various NLP tasks.
no code implementations • ICLR 2021 • Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, DaCheng Tao, Zhaopeng Tu
To this end, we introduce an extra Kullback-Leibler divergence term derived by comparing the lexical choice of NAT model and that embedded in the raw data.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Zilong Wang, Mingjie Zhan, Xuebo Liu, Ding Liang
The table detection and handcrafted features in previous works cannot apply to all forms because of their requirements on formats.
2 code implementations • ECCV 2020 • Wenhai Wang, Xuebo Liu, Xiaozhong Ji, Enze Xie, Ding Liang, Zhibo Yang, Tong Lu, Chunhua Shen, Ping Luo
Unlike previous works that merely employed visual features for text detection, this work proposes a novel text spotter, named Ambiguity Eliminating Text Spotter (AE TextSpotter), which learns both visual and linguistic features to significantly reduce ambiguity in text detection.
1 code implementation • ACL 2020 • Xuebo Liu, Houtim Lai, Derek F. Wong, Lidia S. Chao
We use the norm (aka length or module) of a word embedding as a measure of 1) the difficulty of the sentence, 2) the competence of the model, and 3) the weight of the sentence.
4 code implementations • ECCV 2020 • Wenjia Wang, Enze Xie, Xuebo Liu, Wenhai Wang, Ding Liang, Chunhua Shen, Xiang Bai
For example, it outperforms LapSRN by over 5% and 8%on the recognition accuracy of ASTER and CRNN.
no code implementations • ACL 2019 • Xuebo Liu, Derek F. Wong, Yang Liu, Lidia S. Chao, Tong Xiao, Jingbo Zhu
For similar source and target words, their embeddings tend to share a part of the features and they cooperatively learn these common representation units.
2 code implementations • 28 Mar 2019 • Jingchao Liu, Xuebo Liu, Jie Sheng, Ding Liang, Xin Li, Qingjie Liu
Scene text detection, an essential step of scene text recognition system, is to locate text instances in natural scene images automatically.
Ranked #1 on
Scene Text Detection
on ICDAR 2017 MLT
7 code implementations • CVPR 2018 • Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, Junjie Yan
Incidental scene text spotting is considered one of the most difficult and valuable challenges in the document analysis community.
Ranked #4 on
Scene Text Detection
on ICDAR 2017 MLT