Search Results for author: Baohao Liao

Found 18 papers, 6 papers with code

Fractured Chain-of-Thought Reasoning

no code implementations19 May 2025 Baohao Liao, Hanze Dong, Yuhui Xu, Doyen Sahoo, Christof Monz, Junnan Li, Caiming Xiong

Inference-time scaling techniques have significantly bolstered the reasoning capabilities of large language models (LLMs) by harnessing additional computational effort at inference without retraining.

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

no code implementations31 Jan 2025 Baohao Liao, Yuhui Xu, Hanze Dong, Junnan Li, Christof Monz, Silvio Savarese, Doyen Sahoo, Caiming Xiong

We introduce Reward-Guided Speculative Decoding (RSD), a novel framework aimed at improving the efficiency of inference in large language models (LLMs).

3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability

1 code implementation28 Aug 2024 Baohao Liao, Christof Monz

One notable challenge involves the efficient deployment of LLMs equipped with multiple task- or user-specific adapters, particularly when different adapters are needed for distinct requests within the same batch.

Arithmetic Reasoning

IKUN for WMT24 General MT Task: LLMs Are here for Multilingual Machine Translation

no code implementations21 Aug 2024 Baohao Liao, Christian Herold, Shahram Khadivi, Christof Monz

The systems are based on a two-stage approach: first, continuous pre-training on monolingual data in 10 languages, followed by fine-tuning on high-quality parallel data for 11 language directions.

Machine Translation Translation

Is It a Free Lunch for Removing Outliers during Pretraining?

no code implementations19 Feb 2024 Baohao Liao, Christof Monz

With the growing size of large language models, the role of quantization becomes increasingly significant.

Quantization

ApiQ: Finetuning of 2-Bit Quantized Large Language Model

1 code implementation7 Feb 2024 Baohao Liao, Christian Herold, Shahram Khadivi, Christof Monz

Memory-efficient finetuning of large language models (LLMs) has recently attracted huge attention with the increasing size of LLMs, primarily due to the constraints posed by GPU memory limitations and the effectiveness of these methods compared to full finetuning.

Language Modeling Language Modelling +2

ITEm: Unsupervised Image-Text Embedding Learning for eCommerce

no code implementations22 Oct 2023 Baohao Liao, Michael Kozielski, Sanjika Hewavitharana, Jiangbo Yuan, Shahram Khadivi, Tomer Lancewicki

How to teach a model to learn embedding from different modalities without neglecting information from the less dominant modality is challenging.

Ask Language Model to Clean Your Noisy Translation Data

no code implementations20 Oct 2023 Quinten Bolding, Baohao Liao, Brandon James Denis, Jun Luo, Christof Monz

Lastly, experiments on C-MTNT showcased its effectiveness in evaluating the robustness of NMT models, highlighting the potential of advanced language models for data cleaning and emphasizing C-MTNT as a valuable resource.

Language Modeling Language Modelling +3

Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning

1 code implementation NeurIPS 2023 Baohao Liao, Shaomu Tan, Christof Monz

One effective way to reduce the activation memory is to apply a reversible model, so the intermediate activations are not necessary to be cached and can be recomputed.

Image Classification parameter-efficient fine-tuning +1

Parameter-Efficient Fine-Tuning without Introducing New Latency

no code implementations26 May 2023 Baohao Liao, Yan Meng, Christof Monz

Parameter-efficient fine-tuning (PEFT) of pre-trained language models has recently demonstrated remarkable achievements, effectively matching the performance of full fine-tuning while utilizing significantly fewer trainable parameters, and consequently addressing the storage and communication constraints.

Federated Learning parameter-efficient fine-tuning

Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token

1 code implementation9 Nov 2022 Baohao Liao, David Thulke, Sanjika Hewavitharana, Hermann Ney, Christof Monz

We show: (1) [MASK]s can indeed be appended at a later layer, being disentangled from the word embedding; (2) The gathering of contextualized information from unmasked tokens can be conducted with a few layers.

Back-translation for Large-Scale Multilingual Machine Translation

1 code implementation WMT (EMNLP) 2021 Baohao Liao, Shahram Khadivi, Sanjika Hewavitharana

Surprisingly, the smaller size of vocabularies perform better, and the extensive monolingual English data offers a modest improvement.

Machine Translation Translation

Unifying Input and Output Smoothing in Neural Machine Translation

no code implementations COLING 2020 Yingbo Gao, Baohao Liao, Hermann Ney

Soft contextualized data augmentation is a recent method that replaces one-hot representation of words with soft posterior distributions of an external language model, smoothing the input of neural machine translation systems.

Data Augmentation Language Modeling +3

Multi-Agent Mutual Learning at Sentence-Level and Token-Level for Neural Machine Translation

no code implementations Findings of the Association for Computational Linguistics 2020 Baohao Liao, Yingbo Gao, Hermann Ney

Mutual learning, where multiple agents learn collaboratively and teach one another, has been shown to be an effective way to distill knowledge for image classification tasks.

Image Classification Machine Translation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.