Search Results for author: Wenhao Zhu

Found 16 papers, 10 papers with code

What Knowledge Is Needed? Towards Explainable Memory for kNN-MT Domain Adaptation

1 code implementation8 Nov 2022 Wenhao Zhu, ShuJian Huang, Yunzhe Lv, Xin Zheng, Jiajun Chen

kNN-MT presents a new paradigm for domain adaptation by building an external datastore, which usually saves all target language token occurrences in the parallel corpus.

Domain Adaptation NMT +1

kNN-BOX: A Unified Framework for Nearest Neighbor Generation

1 code implementation27 Feb 2023 Wenhao Zhu, Qianfeng Zhao, Yunzhe Lv, ShuJian Huang, Siheng Zhao, Sizhe Liu, Jiajun Chen

Augmenting the base neural model with a token-level symbolic datastore is a novel generation paradigm and has achieved promising results in machine translation (MT).

Machine Translation Paraphrase Generation +4

Question Translation Training for Better Multilingual Reasoning

1 code implementation15 Jan 2024 Wenhao Zhu, ShuJian Huang, Fei Yuan, Shuaijie She, Jiajun Chen, Alexandra Birch

A typical solution is to translate instruction data into all languages of interest, and then train on the resulting multilingual data, which is called translate-training.

Mathematical Reasoning Translation

Extrapolating Large Language Models to Non-English by Aligning Languages

2 code implementations9 Aug 2023 Wenhao Zhu, Yunzhe Lv, Qingxiu Dong, Fei Yuan, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen, Lei LI

We start from targeting individual languages by performing cross-lingual instruction-tuning (CoIT) on LLaMA, i. e. tuning it with translation task data and cross-lingual general task data to obtain cross-lingual models (x-LLaMAs), and formulate underlying scaling laws to investigate the advantages of using scalable translation data.

Translation

MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization

1 code implementation12 Jan 2024 Shuaijie She, Wei Zou, ShuJian Huang, Wenhao Zhu, Xiang Liu, Xiang Geng, Jiajun Chen

To enhance reasoning abilities in non-dominant languages, we propose a Multilingual-Alignment-as-Preference Optimization framework (MAPO), aiming to align the reasoning processes in other languages with the dominant language.

Mathematical Reasoning

INK: Injecting kNN Knowledge in Nearest Neighbor Machine Translation

1 code implementation10 Jun 2023 Wenhao Zhu, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen

We propose an effective training framework INK to directly smooth the representation space via adjusting representations of kNN neighbors with a small number of new parameters.

Machine Translation Translation

Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model

1 code implementation2 Aug 2023 Kanzhi Cheng, Wenpo Song, Zheng Ma, Wenhao Zhu, Zixuan Zhu, Jianbing Zhang

Considering that Vision-Language Pre-Training (VLP) models master massive such knowledge from large-scale web-harvested data, it is promising to utilize the generalizability of VLP models to incorporate knowledge into image descriptions.

Hallucination Image Captioning +2

Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation

1 code implementation20 Dec 2022 Fei Yuan, Yinquan Lu, Wenhao Zhu, Lingpeng Kong, Lei LI, Yu Qiao, Jingjing Xu

To address the needs of learning representations for all languages in a unified space, we propose a novel efficient training recipe, upon which we build an effective detachable model, Lego-MT.

Machine Translation Translation

FGraDA: A Dataset and Benchmark for Fine-Grained Domain Adaptation in Machine Translation

1 code implementation LREC 2022 Wenhao Zhu, ShuJian Huang, Tong Pu, Pingxuan Huang, Xu Zhang, Jian Yu, Wei Chen, Yanfeng Wang, Jiajun Chen

Previous research for adapting a general neural machine translation (NMT) model into a specific domain usually neglects the diversity in translation within the same domain, which is a core problem for domain adaptation in real-world scenarios.

Autonomous Vehicles Domain Adaptation +3

Gradient Episodic Memory with a Soft Constraint for Continual Learning

no code implementations16 Nov 2020 Guannan Hu, Wu Zhang, Hu Ding, Wenhao Zhu

Catastrophic forgetting in continual learning is a common destructive phenomenon in gradient-based neural networks that learn sequential tasks, and it is much different from forgetting in humans, who can learn and accumulate knowledge throughout their whole lives.

Continual Learning

Hierarchical Transformer for Scalable Graph Learning

no code implementations4 May 2023 Wenhao Zhu, Tianyu Wen, Guojie Song, Xiaojun Ma, Liang Wang

Graph Transformer is gaining increasing attention in the field of machine learning and has demonstrated state-of-the-art performance on benchmarks for graph representation learning.

Graph Learning Graph Representation Learning

On Structural Expressive Power of Graph Transformers

no code implementations23 May 2023 Wenhao Zhu, Tianyu Wen, Guojie Song, Liang Wang, Bo Zheng

Graph Transformer has recently received wide attention in the research community with its outstanding performance, yet its structural expressive power has not been well analyzed.

SIFTER: A Task-specific Alignment Strategy for Enhancing Sentence Embeddings

no code implementations21 Jun 2023 Chao Yu, Wenhao Zhu, Chaoming Liu, XiaoYu Zhang, Qiuhong zhai

This indicates that different downstream tasks have different levels of sensitivity to sentence components.

Sentence Sentence Embeddings +1

PMNN:Physical Model-driven Neural Network for solving time-fractional differential equations

no code implementations7 Oct 2023 Zhiying Ma, Jie Hou, Wenhao Zhu, Yaxin Peng, Ying Li

It establishes a temporal iteration scheme based on physical model-driven neural networks which effectively combines deep neural networks (DNNs) with interpolation approximation of fractional derivatives.

A parameter-free clustering algorithm for missing datasets

no code implementations8 Apr 2024 Qi Li, Xianjun Zeng, Shuliang Wang, Wenhao Zhu, Shijie Ruan, Zhimeng Yuan

Missing datasets, in which some objects have missing values in certain dimensions, are prevalent in the Real-world.

Clustering Imputation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.