Search Results for author: Zhihua Wu

Found 19 papers, 14 papers with code

Code Comparison Tuning for Code Large Language Models

no code implementations28 Mar 2024 Yufan Jiang, Qiaozhi He, Xiaomin Zhuang, Zhihua Wu

We present Code Comparison Tuning (CCT), a simple and effective tuning method for code large language models (Code LLMs) to better handle subtle code errors.

Bug fixing

DeepAdaIn-Net: Deep Adaptive Device-Edge Collaborative Inference for Augmented Reality

no code implementations IEEE Journal of Selected Topics in Signal Processing 2023 Li Wang, Xin Wu, Yi Zhang, Xinyun Zhang, LianmingXu, Zhihua Wu, Aiguo Fei

Specifically, DeepAdaIn-Net encompasses a partition point selection (PPS) module, a high feature compression learning (HFCL) module, a bandwidth-aware feature configuration (BaFC) module, and a feature consistency compensation (FCC) module.

Collaborative Inference Feature Compression +2

TA-MoE: Topology-Aware Large Scale Mixture-of-Expert Training

1 code implementation20 Feb 2023 Chang Chen, Min Li, Zhihua Wu, dianhai yu, Chao Yang

In this paper, we propose TA-MoE, a topology-aware routing strategy for large-scale MoE trainging, from a model-system co-design perspective, which can dynamically adjust the MoE dispatch pattern according to the network topology.

HelixFold: An Efficient Implementation of AlphaFold2 using PaddlePaddle

1 code implementation12 Jul 2022 Guoxia Wang, Xiaomin Fang, Zhihua Wu, Yiqun Liu, Yang Xue, Yingfei Xiang, dianhai yu, Fan Wang, Yanjun Ma

Due to the complex model architecture and large memory consumption, it requires lots of computational resources and time to implement the training and inference of AlphaFold2 from scratch.

Protein Structure Prediction

SE-MoE: A Scalable and Efficient Mixture-of-Experts Distributed Training and Inference System

1 code implementation20 May 2022 Liang Shen, Zhihua Wu, Weibao Gong, Hongxiang Hao, Yangfan Bai, HuaChao Wu, Xinxuan Wu, Jiang Bian, Haoyi Xiong, dianhai yu, Yanjun Ma

With the increasing diversity of ML infrastructures nowadays, distributed training over heterogeneous computing systems is desired to facilitate the production of big models.

Distributed Computing

Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters

1 code implementation19 May 2022 Yang Xiang, Zhihua Wu, Weibao Gong, Siyu Ding, Xianjie Mo, Yuang Liu, Shuohuan Wang, Peng Liu, Yongshuai Hou, Long Li, Bin Wang, Shaohuai Shi, Yaqian Han, Yue Yu, Ge Li, Yu Sun, Yanjun Ma, dianhai yu

We took natural language processing (NLP) as an example to show how Nebula-I works in different training phases that include: a) pre-training a multilingual language model using two remote clusters; and b) fine-tuning a machine translation model using knowledge distilled from pre-trained models, which run through the most popular paradigm of recent deep learning.

Cross-Lingual Natural Language Inference Distributed Computing +2

ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation

2 code implementations31 Dec 2021 Han Zhang, Weichong Yin, Yewei Fang, Lanxin Li, Boqiang Duan, Zhihua Wu, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

To explore the landscape of large-scale pre-training for bidirectional text-image generation, we train a 10-billion parameter ERNIE-ViLG model on a large-scale dataset of 145 million (Chinese) image-text pairs which achieves state-of-the-art performance for both text-to-image and image-to-text tasks, obtaining an FID of 7. 9 on MS-COCO for text-to-image synthesis and best results on COCO-CN and AIC-ICC for image captioning.

Image Captioning Quantization +2

End-to-end Adaptive Distributed Training on PaddlePaddle

1 code implementation6 Dec 2021 Yulong Ao, Zhihua Wu, dianhai yu, Weibao Gong, Zhiqing Kui, Minxu Zhang, Zilingfeng Ye, Liang Shen, Yanjun Ma, Tian Wu, Haifeng Wang, Wei Zeng, Chao Yang

The experiments demonstrate that our framework can satisfy various requirements from the diversity of applications and the heterogeneity of resources with highly competitive performance.

Language Modelling Recommendation Systems +1

PaddleRec

1 code implementation WSDM 2021 Wenhui Zhang, Zhihua Wu, Haofeng Yin

A quick start tool of search & recommendation algorithm based on PaddlePaddle A complete solution of recommendation system for beginners, developers and researchers.

Click-Through Rate Prediction Recommendation Systems

PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

3 code implementations20 Sep 2021 Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang, Wenquan Wu, Zhihua Wu, Zhen Guo, Hua Lu, Xinxian Huang, Xin Tian, Xinchao Xu, Yingzhan Lin, Zheng-Yu Niu

To explore the limit of dialogue generation pre-training, we present the models of PLATO-XL with up to 11 billion parameters, trained on both Chinese and English social media conversations.

Dialogue Generation

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

1 code implementation18 Feb 2020 Yikai Yan, Chaoyue Niu, Yucheng Ding, Zhenzhe Zheng, Fan Wu, Guihai Chen, Shaojie Tang, Zhihua Wu

In this work, we consider a practical and ubiquitous issue when deploying federated learning in mobile environments: intermittent client availability, where the set of eligible clients may change during the training process.

Benchmarking Federated Learning

Secure Federated Submodel Learning

1 code implementation6 Nov 2019 Chaoyue Niu, Fan Wu, Shaojie Tang, Lifeng Hua, Rongfei Jia, Chengfei Lv, Zhihua Wu, Guihai Chen

Nevertheless, the "position" of a client's truly required submodel corresponds to her private data, and its disclosure to the cloud server during interactions inevitably breaks the tenet of federated learning.

Federated Learning Position

From Server-Based to Client-Based Machine Learning: A Comprehensive Survey

no code implementations18 Sep 2019 Renjie Gu, Chaoyue Niu, Fan Wu, Guihai Chen, Chun Hu, Chengfei Lyu, Zhihua Wu

Another benefit is the bandwidth reduction because various kinds of local data can be involved in the training process without being uploaded.

BIG-bench Machine Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.