Search Results for author: dianhai yu

Found 38 papers, 26 papers with code

A Framework for Cost-Effective and Self-Adaptive LLM Shaking and Recovery Mechanism

no code implementations12 Mar 2024 Zhiyu Chen, Yu Li, Suochao Zhang, Jingbo Zhou, Jiwen Zhou, Chenfu Bao, dianhai yu

As Large Language Models (LLMs) gain great success in real-world applications, an increasing number of users are seeking to develop and deploy their customized LLMs through cloud services.

Privacy Preserving

Spectral Heterogeneous Graph Convolutions via Positive Noncommutative Polynomials

2 code implementations31 May 2023 Mingguo He, Zhewei Wei, Shikun Feng, Zhengjie Huang, Weibin Li, Yu Sun, dianhai yu

These spatial-based HGNNs neglect the utilization of spectral graph convolutions, which are the foundation of Graph Convolutional Networks (GCN) on homogeneous graphs.

Graph Learning Node Classification +1

TA-MoE: Topology-Aware Large Scale Mixture-of-Expert Training

1 code implementation20 Feb 2023 Chang Chen, Min Li, Zhihua Wu, dianhai yu, Chao Yang

In this paper, we propose TA-MoE, a topology-aware routing strategy for large-scale MoE trainging, from a model-system co-design perspective, which can dynamically adjust the MoE dispatch pattern according to the network topology.

PP-YOLOE-R: An Efficient Anchor-Free Rotated Object Detector

2 code implementations4 Nov 2022 Xinxin Wang, Guanzhong Wang, Qingqing Dang, Yi Liu, Xiaoguang Hu, dianhai yu

With multi-scale training and testing, PP-YOLOE-R-l and PP-YOLOE-R-x further improve the detection precision to 80. 02 and 80. 73 mAP.

Object object-detection +3

PP-StructureV2: A Stronger Document Analysis System

1 code implementation11 Oct 2022 Chenxia Li, Ruoyu Guo, Jun Zhou, Mengtao An, Yuning Du, Lingfeng Zhu, Yi Liu, Xiaoguang Hu, dianhai yu

For Table Recognition model, we utilize PP-LCNet, CSP-PAN and SLAHead to optimize the backbone module, feature fusion module and decoding module, respectively, which improved the table structure accuracy by 6\% with comparable inference speed.

 Ranked #1 on Network Pruning on CIFAR-100 (Inference Time (ms) metric)

Key Information Extraction Knowledge Distillation +3

ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding

no code implementations18 Sep 2022 Wenjin Wang, Zhengjie Huang, Bin Luo, Qianglong Chen, Qiming Peng, Yinxu Pan, Weichong Yin, Shikun Feng, Yu Sun, dianhai yu, Yin Zhang

At first, a document graph is proposed to model complex relationships among multi-grained multimodal elements, in which salient visual regions are detected by a cluster-based method.

Common Sense Reasoning document understanding +1

Large-scale Knowledge Distillation with Elastic Heterogeneous Computing Resources

1 code implementation14 Jul 2022 Ji Liu, daxiang dong, Xi Wang, An Qin, Xingjian Li, Patrick Valduriez, Dejing Dou, dianhai yu

Although more layers and more parameters generally improve the accuracy of the models, such big models generally have high computational complexity and require big memory, which exceed the capacity of small devices for inference and incurs long training time.

Knowledge Distillation

HelixFold: An Efficient Implementation of AlphaFold2 using PaddlePaddle

1 code implementation12 Jul 2022 Guoxia Wang, Xiaomin Fang, Zhihua Wu, Yiqun Liu, Yang Xue, Yingfei Xiang, dianhai yu, Fan Wang, Yanjun Ma

Due to the complex model architecture and large memory consumption, it requires lots of computational resources and time to implement the training and inference of AlphaFold2 from scratch.

Protein Structure Prediction

PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System

1 code implementation7 Jun 2022 Chenxia Li, Weiwei Liu, Ruoyu Guo, Xiaoting Yin, Kaitao Jiang, Yongkun Du, Yuning Du, Lingfeng Zhu, Baohua Lai, Xiaoguang Hu, dianhai yu, Yanjun Ma

For text recognizer, the base model is replaced from CRNN to SVTR, and we introduce lightweight text recognition network SVTR LCNet, guided training of CTC by attention, data augmentation strategy TextConAug, better pre-trained model by self-supervised TextRotNet, UDML, and UIM to accelerate the model and improve the effect.

Data Augmentation Optical Character Recognition +2

SE-MoE: A Scalable and Efficient Mixture-of-Experts Distributed Training and Inference System

1 code implementation20 May 2022 Liang Shen, Zhihua Wu, Weibao Gong, Hongxiang Hao, Yangfan Bai, HuaChao Wu, Xinxuan Wu, Jiang Bian, Haoyi Xiong, dianhai yu, Yanjun Ma

With the increasing diversity of ML infrastructures nowadays, distributed training over heterogeneous computing systems is desired to facilitate the production of big models.

Distributed Computing

Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters

1 code implementation19 May 2022 Yang Xiang, Zhihua Wu, Weibao Gong, Siyu Ding, Xianjie Mo, Yuang Liu, Shuohuan Wang, Peng Liu, Yongshuai Hou, Long Li, Bin Wang, Shaohuai Shi, Yaqian Han, Yue Yu, Ge Li, Yu Sun, Yanjun Ma, dianhai yu

We took natural language processing (NLP) as an example to show how Nebula-I works in different training phases that include: a) pre-training a multilingual language model using two remote clusters; and b) fine-tuning a machine translation model using knowledge distilled from pre-trained models, which run through the most popular paradigm of recent deep learning.

Cross-Lingual Natural Language Inference Distributed Computing +2

Simple and Effective Relation-based Embedding Propagation for Knowledge Representation Learning

1 code implementation13 May 2022 Huijuan Wang, Siming Dai, Weiyue Su, Hui Zhong, Zeyang Fang, Zhengjie Huang, Shikun Feng, Zeyu Chen, Yu Sun, dianhai yu

Notably, it averagely brings about 10% relative improvement to triplet-based embedding methods on OGBL-WikiKG2 and takes 5%-83% time to achieve comparable results as the state-of-the-art GC-OTE.

Knowledge Graphs Relation +1

End-to-end Adaptive Distributed Training on PaddlePaddle

1 code implementation6 Dec 2021 Yulong Ao, Zhihua Wu, dianhai yu, Weibao Gong, Zhiqing Kui, Minxu Zhang, Zilingfeng Ye, Liang Shen, Yanjun Ma, Tian Wu, Haifeng Wang, Wei Zeng, Chao Yang

The experiments demonstrate that our framework can satisfy various requirements from the diversity of applications and the heterogeneity of resources with highly competitive performance.

Language Modelling Recommendation Systems +1

Exploiting Cross-Modal Prediction and Relation Consistency for Semi-Supervised Image Captioning

no code implementations22 Oct 2021 Yang Yang, Hongchen Wei, HengShu Zhu, dianhai yu, Hui Xiong, Jian Yang

In detail, considering that the heterogeneous gap between modalities always leads to the supervision difficulty of using the global embedding directly, CPRC turns to transform both the raw image and corresponding generated sentence into the shared semantic space, and measure the generated sentence from two aspects: 1) Prediction consistency.

Image Captioning Informativeness +2

PP-LCNet: A Lightweight CPU Convolutional Neural Network

8 code implementations17 Sep 2021 Cheng Cui, Tingquan Gao, Shengyu Wei, Yuning Du, Ruoyu Guo, Shuilong Dong, Bin Lu, Ying Zhou, Xueying Lv, Qiwen Liu, Xiaoguang Hu, dianhai yu, Yanjun Ma

We propose a lightweight CPU network based on the MKLDNN acceleration strategy, named PP-LCNet, which improves the performance of lightweight models on multiple tasks.

Image Classification object-detection +2

PP-YOLOv2: A Practical Object Detector

1 code implementation21 Apr 2021 Xin Huang, Xinxin Wang, Wenyu Lv, Xiaying Bai, Xiang Long, Kaipeng Deng, Qingqing Dang, Shumin Han, Qiwen Liu, Xiaoguang Hu, dianhai yu, Yanjun Ma, Osamu Yoshie

To meet these two concerns, we comprehensively evaluate a collection of existing refinements to improve the performance of PP-YOLO while almost keep the infer time unchanged.

Object Real-Time Object Detection

Distilling Knowledge from Pre-trained Language Models via Text Smoothing

no code implementations8 May 2020 Xing Wu, Yibing Liu, Xiangyang Zhou, dianhai yu

As an alternative, we propose a new method for BERT distillation, i. e., asking the teacher to generate smoothed word ids, rather than labels, for teaching the student model in knowledge distillation.

Knowledge Distillation Language Modelling

RLTM: An Efficient Neural IR Framework for Long Documents

no code implementations22 Jun 2019 Chen Zheng, Yu Sun, Shengxian Wan, dianhai yu

This paper proposes a novel End-to-End neural ranking framework called Reinforced Long Text Matching (RLTM) which matches a query with long documents efficiently and effectively.

Information Retrieval Retrieval +2

A New Method of Region Embedding for Text Classification

1 code implementation ICLR 2018 chao qiao, Bo Huang, guocheng niu, daren li, daxiang dong, wei he, dianhai yu, Hua Wu

In this paper, we propose a new method of learning and utilizing task-specific distributed representations of n-grams, referred to as “region embeddings”.

General Classification text-classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.