Search Results for author: Weixin Liu

Found 7 papers, 5 papers with code

East: Efficient and Accurate Secure Transformer Framework for Inference

no code implementations • 19 Aug 2023 • Yuanchao Ding, Hua Guo, Yewei Guan, Weixin Liu, Jiarong Huo, Zhenyu Guan, Xiyong Zhang

Secure protocols for non-linear functions are crucial in privacy-preserving Transformer inference, which are not well studied.

Privacy Preserving

Paper
Add Code

ERNIE 3.0 Tiny: Frustratingly Simple Method to Improve Task-Agnostic Distillation Generalization

1 code implementation • 9 Jan 2023 • Weixin Liu, Xuyi Chen, Jiaxiang Liu, Shikun Feng, Yu Sun, Hao Tian, Hua Wu

Experimental results demonstrate that our method yields a student with much better generalization, significantly outperforms existing baselines, and establishes a new state-of-the-art result on in-domain, out-domain, and low-resource datasets in the setting of task-agnostic distillation.

Knowledge Distillation Language Modelling +1

11,299

Paper
Code

Learning Weakly-Supervised Contrastive Representations

1 code implementation • ICLR 2022 • Yao-Hung Hubert Tsai, Tianqin Li, Weixin Liu, Peiyuan Liao, Ruslan Salakhutdinov, Louis-Philippe Morency

The first stage is to cluster data according to its auxiliary information.

Contrastive Learning Representation Learning

Paper
Code

ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

3 code implementations • 23 Dec 2021 • Shuohuan Wang, Yu Sun, Yang Xiang, Zhihua Wu, Siyu Ding, Weibao Gong, Shikun Feng, Junyuan Shang, Yanbin Zhao, Chao Pang, Jiaxiang Liu, Xuyi Chen, Yuxiang Lu, Weixin Liu, Xi Wang, Yangfan Bai, Qiuliang Chen, Li Zhao, Shiyong Li, Peng Sun, dianhai yu, Yanjun Ma, Hao Tian, Hua Wu, Tian Wu, Wei Zeng, Ge Li, Wen Gao, Haifeng Wang

A unified framework named ERNIE 3. 0 was recently proposed for pre-training large-scale knowledge enhanced models and trained a model with 10 billion parameters.

Language Modelling

11,418

Paper
Code

ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

2 code implementations • 5 Jul 2021 • Yu Sun, Shuohuan Wang, Shikun Feng, Siyu Ding, Chao Pang, Junyuan Shang, Jiaxiang Liu, Xuyi Chen, Yanbin Zhao, Yuxiang Lu, Weixin Liu, Zhihua Wu, Weibao Gong, Jianzhong Liang, Zhizhou Shang, Peng Sun, Wei Liu, Xuan Ouyang, dianhai yu, Hao Tian, Hua Wu, Haifeng Wang

We trained the model with 10 billion parameters on a 4TB corpus consisting of plain texts and a large-scale knowledge graph.

Few-Shot Learning Natural Language Understanding +2

11,418

Paper
Code

Integrating Auxiliary Information in Self-supervised Learning

no code implementations • 5 Jun 2021 • Yao-Hung Hubert Tsai, Tianqin Li, Weixin Liu, Peiyuan Liao, Ruslan Salakhutdinov, Louis-Philippe Morency

Our approach contributes as follows: 1) Comparing to conventional self-supervised representations, the auxiliary-information-infused self-supervised representations bring the performance closer to the supervised representations; 2) The presented Cl-InfoNCE can also work with unsupervised constructed clusters (e. g., k-means clusters) and outperform strong clustering-based self-supervised learning approaches, such as the Prototypical Contrastive Learning (PCL) method; 3) We show that Cl-InfoNCE may be a better approach to leverage the data clustering information, by comparing it to the baseline approach - learning to predict the clustering assignments with cross-entropy loss.

Clustering Contrastive Learning +1

Paper
Add Code

ERNIE-Tiny : A Progressive Distillation Framework for Pretrained Transformer Compression

1 code implementation • 4 Jun 2021 • Weiyue Su, Xuyi Chen, Shikun Feng, Jiaxiang Liu, Weixin Liu, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Specifically, the first stage, General Distillation, performs distillation with guidance from pretrained teacher, gerenal data and latent distillation loss.

Knowledge Distillation

11,418

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.