Search Results for author: Jun Yao

Found 10 papers, 2 papers with code

HW-TSC’s Participation in the WMT 2021 Efficiency Shared Task

no code implementations • WMT (EMNLP) 2021 • Hengchao Shang, Ting Hu, Daimeng Wei, Zongyao Li, Jianfei Feng, Zhengzhe Yu, Jiaxin Guo, Shaojun Li, Lizhi Lei, Shimin Tao, Hao Yang, Jun Yao, Ying Qin

This paper presents the submission of Huawei Translation Services Center (HW-TSC) to WMT 2021 Efficiency Shared Task.

Quantization Sentence +1

Paper
Add Code

IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact

1 code implementation • 2 Mar 2024 • Ruikang Liu, Haoli Bai, Haokun Lin, Yuening Li, Han Gao, Zhengzhuo Xu, Lu Hou, Jun Yao, Chun Yuan

Large language models (LLMs) excel in natural language processing but demand intensive computation.

Language Modelling Large Language Model +1

183

Paper
Code

Machine Learning Insides OptVerse AI Solver: Design Principles and Applications

no code implementations • 11 Jan 2024 • Xijun Li, Fangzhou Zhu, Hui-Ling Zhen, Weilin Luo, Meng Lu, Yimin Huang, Zhenan Fan, Zirui Zhou, Yufei Kuang, Zhihai Wang, Zijie Geng, Yang Li, Haoyang Liu, Zhiwu An, Muming Yang, Jianshu Li, Jie Wang, Junchi Yan, Defeng Sun, Tao Zhong, Yong Zhang, Jia Zeng, Mingxuan Yuan, Jianye Hao, Jun Yao, Kun Mao

To this end, we present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI Solver, which aims to mitigate the scarcity of real-world mathematical programming instances, and to surpass the capabilities of traditional optimization techniques.

Decision Making Management

Paper
Add Code

PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation

no code implementations • 27 Dec 2023 • Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao

We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.

Language Modelling

Paper
Add Code

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing

no code implementations • 20 Mar 2023 • Xiaozhe Ren, Pingyi Zhou, Xinfan Meng, Xinjing Huang, Yadao Wang, Weichao Wang, Pengfei Li, Xiaoda Zhang, Alexander Podolskiy, Grigory Arshinov, Andrey Bout, Irina Piontkovskaya, Jiansheng Wei, Xin Jiang, Teng Su, Qun Liu, Jun Yao

In this work, we develop a system that trained a trillion-parameter language model on a cluster of Ascend 910 AI processors and MindSpore framework, and present the language model with 1. 085T parameters named PanGu-{\Sigma}.

Code Generation Language Modelling +4

Paper
Add Code

ArCL: Enhancing Contrastive Learning with Augmentation-Robust Representations

no code implementations • 2 Mar 2023 • Xuyang Zhao, Tianqi Du, Yisen Wang, Jun Yao, Weiran Huang

Moreover, we show that contrastive learning fails to learn domain-invariant features, which limits its transferability.

Contrastive Learning Data Augmentation +1

Paper
Add Code

When Noisy Labels Meet Long Tail Dilemmas: A Representation Calibration Method

no code implementations • ICCV 2023 • Manyi Zhang, Xuyang Zhao, Jun Yao, Chun Yuan, Weiran Huang

In this paper, to handle the problem and address the limitations of prior works, we propose a representation calibration method RCAL.

Contrastive Learning Learning with noisy labels +1

Paper
Add Code

Deep Stock Trading: A Hierarchical Reinforcement Learning Framework for Portfolio Optimization and Order Execution

no code implementations • 23 Dec 2020 • Rundong Wang, Hongxin Wei, Bo An, Zhouyan Feng, Jun Yao

Portfolio management via reinforcement learning is at the forefront of fintech research, which explores how to optimally reallocate a fund into different financial assets over the long term by trial-and-error.

Hierarchical Reinforcement Learning Management +2

Paper
Add Code

Correction of Faulty Background Knowledge based on Condition Aware and Revise Transformer for Question Answering

no code implementations • 30 Jun 2020 • Xinyan Zhao, Xiao Feng, Haoming Zhong, Jun Yao, Huanhuan Chen

CAR-Transformer (1) revises each condition value based on the whole conversation and original conditions values, and (2) it encodes the revised conditions and utilizes the conditions embedding to select an answer.

Question Answering

Paper
Add Code

Sub-Architecture Ensemble Pruning in Neural Architecture Search

1 code implementation • 1 Oct 2019 • Yijun Bian, Qingquan Song, Mengnan Du, Jun Yao, Huanhuan Chen, Xia Hu

Neural architecture search (NAS) is gaining more and more attention in recent years due to its flexibility and remarkable capability to reduce the burden of neural network design.

Ensemble Learning Ensemble Pruning +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.