Search Results for author: Cheng Cheng

Found 38 papers, 12 papers with code

Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On

no code implementations11 Jul 2024 Liang Zeng, Liangjun Zhong, Liang Zhao, Tianwen Wei, Liu Yang, Jujie He, Cheng Cheng, Rui Hu, Yang Liu, Shuicheng Yan, Han Fang, Yahui Zhou

In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs).

GSM8K Math +1

Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

1 code implementation3 Jun 2024 Tianwen Wei, Bo Zhu, Liang Zhao, Cheng Cheng, Biye Li, Weiwei Lü, Peng Cheng, Jianhao Zhang, XiaoYu Zhang, Liang Zeng, Xiaokun Wang, Yutuan Ma, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou

In this technical report, we introduce the training methodologies implemented in the development of Skywork-MoE, a high-performance mixture-of-experts (MoE) large language model (LLM) with 146 billion parameters and 16 experts.

Language Modelling Large Language Model

Activating Wider Areas in Image Super-Resolution

1 code implementation13 Mar 2024 Cheng Cheng, Hang Wang, Hongbin Sun

The prevalence of convolution neural networks (CNNs) and vision transformers (ViTs) has markedly revolutionized the area of single-image super-resolution (SISR).

Image Super-Resolution

100 Gbps Indoor Access and 4.8 Gbps Outdoor Point-to-Point LiFi Transmission Systems using Laser-based Light Sources

no code implementations25 Feb 2024 Cheng Cheng, Sovan Das, Stefan Videv, Adrian Spark, Sina Babadi, Aravindh Krishnamoorthy, Changmin Lee, Daniel Grieder, Kathleen Hartnett, Paul Rudy, James Raring, Marzieh Najafi, Vasilis K. Papanikolaou, Robert Schober, Harald Haas

In this paper, we demonstrate the communication capabilities of light-fidelity (LiFi) systems based on highbrightness and high-bandwidth integrated laser-based sources in a surface mount device (SMD) packaging platform.

Meta-Adapter: An Online Few-shot Learner for Vision-Language Model

1 code implementation NeurIPS 2023 Cheng Cheng, Lin Song, Ruoyi Xue, Hang Wang, Hongbin Sun, Yixiao Ge, Ying Shan

Without bells and whistles, our approach outperforms the state-of-the-art online few-shot learning method by an average of 3. 6\% on eight image classification datasets with higher inference speed.

Few-Shot Learning Image Classification +3

Graph Propagation Transformer for Graph Representation Learning

1 code implementation19 May 2023 Zhe Chen, Hao Tan, Tao Wang, Tianrun Shen, Tong Lu, Qiuying Peng, Cheng Cheng, Yue Qi

The core insight of our method is to fully consider the information propagation among nodes and edges in a graph when building the attention module in the transformer blocks.

Ranked #2 on Graph Regression on PCQM4M-LSC (Validation MAE metric)

Graph Learning Graph Property Prediction +3

Individual Fairness under Uncertainty

no code implementations16 Feb 2023 Wenbin Zhang, Zichong Wang, Juyong Kim, Cheng Cheng, Thomas Oommen, Pradeep Ravikumar, Jeremy Weiss

Algorithmic fairness, the research field of making machine learning (ML) algorithms fair, is an established area in ML.


Graph Fourier transform based on singular value decomposition of directed Laplacian

no code implementations12 May 2022 Yang Chen, Cheng Cheng, Qiyu Sun

The proposed GFT is consistent with the conventional GFT in the undirected graph setting, and on directed circulant graphs, the proposed GFT is the classical discrete Fourier transform, up to some rotation, permutation and phase adjustment.

Wiener filters on graphs and distributed polynomial approximation algorithms

no code implementations9 May 2022 Cong Zheng, Cheng Cheng, Qiyu Sun

In this paper, we consider Wiener filters to reconstruct deterministic and (wide-band) stationary graph signals from their observations corrupted by random noises, and we propose distributed algorithms to implement Wiener filters and inverse filters on networks in which agents are equipped with a data processing subsystem for limited data storage and computation power, and with a one-hop communication subsystem for direct data exchange only with their adjacent agents.


MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

1 code implementation14 Mar 2022 Yun He, Xue Feng, Cheng Cheng, Geng Ji, Yunsong Guo, James Caverlee

Specifically, in each training iteration and adaptively for each part of the network, the gradient of an auxiliary loss is carefully reduced or enlarged to have a closer magnitude to the gradient of the target loss, preventing auxiliary tasks from being so strong that dominate the target task or too weak to help the target task.

Learning to Adapt to Light

no code implementations16 Feb 2022 Kai-Fu Yang, Cheng Cheng, Shi-Xuan Zhao, Xian-Shi Zhang, Yong-Jie Li

Light adaptation or brightness correction is a key step in improving the contrast and visual appeal of an image.

Image Enhancement Tone Mapping

A recurrent neural network approach for remaining useful life prediction utilizing a novel trend features construction method

no code implementations10 Dec 2021 Sen Zhao, Yong Zhang, Shang Wang, Beitong Zhou, Cheng Cheng

Data-driven methods for remaining useful life (RUL) prediction normally learn features from a fixed window size of a priori of degradation, which may lead to less accurate prediction results on different datasets because of the variance of local features.

OneFlow: Redesign the Distributed Deep Learning Framework from Scratch

1 code implementation28 Oct 2021 Jinhui Yuan, Xinqi Li, Cheng Cheng, Juncheng Liu, Ran Guo, Shenghang Cai, Chi Yao, Fei Yang, Xiaodong Yi, Chuan Wu, Haoran Zhang, Jie Zhao

Aiming at a simple, neat redesign of distributed deep learning frameworks for various parallelism paradigms, we present OneFlow, a novel distributed training framework based on an SBP (split, broadcast and partial-value) abstraction and the actor model.

Dictionary Learning Using Rank-One Atomic Decomposition (ROAD)

no code implementations25 Oct 2021 Cheng Cheng, Wei Dai

Dictionary learning aims at seeking a dictionary under which the training data can be sparsely represented.

Dictionary Learning

Dictionary Learning with Convex Update (ROMD)

no code implementations13 Oct 2021 Cheng Cheng, Wei Dai

Typical methods for dictionary update focuses on refining both dictionary atoms and their corresponding sparse coefficients by using the sparsity patterns obtained from sparse coding stage, and hence it is a non-convex bilinear inverse problem.

Dictionary Learning

Short-and-Sparse Deconvolution Via Rank-One Constrained Optimization (ROCO)

no code implementations5 Oct 2021 Cheng Cheng, Wei Dai

In the literature, formulations of blind deconvolution is either a convex programming via a matrix lifting of convolution, or a bilinear Lasso.

DA-DRN: Degradation-Aware Deep Retinex Network for Low-Light Image Enhancement

no code implementations5 Oct 2021 Xinxu Wei, Xianshi Zhang, Shisen Wang, Cheng Cheng, Yanlin Huang, KaiFu Yang, YongJie Li

We propose a Degradation-Aware Module (DA Module) which can guide the training process of the decomposer and enable the decomposer to be a restorer during the training phase without additional computational cost in the test phase.

Low-Light Image Enhancement

BLNet: A Fast Deep Learning Framework for Low-Light Image Enhancement with Noise Removal and Color Restoration

no code implementations30 Jun 2021 Xinxu Wei, Xianshi Zhang, Shisen Wang, Cheng Cheng, Yanlin Huang, KaiFu Yang, YongJie Li

We propose a Noise and Color Bias Control module (NCBC Module) that contains a convolutional neural network and two loss functions (noise loss and color loss).

Low-Light Image Enhancement

Design Diversity for Improving Efficiency and Reducing Risk in Oil and Gas Well Stimulation under Uncertain Reservoir Conditions

no code implementations27 Oct 2020 Cheng Cheng

Hydraulic fracturing stimulates fracture swarm in reservoir formation though pressurized injection fluid.


Pixel-Face: A Large-Scale, High-Resolution Benchmark for 3D Face Reconstruction

no code implementations28 Aug 2020 Jiangjing Lyu, Xiaobo Li, Xiangyu Zhu, Cheng Cheng

It is also a challenging task due to the lack of high-quality datasets that can fuel current deep learning-based methods.

3D Face Reconstruction

Adaptive support driven Bayesian reweighted algorithm for sparse signal recovery

no code implementations10 Aug 2020 Junlin Li, Wei Zhou, Cheng Cheng

For example, sparse Bayesian learning (SBL) was proposed to learn major features from a dictionary of basis functions, which makes identified models interpretable.

feature selection Sparse Learning

Predicting Mortality Risk in Viral and Unspecified Pneumonia to Assist Clinicians with COVID-19 ECMO Planning

1 code implementation2 Jun 2020 Helen Zhou, Cheng Cheng, Zachary C. Lipton, George H. Chen, Jeremy C. Weiss

Finally, the PEER score is provided in the form of a nomogram for direct calculation of patient risk, and can be used to highlight at-risk patients among critical care patients eligible for ECMO.


SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier Detection

1 code implementation11 Mar 2020 Yue Zhao, Xiyang Hu, Cheng Cheng, Cong Wang, Changlin Wan, Wen Wang, Jianing Yang, Haoping Bai, Zheng Li, Cao Xiao, Yunlong Wang, Zhi Qiao, Jimeng Sun, Leman Akoglu

Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples with numerous high-stake applications including fraud detection and intrusion detection.

Dimensionality Reduction Fraud Detection +2

Ensemble emotion recognizing with multiple modal physiological signals

no code implementations1 Jan 2020 Jing Zhang, Yong Zhang, Suhua Zhan, Cheng Cheng

Multiple physiological signals fusing models, building the uniform classification model by means of consistent and complementary information from different emotions to improve recognition performance.

Classification EEG +3

Knowledge Distillation in Document Retrieval

no code implementations11 Nov 2019 Siamak Shakeri, Abhinav Sethy, Cheng Cheng

In this paper we show that knowledge distillation can be used to encourage a model that generates claim independent document encodings to mimic the behavior of a more complex model which generates claim dependent encodings.

Knowledge Distillation Retrieval

Combining Machine Learning Models using combo Library

1 code implementation21 Sep 2019 Yue Zhao, Xuejian Wang, Cheng Cheng, Xueying Ding

Model combination, often regarded as a key sub-field of ensemble learning, has been widely used in both academic research and industry applications.

Anomaly Detection BIG-bench Machine Learning +2

Oceanic Eddy Identification Using an AI Scheme

no code implementations Remote Sensing 2019 Guangjun Xu, Cheng Cheng, Wenxian Yang, Wenhong Xie, Lingmei Kong, Renlong Hang, Furong Ma, Changming Dong, Jingsong Yang

Oceanic eddies play an important role in global energyand material transport, and contribute greatly to nutrient and phytoplankton distribution.

Scene Parsing

A Novel GAN-based Fault Diagnosis Approach for Imbalanced Industrial Time Series

no code implementations1 Apr 2019 Wenqian Jiang, Cheng Cheng, Beitong Zhou, Guijun Ma, Ye Yuan

This paper proposes a novel fault diagnosis approach based on generative adversarial networks (GAN) for imbalanced industrial time series where normal samples are much larger than failure cases.

Decoder Time Series +1

Wasserstein Distance based Deep Adversarial Transfer Learning for Intelligent Fault Diagnosis

no code implementations2 Mar 2019 Cheng Cheng, Beitong Zhou, Guijun Ma, Dongrui Wu, Ye Yuan

However, for diverse working conditions in the industry, deep learning suffers two difficulties: one is that the well-defined (source domain) and new (target domain) datasets are with different feature distributions; another one is the fact that insufficient or no labelled data in target domain significantly reduce the accuracy of fault diagnosis.

Transfer Learning

A General End-to-end Diagnosis Framework for Manufacturing Systems

no code implementations17 Dec 2018 Ye Yuan, Guijun Ma, Cheng Cheng, Beitong Zhou, Huan Zhao, Hai-Tao Zhang, Han Ding

A central challenge in manufacturing sector lies in the requirement of a general framework to ensure satisfied diagnosis and monitoring performances in different manufacturing applications.


A deep learning-based remaining useful life prediction approach for bearings

1 code implementation8 Dec 2018 Cheng Cheng, Guijun Ma, Yong Zhang, Mingyang Sun, Fei Teng, Han Ding, Ye Yuan

In industrial applications, nearly half the failures of motors are caused by the degradation of rolling element bearings (REBs).

A Deep Regression Architecture With Two-Stage Re-Initialization for High Performance Facial Landmark Detection

1 code implementation CVPR 2017 Jiangjing Lv, Xiaohu Shao, Junliang Xing, Cheng Cheng, Xi Zhou

At the global stage, given an image with a rough face detection result, the full face region is firstly re-initialized by a supervised spatial transformer network to a canonical shape state and then trained to regress a coarse landmark estimation.

Face Detection Facial Landmark Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.