Search Results for author: Kaiyong Zhao

Found 13 papers, 7 papers with code

FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs

no code implementations3 Sep 2023 Zhenheng Tang, Yuxin Wang, Xin He, Longteng Zhang, Xinglin Pan, Qiang Wang, Rongfei Zeng, Kaiyong Zhao, Shaohuai Shi, Bingsheng He, Xiaowen Chu

The rapid growth of memory and computation requirements of large language models (LLMs) has outpaced the development of hardware, hindering people who lack large-scale high-end GPUs from training or deploying LLMs.

Scheduling

Rethinking Disparity: A Depth Range Free Multi-View Stereo Based on Disparity

1 code implementation30 Nov 2022 Qingsong Yan, Qiang Wang, Kaiyong Zhao, Bo Li, Xiaowen Chu, Fei Deng

Existing learning-based multi-view stereo (MVS) methods rely on the depth range to build the 3D cost volume and may fail when the range is too large or unreliable.

SphereDepth: Panorama Depth Estimation from Spherical Domain

no code implementations29 Aug 2022 Qingsong Yan, Qiang Wang, Kaiyong Zhao, Bo Li, Xiaowen Chu, Fei Deng

The panorama image can simultaneously demonstrate complete information of the surrounding environment and has many advantages in virtual tourism, games, robotics, etc.

Depth Estimation

EASNet: Searching Elastic and Accurate Network Architecture for Stereo Matching

1 code implementation20 Jul 2022 Qiang Wang, Shaohuai Shi, Kaiyong Zhao, Xiaowen Chu

However, existing NAS studies on the dense prediction task, especially stereo matching, still cannot be efficiently and effectively deployed on devices of different computing capabilities.

Image Classification Neural Architecture Search +3

FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

no code implementations6 Oct 2021 Qiang Wang, Shaohuai Shi, Shizhen Zheng, Kaiyong Zhao, Xiaowen Chu

The disparity estimation problem tends to be addressed by DNNs which achieve much better prediction accuracy than traditional hand-crafted feature-based methods.

Disparity Estimation

Layer-wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees

no code implementations20 Nov 2019 Shaohuai Shi, Zhenheng Tang, Qiang Wang, Kaiyong Zhao, Xiaowen Chu

To reduce the long training time of large deep neural network (DNN) models, distributed synchronous stochastic gradient descent (S-SGD) is commonly used on a cluster of workers.

Distributed Optimization

Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training

no code implementations15 Sep 2019 Yuxin Wang, Qiang Wang, Shaohuai Shi, Xin He, Zhenheng Tang, Kaiyong Zhao, Xiaowen Chu

Different from the existing end-to-end benchmarks which only present the training time, We try to investigate the impact of hardware, vendor's software library, and deep learning framework on the performance and energy consumption of AI training.

Benchmarking

AutoML: A Survey of the State-of-the-Art

2 code implementations2 Aug 2019 Xin He, Kaiyong Zhao, Xiaowen Chu

Deep learning (DL) techniques have penetrated all aspects of our lives and brought us great convenience.

Feature Engineering Hyperparameter Optimization +1

A Distributed Synchronous SGD Algorithm with Global Top-$k$ Sparsification for Low Bandwidth Networks

1 code implementation14 Jan 2019 Shaohuai Shi, Qiang Wang, Kaiyong Zhao, Zhenheng Tang, Yuxin Wang, Xiang Huang, Xiaowen Chu

Current methods that use AllGather to accumulate the sparse gradients have a communication complexity of $O(kP)$, where $P$ is the number of workers, which is inefficient on low bandwidth networks with a large number of workers.

Cannot find the paper you are looking for? You can Submit a new open access paper.