1 code implementation • 1 Dec 2023 • Afifa Khaled, Chao Li, Jia Ning, Kun He
Normalization techniques have been widely used in the field of deep learning due to their capability of enabling higher learning rates and are less careful in initialization.
1 code implementation • 27 Oct 2023 • Houwen Peng, Kan Wu, Yixuan Wei, Guoshuai Zhao, Yuxiang Yang, Ze Liu, Yifan Xiong, Ziyue Yang, Bolin Ni, Jingcheng Hu, Ruihang Li, Miaosen Zhang, Chen Li, Jia Ning, Ruizhe Wang, Zheng Zhang, Shuguang Liu, Joe Chau, Han Hu, Peng Cheng
In this paper, we explore FP8 low-bit data formats for efficient training of large language models (LLMs).
no code implementations • 9 Feb 2023 • Qi Chen, Chao Li, Jia Ning, Stephen Lin, Kun He
Inspired by the property that ERFs typically exhibit a Gaussian distribution, we propose a Gaussian Mask convolutional kernel (GMConv) in this work.
1 code implementation • ICCV 2023 • Jia Ning, Chen Li, Zheng Zhang, Zigang Geng, Qi Dai, Kun He, Han Hu
With these new techniques and other designs, we show that the proposed general-purpose task-solver can perform both instance segmentation and depth estimation well.
Ranked #20 on Monocular Depth Estimation on NYU-Depth V2
no code implementations • 10 Apr 2022 • Chao Li, Jia Ning, Han Hu, Kun He
Differentiable architecture search (DARTS) has attracted much attention due to its simplicity and significant improvement in efficiency.
20 code implementations • CVPR 2022 • Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo
Three main techniques are proposed: 1) a residual-post-norm method combined with cosine attention to improve training stability; 2) A log-spaced continuous position bias method to effectively transfer models pre-trained using low-resolution images to downstream tasks with high-resolution inputs; 3) A self-supervised pre-training method, SimMIM, to reduce the needs of vast labeled images.
Ranked #4 on Image Classification on ImageNet V2 (using extra training data)
14 code implementations • CVPR 2022 • Ze Liu, Jia Ning, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin, Han Hu
The vision community is witnessing a modeling shift from CNNs to Transformers, where pure Transformer architectures have attained top accuracy on the major video recognition benchmarks.
Ranked #28 on Action Classification on Kinetics-600 (using extra training data)