Search Results for author: Zhenda Xie

Found 20 papers, 17 papers with code

DeepSeek-VL: Towards Real-World Vision-Language Understanding

2 code implementations • 8 Mar 2024 • Haoyu Lu, Wen Liu, Bo Zhang, Bingxuan Wang, Kai Dong, Bo Liu, Jingxiang Sun, Tongzheng Ren, Zhuoshu Li, Hao Yang, Yaofeng Sun, Chengqi Deng, Hanwei Xu, Zhenda Xie, Chong Ruan

The DeepSeek-VL family (both 1. 3B and 7B models) showcases superior user experiences as a vision-language chatbot in real-world applications, achieving state-of-the-art or competitive performance across a wide range of visual-language benchmarks at the same model size while maintaining robust performance on language-centric benchmarks.

Ranked #28 on Visual Question Answering on MM-Vet

Chatbot Language Modelling +3

1,497

Paper
Code

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

1 code implementation • 25 Jan 2024 • Daya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi, Y. Wu, Y. K. Li, Fuli Luo, Yingfei Xiong, Wenfeng Liang

The rapid development of large language models has revolutionized code intelligence in software development.

Ranked #11 on Code Generation on HumanEval

16k Code Generation +2

5,297

Paper
Code

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

1 code implementation • 11 Jan 2024 • Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, Wenfeng Liang

Subsequently, we scale up DeepSeekMoE to 16B parameters and show that it achieves comparable performance with LLaMA2 7B, with only about 40% of computations.

Language Modelling Large Language Model

815

Paper
Code

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

1 code implementation • 5 Jan 2024 • DeepSeek-AI, :, Xiao Bi, Deli Chen, Guanting Chen, Shanhuang Chen, Damai Dai, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du, Zhe Fu, Huazuo Gao, Kaige Gao, Wenjun Gao, Ruiqi Ge, Kang Guan, Daya Guo, JianZhong Guo, Guangbo Hao, Zhewen Hao, Ying He, Wenjie Hu, Panpan Huang, Erhang Li, Guowei Li, Jiashi Li, Yao Li, Y. K. Li, Wenfeng Liang, Fangyun Lin, A. X. Liu, Bo Liu, Wen Liu, Xiaodong Liu, Xin Liu, Yiyuan Liu, Haoyu Lu, Shanghao Lu, Fuli Luo, Shirong Ma, Xiaotao Nie, Tian Pei, Yishi Piao, Junjie Qiu, Hui Qu, Tongzheng Ren, Zehui Ren, Chong Ruan, Zhangli Sha, Zhihong Shao, Junxiao Song, Xuecheng Su, Jingxiang Sun, Yaofeng Sun, Minghui Tang, Bingxuan Wang, Peiyi Wang, Shiyu Wang, Yaohui Wang, Yongji Wang, Tong Wu, Y. Wu, Xin Xie, Zhenda Xie, Ziwei Xie, Yiliang Xiong, Hanwei Xu, R. X. Xu, Yanhong Xu, Dejian Yang, Yuxiang You, Shuiping Yu, Xingkai Yu, B. Zhang, Haowei Zhang, Lecong Zhang, Liyue Zhang, Mingchuan Zhang, Minghua Zhang, Wentao Zhang, Yichao Zhang, Chenggang Zhao, Yao Zhao, Shangyan Zhou, Shunfeng Zhou, Qihao Zhu, Yuheng Zou

The rapid development of open-source large language models (LLMs) has been truly remarkable.

1,119

Paper
Code

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

1 code implementation • 25 Oct 2023 • Jingxiang Sun, Bo Zhang, Ruizhi Shao, Lizhen Wang, Wen Liu, Zhenda Xie, Yebin Liu

The score distillation from this 3D-aware diffusion prior provides view-consistent guidance for the scene.

3D Generation

1,793

Paper
Code

High speed free-space optical communication using standard fiber communication component without optical amplification

no code implementations • 27 Feb 2023 • Yao Zhang, Hua-Ying Liu, Xiaoyi Liu, Peng Xu, Xiang Dong, Pengfei Fan, Xiaohui Tian, Hua Yu, Dong Pan, Zhijun Yin, Guilu Long, Shi-Ning Zhu, Zhenda Xie

Free-space optical communication (FSO) can achieve fast, secure and license-free communication without need for physical cables, making it a cost-effective, energy-efficient and flexible solution when the fiber connection is unavailable.

Paper
Add Code

iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-Training for Visual Recognition

no code implementations • CVPR 2023 • Yixuan Wei, Yue Cao, Zheng Zhang, Houwen Peng, Zhuliang Yao, Zhenda Xie, Han Hu, Baining Guo

This paper presents a method that effectively combines two prevalent visual recognition methods, i. e., image classification and contrastive language-image pre-training, dubbed iCLIP.

Classification Image Classification +2

Paper
Add Code

Improving CLIP Fine-tuning Performance

1 code implementation • ICCV 2023 • Yixuan Wei, Han Hu, Zhenda Xie, Ze Liu, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo

Experiments suggest that the feature map distillation approach significantly boosts the fine-tuning performance of CLIP models on several typical downstream vision tasks.

object-detection Object Detection +1

217

Paper
Code

On Data Scaling in Masked Image Modeling

1 code implementation • CVPR 2023 • Zhenda Xie, Zheng Zhang, Yue Cao, Yutong Lin, Yixuan Wei, Qi Dai, Han Hu

Our study reveals that: (i) Masked image modeling is also demanding on larger data.

Self-Supervised Learning

12,930

Paper
Code

Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation

1 code implementation • 27 May 2022 • Yixuan Wei, Han Hu, Zhenda Xie, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo

These properties, which we aggregately refer to as optimization friendliness, are identified and analyzed by a set of attention- and optimization-related diagnosis tools.

Ranked #2 on Instance Segmentation on COCO test-dev (using extra training data)

Contrastive Learning Image Classification +5

217

Paper
Code

Revealing the Dark Secrets of Masked Image Modeling

1 code implementation • CVPR 2023 • Zhenda Xie, Zigang Geng, Jingcheng Hu, Zheng Zhang, Han Hu, Yue Cao

In this paper, we compare MIM with the long-dominant supervised pre-trained models from two perspectives, the visualizations and the experiments, to uncover their key representational differences.

Ranked #3 on Depth Estimation on NYU-Depth V2

Inductive Bias Monocular Depth Estimation +3

153

Paper
Code

iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition

no code implementations • 22 Apr 2022 • Yixuan Wei, Yue Cao, Zheng Zhang, Zhuliang Yao, Zhenda Xie, Han Hu, Baining Guo

Second, we convert the image classification problem from learning parametric category classifier weights to learning a text encoder as a meta network to generate category classifier weights.

Action Recognition Classification +7

Paper
Add Code

SimMIM: A Simple Framework for Masked Image Modeling

4 code implementations • CVPR 2022 • Zhenda Xie, Zheng Zhang, Yue Cao, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai, Han Hu

We also leverage this approach to facilitate the training of a 3B model (SwinV2-G), that by $40\times$ less data than that in previous practice, we achieve the state-of-the-art on four representative vision benchmarks.

Ranked #10 on Self-Supervised Image Classification on ImageNet (finetuned)

Representation Learning Self-Supervised Image Classification +1

868

Paper
Code

Swin Transformer V2: Scaling Up Capacity and Resolution

19 code implementations • CVPR 2022 • Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo

Three main techniques are proposed: 1) a residual-post-norm method combined with cosine attention to improve training stability; 2) A log-spaced continuous position bias method to effectively transfer models pre-trained using low-resolution images to downstream tasks with high-resolution inputs; 3) A self-supervised pre-training method, SimMIM, to reduce the needs of vast labeled images.

Ranked #4 on Image Classification on ImageNet V2 (using extra training data)

Action Classification Image Classification +3

29,735

Paper
Code

Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for Video Correspondence Learning

1 code implementation • 12 May 2021 • Yansong Tang, Zhenyu Jiang, Zhenda Xie, Yue Cao, Zheng Zhang, Philip H. S. Torr, Han Hu

Previous cycle-consistency correspondence learning methods usually leverage image patches for training.

Landmark Tracking Pose Tracking +3

Paper
Code

Self-Supervised Learning with Swin Transformers

6 code implementations • 10 May 2021 • Zhenda Xie, Yutong Lin, Zhuliang Yao, Zheng Zhang, Qi Dai, Yue Cao, Han Hu

We are witnessing a modeling shift from CNN to Transformers in computer vision.

Ranked #75 on Self-Supervised Image Classification on ImageNet

object-detection Object Detection +3

12,930

Paper
Code

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

7 code implementations • CVPR 2021 • Zhenda Xie, Yutong Lin, Zheng Zhang, Yue Cao, Stephen Lin, Han Hu

We argue that the power of contrastive learning has yet to be fully unleashed, as current methods are trained only on instance-level pretext tasks, leading to representations that may be sub-optimal for downstream tasks requiring dense pixel predictions.

Contrastive Learning object-detection +3

328

Paper
Code

Parametric Instance Classification for Unsupervised Visual Feature Learning

1 code implementation • NeurIPS 2020 • Yue Cao, Zhenda Xie, Bin Liu, Yutong Lin, Zheng Zhang, Han Hu

This paper presents parametric instance classification (PIC) for unsupervised visual feature learning.

Classification General Classification

Paper
Code

Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation

1 code implementation • ECCV 2020 • Zhenda Xie, Zheng Zhang, Xizhou Zhu, Gao Huang, Stephen Lin

In the feature maps of CNNs, there commonly exists considerable spatial redundancy that leads to much repetitive processing.

Paper
Code

Local Relation Networks for Image Recognition

3 code implementations • ICCV 2019 • Han Hu, Zheng Zhang, Zhenda Xie, Stephen Lin

The convolution layer has been the dominant feature extractor in computer vision for years.

Ranked #864 on Image Classification on ImageNet

General Classification Image Classification +2

12,930

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.