Search Results for author: Houwen Peng

Found 32 papers, 26 papers with code

Common 7B Language Models Already Possess Strong Math Capabilities

no code implementations • 7 Mar 2024 • Chen Li, Weiqi Wang, Jingcheng Hu, Yixuan Wei, Nanning Zheng, Han Hu, Zheng Zhang, Houwen Peng

This paper shows that the LLaMA-2 7B model with common pre-training already exhibits strong mathematical abilities, as evidenced by its impressive accuracy of 97. 7% and 72. 0% on the GSM8K and MATH benchmarks, respectively, when selecting the best response from 256 random generations.

GSM8K Math

Paper
Add Code

FP8-LM: Training FP8 Large Language Models

1 code implementation • 27 Oct 2023 • Houwen Peng, Kan Wu, Yixuan Wei, Guoshuai Zhao, Yuxiang Yang, Ze Liu, Yifan Xiong, Ziyue Yang, Bolin Ni, Jingcheng Hu, Ruihang Li, Miaosen Zhang, Chen Li, Jia Ning, Ruizhe Wang, Zheng Zhang, Shuguang Liu, Joe Chau, Han Hu, Peng Cheng

In this paper, we explore FP8 low-bit data formats for efficient training of large language models (LLMs).

458

Paper
Code

TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance

1 code implementation • ICCV 2023 • Kan Wu, Houwen Peng, Zhenghong Zhou, Bin Xiao, Mengchen Liu, Lu Yuan, Hong Xuan, Michael Valenzuela, Xi, Chen, Xinggang Wang, Hongyang Chao, Han Hu

In this paper, we propose a novel cross-modal distillation method, called TinyCLIP, for large-scale language-image pre-trained models.

1,566

Paper
Code

Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking

no code implementations • ICCV 2023 • Ben Kang, Xin Chen, Dong Wang, Houwen Peng, Huchuan Lu

The Bridge Module incorporates the high-level information of deep features into the shallow large-resolution features.

Position Visual Tracking

Paper
Add Code

EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention

3 code implementations • CVPR 2023 • Xinyu Liu, Houwen Peng, Ningxin Zheng, Yuqing Yang, Han Hu, Yixuan Yuan

Comprehensive experiments demonstrate EfficientViT outperforms existing efficient models, striking a good trade-off between speed and accuracy.

29,846

Paper
Code

Unified Sequence-to-Sequence Learning for Single- and Multi-Modal Visual Object Tracking

1 code implementation • CVPR 2023 • Xin Chen, Ben Kang, Jiawen Zhu, Dong Wang, Houwen Peng, Huchuan Lu

In this paper, we introduce a new sequence-to-sequence learning framework for RGB-based and multi-modal object tracking.

Ranked #1 on Rgb-T Tracking on LasHeR

Decoder Object +2

Paper
Code

iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-Training for Visual Recognition

no code implementations • CVPR 2023 • Yixuan Wei, Yue Cao, Zheng Zhang, Houwen Peng, Zhuliang Yao, Zhenda Xie, Han Hu, Baining Guo

This paper presents a method that effectively combines two prevalent visual recognition methods, i. e., image classification and contrastive language-image pre-training, dubbed iCLIP.

Classification Image Classification +2

Paper
Add Code

Attentive Mask CLIP

1 code implementation • ICCV 2023 • Yifan Yang, Weiquan Huang, Yixuan Wei, Houwen Peng, Xinyang Jiang, Huiqiang Jiang, Fangyun Wei, Yin Wang, Han Hu, Lili Qiu, Yuqing Yang

To address this issue, we propose an attentive token removal approach for CLIP training, which retains tokens with a high semantic correlation to the text description.

Contrastive Learning Retrieval +1

Paper
Code

Expanding Language-Image Pretrained Models for General Video Recognition

2 code implementations • 4 Aug 2022 • Bolin Ni, Houwen Peng, Minghao Chen, Songyang Zhang, Gaofeng Meng, Jianlong Fu, Shiming Xiang, Haibin Ling

Extensive experiments demonstrate that our approach is effective and can be generalized to different video recognition scenarios.

Ranked #8 on Zero-Shot Action Recognition on Kinetics

Action Classification Action Recognition +3

932

Paper
Code

TinyViT: Fast Pretraining Distillation for Small Vision Transformers

2 code implementations • 21 Jul 2022 • Kan Wu, Jinnian Zhang, Houwen Peng, Mengchen Liu, Bin Xiao, Jianlong Fu, Lu Yuan

It achieves a top-1 accuracy of 84. 8% on ImageNet-1k with only 21M parameters, being comparable to Swin-B pretrained on ImageNet-21k while using 4. 2 times fewer parameters.

Ranked #135 on Image Classification on ImageNet

Image Classification Knowledge Distillation

29,846

Paper
Code

PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies

3 code implementations • 9 Jun 2022 • Guocheng Qian, Yuchen Li, Houwen Peng, Jinjie Mai, Hasan Abed Al Kader Hammoud, Mohamed Elhoseiny, Bernard Ghanem

In this work, we revisit the classical PointNet++ through a systematic study of model training and scaling strategies, and offer two major contributions.

Ranked #3 on 3D Semantic Segmentation on OpenTrench3D

3D Classification 3D Part Segmentation +3

703

Paper
Code

MiniViT: Compressing Vision Transformers with Weight Multiplexing

2 code implementations • CVPR 2022 • Jinnian Zhang, Houwen Peng, Kan Wu, Mengchen Liu, Bin Xiao, Jianlong Fu, Lu Yuan

The central idea of MiniViT is to multiplex the weights of consecutive transformer blocks.

Ranked #212 on Image Classification on ImageNet (using extra training data)

Image Classification

1,565

Paper
Code

Searching the Search Space of Vision Transformer

2 code implementations • NeurIPS 2021 • Minghao Chen, Kan Wu, Bolin Ni, Houwen Peng, Bei Liu, Jianlong Fu, Hongyang Chao, Haibin Ling

Vision Transformer has shown great visual representation power in substantial vision tasks such as recognition and detection, and thus been attracting fast-growing efforts on manually designing more effective architectures.

Neural Architecture Search object-detection +4

1,565

Paper
Code

Learning to Track Objects from Unlabeled Videos

1 code implementation • ICCV 2021 • Jilai Zheng, Chao Ma, Houwen Peng, Xiaokang Yang

In this paper, we propose to learn an Unsupervised Single Object Tracker (USOT) from scratch.

Object Discovery Optical Flow Estimation

Paper
Code

Rethinking and Improving Relative Position Encoding for Vision Transformer

1 code implementation • ICCV 2021 • Kan Wu, Houwen Peng, Minghao Chen, Jianlong Fu, Hongyang Chao

We then propose new relative position encoding methods dedicated to 2D images, called image RPE (iRPE).

Ranked #140 on Object Detection on COCO minival

Image Classification Object Detection +1

1,565

Paper
Code

AutoFormer: Searching Transformers for Visual Recognition

2 code implementations • ICCV 2021 • Minghao Chen, Houwen Peng, Jianlong Fu, Haibin Ling

Specifically, the performance of these subnets with weights inherited from the supernet is comparable to those retrained from scratch.

Ranked #1 on Fine-Grained Image Classification on Oxford 102 Flowers (Top 1 Accuracy metric)

AutoML Fine-Grained Image Classification

1,565

Paper
Code

Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training

no code implementations • NeurIPS 2021 • Hongwei Xue, Yupan Huang, Bei Liu, Houwen Peng, Jianlong Fu, Houqiang Li, Jiebo Luo

To tackle this, we propose a fully Transformer visual embedding for VLP to better learn visual relation and further promote inter-modal alignment.

Question Answering Relation +5

Paper
Add Code

Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training

no code implementations • NeurIPS 2021 • Hongwei Xue, Yupan Huang, Bei Liu, Houwen Peng, Jianlong Fu, Houqiang Li, Jiebo Luo

To tackle this, we propose a fully Transformer visual embedding for VLP to better learn visual relation and further promote inter-modal alignment.

Question Answering Relation +3

Paper
Add Code

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search

1 code implementation • CVPR 2021 • Bin Yan, Houwen Peng, Kan Wu, Dong Wang, Jianlong Fu, Huchuan Lu

Object tracking has achieved significant progress over the past few years.

Neural Architecture Search Object +1

381

Paper
Code

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking

1 code implementation • CVPR 2021 • Minghao Chen, Houwen Peng, Jianlong Fu, Haibin Ling

In this paper, we propose a one-shot neural ensemble architecture search (NEAS) solution that addresses the two challenges.

Neural Architecture Search

Paper
Code

Learning Spatio-Temporal Transformer for Visual Tracking

1 code implementation • ICCV 2021 • Bin Yan, Houwen Peng, Jianlong Fu, Dong Wang, Huchuan Lu

In this paper, we present a new tracking architecture with an encoder-decoder transformer as the key component.

Ranked #19 on Visual Object Tracking on TrackingNet

Decoder Visual Object Tracking +1

612

Paper
Code

Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language

1 code implementation • 4 Dec 2020 • Songyang Zhang, Houwen Peng, Jianlong Fu, Yijuan Lu, Jiebo Luo

It is a challenging problem because a target moment may take place in the context of other temporal moments in the untrimmed video.

932

Paper
Code

Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search

2 code implementations • NeurIPS 2020 • Houwen Peng, Hao Du, Hongyuan Yu, Qi Li, Jing Liao, Jianlong Fu

The experiments on ImageNet verify such path distillation method can improve the convergence ratio and performance of the hypernetwork, as well as boosting the training of subnetworks.

Neural Architecture Search object-detection +1

1,565

Paper
Code

Revisiting Anchor Mechanisms for Temporal Action Localization

1 code implementation • 22 Aug 2020 • Le Yang, Houwen Peng, Dingwen Zhang, Jianlong Fu, Junwei Han

To address this problem, this paper proposes a novel anchor-free action localization module that assists action localization by temporal points.

Temporal Action Localization

Paper
Code

Towards Accurate Pixel-wise Object Tracking by Attention Retrieval

1 code implementation • 6 Aug 2020 • Zhipeng Zhang, Bing Li, Weiming Hu, Houwen Peng

We first build a look-up-table (LUT) with the ground-truth mask in the starting frame, and then retrieves the LUT to obtain an attention map for spatial constraints.

Object Object Tracking +2

603

Paper
Code

Cyclic Differentiable Architecture Search

3 code implementations • 18 Jun 2020 • Hongyuan Yu, Houwen Peng, Yan Huang, Jianlong Fu, Hao Du, Liang Wang, Haibin Ling

First, the search network generates an initial architecture for evaluation, and the weights of the evaluation network are optimized.

Ranked #17 on Neural Architecture Search on NAS-Bench-201, CIFAR-10

Neural Architecture Search

1,565

Paper
Code

Ocean: Object-aware Anchor-free Tracking

4 code implementations • ECCV 2020 • Zhipeng Zhang, Houwen Peng, Jianlong Fu, Bing Li, Weiming Hu

In this paper, we propose a novel object-aware anchor-free network to address this issue.

Ranked #2 on Visual Object Tracking on VOT2019

Object Visual Object Tracking

748

Paper
Code

A Transductive Approach for Video Object Segmentation

1 code implementation • CVPR 2020 • Yizhuo Zhang, Zhirong Wu, Houwen Peng, Stephen Lin

Semi-supervised video object segmentation aims to separate a target object from a video sequence, given the mask in the first frame.

Ranked #15 on Semi-Supervised Video Object Segmentation on DAVIS (no YouTube-VOS training)

Instance Segmentation Object +4

155

Paper
Code

Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language

3 code implementations • 8 Dec 2019 • Songyang Zhang, Houwen Peng, Jianlong Fu, Jiebo Luo

We address the problem of retrieving a specific moment from an untrimmed video by a query sentence.

Sentence

932

Paper
Code

Learning Sparse 2D Temporal Adjacent Networks for Temporal Action Localization

2 code implementations • 8 Dec 2019 • Songyang Zhang, Houwen Peng, Le Yang, Jianlong Fu, Jiebo Luo

In this report, we introduce the Winner method for HACS Temporal Action Localization Challenge 2019.

Temporal Action Localization

932

Paper
Code

Deeper and Wider Siamese Networks for Real-Time Visual Tracking

5 code implementations • CVPR 2019 • Zhipeng Zhang, Houwen Peng

Siamese networks have drawn great attention in visual tracking because of their balanced accuracy and speed.

Ranked #2 on Visual Object Tracking on VOT2017

Real-Time Visual Tracking Visual Object Tracking

748

Paper
Code

Illumination Estimation Based on Bilayer Sparse Coding

no code implementations • CVPR 2013 • Bing Li, Weihua Xiong, Weiming Hu, Houwen Peng

In this paper, we propose a novel bilayer sparse coding model for illumination estimation that considers image similarity in terms of both low level color distribution and high level image scene content simultaneously.

Color Constancy

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.