Search Results for author: Wuyang Chen

Found 26 papers, 20 papers with code

Transferable and Principled Efficiency for Open-Vocabulary Segmentation

2 code implementations • 11 Apr 2024 • Jingxuan Xu, Wuyang Chen, Yao Zhao, Yunchao Wei

In the context of efficient OVS, we target achieving performance that is comparable to or even better than prior OVS works based on large vision-language foundation models, by utilizing smaller models that incur lower training costs.

Model Compression

124

Paper
Code

Principled Architecture-aware Scaling of Hyperparameters

1 code implementation • 27 Feb 2024 • Wuyang Chen, Junru Wu, Zhangyang Wang, Boris Hanin

However, most designs or optimization methods are agnostic to the choice of network structures, and thus largely ignore the impact of neural architectures on hyperparameters.

AutoML

Paper
Code

Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

no code implementations • 24 Feb 2024 • Wuyang Chen, Jialin Song, Pu Ren, Shashank Subramanian, Dmitriy Morozov, Michael W. Mahoney

To reduce the need for training data with simulated solutions, we pretrain neural operators on unlabeled PDE data using reconstruction-based proxy tasks.

In-Context Learning Operator learning

Paper
Add Code

Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models

no code implementations • 24 May 2023 • Sheng Shen, Le Hou, Yanqi Zhou, Nan Du, Shayne Longpre, Jason Wei, Hyung Won Chung, Barret Zoph, William Fedus, Xinyun Chen, Tu Vu, Yuexin Wu, Wuyang Chen, Albert Webson, Yunxuan Li, Vincent Zhao, Hongkun Yu, Kurt Keutzer, Trevor Darrell, Denny Zhou

Sparse Mixture-of-Experts (MoE) is a neural architecture design that can be utilized to add learnable parameters to Large Language Models (LLMs) without increasing inference cost.

Zero-shot Generalization

Paper
Add Code

Lifelong Language Pretraining with Distribution-Specialized Experts

no code implementations • 20 May 2023 • Wuyang Chen, Yanqi Zhou, Nan Du, Yanping Huang, James Laudon, Zhifeng Chen, Claire Cu

Compared to existing lifelong learning approaches, Lifelong-MoE achieves better few-shot performance on 19 downstream NLP tasks.

Paper
Add Code

Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices

no code implementations • 16 Oct 2022 • Yimeng Zhang, Akshay Karkal Kamath, Qiucheng Wu, Zhiwen Fan, Wuyang Chen, Zhangyang Wang, Shiyu Chang, Sijia Liu, Cong Hao

In this paper, we propose a data-model-hardware tri-design framework for high-throughput, low-cost, and high-accuracy multi-object tracking (MOT) on High-Definition (HD) video stream.

Model Compression Multi-Object Tracking

Paper
Add Code

Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis

2 code implementations • 11 May 2022 • Wuyang Chen, Wei Huang, Xinyu Gong, Boris Hanin, Zhangyang Wang

Advanced deep neural networks (DNNs), designed by either human or AutoML algorithms, are growing increasingly complex.

Neural Architecture Search

Paper
Code

Auto-scaling Vision Transformers without Training

1 code implementation • ICLR 2022 • Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wang, Denny Zhou

The motivation comes from two pain spots: 1) the lack of efficient and principled methods for designing and scaling ViTs; 2) the tremendous computational cost of training ViT that is much heavier than its convolution counterpart.

Paper
Code

A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation

3 code implementations • 17 Dec 2021 • Wuyang Chen, Xianzhi Du, Fan Yang, Lucas Beyer, Xiaohua Zhai, Tsung-Yi Lin, Huizhong Chen, Jing Li, Xiaodan Song, Zhangyang Wang, Denny Zhou

In this paper, we comprehensively study three architecture design choices on ViT -- spatial reduction, doubled channels, and multiscale features -- and demonstrate that a vanilla ViT architecture can fulfill this goal without handcrafting multiscale features, maintaining the original ViT design philosophy.

Image Classification Instance Segmentation +6

76,571

Paper
Code

Learning Pruning-Friendly Networks via Frank-Wolfe: One-Shot, Any-Sparsity, And No Retraining

1 code implementation • ICLR 2022 • Lu Miao, Xiaolong Luo, Tianlong Chen, Wuyang Chen, Dong Liu, Zhangyang Wang

Conventional methods often require (iterative) pruning followed by re-training, which not only incurs large overhead beyond the original DNN training but also can be sensitive to retraining hyperparameters.

Paper
Code

Font Completion and Manipulation by Cycling Between Multi-Modality Representations

1 code implementation • 30 Aug 2021 • Ye Yuan, Wuyang Chen, Zhaowen Wang, Matthew Fisher, Zhifei Zhang, Zhangyang Wang, Hailin Jin

The novel graph constructor maps a glyph's latent code to its graph representation that matches expert knowledge, which is trained to help the translation task.

Image-to-Image Translation Representation Learning +2

Paper
Code

Understanding and Accelerating Neural Architecture Search with Training-Free and Theory-Grounded Metrics

1 code implementation • 26 Aug 2021 • Wuyang Chen, Xinyu Gong, Junru Wu, Yunchao Wei, Humphrey Shi, Zhicheng Yan, Yi Yang, Zhangyang Wang

This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS), with high performance, low cost, and in-depth interpretation.

Neural Architecture Search

Paper
Code

DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference

no code implementations • 16 Jul 2021 • Chaojian Li, Wuyang Chen, Yuchen Gu, Tianlong Chen, Yonggan Fu, Zhangyang Wang, Yingyan Lin

Semantic segmentation for scene understanding is nowadays widely demanded, raising significant challenges for the algorithm efficiency, especially its applications on resource-limited platforms.

Scene Understanding Segmentation +1

Paper
Add Code

Contrastive Syn-to-Real Generalization

2 code implementations • ICLR 2021 • Wuyang Chen, Zhiding Yu, Shalini De Mello, Sifei Liu, Jose M. Alvarez, Zhangyang Wang, Anima Anandkumar

Training on synthetic data can be beneficial for label or data-scarce scenarios.

Domain Generalization Inductive Bias

Paper
Code

Learning to Optimize: A Primer and A Benchmark

1 code implementation • 23 Mar 2021 • Tianlong Chen, Xiaohan Chen, Wuyang Chen, Howard Heaton, Jialin Liu, Zhangyang Wang, Wotao Yin

It automates the design of an optimization method based on its performance on a set of training problems.

Benchmarking

247

Paper
Code

Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective

4 code implementations • ICLR 2021 • Wuyang Chen, Xinyu Gong, Zhangyang Wang

Can we select the best neural architectures without involving any training and eliminate a drastic portion of the search cost?

Ranked #4 on Neural Architecture Search on NATS-Bench Topology, CIFAR-10

Neural Architecture Search

216

Paper
Code

Sandwich Batch Normalization: A Drop-In Replacement for Feature Distribution Heterogeneity

1 code implementation • 22 Feb 2021 • Xinyu Gong, Wuyang Chen, Tianlong Chen, Zhangyang Wang

We present Sandwich Batch Normalization (SaBN), a frustratingly easy improvement of Batch Normalization (BN) with only a few lines of code changes.

Ranked #20 on Neural Architecture Search on NAS-Bench-201, CIFAR-100

Adversarial Defense Conditional Image Generation +2

Paper
Code

AutoPose: Searching Multi-Scale Branch Aggregation for Pose Estimation

no code implementations • 16 Aug 2020 • Xinyu Gong, Wuyang Chen, Yifan Jiang, Ye Yuan, Xian-Ming Liu, Qian Zhang, Yuan Li, Zhangyang Wang

Such simplification limits the fusion of information at different scales and fails to maintain high-resolution representations.

2D Human Pose Estimation Neural Architecture Search +1

Paper
Add Code

Automated Synthetic-to-Real Generalization

1 code implementation • ICML 2020 • Wuyang Chen, Zhiding Yu, Zhangyang Wang, Anima Anandkumar

Models trained on synthetic images often face degraded generalization to real data.

Domain Adaptation

Paper
Code

Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training

1 code implementation • ICML 2020 • Xuxi Chen, Wuyang Chen, Tianlong Chen, Ye Yuan, Chen Gong, Kewei Chen, Zhangyang Wang

Many real-world applications have to tackle the Positive-Unlabeled (PU) learning problem, i. e., learning binary classifiers from a large amount of unlabeled data and a few labeled positive examples.

Paper
Code

AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks

3 code implementations • ICML 2020 • Yonggan Fu, Wuyang Chen, Haotao Wang, Haoran Li, Yingyan Lin, Zhangyang Wang

Inspired by the recent success of AutoML in deep compression, we introduce AutoML to GAN compression and develop an AutoGAN-Distiller (AGD) framework.

AutoML Knowledge Distillation +2

108

Paper
Code

FasterSeg: Searching for Faster Real-time Semantic Segmentation

2 code implementations • ICLR 2020 • Wuyang Chen, Xinyu Gong, Xian-Ming Liu, Qian Zhang, Yuan Li, Zhangyang Wang

We present FasterSeg, an automatically designed semantic segmentation network with not only state-of-the-art performance but also faster speed than current methods.

Ranked #1 on Semantic Segmentation on BDD

Neural Architecture Search Real-Time Semantic Segmentation +1

523

Paper
Code

In Defense of the Triplet Loss Again: Learning Robust Person Re-Identification with Fast Approximated Triplet Loss and Label Distillation

1 code implementation • 17 Dec 2019 • Ye Yuan, Wuyang Chen, Yang Yang, Zhangyang Wang

This work addresses the above two shortcomings of triplet loss, extending its effectiveness to large-scale ReID datasets with potentially noisy labels.

Person Re-Identification