Search Results for author: Zhenguo Li

Found 172 papers, 65 papers with code

CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs

no code implementations25 Mar 2024 Yingji Zhong, Lanqing Hong, Zhenguo Li, Dan Xu

While existing works mainly consider ray-level consistency to construct 2D learning regularization based on rendered color, depth, or semantics on image planes, in this paper we propose a novel approach that models 3D spatial field consistency to improve NeRF's performance with sparse inputs.

Novel View Synthesis

DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving

no code implementations25 Mar 2024 Tianqi Wang, Enze Xie, Ruihang Chu, Zhenguo Li, Ping Luo

We utilize the challenging driving scenarios from the CARLA leaderboard 2. 0, which involve high-speed driving and lane-changing, and propose a rule-based expert policy to control the vehicle and generate ground truth labels for its reasoning process across different driving aspects and the final decisions.

Editing Massive Concepts in Text-to-Image Diffusion Models

1 code implementation20 Mar 2024 Tianwei Xiong, Enze Xie, Yue Wu, Zhenguo Li, Xihui Liu

We further propose a comprehensive benchmark, named ImageNet Concept Editing Benchmark (ICEB), for evaluating massive concept editing for T2I models with two subtasks, free-form prompts, massive concept categories, and extensive evaluation metrics.

Model Editing

Efficient Transferability Assessment for Selection of Pre-trained Detectors

no code implementations14 Mar 2024 Zhao Wang, Aoxue Li, Zhenguo Li, Qi Dou

Given this zoo, we adopt 7 target datasets from 5 diverse domains as the downstream target tasks for evaluation.

Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization

no code implementations14 Mar 2024 Zhao Wang, Aoxue Li, Fengwei Zhou, Zhenguo Li, Qi Dou

Without using knowledge distillation, ensemble model or extra training data during detector training, our proposed MIC outperforms previous SOTA methods trained with these complex techniques on LVIS.

Contrastive Learning Knowledge Distillation +2

Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation

no code implementations14 Mar 2024 Yunhao Gou, Kai Chen, Zhili Liu, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-yan Yeung, James T. Kwok, Yu Zhang

Multimodal large language models (MLLMs) have shown impressive reasoning abilities, which, however, are also more vulnerable to jailbreak attacks than their LLM predecessors.

Optical Character Recognition (OCR)

Accelerating Diffusion Sampling with Optimized Time Steps

no code implementations27 Feb 2024 Shuchen Xue, Zhaoqiang Liu, Fei Chen, Shifeng Zhang, Tianyang Hu, Enze Xie, Zhenguo Li

While this is a significant development, most sampling methods still employ uniform time steps, which is not optimal when using a small number of steps.

Image Generation

The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling

no code implementations23 Feb 2024 Jiajun Ma, Shuchen Xue, Tianyang Hu, Wenjia Wang, Zhaoqiang Liu, Zhenguo Li, Zhi-Ming Ma, Kenji Kawaguchi

Surprisingly, the improvement persists when we increase the number of sampling steps and can even surpass the best result from EDM-2 (1. 58) with only 39 NFEs (1. 57).

Image Generation

On the Expressive Power of a Variant of the Looped Transformer

no code implementations21 Feb 2024 Yihang Gao, Chuanyang Zheng, Enze Xie, Han Shi, Tianyang Hu, Yu Li, Michael K. Ng, Zhenguo Li, Zhaoqiang Liu

Previous works try to explain this from the expressive power and capability perspectives that standard transformers are capable of performing some algorithms.

MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data

1 code implementation14 Feb 2024 Yinya Huang, Xiaohan Lin, Zhengying Liu, Qingxing Cao, Huajian Xin, Haiming Wang, Zhenguo Li, Linqi Song, Xiaodan Liang

Recent large language models (LLMs) have witnessed significant advancement in various tasks, including mathematical reasoning and theorem proving.

Automated Theorem Proving Language Modelling +3

Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models

1 code implementation12 Feb 2024 Jiacheng Ye, Shansan Gong, Liheng Chen, Lin Zheng, Jiahui Gao, Han Shi, Chuan Wu, Zhenguo Li, Wei Bi, Lingpeng Kong

This work explores the integration of diffusion models and Chain-of-Thought (CoT), a well-established technique to improve the reasoning ability in autoregressive language models.

Math

Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation

no code implementations28 Jan 2024 Zhenyu Wang, Enze Xie, Aoxue Li, Zhongdao Wang, Xihui Liu, Zhenguo Li

Given a complex text prompt containing multiple concepts including objects, attributes, and relationships, the LLM agent initially decomposes it, which entails the extraction of individual objects, their associated attributes, and the prediction of a coherent scene layout.

Attribute Language Modelling +3

CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

no code implementations18 Jan 2024 Zhao Wang, Aoxue Li, Enze Xie, Lingting Zhu, Yong Guo, Qi Dou, Zhenguo Li

Customized text-to-video generation aims to generate high-quality videos guided by text prompts and subject references.

Object Text-to-Video Generation +1

PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models

1 code implementation10 Jan 2024 Junsong Chen, Yue Wu, Simian Luo, Enze Xie, Sayak Paul, Ping Luo, Hang Zhao, Zhenguo Li

As a state-of-the-art, open-source image generation model, PIXART-{\delta} offers a promising alternative to the Stable Diffusion family of models, contributing significantly to text-to-image synthesis.

Image Generation

SERF: Fine-Grained Interactive 3D Segmentation and Editing with Radiance Fields

no code implementations26 Dec 2023 Kaichen Zhou, Lanqing Hong, Enze Xie, Yongxin Yang, Zhenguo Li, Wei zhang

Although significant progress has been made in the field of 2D-based interactive editing, fine-grained 3D-based interactive editing remains relatively unexplored.

Interactive Segmentation Segmentation

G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model

1 code implementation18 Dec 2023 Jiahui Gao, Renjie Pi, Jipeng Zhang, Jiacheng Ye, Wanjun Zhong, YuFei Wang, Lanqing Hong, Jianhua Han, Hang Xu, Zhenguo Li, Lingpeng Kong

We first analyze the limitations of current Multimodal Large Language Models (MLLMs) in this area: they struggle to accurately comprehending basic geometric elements and their relationships.

Language Modelling Large Language Model

Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation

no code implementations12 Dec 2023 Shentong Mo, Enze Xie, Yue Wu, Junsong Chen, Matthias Nießner, Zhenguo Li

Motivated by the inherent redundancy of 3D compared to 2D, we propose FastDiT-3D, a novel masked diffusion transformer tailored for efficient 3D point cloud generation, which greatly reduces training costs.

Denoising Point Cloud Generation

Drag-A-Video: Non-rigid Video Editing with Point-based Interaction

no code implementations5 Dec 2023 Yao Teng, Enze Xie, Yue Wu, Haoyu Han, Zhenguo Li, Xihui Liu

In this paper, we propose a new diffusion-based method for interactive point-based video manipulation, called Drag-A-Video.

Denoising Point Tracking +1

Animate124: Animating One Image to 4D Dynamic Scene

no code implementations24 Nov 2023 Yuyang Zhao, Zhiwen Yan, Enze Xie, Lanqing Hong, Zhenguo Li, Gim Hee Lee

We introduce Animate124 (Animate-one-image-to-4D), the first work to animate a single in-the-wild image into 3D video through textual motion descriptions, an underexplored problem with significant applications.

Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis

no code implementations16 Oct 2023 Kai Chen, Chunwei Wang, Kuo Yang, Jianhua Han, Lanqing Hong, Fei Mi, Hang Xu, Zhengying Liu, Wenyong Huang, Zhenguo Li, Dit-yan Yeung, Lifeng Shang, Xin Jiang, Qun Liu

The rapid development of large language models (LLMs) has not only provided numerous opportunities but also presented significant challenges.

Instruction Following

Robustness May be More Brittle than We Think under Different Degrees of Distribution Shifts

no code implementations10 Oct 2023 Kaican Li, Yifan Zhang, Lanqing Hong, Zhenguo Li, Nevin L. Zhang

This indicates that while pre-trained representations may help improve downstream in-distribution performance, they could have minimal or even adverse effects on generalization in certain OOD scenarios of the downstream task if not used properly.

Implicit Concept Removal of Diffusion Models

no code implementations9 Oct 2023 Zhili Liu, Kai Chen, Yifan Zhang, Jianhua Han, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-yan Yeung, James Kwok

To address this, we utilize the intrinsic geometric characteristics of implicit concepts and present the Geom-Erasing, a novel concept removal method based on geometric-driven control.

BYOM: Building Your Own Multi-Task Model For Free

no code implementations3 Oct 2023 Weisen Jiang, Baijiong Lin, Han Shi, Yu Zhang, Zhenguo Li, James T. Kwok

Recently, various merging methods have been proposed to build a multi-task model from task-specific finetuned models without retraining.

DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model

no code implementations2 Oct 2023 Zhenhua Xu, Yujia Zhang, Enze Xie, Zhen Zhao, Yong Guo, Kwan-Yee. K. Wong, Zhenguo Li, Hengshuang Zhao

Multimodal large language models (MLLMs) have emerged as a prominent area of interest within the research community, given their proficiency in handling and reasoning with non-textual data, including images and videos.

Autonomous Driving Language Modelling +2

LEGO-Prover: Neural Theorem Proving with Growing Libraries

1 code implementation1 Oct 2023 Haiming Wang, Huajian Xin, Chuanyang Zheng, Lin Li, Zhengying Liu, Qingxing Cao, Yinya Huang, Jing Xiong, Han Shi, Enze Xie, Jian Yin, Zhenguo Li, Heng Liao, Xiaodan Liang

Our ablation study indicates that these newly added skills are indeed helpful for proving theorems, resulting in an improvement from a success rate of 47. 1% to 50. 4%.

 Ranked #1 on Automated Theorem Proving on miniF2F-test (Pass@100 metric)

Automated Theorem Proving

PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

2 code implementations30 Sep 2023 Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Yue Wu, Zhongdao Wang, James Kwok, Ping Luo, Huchuan Lu, Zhenguo Li

We hope PIXART-$\alpha$ will provide new insights to the AIGC community and startups to accelerate building their own high-quality yet low-cost generative models from scratch.

Image Generation Language Modelling

Lyra: Orchestrating Dual Correction in Automated Theorem Proving

1 code implementation27 Sep 2023 Chuanyang Zheng, Haiming Wang, Enze Xie, Zhengying Liu, Jiankai Sun, Huajian Xin, Jianhao Shen, Zhenguo Li, Yu Li

In addition, we introduce Conjecture Correction, an error feedback mechanism designed to interact with prover to refine formal proof conjectures with prover error messages.

 Ranked #1 on Automated Theorem Proving on miniF2F-test (Pass@100 metric)

Automated Theorem Proving Hallucination

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

1 code implementation21 Sep 2023 Longhui Yu, Weisen Jiang, Han Shi, Jincheng Yu, Zhengying Liu, Yu Zhang, James T. Kwok, Zhenguo Li, Adrian Weller, Weiyang Liu

Our MetaMath-7B model achieves 66. 4% on GSM8K and 19. 4% on MATH, exceeding the state-of-the-art models of the same size by 11. 5% and 8. 7%.

Ranked #53 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +4

SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models

1 code implementation NeurIPS 2023 Shuchen Xue, Mingyang Yi, Weijian Luo, Shifeng Zhang, Jiacheng Sun, Zhenguo Li, Zhi-Ming Ma

Based on our analysis, we propose SA-Solver, which is an improved efficient stochastic Adams method for solving diffusion SDE to generate data with high quality.

Image Generation

Forward-Backward Reasoning in Large Language Models for Mathematical Verification

no code implementations15 Aug 2023 Weisen Jiang, Han Shi, Longhui Yu, Zhengying Liu, Yu Zhang, Zhenguo Li, James T. Kwok

Instead of using forward or backward reasoning alone, we propose FOBAR to combine FOrward and BAckward Reasoning for verification.

Mathematical Reasoning

A Causal Framework to Unify Common Domain Generalization Approaches

no code implementations13 Jul 2023 Nevin L. Zhang, Kaican Li, Han Gao, Weiyan Xie, Zhi Lin, Zhenguo Li, Luning Wang, Yongxiang Huang

Domain generalization (DG) is about learning models that generalize well to new domains that are related to, but different from, the training domain(s).

Domain Generalization

T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation

1 code implementation NeurIPS 2023 Kaiyi Huang, Kaiyue Sun, Enze Xie, Zhenguo Li, Xihui Liu

Despite the stunning ability to generate high-quality images by recent text-to-image models, current approaches often struggle to effectively compose objects with different attributes and relationships into a complex and coherent scene.

Attribute Text-to-Image Generation

DiffFlow: A Unified SDE Framework for Score-Based Diffusion Models and Generative Adversarial Networks

no code implementations5 Jul 2023 Jingwei Zhang, Han Shi, Jincheng Yu, Enze Xie, Zhenguo Li

Generative models can be categorized into two types: explicit generative models that define explicit density forms and allow exact likelihood inference, such as score-based diffusion models (SDMs) and normalizing flows; implicit generative models that directly learn a transformation from the prior to the data distribution, such as generative adversarial nets (GANs).

Denoising

Training Energy-Based Models with Diffusion Contrastive Divergences

no code implementations4 Jul 2023 Weijian Luo, Hao Jiang, Tianyang Hu, Jiacheng Sun, Zhenguo Li, Zhihua Zhang

In image generation experiments, the proposed DCD is capable of training an energy-based model for generating the Celab-A $32\times 32$ dataset, which is comparable to existing EBMs.

Image Denoising Image Generation

GeoDiffusion: Text-Prompted Geometric Control for Object Detection Data Generation

no code implementations7 Jun 2023 Kai Chen, Enze Xie, Zhe Chen, Yibo Wang, Lanqing Hong, Zhenguo Li, Dit-yan Yeung

Diffusion models have attracted significant attention due to the remarkable ability to create content and generate data for tasks like image classification.

Image Classification Layout-to-Image Generation +2

Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization

no code implementations5 Jun 2023 Yimeng Chen, Tianyang Hu, Fengwei Zhou, Zhenguo Li, ZhiMing Ma

The proliferation of pretrained models, as a result of advancements in pretraining techniques, has led to the emergence of a vast zoo of publicly available models.

Domain Generalization Out-of-Distribution Generalization

Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models

no code implementations NeurIPS 2023 Weijian Luo, Tianyang Hu, Shifeng Zhang, Jiacheng Sun, Zhenguo Li, Zhihua Zhang

To demonstrate the effectiveness and universality of Diff-Instruct, we consider two scenarios: distilling pre-trained diffusion models and refining existing GAN models.

On the Generalization of Diffusion Model

no code implementations24 May 2023 Mingyang Yi, Jiacheng Sun, Zhenguo Li

To understand this contradiction, we empirically verify the difference between the sufficiently trained diffusion model and the empirical optima.

ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for Sparse View Synthesis

1 code implementation18 May 2023 Shoukang Hu, Kaichen Zhou, Kaiyu Li, Longhui Yu, Lanqing Hong, Tianyang Hu, Zhenguo Li, Gim Hee Lee, Ziwei Liu

In this paper, we propose ConsistentNeRF, a method that leverages depth information to regularize both multi-view and single-view 3D consistency among pixels.

3D Reconstruction SSIM

Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts

no code implementations15 May 2023 Yuyang Zhao, Enze Xie, Lanqing Hong, Zhenguo Li, Gim Hee Lee

The text-driven image and video diffusion models have achieved unprecedented success in generating realistic and diverse content.

Denoising Video Editing +1

Boosting Visual-Language Models by Exploiting Hard Samples

1 code implementation9 May 2023 Haonan Wang, Minbin Huang, Runhui Huang, Lanqing Hong, Hang Xu, Tianyang Hu, Xiaodan Liang, Zhenguo Li, Hong Cheng, Kenji Kawaguchi

In this work, we present HELIP, a cost-effective strategy tailored to enhance the performance of existing CLIP models without the need for training a model from scratch or collecting additional data.

Retrieval Zero-Shot Learning

Progressive-Hint Prompting Improves Reasoning in Large Language Models

1 code implementation19 Apr 2023 Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li

The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability.

Arithmetic Reasoning GSM8K +2

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

1 code implementation19 Apr 2023 Chongjian Ge, Junsong Chen, Enze Xie, Zhongdao Wang, Lanqing Hong, Huchuan Lu, Zhenguo Li, Ping Luo

These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities.

3D Object Detection Autonomous Driving +3

DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning

1 code implementation ICCV 2023 Enze Xie, Lewei Yao, Han Shi, Zhili Liu, Daquan Zhou, Zhaoqiang Liu, Jiawei Li, Zhenguo Li

This paper proposes DiffFit, a parameter-efficient strategy to fine-tune large pre-trained diffusion models that enable fast adaptation to new domains.

Efficient Diffusion Personalization

DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment

no code implementations CVPR 2023 Lewei Yao, Jianhua Han, Xiaodan Liang, Dan Xu, Wei zhang, Zhenguo Li, Hang Xu

This paper presents DetCLIPv2, an efficient and scalable training framework that incorporates large-scale image-text pairs to achieve open-vocabulary object detection (OVD).

Language Modelling object-detection +1

DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving

no code implementations3 Apr 2023 Tianqi Wang, Sukmin Kim, Wenxuan Ji, Enze Xie, Chongjian Ge, Junsong Chen, Zhenguo Li, Ping Luo

In addition, we propose a new task, end-to-end motion and accident prediction, which can be used to directly evaluate the accident prediction ability for different autonomous driving algorithms.

3D Object Detection Autonomous Driving +1

Fair-CDA: Continuous and Directional Augmentation for Group Fairness

no code implementations1 Apr 2023 Rui Sun, Fengwei Zhou, Zhenhua Dong, Chuanlong Xie, Lanqing Hong, Jiawei Li, Rui Zhang, Zhen Li, Zhenguo Li

By adjusting the perturbation strength in the direction of the paths, our proposed augmentation is controllable and auditable.

Data Augmentation Disentanglement +1

Mixed Autoencoder for Self-supervised Visual Representation Learning

1 code implementation CVPR 2023 Kai Chen, Zhili Liu, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-yan Yeung

Specifically, our MixedAE outperforms MAE by +0. 3% accuracy, +1. 7 mIoU and +0. 9 AP on ImageNet-1K, ADE20K and COCO respectively with a standard ViT-Base.

Contrastive Learning Data Augmentation +1

ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive Learning

no code implementations CVPR 2023 Hao Yang, Lanqing Hong, Aoxue Li, Tianyang Hu, Zhenguo Li, Gim Hee Lee, LiWei Wang

In this work, we first investigate the effects of synthetic data in synthetic-to-real novel view synthesis and surprisingly observe that models trained with synthetic data tend to produce sharper but less accurate volume densities.

Contrastive Learning Generalizable Novel View Synthesis +2

Entity-Level Text-Guided Image Manipulation

1 code implementation22 Feb 2023 Yikai Wang, Jianan Wang, Guansong Lu, Hang Xu, Zhenguo Li, Wei zhang, Yanwei Fu

In the image manipulation phase, SeMani adopts a generative model to synthesize new images conditioned on the entity-irrelevant regions and target text descriptions.

Denoising Image Manipulation

MetaBEV: Solving Sensor Failures for 3D Detection and Map Segmentation

no code implementations ICCV 2023 Chongjian Ge, Junsong Chen, Enze Xie, Zhongdao Wang, Lanqing Hong, Huchuan Lu, Zhenguo Li, Ping Luo

These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities.

3D Object Detection Autonomous Driving +3

Boosting Out-of-Distribution Detection with Multiple Pre-trained Models

1 code implementation24 Dec 2022 Feng Xue, Zi He, Chuanlong Xie, Falong Tan, Zhenguo Li

This advance raises a natural question: Can we leverage the diversity of multiple pre-trained models to improve the performance of post hoc detection methods?

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Dual-Curriculum Teacher for Domain-Inconsistent Object Detection in Autonomous Driving

no code implementations17 Oct 2022 Longhui Yu, Yifan Zhang, Lanqing Hong, Fei Chen, Zhenguo Li

Specifically, DucTeacher consists of two curriculums, i. e., (1) domain evolving curriculum seeks to learn from the data progressively to handle data distribution discrepancy by estimating the similarity between domains, and (2) distribution matching curriculum seeks to estimate the class distribution for each unlabeled domain to handle class distribution shifts.

Autonomous Driving object-detection +2

ZooD: Exploiting Model Zoo for Out-of-Distribution Generalization

no code implementations17 Oct 2022 Qishi Dong, Awais Muhammad, Fengwei Zhou, Chuanlong Xie, Tianyang Hu, Yongxin Yang, Sung-Ho Bae, Zhenguo Li

We evaluate our paradigm on a diverse model zoo consisting of 35 models for various OoD tasks and demonstrate: (i) model ranking is better correlated with fine-tuning ranking than previous methods and up to 9859x faster than brute-force fine-tuning; (ii) OoD generalization after model ensemble with feature selection outperforms the state-of-the-art methods and the accuracy on most challenging task DomainNet is improved from 46. 5\% to 50. 6\%.

feature selection Out-of-Distribution Generalization

DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection

no code implementations20 Sep 2022 Lewei Yao, Jianhua Han, Youpeng Wen, Xiaodan Liang, Dan Xu, Wei zhang, Zhenguo Li, Chunjing Xu, Hang Xu

We further design a concept dictionary~(with descriptions) from various online sources and detection datasets to provide prior knowledge for each concept.

object-detection Open World Object Detection

DevNet: Self-supervised Monocular Depth Learning via Density Volume Construction

1 code implementation14 Sep 2022 Kaichen Zhou, Lanqing Hong, Changhao Chen, Hang Xu, Chaoqiang Ye, Qingyong Hu, Zhenguo Li

Self-supervised depth learning from monocular images normally relies on the 2D pixel-wise photometric relation between temporally adjacent image frames.

Depth Estimation

Breaking Correlation Shift via Conditional Invariant Regularizer

no code implementations14 Jul 2022 Mingyang Yi, Ruoyu Wang, Jiachen Sun, Zhenguo Li, Zhi-Ming Ma

The correlation shift is caused by the spurious attributes that correlate to the class label, as the correlation between them may vary in training and test data.

Learning to Prove Trigonometric Identities

no code implementations14 Jul 2022 Zhou Liu, YuJun Li, Zhengying Liu, Lin Li, Zhenguo Li

We define the normalized form of trigonometric identities, design a set of rules for the proof and put forward a method which can generate theoretically infinite trigonometric identities.

Automated Theorem Proving Imitation Learning

PILC: Practical Image Lossless Compression with an End-to-end GPU Oriented Neural Framework

no code implementations CVPR 2022 Ning Kang, Shanzhao Qiu, Shifeng Zhang, Zhenguo Li, Shutao Xia

Generative model based image lossless compression algorithms have seen a great success in improving compression ratio.

CO^3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving

1 code implementation8 Jun 2022 Runjian Chen, Yao Mu, Runsen Xu, Wenqi Shao, Chenhan Jiang, Hang Xu, Zhenguo Li, Ping Luo

In this paper, we propose CO^3, namely Cooperative Contrastive Learning and Contextual Shape Prediction, to learn 3D representation for outdoor-scene point clouds in an unsupervised manner.

Autonomous Driving Contrastive Learning +1

Task-Customized Self-Supervised Pre-training with Scalable Dynamic Routing

no code implementations26 May 2022 Zhili Liu, Jianhua Han, Lanqing Hong, Hang Xu, Kai Chen, Chunjing Xu, Zhenguo Li

On the other hand, for existing SSL methods, it is burdensome and infeasible to use different downstream-task-customized datasets in pre-training for different tasks.

Self-Supervised Learning

Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning

2 code implementations25 May 2022 Jiahui Gao, Renjie Pi, Yong Lin, Hang Xu, Jiacheng Ye, Zhiyong Wu, Weizhong Zhang, Xiaodan Liang, Zhenguo Li, Lingpeng Kong

In this paradigm, the synthesized data from the PLM acts as the carrier of knowledge, which is used to train a task-specific model with orders of magnitude fewer parameters than the PLM, achieving both higher performance and efficiency than prompt-based zero-shot learning methods on PLMs.

text-classification Text Classification +1

ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation

1 code implementation CVPR 2022 Jianan Wang, Guansong Lu, Hang Xu, Zhenguo Li, Chunjing Xu, Yanwei Fu

Existing text-guided image manipulation methods aim to modify the appearance of the image or to edit a few objects in a virtual or simple scenario, which is far from practical application.

Image Generation Image Manipulation

Generalizing Few-Shot NAS with Gradient Matching

1 code implementation ICLR 2022 Shoukang Hu, Ruochen Wang, Lanqing Hong, Zhenguo Li, Cho-Jui Hsieh, Jiashi Feng

Efficient performance estimation of architectures drawn from large search spaces is essential to Neural Architecture Search.

Neural Architecture Search

CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving

no code implementations15 Mar 2022 Kaican Li, Kai Chen, Haoyu Wang, Lanqing Hong, Chaoqiang Ye, Jianhua Han, Yukuai Chen, Wei zhang, Chunjing Xu, Dit-yan Yeung, Xiaodan Liang, Zhenguo Li, Hang Xu

One main reason that impedes the development of truly reliably self-driving systems is the lack of public datasets for evaluating the performance of object detectors on corner cases.

Autonomous Driving Object +2

Memory Replay with Data Compression for Continual Learning

1 code implementation ICLR 2022 Liyuan Wang, Xingxing Zhang, Kuo Yang, Longhui Yu, Chongxuan Li, Lanqing Hong, Shifeng Zhang, Zhenguo Li, Yi Zhong, Jun Zhu

In this work, we propose memory replay with data compression (MRDC) to reduce the storage cost of old training samples and thus increase their amount that can be stored in the memory buffer.

Autonomous Driving Class Incremental Learning +5

Semi-Supervised Object Detection via Multi-Instance Alignment With Global Class Prototypes

no code implementations CVPR 2022 Aoxue Li, Peng Yuan, Zhenguo Li

Semi-Supervised object detection (SSOD) aims to improve the generalization ability of object detectors with large-scale unlabeled images.

object-detection Object Detection +1

Long-tail Recognition via Compositional Knowledge Transfer

no code implementations CVPR 2022 Sarah Parisot, Pedro M. Esperanca, Steven McDonagh, Tamas J. Madarasz, Yongxin Yang, Zhenguo Li

In this work, we introduce a novel strategy for long-tail recognition that addresses the tail classes' few-shot problem via training-free knowledge transfer.

Transfer Learning

Layer-Parallel Training of Residual Networks with Auxiliary-Variable Networks

no code implementations10 Dec 2021 Qi Sun, Hexin Dong, Zewei Chen, Jiacheng Sun, Zhenguo Li, Bin Dong

Gradient-based methods for the distributed training of residual networks (ResNets) typically require a forward pass of the input data, followed by back-propagating the error gradient to update model parameters, which becomes time-consuming as the network goes deeper.

Data Augmentation

Understanding Square Loss in Training Overparametrized Neural Network Classifiers

no code implementations7 Dec 2021 Tianyang Hu, Jun Wang, Wenjia Wang, Zhenguo Li

Comparing to cross-entropy, square loss has comparable generalization error but noticeable advantages in robustness and model calibration.

MixACM: Mixup-Based Robustness Transfer via Distillation of Activated Channel Maps

no code implementations NeurIPS 2021 Muhammad Awais, Fengwei Zhou, Chuanlong Xie, Jiawei Li, Sung-Ho Bae, Zhenguo Li

First, we theoretically show the transferability of robustness from an adversarially trained teacher model to a student model with the help of mixup augmentation.

Transfer Learning

FILIP: Fine-grained Interactive Language-Image Pre-Training

1 code implementation ICLR 2022 Lewei Yao, Runhui Huang, Lu Hou, Guansong Lu, Minzhe Niu, Hang Xu, Xiaodan Liang, Zhenguo Li, Xin Jiang, Chunjing Xu

In this paper, we introduce a large-scale Fine-grained Interactive Language-Image Pre-training (FILIP) to achieve finer-level alignment through a cross-modal late interaction mechanism, which uses a token-wise maximum similarity between visual and textual tokens to guide the contrastive objective.

Image Classification Retrieval +2

AIM: Automatic Interaction Machine for Click-Through Rate Prediction

1 code implementation5 Nov 2021 Chenxu Zhu, Bo Chen, Weinan Zhang, Jincai Lai, Ruiming Tang, Xiuqiang He, Zhenguo Li, Yong Yu

To address these three issues mentioned above, we propose Automatic Interaction Machine (AIM) with three core components, namely, Feature Interaction Search (FIS), Interaction Function Search (IFS) and Embedding Dimension Search (EDS), to select significant feature interactions, appropriate interaction functions and necessary embedding dimensions automatically in a unified framework.

Click-Through Rate Prediction

OSOA: One-Shot Online Adaptation of Deep Generative Models for Lossless Compression

no code implementations NeurIPS 2021 Chen Zhang, Shifeng Zhang, Fabio Maria Carlucci, Zhenguo Li

To eliminate the requirement of saving separate models for different target datasets, we propose a novel setting that starts from a pretrained deep generative model and compresses the data batches while adapting the model with a dynamical system for only one epoch.

Density Estimation

iFlow: Numerically Invertible Flows for Efficient Lossless Compression via a Uniform Coder

no code implementations NeurIPS 2021 Shifeng Zhang, Ning Kang, Tom Ryder, Zhenguo Li

In this paper, we discuss lossless compression using normalizing flows which have demonstrated a great capacity for achieving high compression ratios.

Image Compression

Rethinking Adversarial Transferability from a Data Distribution Perspective

no code implementations ICLR 2022 Yao Zhu, Jiacheng Sun, Zhenguo Li

Adversarial transferability enables attackers to generate adversarial examples from the source model to attack the target model, which has raised security concerns about the deployment of DNNs in practice.

Adversarial Attack

Nonlinear ICA Using Volume-Preserving Transformations

no code implementations ICLR 2022 Xiaojiang Yang, Yi Wang, Jiacheng Sun, Xing Zhang, Shifeng Zhang, Zhenguo Li, Junchi Yan

Nonlinear ICA is a fundamental problem in machine learning, aiming to identify the underlying independent components (sources) from data which is assumed to be a nonlinear function (mixing function) of these sources.

How Well Does Self-Supervised Pre-Training Perform with Streaming ImageNet?

no code implementations NeurIPS Workshop ImageNet_PPF 2021 Dapeng Hu, Shipeng Yan, Qizhengqiu Lu, Lanqing Hong, Hailin Hu, Yifan Zhang, Zhenguo Li, Xinchao Wang, Jiashi Feng

Prior works on self-supervised pre-training focus on the joint training scenario, where massive unlabeled data are assumed to be given as input all at once, and only then is a learner trained.

Self-Supervised Learning

Layer-Parallel Training of Residual Networks with Auxiliary Variables

no code implementations NeurIPS Workshop DLDE 2021 Qi Sun, Hexin Dong, Zewei Chen, Weizhen Dian, Jiacheng Sun, Yitong Sun, Zhenguo Li, Bin Dong

Backpropagation algorithm is indispensable for training modern residual networks (ResNets) and usually tends to be time-consuming due to its inherent algorithmic lockings.

Data Augmentation

NAS-OoD: Neural Architecture Search for Out-of-Distribution Generalization

1 code implementation ICCV 2021 Haoyue Bai, Fengwei Zhou, Lanqing Hong, Nanyang Ye, S. -H. Gary Chan, Zhenguo Li

In this work, we propose robust Neural Architecture Search for OoD generalization (NAS-OoD), which optimizes the architecture with respect to its performance on generated OoD data by gradient descent.

Domain Generalization Neural Architecture Search +1

Adversarial Robustness for Unsupervised Domain Adaptation

no code implementations ICCV 2021 Muhammad Awais, Fengwei Zhou, Hang Xu, Lanqing Hong, Ping Luo, Sung-Ho Bae, Zhenguo Li

Extensive Unsupervised Domain Adaptation (UDA) studies have shown great success in practice by learning transferable representations across a labeled source domain and an unlabeled target domain with deep models.

Adversarial Robustness Unsupervised Domain Adaptation

MultiSiam: Self-supervised Multi-instance Siamese Representation Learning for Autonomous Driving

1 code implementation ICCV 2021 Kai Chen, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-yan Yeung

By pre-training on SODA10M, a large-scale autonomous driving dataset, MultiSiam exceeds the ImageNet pre-trained MoCo-v2, demonstrating the potential of domain-specific pre-training.

Autonomous Driving Image Clustering +2

Towards Understanding the Generative Capability of Adversarially Robust Classifiers

no code implementations ICCV 2021 Yao Zhu, Jiacheng Ma, Jiacheng Sun, Zewei Chen, Rongxin Jiang, Zhenguo Li

We find that adversarial training contributes to obtaining an energy function that is flat and has low energy around the real data, which is the key for generative capability.

Image Generation

G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitation

no code implementations ICCV 2021 Lewei Yao, Renjie Pi, Hang Xu, Wei zhang, Zhenguo Li, Tong Zhang

In this paper, we investigate the knowledge distillation (KD) strategy for object detection and propose an effective framework applicable to both homogeneous and heterogeneous student-teacher pairs.

Knowledge Distillation object-detection +1

AutoBERT-Zero: Evolving BERT Backbone from Scratch

no code implementations15 Jul 2021 Jiahui Gao, Hang Xu, Han Shi, Xiaozhe Ren, Philip L. H. Yu, Xiaodan Liang, Xin Jiang, Zhenguo Li

Transformer-based pre-trained language models like BERT and its variants have recently achieved promising performance in various natural language processing (NLP) tasks.

Inductive Bias Language Modelling +3

One Million Scenes for Autonomous Driving: ONCE Dataset

1 code implementation21 Jun 2021 Jiageng Mao, Minzhe Niu, Chenhan Jiang, Hanxue Liang, Jingheng Chen, Xiaodan Liang, Yamin Li, Chaoqiang Ye, Wei zhang, Zhenguo Li, Jie Yu, Hang Xu, Chunjing Xu

To facilitate future research on exploiting unlabeled data for 3D detection, we additionally provide a benchmark in which we reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset.

3D Object Detection Autonomous Driving +1

SODA10M: A Large-Scale 2D Self/Semi-Supervised Object Detection Dataset for Autonomous Driving

no code implementations21 Jun 2021 Jianhua Han, Xiwen Liang, Hang Xu, Kai Chen, Lanqing Hong, Jiageng Mao, Chaoqiang Ye, Wei zhang, Zhenguo Li, Xiaodan Liang, Chunjing Xu

Experiments show that SODA10M can serve as a promising pre-training dataset for different self-supervised learning methods, which gives superior performance when fine-tuning with different downstream tasks (i. e., detection, semantic/instance segmentation) in autonomous driving domain.

Autonomous Driving Instance Segmentation +5

Transformation Invariant Few-Shot Object Detection

no code implementations CVPR 2021 Aoxue Li, Zhenguo Li

To this end, we propose a simple yet effective Transformation Invariant Principle (TIP) that can be flexibly applied to various meta-learning models for boosting the detection performance on novel class objects.

Few-Shot Object Detection Meta-Learning +2

Adversarial Invariant Learning

1 code implementation CVPR 2021 Nanyang Ye, Jingxuan Tang, Huayu Deng, Xiao-Yun Zhou, Qianxiao Li, Zhenguo Li, Guang-Zhong Yang, Zhanxing Zhu

To the best of our knowledge, this is one of the first to adopt differentiable environment splitting method to enable stable predictions across environments without environment index information, which achieves the state-of-the-art performance on datasets with strong spurious correlation, such as Colored MNIST.

Domain Generalization Out-of-Distribution Generalization

Contextualizing Meta-Learning via Learning to Decompose

1 code implementation15 Jun 2021 Han-Jia Ye, Da-Wei Zhou, Lanqing Hong, Zhenguo Li, Xiu-Shen Wei, De-Chuan Zhan

To this end, we propose Learning to Decompose Network (LeadNet) to contextualize the meta-learned ``support-to-target'' strategy, leveraging the context of instances with one or mixed latent attributes in a support set.

Attribute Few-Shot Image Classification +1

Towards a Theoretical Framework of Out-of-Distribution Generalization

no code implementations NeurIPS 2021 Haotian Ye, Chuanlong Xie, Tianle Cai, Ruichen Li, Zhenguo Li, LiWei Wang

We also introduce a new concept of expansion function, which characterizes to what extent the variance is amplified in the test domains over the training domains, and therefore give a quantitative meaning of invariant features.

Domain Generalization Model Selection +1

Joint-DetNAS: Upgrade Your Detector with NAS, Pruning and Dynamic Distillation

no code implementations CVPR 2021 Lewei Yao, Renjie Pi, Hang Xu, Wei zhang, Zhenguo Li, Tong Zhang

For student morphism, weight inheritance strategy is adopted, allowing the student to flexibly update its architecture while fully utilize the predecessor's weights, which considerably accelerates the search; To facilitate dynamic distillation, an elastic teacher pool is trained via integrated progressive shrinking strategy, from which teacher detectors can be sampled without additional cost in subsequent searches.

Knowledge Distillation Neural Architecture Search +2

TransNAS-Bench-101: Improving Transferability and Generalizability of Cross-Task Neural Architecture Search

2 code implementations CVPR 2021 Yawen Duan, Xin Chen, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li

While existing NAS methods mostly design architectures on a single task, algorithms that look beyond single-task search are surging to pursue a more efficient and universal solution across various tasks.

Neural Architecture Search Transfer Learning

BWCP: Probabilistic Learning-to-Prune Channels for ConvNets via Batch Whitening

no code implementations13 May 2021 Wenqi Shao, Hang Yu, Zhaoyang Zhang, Hang Xu, Zhenguo Li, Ping Luo

To address this problem, we develop a probability-based pruning algorithm, called batch whitening channel pruning (BWCP), which can stochastically discard unimportant channels by modeling the probability of a channel being activated.

How Well Does Self-Supervised Pre-Training Perform with Streaming Data?

no code implementations ICLR 2022 Dapeng Hu, Shipeng Yan, Qizhengqiu Lu, Lanqing Hong, Hailin Hu, Yifan Zhang, Zhenguo Li, Xinchao Wang, Jiashi Feng

Prior works on self-supervised pre-training focus on the joint training scenario, where massive unlabeled data are assumed to be given as input all at once, and only then is a learner trained.

Representation Learning Self-Supervised Learning

SparseBERT: Rethinking the Importance Analysis in Self-attention

1 code implementation25 Feb 2021 Han Shi, Jiahui Gao, Xiaozhe Ren, Hang Xu, Xiaodan Liang, Zhenguo Li, James T. Kwok

A surprising result is that diagonal elements in the attention map are the least important compared with other attention positions.

Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search

1 code implementation ICLR 2021 Peidong Liu, Gengwei Zhang, Bochao Wang, Hang Xu, Xiaodan Liang, Yong Jiang, Zhenguo Li

For object detection, the well-established classification and regression loss functions have been carefully designed by considering diverse learning challenges.

Model Optimization object-detection +1

DetCo: Unsupervised Contrastive Learning for Object Detection

2 code implementations ICCV 2021 Enze Xie, Jian Ding, Wenhai Wang, Xiaohang Zhan, Hang Xu, Peize Sun, Zhenguo Li, Ping Luo

Unlike most recent methods that focused on improving accuracy of image classification, we present a novel contrastive learning approach, named DetCo, which fully explores the contrasts between global image and local image patches to learn discriminative representations for object detection.

Contrastive Learning Image Classification +2

Relaxed Conditional Image Transfer for Semi-supervised Domain Adaptation

no code implementations5 Jan 2021 Qijun Luo, Zhili Liu, Lanqing Hong, Chongxuan Li, Kuo Yang, Liyuan Wang, Fengwei Zhou, Guilin Li, Zhenguo Li, Jun Zhu

Semi-supervised domain adaptation (SSDA), which aims to learn models in a partially labeled target domain with the assistance of the fully labeled source domain, attracts increasing attention in recent years.

Domain Adaptation Semi-supervised Domain Adaptation

NASOA: Towards Faster Task-oriented Online Fine-tuning

no code implementations1 Jan 2021 Hang Xu, Ning Kang, Gengwei Zhang, Xiaodan Liang, Zhenguo Li

The resulting model zoo is more training efficient than SOTA NAS models, e. g. 6x faster than RegNetY-16GF, and 1. 7x faster than EfficientNetB3.

Cloud Computing Neural Architecture Search

Optimal Designs of Gaussian Processes with Budgets for Hyperparameter Optimization

no code implementations1 Jan 2021 Yimin Huang, YuJun Li, Zhenguo Li, Zhihua Zhang

Moreover, comparisons between different initial designs with the same model show the advantage of the proposed optimal design.

Gaussian Processes Hyperparameter Optimization

SAD: Saliency Adversarial Defense without Adversarial Training

no code implementations1 Jan 2021 Yao Zhu, Jiacheng Sun, Zewei Chen, Zhenguo Li

We justify the algorithm with a linear model that the added saliency maps pull data away from its closest decision boundary.

Adversarial Defense

TransNAS-Bench-101: Improving Transferrability and Generalizability of Cross-Task Neural Architecture Search

2 code implementations1 Jan 2021 Yawen Duan, Xin Chen, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li

While existing NAS methods mostly design architectures on one single task, algorithms that look beyond single-task search are surging to pursue a more efficient and universal solution across various tasks.

Neural Architecture Search Transfer Learning

Exploring Geometry-Aware Contrast and Clustering Harmonization for Self-Supervised 3D Object Detection

no code implementations ICCV 2021 Hanxue Liang, Chenhan Jiang, Dapeng Feng, Xin Chen, Hang Xu, Xiaodan Liang, Wei zhang, Zhenguo Li, Luc van Gool

Here we present a novel self-supervised 3D Object detection framework that seamlessly integrates the geometry-aware contrast and clustering harmonization to lift the unsupervised 3D representation learning, named GCC-3D.

3D Object Detection Clustering +4

Fewmatch: Dynamic Prototype Refinement for Semi-Supervised Few-Shot Learning

no code implementations1 Jan 2021 Xu Lan, Steven McDonagh, Shaogang Gong, Jiali Wang, Zhenguo Li, Sarah Parisot

Semi-Supervised Few-shot Learning (SS-FSL) investigates the benefit of incorporating unlabelled data in few-shot settings.

Few-Shot Learning Pseudo Label

DiffAutoML: Differentiable Joint Optimization for Efficient End-to-End Automated Machine Learning

no code implementations1 Jan 2021 Kaichen Zhou, Lanqing Hong, Fengwei Zhou, Binxin Ru, Zhenguo Li, Trigoni Niki, Jiashi Feng

Our method performs co-optimization of the neural architectures, training hyper-parameters and data augmentation policies in an end-to-end fashion without the need of model retraining.

BIG-bench Machine Learning Computational Efficiency +2

An Embedding Learning Framework for Numerical Features in CTR Prediction

1 code implementation16 Dec 2020 Huifeng Guo, Bo Chen, Ruiming Tang, Weinan Zhang, Zhenguo Li, Xiuqiang He

In this paper, we propose a novel embedding learning framework for numerical features in CTR prediction (AutoDis) with high model capacity, end-to-end training and unique representation properties preserved.

Click-Through Rate Prediction Feature Engineering +1

Batch Group Normalization

no code implementations4 Dec 2020 Xiao-Yun Zhou, Jiacheng Sun, Nanyang Ye, Xu Lan, Qijun Luo, Bo-Lin Lai, Pedro Esperanca, Guang-Zhong Yang, Zhenguo Li

Among previous normalization methods, Batch Normalization (BN) performs well at medium and large batch sizes and is with good generalizability to multiple vision tasks, while its performance degrades significantly at small batch sizes.

Few-Shot Learning Image Classification +2

MOFA: Modular Factorial Design for Hyperparameter Optimization

no code implementations18 Nov 2020 Bo Xiong, Yimin Huang, Hanrong Ye, Steffen Staab, Zhenguo Li

MOFA pursues several rounds of HPO, where each round alternates between exploration of hyperparameter space by factorial design and exploitation of evaluation results by factorial analysis.

Hyperparameter Optimization Model Selection

A Practical Layer-Parallel Training Algorithm for Residual Networks

no code implementations3 Sep 2020 Qi Sun, Hexin Dong, Zewei Chen, Weizhen Dian, Jiacheng Sun, Yitong Sun, Zhenguo Li, Bin Dong

Gradient-based algorithms for training ResNets typically require a forward pass of the input data, followed by back-propagating the objective gradient to update parameters, which are time-consuming for deep ResNets.

Data Augmentation

CurveLane-NAS: Unifying Lane-Sensitive Architecture Search and Adaptive Point Blending

1 code implementation ECCV 2020 Hang Xu, Shaoju Wang, Xinyue Cai, Wei zhang, Xiaodan Liang, Zhenguo Li

In this paper, we propose a novel lane-sensitive architecture search framework named CurveLane-NAS to automatically capture both long-ranged coherent and accurate short-range curve information while unifying both architecture search and post-processing on curve lane predictions via point blending.

Autonomous Driving Lane Detection

AABO: Adaptive Anchor Box Optimization for Object Detection via Bayesian Sub-sampling

no code implementations ECCV 2020 Wenshuo Ma, Tingzhong Tian, Hang Xu, Yimin Huang, Zhenguo Li

By carefully analyzing the existing bounding box patterns on the feature hierarchy, we design a flexible and tight hyper-parameter space for anchor configurations.

Bayesian Optimization object-detection +1

An Asymptotically Optimal Multi-Armed Bandit Algorithm and Hyperparameter Optimization

1 code implementation11 Jul 2020 Yimin Huang, Yu-Jun Li, Hanrong Ye, Zhenguo Li, Zhihua Zhang

The evaluation of hyperparameters, neural architectures, or data augmentation policies becomes a critical model selection problem in advanced deep learning with a large hyperparameter search space.

Bayesian Optimization Data Augmentation +6

Decoder-free Robustness Disentanglement without (Additional) Supervision

no code implementations2 Jul 2020 Yifei Wang, Dan Peng, Furui Liu, Zhenguo Li, Zhitang Chen, Jiansheng Yang

Adversarial Training (AT) is proposed to alleviate the adversarial vulnerability of machine learning models by extracting only robust features from the input, which, however, inevitably leads to severe accuracy reduction as it discards the non-robust yet useful features.

BIG-bench Machine Learning Disentanglement

New Interpretations of Normalization Methods in Deep Learning

no code implementations16 Jun 2020 Jiacheng Sun, Xiangyong Cao, Hanwen Liang, Weiran Huang, Zewei Chen, Zhenguo Li

In recent years, a variety of normalization methods have been proposed to help train neural networks, such as batch normalization (BN), layer normalization (LN), weight normalization (WN), group normalization (GN), etc.

LEMMA

Risk Variance Penalization

no code implementations13 Jun 2020 Chuanlong Xie, Haotian Ye, Fei Chen, Yue Liu, Rui Sun, Zhenguo Li

The key of the out-of-distribution (OOD) generalization is to generalize invariance from training domains to target domains.

Boosting Few-Shot Learning With Adaptive Margin Loss

no code implementations CVPR 2020 Aoxue Li, Weiran Huang, Xu Lan, Jiashi Feng, Zhenguo Li, Li-Wei Wang

Few-shot learning (FSL) has attracted increasing attention in recent years but remains challenging, due to the intrinsic difficulty in learning to generalize from a few examples.

Few-Shot Image Classification Few-Shot Learning +2

Rethinking Performance Estimation in Neural Architecture Search

1 code implementation CVPR 2020 Xiawu Zheng, Rongrong Ji, Qiang Wang, Qixiang Ye, Zhenguo Li, Yonghong Tian, Qi Tian

In this paper, we provide a novel yet systematic rethinking of PE in a resource constrained regime, termed budgeted PE (BPE), which precisely and effectively estimates the performance of an architecture sampled from an architecture space.

Neural Architecture Search

AutoFIS: Automatic Feature Interaction Selection in Factorization Models for Click-Through Rate Prediction

4 code implementations25 Mar 2020 Bin Liu, Chenxu Zhu, Guilin Li, Wei-Nan Zhang, Jincai Lai, Ruiming Tang, Xiuqiang He, Zhenguo Li, Yong Yu

By implementing a regularized optimizer over the architecture parameters, the model can automatically identify and remove the redundant feature interactions during the training process of the model.

Click-Through Rate Prediction Recommendation Systems

EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with Cascade Refinement

no code implementations18 Feb 2020 Linpu Fang, Hang Xu, Zhili Liu, Sarah Parisot, Zhenguo Li

In this paper, we study the hybrid-supervised object detection problem, aiming to train a high quality detector with only a limited amount of fullyannotated data and fully exploiting cheap data with imagelevel labels.

Object object-detection +1

Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN

no code implementations18 Feb 2020 Hang Xu, Linpu Fang, Xiaodan Liang, Wenxiong Kang, Zhenguo Li

Finally, an InterDomain Transfer Module is proposed to exploit diverse transfer dependencies across all domains and enhance the regional feature representation by attending and transferring semantic contexts globally.

Object object-detection +2

Multi-objective Neural Architecture Search via Non-stationary Policy Gradient

no code implementations23 Jan 2020 Zewei Chen, Fengwei Zhou, George Trimponias, Zhenguo Li

Despite recent progress, the problem of approximating the full Pareto front accurately and efficiently remains challenging.

Neural Architecture Search Reinforcement Learning (RL)

MetaSelector: Meta-Learning for Recommendation with User-Level Adaptive Model Selection

no code implementations22 Jan 2020 Mi Luo, Fei Chen, Pengxiang Cheng, Zhenhua Dong, Xiuqiang He, Jiashi Feng, Zhenguo Li

Recommender systems often face heterogeneous datasets containing highly personalized historical data of users, where no single model could give the best recommendation for every user.

Meta-Learning Model Selection +1

Meta-Learning PAC-Bayes Priors in Model Averaging

no code implementations24 Dec 2019 Yimin Huang, Weiran Huang, Liang Li, Zhenguo Li

In this paper, we mainly consider the scenario in which we have a common model set used for model averaging instead of selecting a single final model via a model selection procedure to account for this model's uncertainty to improve the reliability and accuracy of inferences.

Meta-Learning Model Selection

SM-NAS: Structural-to-Modular Neural Architecture Search for Object Detection

no code implementations22 Nov 2019 Lewei Yao, Hang Xu, Wei zhang, Xiaodan Liang, Zhenguo Li

In this paper, we present a two-stage coarse-to-fine searching strategy named Structural-to-Modular NAS (SM-NAS) for searching a GPU-friendly design of both an efficient combination of modules and better modular-level architecture for object detection.

Neural Architecture Search Object +2

Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS

1 code implementation NeurIPS 2020 Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James T. Kwok, Tong Zhang

In this work, we propose BONAS (Bayesian Optimized Neural Architecture Search), a sample-based NAS framework which is accelerated using weight-sharing to evaluate multiple related architectures simultaneously.

Bayesian Optimization Neural Architecture Search

Hierarchical Neural Architecture Search via Operator Clustering

1 code implementation26 Sep 2019 Guilin Li, Xing Zhang, Zitong Wang, Matthias Tan, Jiashi Feng, Zhenguo Li, Tong Zhang

Recently, the efficiency of automatic neural architecture design has been significantly improved by gradient-based search methods such as DARTS.

Clustering Neural Architecture Search

Multi-objective Neural Architecture Search via Predictive Network Performance Optimization

no code implementations25 Sep 2019 Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James T. Kwok, Tong Zhang

Inspired by the nature of the graph structure of a neural network, we propose BOGCN-NAS, a NAS algorithm using Bayesian Optimization with Graph Convolutional Network (GCN) predictor.

Bayesian Optimization Neural Architecture Search

DARTS+: Improved Differentiable Architecture Search with Early Stopping

no code implementations13 Sep 2019 Hanwen Liang, Shifeng Zhang, Jiacheng Sun, Xingqiu He, Weiran Huang, Kechen Zhuang, Zhenguo Li

Therefore, we propose a simple and effective algorithm, named "DARTS+", to avoid the collapse and improve the original DARTS, by "early stopping" the search procedure when meeting a certain criterion.

Meta Reinforcement Learning with Task Embedding and Shared Policy

2 code implementations16 May 2019 Lin Lan, Zhenguo Li, Xiaohong Guan, Pinghui Wang

Despite significant progress, deep reinforcement learning (RL) suffers from data-inefficiency and limited generalization.

Meta-Learning Meta Reinforcement Learning +2

Formulating Camera-Adaptive Color Constancy as a Few-shot Meta-Learning Problem

no code implementations28 Nov 2018 Steven McDonagh, Sarah Parisot, Fengwei Zhou, Xing Zhang, Ales Leonardis, Zhenguo Li, Gregory Slabaugh

In this work, we propose a new approach that affords fast adaptation to previously unseen cameras, and robustness to changes in capture device by leveraging annotated samples across different cameras and datasets.

Few-Shot Camera-Adaptive Color Constancy Meta-Learning

DeepFM: An End-to-End Wide & Deep Learning Framework for CTR Prediction

8 code implementations12 Apr 2018 Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, Xiuqiang He, Zhenhua Dong

In this paper, we study two instances of DeepFM where its "deep" component is DNN and PNN respectively, for which we denote as DeepFM-D and DeepFM-P. Comprehensive experiments are conducted to demonstrate the effectiveness of DeepFM-D and DeepFM-P over the existing models for CTR prediction, on both benchmark data and commercial data.

Click-Through Rate Prediction Feature Engineering +1

Federated Meta-Learning with Fast Convergence and Efficient Communication

1 code implementation22 Feb 2018 Fei Chen, Mi Luo, Zhenhua Dong, Zhenguo Li, Xiuqiang He

Statistical and systematic challenges in collaboratively training machine learning models across distributed networks of mobile devices have been the bottlenecks in the real-world application of federated learning.

Federated Learning Meta-Learning +1

Deep Meta-Learning: Learning to Learn in the Concept Space

no code implementations10 Feb 2018 Fengwei Zhou, Bin Wu, Zhenguo Li

Few-shot learning remains challenging for meta-learning that learns a learning algorithm (meta-learner) from many related tasks.

Few-Shot Learning

Graph Edge Partitioning via Neighborhood Heuristic

1 code implementation13 Aug 2017 Chenzi Zhang, Fan Wei, Qin Liu, Zhihao Gavin Tang, Zhenguo Li

We provide a worst-case upper bound of replication factor for our heuristic on general graphs.

Meta-SGD: Learning to Learn Quickly for Few-Shot Learning

9 code implementations31 Jul 2017 Zhenguo Li, Fengwei Zhou, Fei Chen, Hang Li

In contrast, meta-learning learns from many related tasks a meta-learner that can learn a new task more accurately and faster with fewer examples, where the choice of meta-learners is crucial.

Few-Shot Learning reinforcement-learning +1

New Insights Into Laplacian Similarity Search

no code implementations CVPR 2015 Xiao-Ming Wu, Zhenguo Li, Shih-Fu Chang

Graph-based computer vision applications rely critically on similarity metrics which compute the pairwise similarity between any pair of vertices on graphs.

Image Retrieval Retrieval

Locally Linear Hashing for Extracting Non-Linear Manifolds

no code implementations CVPR 2014 Go Irie, Zhenguo Li, Xiao-Ming Wu, Shih-Fu Chang

Previous efforts in hashing intend to preserve data variance or pairwise affinity, but neither is adequate in capturing the manifold structures hidden in most visual data.

Quantization

Analyzing the Harmonic Structure in Graph-Based Learning

no code implementations NeurIPS 2013 Xiao-Ming Wu, Zhenguo Li, Shih-Fu Chang

We show that either explicitly or implicitly, various well-known graph-based models exhibit a common significant \emph{harmonic} structure in its target function -- the value of a vertex is approximately the weighted average of the values of its adjacent neighbors.

Learning with Partially Absorbing Random Walks

no code implementations NeurIPS 2012 Xiao-Ming Wu, Zhenguo Li, Anthony M. So, John Wright, Shih-Fu Chang

We prove that under proper absorption rates, a random walk starting from a set $\mathcal{S}$ of low conductance will be mostly absorbed in $\mathcal{S}$.

Fast Graph Laplacian Regularized Kernel Learning via Semidefinite–Quadratic–Linear Programming

no code implementations NeurIPS 2009 Xiao-Ming Wu, Anthony M. So, Zhenguo Li, Shuo-Yen R. Li

In this paper, we show that a large class of kernel learning problems can be reformulated as semidefinite-quadratic-linear programs (SQLPs), which only contain a simple positive semidefinite constraint, a second-order cone constraint and a number of linear constraints.

Computational Efficiency Constrained Clustering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.