Search Results for author: Jiayuan Fan

Found 29 papers, 12 papers with code

Scene123: One Prompt to 3D Scene Generation via Video-Assisted and Consistency-Enhanced MAE

no code implementations10 Aug 2024 Yiying Yang, Fukun Yin, Jiayuan Fan, Xin Chen, Wanzhang Li, Gang Yu

As Artificial Intelligence Generated Content (AIGC) advances, a variety of methods have been developed to generate text, images, videos, and 3D objects from single or multimodal inputs, contributing efforts to emulate human-like cognitive content creation.

Scene Generation Video Generation

Lightweight Model Pre-training via Language Guided Knowledge Distillation

1 code implementation17 Jun 2024 Mingsheng Li, Lin Zhang, Mingzhen Zhu, Zilong Huang, Gang Yu, Jiayuan Fan, Tao Chen

In this paper, for the first time, we introduce language guidance to the distillation process and propose a new method named Language-Guided Distillation (LGD) system, which uses category names of the target downstream task to help refine the knowledge transferred between the teacher and student.

Knowledge Distillation

DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification

no code implementations11 Jun 2024 Jiamu Sheng, Jingyi Zhou, Jiong Wang, Peng Ye, Jiayuan Fan

Finally, the adaptive global-local fusion is proposed to dynamically combine global Mamba features and local convolution features for a global-local spectral-spatial representation.

Classification Hyperspectral Image Classification +1

Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision

no code implementations5 Jun 2024 Minglei Li, Peng Ye, Yongqi Huang, Lin Zhang, Tao Chen, Tong He, Jiayuan Fan, Wanli Ouyang

Parameter-efficient fine-tuning (PEFT) has become increasingly important as foundation models continue to grow in both popularity and size.

3D Classification parameter-efficient fine-tuning

MotionChain: Conversational Motion Controllers via Multimodal Prompts

1 code implementation2 Apr 2024 Biao Jiang, Xin Chen, Chi Zhang, Fukun Yin, Zhuoyuan Li, Gang Yu, Jiayuan Fan

However, this proficiency remains largely unexplored in other multimodal generative models, particularly in human motion models.

Language Modeling Language Modelling +1

BridgeNet: Comprehensive and Effective Feature Interactions via Bridge Feature for Multi-task Dense Predictions

no code implementations21 Dec 2023 Jingdong Zhang, Jiayuan Fan, Peng Ye, Bo Zhang, Hancheng Ye, Baopu Li, Yancheng Cai, Tao Chen

Multi-task dense prediction aims at handling multiple pixel-wise prediction tasks within a unified network simultaneously for visual scene understanding.

Decoder Multi-Task Learning +1

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning

1 code implementation30 Nov 2023 Sijin Chen, Xin Chen, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, Tao Chen

However, developing LMMs that can comprehend, reason, and plan in complex and diverse 3D environments remains a challenging topic, especially considering the demand for understanding permutation-invariant point cloud 3D representations of the 3D scene.

3D dense captioning Dense Captioning +1

ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model

no code implementations29 Nov 2023 Fukun Yin, Xin Chen, Chi Zhang, Biao Jiang, Zibo Zhao, Jiayuan Fan, Gang Yu, Taihao Li, Tao Chen

The advent of large language models, enabling flexibility through instruction-driven approaches, has revolutionized many traditional generative tasks, but large models for 3D data, particularly in comprehensively handling 3D shapes with other modalities, are still under-explored.

3D Shape Generation Language Modeling +2

VQ-NeRF: Vector Quantization Enhances Implicit Neural Representations

no code implementations23 Oct 2023 Yiying Yang, Wen Liu, Fukun Yin, Xin Chen, Gang Yu, Jiayuan Fan, Tao Chen

Recent advancements in implicit neural representations have contributed to high-fidelity surface reconstruction and photorealistic novel view synthesis.

Decoder NeRF +3

Exploring Multi-Timestep Multi-Stage Diffusion Features for Hyperspectral Image Classification

1 code implementation15 Jun 2023 Jingyi Zhou, Jiamu Sheng, Jiayuan Fan, Peng Ye, Tong He, Bin Wang, Tao Chen

To address this issue, we propose a novel diffusion-based feature learning framework that explores Multi-Timestep Multi-Stage Diffusion features for HSI classification for the first time, called MTMSD.

Classification Hyperspectral Image Classification

Boost Vision Transformer with GPU-Friendly Sparsity and Quantization

no code implementations CVPR 2023 Chong Yu, Tao Chen, Zhongxue Gan, Jiayuan Fan

Moreover, GPUSQ-ViT can boost actual deployment performance by 1. 39-1. 79 times and 3. 22-3. 43 times of latency and throughput on A100 GPU, and 1. 57-1. 69 times and 2. 11-2. 51 times improvement of latency and throughput on AGX Orin.

Benchmarking Knowledge Distillation +1

What Makes for Effective Few-shot Point Cloud Classification?

1 code implementation31 Mar 2023 Chuangguan Ye, Hongyuan Zhu, Yongbin Liao, Yanggang Zhang, Tao Chen, Jiayuan Fan

Due to the emergence of powerful computing resources and large-scale annotated datasets, deep learning has seen wide applications in our daily life.

Benchmarking Classification +2

Performance-aware Approximation of Global Channel Pruning for Multitask CNNs

2 code implementations21 Mar 2023 Hancheng Ye, Bo Zhang, Tao Chen, Jiayuan Fan, Bin Wang

Global channel pruning (GCP) aims to remove a subset of channels (filters) across different layers from a deep model without hurting the performance.

Model Compression

A2S-NAS: Asymmetric Spectral-Spatial Neural Architecture Search For Hyperspectral Image Classification

no code implementations23 Feb 2023 Lin Zhan, Jiayuan Fan, Peng Ye, JianJian Cao

To address the above issues, we propose a multi-stage search architecture in order to overcome asymmetric spectral-spatial dimensions and capture significant features.

Hyperspectral Image Classification Neural Architecture Search

JNDMix: JND-Based Data Augmentation for No-reference Image Quality Assessment

no code implementations20 Feb 2023 Jiamu Sheng, Jiayuan Fan, Peng Ye, JianJian Cao

Despite substantial progress in no-reference image quality assessment (NR-IQA), previous training models often suffer from over-fitting due to the limited scale of used datasets, resulting in model performance bottlenecks.

Data Augmentation NR-IQA

A Large-Scale Outdoor Multi-modal Dataset and Benchmark for Novel View Synthesis and Implicit Scene Reconstruction

no code implementations ICCV 2023 Chongshan Lu, Fukun Yin, Xin Chen, Tao Chen, Gang Yu, Jiayuan Fan

Meanwhile, a new benchmark for several outdoor NeRF-based tasks is established, such as novel view synthesis, surface reconstruction, and multi-modal NeRF.

NeRF Novel View Synthesis +1

Instance-aware Model Ensemble With Distillation For Unsupervised Domain Adaptation

no code implementations15 Nov 2022 Weimin Wu, Jiayuan Fan, Tao Chen, Hancheng Ye, Bo Zhang, Baopu Li

To enhance the model, adaptability between domains and reduce the computational cost when deploying the ensemble model, we propose a novel framework, namely Instance aware Model Ensemble With Distillation, IMED, which fuses multiple UDA component models adaptively according to different instances and distills these components into a small model.

Knowledge Distillation Unsupervised Domain Adaptation

$β$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search

1 code implementation3 Mar 2022 Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang

Neural Architecture Search~(NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural networks automatically.

Neural Architecture Search

b-DARTS: Beta-Decay Regularization for Differentiable Architecture Search

1 code implementation CVPR 2022 Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang

Neural Architecture Search (NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural network automatically.

Neural Architecture Search

Point Cloud Instance Segmentation with Semi-supervised Bounding-Box Mining

1 code implementation30 Nov 2021 Yongbin Liao, Hongyuan Zhu, Yanggang Zhang, Chuangguan Ye, Tao Chen, Jiayuan Fan

For stage two, the bounding box proposals with SPCR are grouped into some subsets, and the instance masks are mined inside each subset with a novel semantic propagation module and a property consistency graph module.

Instance Segmentation Semantic Segmentation

Densely Semantic Enhancement for Domain Adaptive Region-free Detectors

no code implementations30 Aug 2021 Bo Zhang, Tao Chen, Bin Wang, Xiaofeng Wu, Liming Zhang, Jiayuan Fan

Unsupervised domain adaptive object detection aims to adapt a well-trained detector from its original source domain with rich labeled data to a new target domain with unlabeled data.

object-detection Object Detection +1

Object-aware Long-short-range Spatial Alignment for Few-Shot Fine-Grained Image Classification

no code implementations30 Aug 2021 Yike Wu, Bo Zhang, Gang Yu, Weixi Zhang, Bin Wang, Tao Chen, Jiayuan Fan

The goal of few-shot fine-grained image classification is to recognize rarely seen fine-grained objects in the query set, given only a few samples of this class in the support set.

Fine-Grained Image Classification Object +3

EADNet: Efficient Asymmetric Dilated Network for Semantic Segmentation

no code implementations16 Mar 2021 Qihang Yang, Tao Chen, Jiayuan Fan, Ye Lu, Chongyan Zuo, Qinghua Chi

Due to real-time image semantic segmentation needs on power constrained edge devices, there has been an increasing desire to design lightweight semantic segmentation neural network, to simultaneously reduce computational cost and increase inference speed.

Segmentation Semantic Segmentation

PIDNet: An Efficient Network for Dynamic Pedestrian Intrusion Detection

no code implementations1 Sep 2020 Jingchen Sun, Jiming Chen, Tao Chen, Jiayuan Fan, Shibo He

Vision-based dynamic pedestrian intrusion detection (PID), judging whether pedestrians intrude an area-of-interest (AoI) by a moving camera, is an important task in mobile surveillance.

Feature Compression Intrusion Detection

Coarse-to-Fine Gaze Redirection with Numerical and Pictorial Guidance

1 code implementation7 Apr 2020 Jingjing Chen, Jichao Zhang, Enver Sangineto, Jiayuan Fan, Tao Chen, Nicu Sebe

In this paper, we propose to alleviate these problems by means of a novel gaze redirection framework which exploits both a numerical and a pictorial direction guidance, jointly with a coarse-to-fine learning strategy.

gaze redirection Image Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.