Search Results for author: Jiayuan Fan

Found 24 papers, 10 papers with code

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning

1 code implementation • 30 Nov 2023 • Sijin Chen, Xin Chen, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, Tao Chen

However, developing LMMs that can comprehend, reason, and plan in complex and diverse 3D environments remains a challenging topic, especially considering the demand for understanding permutation-invariant point cloud 3D representations of the 3D scene.

3D dense captioning Dense Captioning +1

146

Paper
Code

Performance-aware Approximation of Global Channel Pruning for Multitask CNNs

1 code implementation • 21 Mar 2023 • Hancheng Ye, Bo Zhang, Tao Chen, Jiayuan Fan, Bin Wang

Global channel pruning (GCP) aims to remove a subset of channels (filters) across different layers from a deep model without hurting the performance.

Model Compression

Paper
Code

$β$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search

1 code implementation • 3 Mar 2022 • Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang

Neural Architecture Search~(NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural networks automatically.

Ranked #1 on Neural Architecture Search on NAS-Bench-201, CIFAR-100

Neural Architecture Search

Paper
Code

b-DARTS: Beta-Decay Regularization for Differentiable Architecture Search

1 code implementation • CVPR 2022 • Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang

Neural Architecture Search (NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural network automatically.

Neural Architecture Search

Paper
Code

ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model

1 code implementation • 29 Nov 2023 • Fukun Yin, Xin Chen, Chi Zhang, Biao Jiang, Zibo Zhao, Jiayuan Fan, Gang Yu, Taihao Li, Tao Chen

The advent of large language models, enabling flexibility through instruction-driven approaches, has revolutionized many traditional generative tasks, but large models for 3D data, particularly in comprehensively handling 3D shapes with other modalities, are still under-explored.

3D Shape Generation Language Modelling +1

Paper
Code

MotionChain: Conversational Motion Controllers via Multimodal Prompts

1 code implementation • 2 Apr 2024 • Biao Jiang, Xin Chen, Chi Zhang, Fukun Yin, Zhuoyuan Li, Gang Yu, Jiayuan Fan

However, this proficiency remains largely unexplored in other multimodal generative models, particularly in human motion models.

Language Modelling

Paper
Code

Learning Cross-Image Object Semantic Relation in Transformer for Few-Shot Fine-Grained Image Classification

1 code implementation • 2 Jul 2022 • Bo Zhang, Jiakang Yuan, Baopu Li, Tao Chen, Jiayuan Fan, Botian Shi

Few-shot fine-grained learning aims to classify a query image into one of a set of support categories with fine-grained differences.

Fine-Grained Image Classification Object +1

Paper
Code

Point Cloud Instance Segmentation with Semi-supervised Bounding-Box Mining

1 code implementation • 30 Nov 2021 • Yongbin Liao, Hongyuan Zhu, Yanggang Zhang, Chuangguan Ye, Tao Chen, Jiayuan Fan

For stage two, the bounding box proposals with SPCR are grouped into some subsets, and the instance masks are mined inside each subset with a novel semantic propagation module and a property consistency graph module.

Instance Segmentation Semantic Segmentation

Paper
Code

What Makes for Effective Few-shot Point Cloud Classification?

1 code implementation • 31 Mar 2023 • Chuangguan Ye, Hongyuan Zhu, Yongbin Liao, Yanggang Zhang, Tao Chen, Jiayuan Fan

Due to the emergence of powerful computing resources and large-scale annotated datasets, deep learning has seen wide applications in our daily life.

Benchmarking Classification +2

Paper
Code

Coarse-to-Fine Gaze Redirection with Numerical and Pictorial Guidance

1 code implementation • 7 Apr 2020 • Jingjing Chen, Jichao Zhang, Enver Sangineto, Jiayuan Fan, Tao Chen, Nicu Sebe

In this paper, we propose to alleviate these problems by means of a novel gaze redirection framework which exploits both a numerical and a pictorial direction guidance, jointly with a coarse-to-fine learning strategy.

gaze redirection Image Generation

Paper
Code

PIDNet: An Efficient Network for Dynamic Pedestrian Intrusion Detection

no code implementations • 1 Sep 2020 • Jingchen Sun, Jiming Chen, Tao Chen, Jiayuan Fan, Shibo He

Vision-based dynamic pedestrian intrusion detection (PID), judging whether pedestrians intrude an area-of-interest (AoI) by a moving camera, is an important task in mobile surveillance.

Feature Compression Intrusion Detection

Paper
Add Code

EADNet: Efficient Asymmetric Dilated Network for Semantic Segmentation

no code implementations • 16 Mar 2021 • Qihang Yang, Tao Chen, Jiayuan Fan, Ye Lu, Chongyan Zuo, Qinghua Chi

Due to real-time image semantic segmentation needs on power constrained edge devices, there has been an increasing desire to design lightweight semantic segmentation neural network, to simultaneously reduce computational cost and increase inference speed.

Segmentation Semantic Segmentation

Paper
Add Code

Object-aware Long-short-range Spatial Alignment for Few-Shot Fine-Grained Image Classification

no code implementations • 30 Aug 2021 • Yike Wu, Bo Zhang, Gang Yu, Weixi Zhang, Bin Wang, Tao Chen, Jiayuan Fan

The goal of few-shot fine-grained image classification is to recognize rarely seen fine-grained objects in the query set, given only a few samples of this class in the support set.

Fine-Grained Image Classification Object +3

Paper
Add Code

Densely Semantic Enhancement for Domain Adaptive Region-free Detectors

no code implementations • 30 Aug 2021 • Bo Zhang, Tao Chen, Bin Wang, Xiaofeng Wu, Liming Zhang, Jiayuan Fan

Unsupervised domain adaptive object detection aims to adapt a well-trained detector from its original source domain with rich labeled data to a new target domain with unlabeled data.

object-detection Object Detection +1

Paper
Add Code

Efficient Joint-Dimensional Search with Solution Space Regularization for Real-Time Semantic Segmentation

no code implementations • 10 Aug 2022 • Peng Ye, Baopu Li, Tao Chen, Jiayuan Fan, Zhen Mei, Chen Lin, Chongyan Zuo, Qinghua Chi, Wanli Ouyan

In this paper, we intend to search an optimal network structure that can run in real-time for this problem.

Real-Time Semantic Segmentation

Paper
Add Code

Instance-aware Model Ensemble With Distillation For Unsupervised Domain Adaptation

no code implementations • 15 Nov 2022 • Weimin Wu, Jiayuan Fan, Tao Chen, Hancheng Ye, Bo Zhang, Baopu Li

To enhance the model, adaptability between domains and reduce the computational cost when deploying the ensemble model, we propose a novel framework, namely Instance aware Model Ensemble With Distillation, IMED, which fuses multiple UDA component models adaptively according to different instances and distills these components into a small model.

Knowledge Distillation Unsupervised Domain Adaptation

Paper
Add Code

A Large-Scale Outdoor Multi-modal Dataset and Benchmark for Novel View Synthesis and Implicit Scene Reconstruction

no code implementations • ICCV 2023 • Chongshan Lu, Fukun Yin, Xin Chen, Tao Chen, Gang Yu, Jiayuan Fan

Meanwhile, a new benchmark for several outdoor NeRF-based tasks is established, such as novel view synthesis, surface reconstruction, and multi-modal NeRF.

Novel View Synthesis Surface Reconstruction

Paper
Add Code

JNDMix: JND-Based Data Augmentation for No-reference Image Quality Assessment

no code implementations • 20 Feb 2023 • Jiamu Sheng, Jiayuan Fan, Peng Ye, JianJian Cao

Despite substantial progress in no-reference image quality assessment (NR-IQA), previous training models often suffer from over-fitting due to the limited scale of used datasets, resulting in model performance bottlenecks.

Data Augmentation No-Reference Image Quality Assessment +1

Paper
Add Code

A2S-NAS: Asymmetric Spectral-Spatial Neural Architecture Search For Hyperspectral Image Classification

no code implementations • 23 Feb 2023 • Lin Zhan, Jiayuan Fan, Peng Ye, JianJian Cao

To address the above issues, we propose a multi-stage search architecture in order to overcome asymmetric spectral-spatial dimensions and capture significant features.

Hyperspectral Image Classification Neural Architecture Search

Paper
Add Code

Boost Vision Transformer with GPU-Friendly Sparsity and Quantization

no code implementations • CVPR 2023 • Chong Yu, Tao Chen, Zhongxue Gan, Jiayuan Fan

Moreover, GPUSQ-ViT can boost actual deployment performance by 1. 39-1. 79 times and 3. 22-3. 43 times of latency and throughput on A100 GPU, and 1. 57-1. 69 times and 2. 11-2. 51 times improvement of latency and throughput on AGX Orin.

Benchmarking Knowledge Distillation +1

Paper
Add Code

When Hyperspectral Image Classification Meets Diffusion Models: An Unsupervised Feature Learning Framework

no code implementations • 15 Jun 2023 • Jingyi Zhou, Jiamu Sheng, Jiayuan Fan, Peng Ye, Tong He, Bin Wang, Tao Chen

Learning effective spectral-spatial features is important for the hyperspectral image (HSI) classification task, but the majority of existing HSI classification methods still suffer from modeling complex spectral-spatial relations and characterizing low-level details and high-level semantics comprehensively.

Classification Hyperspectral Image Classification

Paper
Add Code

VQ-NeRF: Vector Quantization Enhances Implicit Neural Representations

no code implementations • 23 Oct 2023 • Yiying Yang, Wen Liu, Fukun Yin, Xin Chen, Gang Yu, Jiayuan Fan, Tao Chen

Recent advancements in implicit neural representations have contributed to high-fidelity surface reconstruction and photorealistic novel view synthesis.

Novel View Synthesis Quantization +1

Paper
Add Code

Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction

no code implementations • 21 Dec 2023 • Jingdong Zhang, Jiayuan Fan, Peng Ye, Bo Zhang, Hancheng Ye, Baopu Li, Yancheng Cai, Tao Chen

In this work, we propose to learn a comprehensive intermediate feature globally from both task-generic and task-specific features, we reveal an important fact that this intermediate feature, namely the bridge feature, is a good solution to the above issues.

Multi-Task Learning

Paper
Add Code

MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies

no code implementations • 3 Mar 2024 • Zhende Song, Chenchen Wang, Jiamu Sheng, Chi Zhang, Gang Yu, Jiayuan Fan, Tao Chen

The development of multimodal models has marked a significant step forward in how machines understand videos.

Video Understanding

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.