Search Results for author: Jiayuan Fan

Found 24 papers, 10 papers with code

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning

1 code implementation30 Nov 2023 Sijin Chen, Xin Chen, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, Tao Chen

However, developing LMMs that can comprehend, reason, and plan in complex and diverse 3D environments remains a challenging topic, especially considering the demand for understanding permutation-invariant point cloud 3D representations of the 3D scene.

3D dense captioning Dense Captioning +1

Performance-aware Approximation of Global Channel Pruning for Multitask CNNs

1 code implementation21 Mar 2023 Hancheng Ye, Bo Zhang, Tao Chen, Jiayuan Fan, Bin Wang

Global channel pruning (GCP) aims to remove a subset of channels (filters) across different layers from a deep model without hurting the performance.

Model Compression

$β$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search

1 code implementation3 Mar 2022 Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang

Neural Architecture Search~(NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural networks automatically.

Neural Architecture Search

b-DARTS: Beta-Decay Regularization for Differentiable Architecture Search

1 code implementation CVPR 2022 Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang

Neural Architecture Search (NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural network automatically.

Neural Architecture Search

ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model

1 code implementation29 Nov 2023 Fukun Yin, Xin Chen, Chi Zhang, Biao Jiang, Zibo Zhao, Jiayuan Fan, Gang Yu, Taihao Li, Tao Chen

The advent of large language models, enabling flexibility through instruction-driven approaches, has revolutionized many traditional generative tasks, but large models for 3D data, particularly in comprehensively handling 3D shapes with other modalities, are still under-explored.

3D Shape Generation Language Modelling +1

MotionChain: Conversational Motion Controllers via Multimodal Prompts

1 code implementation2 Apr 2024 Biao Jiang, Xin Chen, Chi Zhang, Fukun Yin, Zhuoyuan Li, Gang Yu, Jiayuan Fan

However, this proficiency remains largely unexplored in other multimodal generative models, particularly in human motion models.

Language Modelling

Point Cloud Instance Segmentation with Semi-supervised Bounding-Box Mining

1 code implementation30 Nov 2021 Yongbin Liao, Hongyuan Zhu, Yanggang Zhang, Chuangguan Ye, Tao Chen, Jiayuan Fan

For stage two, the bounding box proposals with SPCR are grouped into some subsets, and the instance masks are mined inside each subset with a novel semantic propagation module and a property consistency graph module.

Instance Segmentation Semantic Segmentation

What Makes for Effective Few-shot Point Cloud Classification?

1 code implementation31 Mar 2023 Chuangguan Ye, Hongyuan Zhu, Yongbin Liao, Yanggang Zhang, Tao Chen, Jiayuan Fan

Due to the emergence of powerful computing resources and large-scale annotated datasets, deep learning has seen wide applications in our daily life.

Benchmarking Classification +2

Coarse-to-Fine Gaze Redirection with Numerical and Pictorial Guidance

1 code implementation7 Apr 2020 Jingjing Chen, Jichao Zhang, Enver Sangineto, Jiayuan Fan, Tao Chen, Nicu Sebe

In this paper, we propose to alleviate these problems by means of a novel gaze redirection framework which exploits both a numerical and a pictorial direction guidance, jointly with a coarse-to-fine learning strategy.

gaze redirection Image Generation

PIDNet: An Efficient Network for Dynamic Pedestrian Intrusion Detection

no code implementations1 Sep 2020 Jingchen Sun, Jiming Chen, Tao Chen, Jiayuan Fan, Shibo He

Vision-based dynamic pedestrian intrusion detection (PID), judging whether pedestrians intrude an area-of-interest (AoI) by a moving camera, is an important task in mobile surveillance.

Feature Compression Intrusion Detection

EADNet: Efficient Asymmetric Dilated Network for Semantic Segmentation

no code implementations16 Mar 2021 Qihang Yang, Tao Chen, Jiayuan Fan, Ye Lu, Chongyan Zuo, Qinghua Chi

Due to real-time image semantic segmentation needs on power constrained edge devices, there has been an increasing desire to design lightweight semantic segmentation neural network, to simultaneously reduce computational cost and increase inference speed.

Segmentation Semantic Segmentation

Object-aware Long-short-range Spatial Alignment for Few-Shot Fine-Grained Image Classification

no code implementations30 Aug 2021 Yike Wu, Bo Zhang, Gang Yu, Weixi Zhang, Bin Wang, Tao Chen, Jiayuan Fan

The goal of few-shot fine-grained image classification is to recognize rarely seen fine-grained objects in the query set, given only a few samples of this class in the support set.

Fine-Grained Image Classification Object +3

Densely Semantic Enhancement for Domain Adaptive Region-free Detectors

no code implementations30 Aug 2021 Bo Zhang, Tao Chen, Bin Wang, Xiaofeng Wu, Liming Zhang, Jiayuan Fan

Unsupervised domain adaptive object detection aims to adapt a well-trained detector from its original source domain with rich labeled data to a new target domain with unlabeled data.

object-detection Object Detection +1

Instance-aware Model Ensemble With Distillation For Unsupervised Domain Adaptation

no code implementations15 Nov 2022 Weimin Wu, Jiayuan Fan, Tao Chen, Hancheng Ye, Bo Zhang, Baopu Li

To enhance the model, adaptability between domains and reduce the computational cost when deploying the ensemble model, we propose a novel framework, namely Instance aware Model Ensemble With Distillation, IMED, which fuses multiple UDA component models adaptively according to different instances and distills these components into a small model.

Knowledge Distillation Unsupervised Domain Adaptation

JNDMix: JND-Based Data Augmentation for No-reference Image Quality Assessment

no code implementations20 Feb 2023 Jiamu Sheng, Jiayuan Fan, Peng Ye, JianJian Cao

Despite substantial progress in no-reference image quality assessment (NR-IQA), previous training models often suffer from over-fitting due to the limited scale of used datasets, resulting in model performance bottlenecks.

Data Augmentation No-Reference Image Quality Assessment +1

A2S-NAS: Asymmetric Spectral-Spatial Neural Architecture Search For Hyperspectral Image Classification

no code implementations23 Feb 2023 Lin Zhan, Jiayuan Fan, Peng Ye, JianJian Cao

To address the above issues, we propose a multi-stage search architecture in order to overcome asymmetric spectral-spatial dimensions and capture significant features.

Hyperspectral Image Classification Neural Architecture Search

Boost Vision Transformer with GPU-Friendly Sparsity and Quantization

no code implementations CVPR 2023 Chong Yu, Tao Chen, Zhongxue Gan, Jiayuan Fan

Moreover, GPUSQ-ViT can boost actual deployment performance by 1. 39-1. 79 times and 3. 22-3. 43 times of latency and throughput on A100 GPU, and 1. 57-1. 69 times and 2. 11-2. 51 times improvement of latency and throughput on AGX Orin.

Benchmarking Knowledge Distillation +1

When Hyperspectral Image Classification Meets Diffusion Models: An Unsupervised Feature Learning Framework

no code implementations15 Jun 2023 Jingyi Zhou, Jiamu Sheng, Jiayuan Fan, Peng Ye, Tong He, Bin Wang, Tao Chen

Learning effective spectral-spatial features is important for the hyperspectral image (HSI) classification task, but the majority of existing HSI classification methods still suffer from modeling complex spectral-spatial relations and characterizing low-level details and high-level semantics comprehensively.

Classification Hyperspectral Image Classification

VQ-NeRF: Vector Quantization Enhances Implicit Neural Representations

no code implementations23 Oct 2023 Yiying Yang, Wen Liu, Fukun Yin, Xin Chen, Gang Yu, Jiayuan Fan, Tao Chen

Recent advancements in implicit neural representations have contributed to high-fidelity surface reconstruction and photorealistic novel view synthesis.

Novel View Synthesis Quantization +1

Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction

no code implementations21 Dec 2023 Jingdong Zhang, Jiayuan Fan, Peng Ye, Bo Zhang, Hancheng Ye, Baopu Li, Yancheng Cai, Tao Chen

In this work, we propose to learn a comprehensive intermediate feature globally from both task-generic and task-specific features, we reveal an important fact that this intermediate feature, namely the bridge feature, is a good solution to the above issues.

Multi-Task Learning

MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies

no code implementations3 Mar 2024 Zhende Song, Chenchen Wang, Jiamu Sheng, Chi Zhang, Gang Yu, Jiayuan Fan, Tao Chen

The development of multimodal models has marked a significant step forward in how machines understand videos.

Video Understanding

Cannot find the paper you are looking for? You can Submit a new open access paper.