Search Results for author: Yue Fan

Found 23 papers, 10 papers with code

Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting

no code implementations • 22 Mar 2024 • Jun Guo, Xiaojian Ma, Yue Fan, Huaping Liu, Qing Li

Open-vocabulary 3D scene understanding presents a significant challenge in computer vision, withwide-ranging applications in embodied agents and augmented reality systems.

Scene Understanding Segmentation +2

Paper
Add Code

VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding

no code implementations • 18 Mar 2024 • Yue Fan, Xiaojian Ma, Rujie Wu, Yuntao Du, Jiaqi Li, Zhi Gao, Qing Li

We explore how reconciling several foundation models (large language models and vision-language models) with a novel unified memory mechanism could tackle the challenging video understanding problem, especially capturing the long-term temporal relations in lengthy videos.

Video Understanding

Paper
Add Code

Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey

1 code implementation • 3 Feb 2024 • Yi Xin, Siqi Luo, Haodi Zhou, Junlong Du, Xiaohong Liu, Yue Fan, Qing Li, Yuntao Du

Large-scale pre-trained vision models (PVMs) have shown great potential for adaptability across various downstream vision tasks.

Transfer Learning

344

Paper
Code

Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA

no code implementations • 29 Jan 2024 • Yue Fan, Jing Gu, Kaiwen Zhou, Qianqi Yan, Shan Jiang, Ching-Chen Kuo, Xinze Guan, Xin Eric Wang

Our evaluation shows that questions in the MultipanelVQA benchmark pose significant challenges to the state-of-the-art Large Vision Language Models (LVLMs) tested, even though humans can attain approximately 99\% accuracy on these questions.

Benchmarking Image Comprehension +4

Paper
Add Code

SSB: Simple but Strong Baseline for Boosting Performance of Open-Set Semi-Supervised Learning

1 code implementation • ICCV 2023 • Yue Fan, Anna Kukleva, Dengxin Dai, Bernt Schiele

In experiments, SSB greatly improves both inlier classification and outlier detection performance, outperforming existing methods by a large margin.

Multi-Task Learning Outlier Detection

Paper
Code

LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models

1 code implementation • 5 Oct 2023 • Saaket Agashe, Yue Fan, Anthony Reyna, Xin Eric Wang

In this study, we introduce a new LLM-Coordination Benchmark aimed at a detailed analysis of LLMs within the context of Pure Coordination Games, where participating agents need to cooperate for the most gain.

Multiple-choice Question Answering

Paper
Code

Factored-NeuS: Reconstructing Surfaces, Illumination, and Materials of Possibly Glossy Objects

no code implementations • 29 May 2023 • Yue Fan, Ivan Skorokhodov, Oleg Voynov, Savva Ignatyev, Evgeny Burnaev, Peter Wonka, Yiqun Wang

We develop a method that recovers the surface, materials, and illumination of a scene from its posed multi-view images.

Inverse Rendering Surface Reconstruction

Paper
Add Code

R2H: Building Multimodal Navigation Helpers that Respond to Help Requests

no code implementations • 23 May 2023 • Yue Fan, Jing Gu, Kaizhi Zheng, Xin Eric Wang

Intelligent navigation-helper agents are critical as they can navigate users in unknown areas through environmental awareness and conversational ability, serving as potential accessibility tools for individuals with disabilities.

Benchmarking Language Modelling +3

Paper
Add Code

SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised Learning

4 code implementations • 26 Jan 2023 • Hao Chen, Ran Tao, Yue Fan, Yidong Wang, Jindong Wang, Bernt Schiele, Xing Xie, Bhiksha Raj, Marios Savvides

The critical challenge of Semi-Supervised Learning (SSL) is how to effectively leverage the limited labeled data and massive unlabeled data to improve the model's generalization performance.

imbalanced classification

1,264

Paper
Code

An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised Learning

no code implementations • 20 Nov 2022 • Hao Chen, Yue Fan, Yidong Wang, Jindong Wang, Bernt Schiele, Xing Xie, Marios Savvides, Bhiksha Raj

While standard SSL assumes uniform data distribution, we consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data.

Pseudo Label

Paper
Add Code

JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents

no code implementations • 28 Aug 2022 • Kaizhi Zheng, Kaiwen Zhou, Jing Gu, Yue Fan, Jialu Wang, Zonglin Di, Xuehai He, Xin Eric Wang

Building a conversational embodied agent to execute real-life tasks has been a long-standing yet quite challenging research goal, as it requires effective human-agent communication, multi-modal understanding, long-range sequential decision making, etc.

Action Generation Common Sense Reasoning +1

Paper
Add Code

USB: A Unified Semi-supervised Learning Benchmark for Classification

4 code implementations • 12 Aug 2022 • Yidong Wang, Hao Chen, Yue Fan, Wang Sun, Ran Tao, Wenxin Hou, RenJie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yu-Feng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, Jindong Wang, Xing Xie, Yue Zhang

We further provide the pre-trained versions of the state-of-the-art neural models for CV tasks to make the cost affordable for further tuning.

Ranked #2 on Semi-Supervised Image Classification on CIFAR-100, 400 Labels

General Classification Semi-Supervised Image Classification

1,176

Paper
Code

Aerial Vision-and-Dialog Navigation

2 code implementations • 24 May 2022 • Yue Fan, Winson Chen, Tongzhou Jiang, Chun Zhou, Yi Zhang, Xin Eric Wang

To this end, we introduce Aerial Vision-and-Dialog Navigation (AVDN), to navigate a drone via natural language conversation.

Navigate

Paper
Code

FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning

4 code implementations • 15 May 2022 • Yidong Wang, Hao Chen, Qiang Heng, Wenxin Hou, Yue Fan, Zhen Wu, Jindong Wang, Marios Savvides, Takahiro Shinozaki, Bhiksha Raj, Bernt Schiele, Xing Xie

Semi-supervised Learning (SSL) has witnessed great success owing to the impressive performances brought by various methods based on pseudo labeling and consistency regularization.

Ranked #1 on Semi-Supervised Image Classification on CIFAR-10, 40 Labels

Fairness Semi-Supervised Image Classification

1,264

Paper
Code

Revisiting Consistency Regularization for Semi-Supervised Learning

no code implementations • 10 Dec 2021 • Yue Fan, Anna Kukleva, Bernt Schiele

Generally, the aim is to train a model that is invariant to various data augmentations.

Paper
Add Code

CoSSL: Co-Learning of Representation and Classifier for Imbalanced Semi-Supervised Learning

1 code implementation • CVPR 2022 • Yue Fan, Dengxin Dai, Anna Kukleva, Bernt Schiele

In this paper, we propose a novel co-learning framework (CoSSL) with decoupled representation learning and classifier learning for imbalanced SSL.

Representation Learning

Paper
Code

Multi-Vector Embedding on Networks with Taxonomies

no code implementations • 29 Sep 2021 • Yue Fan, Xiuli Ma

Networks serve as efficient tools to describe close relationships among nodes.

Network Embedding

Paper
Add Code

Learn by Observation: Imitation Learning for Drone Patrolling from Videos of A Human Navigator

no code implementations • 30 Aug 2020 • Yue Fan, Shilei Chu, Wei zhang, Ran Song, Yibin Li

Extensive experiments are conducted to demonstrate the accuracy of the proposed imitating learning process as well as the reliability of the holistic system for autonomous drone navigation.

Drone navigation Imitation Learning

Paper
Add Code

Analyzing the Dependency of ConvNets on Spatial Information

no code implementations • 5 Feb 2020 • Yue Fan, Yongqin Xian, Max Maria Losch, Bernt Schiele

In this paper, we are pushing the envelope and aim to further investigate the reliance on spatial information.

Image Classification Object Recognition

Paper
Add Code

CN-CELEB: a challenging Chinese speaker recognition dataset

2 code implementations • 31 Oct 2019 • Yue Fan, Jiawen Kang, Lantian Li, Kaicheng Li, Haolin Chen, Sitong Cheng, Pengyuan Zhang, Ziya Zhou, Yunqi Cai, Dong Wang

These datasets tend to deliver over optimistic performance and do not meet the request of research on speaker recognition in unconstrained conditions.

Speaker Recognition

Paper
Code

Tag2Vec: Learning Tag Representations in Tag Networks

no code implementations • 19 Apr 2019 • Junshan Wang, Zhicong Lu, Guojie Song, Yue Fan, Lun Du, Wei. Lin

Network embedding is a method to learn low-dimensional representation vectors for nodes in complex networks.

Network Embedding TAG

Paper
Add Code

Parameter-Free Spatial Attention Network for Person Re-Identification

3 code implementations • 29 Nov 2018 • Haoran Wang, Yue Fan, Zexin Wang, Licheng Jiao, Bernt Schiele

We propose a novel architecture for Person Re-Identification, based on a novel parameter-free spatial attention layer introducing spatial relations among the feature map activations back to the model.

Ranked #20 on Person Re-Identification on DukeMTMC-reID

Person Re-Identification

130

Paper
Code

Feature vector regularization in machine learning

no code implementations • 19 Dec 2012 • Yue Fan, Louise Raphael, Mark Kon

Such feature vector regularization inherits a property from function denoising on ${\bf R}^n$, in that accuracy is non-monotonic in the denoising (regularization) parameter $\alpha$.

BIG-bench Machine Learning Denoising +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.