Search Results for author: Yue Fan

Found 23 papers, 10 papers with code

Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting

no code implementations22 Mar 2024 Jun Guo, Xiaojian Ma, Yue Fan, Huaping Liu, Qing Li

Open-vocabulary 3D scene understanding presents a significant challenge in computer vision, withwide-ranging applications in embodied agents and augmented reality systems.

Scene Understanding Segmentation +2

VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding

no code implementations18 Mar 2024 Yue Fan, Xiaojian Ma, Rujie Wu, Yuntao Du, Jiaqi Li, Zhi Gao, Qing Li

We explore how reconciling several foundation models (large language models and vision-language models) with a novel unified memory mechanism could tackle the challenging video understanding problem, especially capturing the long-term temporal relations in lengthy videos.

Video Understanding

Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey

1 code implementation3 Feb 2024 Yi Xin, Siqi Luo, Haodi Zhou, Junlong Du, Xiaohong Liu, Yue Fan, Qing Li, Yuntao Du

Large-scale pre-trained vision models (PVMs) have shown great potential for adaptability across various downstream vision tasks.

Transfer Learning

Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA

no code implementations29 Jan 2024 Yue Fan, Jing Gu, Kaiwen Zhou, Qianqi Yan, Shan Jiang, Ching-Chen Kuo, Xinze Guan, Xin Eric Wang

Our evaluation shows that questions in the MultipanelVQA benchmark pose significant challenges to the state-of-the-art Large Vision Language Models (LVLMs) tested, even though humans can attain approximately 99\% accuracy on these questions.

Benchmarking Image Comprehension +4

SSB: Simple but Strong Baseline for Boosting Performance of Open-Set Semi-Supervised Learning

1 code implementation ICCV 2023 Yue Fan, Anna Kukleva, Dengxin Dai, Bernt Schiele

In experiments, SSB greatly improves both inlier classification and outlier detection performance, outperforming existing methods by a large margin.

Multi-Task Learning Outlier Detection

LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models

1 code implementation5 Oct 2023 Saaket Agashe, Yue Fan, Anthony Reyna, Xin Eric Wang

In this study, we introduce a new LLM-Coordination Benchmark aimed at a detailed analysis of LLMs within the context of Pure Coordination Games, where participating agents need to cooperate for the most gain.

Multiple-choice Question Answering

R2H: Building Multimodal Navigation Helpers that Respond to Help Requests

no code implementations23 May 2023 Yue Fan, Jing Gu, Kaizhi Zheng, Xin Eric Wang

Intelligent navigation-helper agents are critical as they can navigate users in unknown areas through environmental awareness and conversational ability, serving as potential accessibility tools for individuals with disabilities.

Benchmarking Language Modelling +3

SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised Learning

4 code implementations26 Jan 2023 Hao Chen, Ran Tao, Yue Fan, Yidong Wang, Jindong Wang, Bernt Schiele, Xing Xie, Bhiksha Raj, Marios Savvides

The critical challenge of Semi-Supervised Learning (SSL) is how to effectively leverage the limited labeled data and massive unlabeled data to improve the model's generalization performance.

imbalanced classification

An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised Learning

no code implementations20 Nov 2022 Hao Chen, Yue Fan, Yidong Wang, Jindong Wang, Bernt Schiele, Xing Xie, Marios Savvides, Bhiksha Raj

While standard SSL assumes uniform data distribution, we consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data.

Pseudo Label

JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents

no code implementations28 Aug 2022 Kaizhi Zheng, Kaiwen Zhou, Jing Gu, Yue Fan, Jialu Wang, Zonglin Di, Xuehai He, Xin Eric Wang

Building a conversational embodied agent to execute real-life tasks has been a long-standing yet quite challenging research goal, as it requires effective human-agent communication, multi-modal understanding, long-range sequential decision making, etc.

Action Generation Common Sense Reasoning +1

Aerial Vision-and-Dialog Navigation

2 code implementations24 May 2022 Yue Fan, Winson Chen, Tongzhou Jiang, Chun Zhou, Yi Zhang, Xin Eric Wang

To this end, we introduce Aerial Vision-and-Dialog Navigation (AVDN), to navigate a drone via natural language conversation.

Navigate

FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning

4 code implementations15 May 2022 Yidong Wang, Hao Chen, Qiang Heng, Wenxin Hou, Yue Fan, Zhen Wu, Jindong Wang, Marios Savvides, Takahiro Shinozaki, Bhiksha Raj, Bernt Schiele, Xing Xie

Semi-supervised Learning (SSL) has witnessed great success owing to the impressive performances brought by various methods based on pseudo labeling and consistency regularization.

Fairness Semi-Supervised Image Classification

Revisiting Consistency Regularization for Semi-Supervised Learning

no code implementations10 Dec 2021 Yue Fan, Anna Kukleva, Bernt Schiele

Generally, the aim is to train a model that is invariant to various data augmentations.

CoSSL: Co-Learning of Representation and Classifier for Imbalanced Semi-Supervised Learning

1 code implementation CVPR 2022 Yue Fan, Dengxin Dai, Anna Kukleva, Bernt Schiele

In this paper, we propose a novel co-learning framework (CoSSL) with decoupled representation learning and classifier learning for imbalanced SSL.

Representation Learning

Multi-Vector Embedding on Networks with Taxonomies

no code implementations29 Sep 2021 Yue Fan, Xiuli Ma

Networks serve as efficient tools to describe close relationships among nodes.

Network Embedding

Learn by Observation: Imitation Learning for Drone Patrolling from Videos of A Human Navigator

no code implementations30 Aug 2020 Yue Fan, Shilei Chu, Wei zhang, Ran Song, Yibin Li

Extensive experiments are conducted to demonstrate the accuracy of the proposed imitating learning process as well as the reliability of the holistic system for autonomous drone navigation.

Drone navigation Imitation Learning

Analyzing the Dependency of ConvNets on Spatial Information

no code implementations5 Feb 2020 Yue Fan, Yongqin Xian, Max Maria Losch, Bernt Schiele

In this paper, we are pushing the envelope and aim to further investigate the reliance on spatial information.

Image Classification Object Recognition

CN-CELEB: a challenging Chinese speaker recognition dataset

2 code implementations31 Oct 2019 Yue Fan, Jiawen Kang, Lantian Li, Kaicheng Li, Haolin Chen, Sitong Cheng, Pengyuan Zhang, Ziya Zhou, Yunqi Cai, Dong Wang

These datasets tend to deliver over optimistic performance and do not meet the request of research on speaker recognition in unconstrained conditions.

Speaker Recognition

Tag2Vec: Learning Tag Representations in Tag Networks

no code implementations19 Apr 2019 Junshan Wang, Zhicong Lu, Guojie Song, Yue Fan, Lun Du, Wei. Lin

Network embedding is a method to learn low-dimensional representation vectors for nodes in complex networks.

Network Embedding TAG

Parameter-Free Spatial Attention Network for Person Re-Identification

3 code implementations29 Nov 2018 Haoran Wang, Yue Fan, Zexin Wang, Licheng Jiao, Bernt Schiele

We propose a novel architecture for Person Re-Identification, based on a novel parameter-free spatial attention layer introducing spatial relations among the feature map activations back to the model.

Person Re-Identification

Feature vector regularization in machine learning

no code implementations19 Dec 2012 Yue Fan, Louise Raphael, Mark Kon

Such feature vector regularization inherits a property from function denoising on ${\bf R}^n$, in that accuracy is non-monotonic in the denoising (regularization) parameter $\alpha$.

BIG-bench Machine Learning Denoising +2

Cannot find the paper you are looking for? You can Submit a new open access paper.