Search Results for author: Shengyi Qian

Found 11 papers, 5 papers with code

AffordanceLLM: Grounding Affordance from Vision Language Models

no code implementations • 12 Jan 2024 • Shengyi Qian, Weifeng Chen, Min Bai, Xiong Zhou, Zhuowen Tu, Li Erran Li

Affordance grounding refers to the task of finding the area of an object with which one can interact.

Human-Object Interaction Detection Object

Paper
Add Code

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

1 code implementation • 21 Sep 2023 • Jianing Yang, Xuweiyi Chen, Shengyi Qian, Nikhil Madaan, Madhavan Iyengar, David F. Fouhey, Joyce Chai

While existing approaches often rely on extensive labeled data or exhibit limitations in handling complex language queries, we propose LLM-Grounder, a novel zero-shot, open-vocabulary, Large Language Model (LLM)-based 3D visual grounding pipeline.

Language Modelling Large Language Model +3

251

Paper
Code

Pitfalls in Link Prediction with Graph Neural Networks: Understanding the Impact of Target-link Inclusion & Better Practices

no code implementations • 1 Jun 2023 • Jing Zhu, YuHang Zhou, Vassilis N. Ioannidis, Shengyi Qian, Wei Ai, Xiang Song, Danai Koutra

While Graph Neural Networks (GNNs) are remarkably successful in a variety of high-impact applications, we demonstrate that, in link prediction, the common practices of including the edges being predicted in the graph at training and/or test have outsized impact on the performance of low-degree nodes.

Link Prediction Node Classification

Paper
Add Code

Understanding 3D Object Interaction from a Single Image

1 code implementation • ICCV 2023 • Shengyi Qian, David F. Fouhey

Humans can easily understand a single image as depicting multiple potential objects permitting interaction.

Object

Paper
Code

Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation

2 code implementations • ICCV 2023 • Ziyang Chen, Shengyi Qian, Andrew Owens

In this paper, we use these cues to solve a problem we call Sound Localization from Motion (SLfM): jointly estimating camera rotation and localizing sound sources.