Search Results for author: Yian Wang

Found 9 papers, 3 papers with code

Silver medal Solution for Image Matching Challenge 2024

no code implementations4 Nov 2024 Yian Wang

Image Matching Challenge 2024 is a competition focused on building 3D maps from diverse image sets, requiring participants to solve fundamental computer vision challenges in image matching across varying angles, lighting, and seasonal changes.

Keypoint Detection

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

1 code implementation30 Oct 2024 Zhiyong Wu, Zhenyu Wu, Fangzhi Xu, Yian Wang, Qiushi Sun, Chengyou Jia, Kanzhi Cheng, Zichen Ding, Liheng Chen, Paul Pu Liang, Yu Qiao

Existing efforts in building GUI agents heavily rely on the availability of robust commercial Vision-Language Models (VLMs) such as GPT-4o and GeminiProVision.

Natural Language Visual Grounding

Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy

1 code implementation11 Mar 2024 Jiuming Liu, Ruiji Yu, Yian Wang, Yu Zheng, Tianchen Deng, Weicai Ye, Hesheng Wang

In this paper, we propose a novel SSM-based point cloud processing backbone, named Point Mamba, with a causality-aware ordering mechanism.

Mamba Semantic Segmentation

MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World

no code implementations CVPR 2024 Yining Hong, Zishuo Zheng, Peihao Chen, Yian Wang, Junyan Li, Chuang Gan

Human beings possess the capability to multiply a melange of multisensory cues while actively exploring and interacting with the 3D world.

Language Modelling Large Language Model

Towards Generalist Robots: A Promising Paradigm via Generative Simulation

no code implementations17 May 2023 Zhou Xian, Theophile Gervet, Zhenjia Xu, Yi-Ling Qiao, Tsun-Hsuan Wang, Yian Wang

This document serves as a position paper that outlines the authors' vision for a potential pathway towards generalist robots.

Scene Generation

AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-shot Interactions

no code implementations1 Dec 2021 Yian Wang, Ruihai Wu, Kaichun Mo, Jiaqi Ke, Qingnan Fan, Leonidas Guibas, Hao Dong

Perceiving and interacting with 3D articulated objects, such as cabinets, doors, and faucets, pose particular challenges for future home-assistant robots performing daily tasks in human environments.

Friction

VAT-Mart: Learning Visual Action Trajectory Proposals for Manipulating 3D ARTiculated Objects

no code implementations ICLR 2022 Ruihai Wu, Yan Zhao, Kaichun Mo, Zizheng Guo, Yian Wang, Tianhao Wu, Qingnan Fan, Xuelin Chen, Leonidas Guibas, Hao Dong

In this paper, we propose object-centric actionable visual priors as a novel perception-interaction handshaking point that the perception system outputs more actionable guidance than kinematic structure estimation, by predicting dense geometry-aware, interaction-aware, and task-aware visual action affordance and trajectory proposals.

Cannot find the paper you are looking for? You can Submit a new open access paper.