Search Results for author: Xiaoqi Li

Found 24 papers, 11 papers with code

Human-centered In-building Embodied Delivery Benchmark

1 code implementation25 Jun 2024 Zhuoqun Xu, Yang Liu, Xiaoqi Li, Jiyao Zhang, Hao Dong

This environment also includes autonomous human characters and robots with grasping and mobility capabilities, as well as a large number of interactive items.

SpatialBot: Precise Spatial Understanding with Vision Language Models

1 code implementation19 Jun 2024 Wenxiao Cai, Iaroslav Ponomarenko, Jianhao Yuan, Xiaoqi Li, Wankou Yang, Hao Dong, Bo Zhao

Vision Language Models (VLMs) have achieved impressive performance in 2D image understanding, however they are still struggling with spatial understanding which is the foundation of Embodied AI.

AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation

no code implementations17 Jun 2024 Chuyan Xiong, Chengyu Shen, Xiaoqi Li, Kaichen Zhou, Jeremy Liu, Ruiping Wang, Hao Dong

The ability to reflect on and correct failures is crucial for robotic systems to interact stably with real-life objects. Observing the generalization and reasoning capabilities of Multimodal Large Language Models (MLLMs), previous approaches have aimed to utilize these models to enhance robotic systems accordingly. However, these methods typically focus on high-level planning corrections using an additional MLLM, with limited utilization of failed samples to correct low-level contact poses which is particularly prone to occur during articulated object manipulation. To address this gap, we propose an Autonomous Interactive Correction (AIC) MLLM, which makes use of previous low-level interaction experiences to correct SE(3) pose predictions for articulated object.

Pose Prediction Test-time Adaptation

GasTrace: Detecting Sandwich Attack Malicious Accounts in Ethereum

no code implementations30 May 2024 Zekai Liu, Xiaoqi Li, Hongli Peng, Wenkai Li

The openness and transparency of Ethereum transaction data make it easy to be exploited by any entities, executing malicious attacks.

Classification Graph Attention

NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation

no code implementations13 Mar 2024 ran Xu, Yan Shen, Xiaoqi Li, Ruihai Wu, Hao Dong

To address these challenges, we introduce a comprehensive benchmark, NrVLM, comprising 15 distinct manipulation tasks, containing over 4500 episodes meticulously annotated with fine-grained language instructions.

Robot Manipulation

ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation

no code implementations CVPR 2024 Xiaoqi Li, Mingxu Zhang, Yiran Geng, Haoran Geng, Yuxing Long, Yan Shen, Renrui Zhang, Jiaming Liu, Hao Dong

By fine-tuning the injected adapters, we preserve the inherent common sense and reasoning ability of the MLLMs while equipping them with the ability for manipulation.

Common Sense Reasoning Language Modelling +5

LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding

no code implementations21 Dec 2023 Senqiao Yang, Jiaming Liu, Ray Zhang, Mingjie Pan, Zoey Guo, Xiaoqi Li, Zehui Chen, Peng Gao, Yandong Guo, Shanghang Zhang

In this paper, we introduce LiDAR-LLM, which takes raw LiDAR data as input and harnesses the remarkable reasoning capabilities of LLMs to gain a comprehensive understanding of outdoor 3D scenes.

Instruction Following Language Modelling +2

ImageManip: Image-based Robotic Manipulation with Affordance-guided Next View Selection

no code implementations13 Oct 2023 Xiaoqi Li, Yanzi Wang, Yan Shen, Ponomarenko Iaroslav, Haoran Lu, Qianxu Wang, Boshi An, Jiaming Liu, Hao Dong

This framework is designed to capture multiple perspectives of the target object and infer depth information to complement its geometry.

Object Robot Manipulation

Distribution-Aware Continual Test-Time Adaptation for Semantic Segmentation

1 code implementation24 Sep 2023 Jiayi Ni, Senqiao Yang, ran Xu, Jiaming Liu, Xiaoqi Li, Wenyu Jiao, Zehui Chen, Yi Liu, Shanghang Zhang

In this paper, we propose a distribution-aware tuning (DAT) method to make the semantic segmentation CTTA efficient and practical in real-world applications.

Autonomous Driving Semantic Segmentation +1

Discuss Before Moving: Visual Language Navigation via Multi-expert Discussions

no code implementations20 Sep 2023 Yuxing Long, Xiaoqi Li, Wenzhe Cai, Hao Dong

The performances on the representative VLN task R2R show that our method surpasses the leading zero-shot VLN model by a large margin on all metrics.

Language Modelling Large Language Model

RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision

1 code implementation18 Sep 2023 Mingjie Pan, Jiaming Liu, Renrui Zhang, Peixiang Huang, Xiaoqi Li, Bing Wang, Hongwei Xie, Li Liu, Shanghang Zhang

3D occupancy prediction holds significant promise in the fields of robot perception and autonomous driving, which quantifies 3D scenes into grid cells with semantic labels.

Autonomous Driving

An Overview of AI and Blockchain Integration for Privacy-Preserving

no code implementations6 May 2023 Zongwei Li, Dechao Kong, Yuanzheng Niu, Hongli Peng, Xiaoqi Li, Wenkai Li

In conclusion, this paper outlines the future directions of privacy protection technologies emerging from AI and blockchain integration, including enhancing efficiency and security to achieve a more comprehensive privacy protection of privacy.

De-identification Management +1

BEVUDA: Multi-geometric Space Alignments for Domain Adaptive BEV 3D Object Detection

no code implementations30 Nov 2022 Jiaming Liu, Rongyu Zhang, Xiaoqi Li, Xiaowei Chi, Zehui Chen, Ming Lu, Yandong Guo, Shanghang Zhang

In this paper, we propose a Multi-space Alignment Teacher-Student (MATS) framework to ease the domain shift accumulation, which consists of a Depth-Aware Teacher (DAT) and a Geometric-space Aligned Student (GAS) model.

3D Object Detection Autonomous Driving +4

Unsupervised Spike Depth Estimation via Cross-modality Cross-domain Knowledge Transfer

1 code implementation26 Aug 2022 Jiaming Liu, Qizhe Zhang, Xiaoqi Li, Jianing Li, Guanqun Wang, Ming Lu, Tiejun Huang, Shanghang Zhang

Neuromorphic spike data, an upcoming modality with high temporal resolution, has shown promising potential in autonomous driving by mitigating the challenges posed by high-velocity motion blur.

Autonomous Driving Depth Estimation +2

Efficient Meta-Tuning for Content-aware Neural Video Delivery

1 code implementation20 Jul 2022 Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang

Our method significantly reduces the computational cost and achieves even better performance, paving the way for applying neural video delivery techniques to practical applications.

Super-Resolution

Adaptive Patch Exiting for Scalable Single Image Super-Resolution

1 code implementation22 Mar 2022 Shizun Wang, Jiaming Liu, Kaixin Chen, Xiaoqi Li, Ming Lu, Yandong Guo

Once the incremental capacity is below the threshold, the patch can exit at the specific layer.

Image Super-Resolution

A State-of-the-art Survey of U-Net in Microscopic Image Analysis: from Simple Usage to Structure Mortification

no code implementations14 Feb 2022 Jian Wu, Wanli Liu, Chen Li, Tao Jiang, Islam Mohammad Shariful, Hongzan Sun, Xiaoqi Li, Xintong Li, Xinyu Huang, Marcin Grzegorzek

Image analysis technology is used to solve the inadvertences of artificial traditional methods in disease, wastewater treatment, environmental change monitoring analysis and convolutional neural networks (CNN) play an important role in microscopic image analysis.

Image Segmentation Segmentation +1

What Can Machine Vision Do for Lymphatic Histopathology Image Analysis: A Comprehensive Review

no code implementations21 Jan 2022 Xiaoqi Li, HaoYuan Chen, Chen Li, Md Mamunur Rahaman, Xintong Li, Jian Wu, Xiaoyan Li, Hongzan Sun, Marcin Grzegorzek

In the past ten years, the computing power of machine vision (MV) has been continuously improved, and image analysis algorithms have developed rapidly.

SamplingAug: On the Importance of Patch Sampling Augmentation for Single Image Super-Resolution

1 code implementation30 Nov 2021 Shizun Wang, Ming Lu, Kaixin Chen, Jiaming Liu, Xiaoqi Li, Chuang Zhang, Ming Wu

However, existing methods mostly train the DNNs on uniformly sampled LR-HR patch pairs, which makes them fail to fully exploit informative patches within the image.

Data Augmentation Image Super-Resolution

CLUE: Towards Discovering Locked Cryptocurrencies in Ethereum

no code implementations2 Dec 2020 Xiaoqi Li, Ting Chen, Xiapu Luo, Chenxu Wang

Because the locked cryptocurrencies can never be controlled by users, avoid interacting with the accounts discovered by CLUE and repeating the same mistakes again can help users to save money.

Cryptography and Security

Cannot find the paper you are looking for? You can Submit a new open access paper.