Search Results for author: Xinyu Liu

Found 33 papers, 13 papers with code

EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention

3 code implementations CVPR 2023 Xinyu Liu, Houwen Peng, Ningxin Zheng, Yuqing Yang, Han Hu, Yixuan Yuan

Comprehensive experiments demonstrate EfficientViT outperforms existing efficient models, striking a good trade-off between speed and accuracy.

ADAPT: Action-aware Driving Caption Transformer

1 code implementation1 Feb 2023 Bu Jin, Xinyu Liu, Yupeng Zheng, Pengfei Li, Hao Zhao, Tong Zhang, Yuhang Zheng, Guyue Zhou, Jingjing Liu

To bridge the gap, we propose an end-to-end transformer-based architecture, ADAPT (Action-aware Driving cAPtion Transformer), which provides user-friendly natural language narrations and reasoning for each decision making step of autonomous vehicular control and action.

Autonomous Driving Decision Making

SIGMA: Semantic-complete Graph Matching for Domain Adaptive Object Detection

1 code implementation CVPR 2022 Wuyang Li, Xinyu Liu, Yixuan Yuan

To overcome these challenges, we propose a novel SemantIc-complete Graph MAtching (SIGMA) framework for DAOD, which completes mismatched semantics and reformulates the adaptation with graph matching.

Graph Matching Hallucination +2

Delving into Shape-aware Zero-shot Semantic Segmentation

1 code implementation CVPR 2023 Xinyu Liu, Beiwen Tian, Zhen Wang, Rui Wang, Kehua Sheng, Bo Zhang, Hao Zhao, Guyue Zhou

Thanks to the impressive progress of large-scale vision-language pretraining, recent recognition models can classify arbitrary objects in a zero-shot and open-set manner, with a surprisingly high accuracy.

Image Segmentation Segmentation +2

Towards Robust Adaptive Object Detection under Noisy Annotations

1 code implementation CVPR 2022 Xinyu Liu, Wuyang Li, Qiushi Yang, Baopu Li, Yixuan Yuan

Domain Adaptive Object Detection (DAOD) models a joint distribution of images and labels from an annotated source domain and learns a domain-invariant transformation to estimate the target labels with the given target domain images.

Object object-detection +1

UN-SAM: Universal Prompt-Free Segmentation for Generalized Nuclei Images

1 code implementation26 Feb 2024 Zhen Chen, Qing Xu, Xinyu Liu, Yixuan Yuan

Moreover, to unleash the generalization capability of SAM across a variety of nuclei images, we devise a Domain-adaptive Tuning Encoder (DT-Encoder) to seamlessly harmonize visual features with domain-common and domain-specific knowledge, and further devise a Domain Query-enhanced Decoder (DQ-Decoder) by leveraging learnable domain queries for segmentation decoding in different nuclei domains.

Segmentation Semantic Segmentation

Learning for Vehicle-to-Vehicle Cooperative Perception under Lossy Communication

1 code implementation16 Dec 2022 Jinlong Li, Runsheng Xu, Xinyu Liu, Jin Ma, Zicheng Chi, Jiaqi Ma, Hongkai Yu

Due to the beneficial Vehicle-to-Vehicle (V2V) communication, the deep learning based features from other agents can be shared to the ego vehicle so as to improve the perception of the ego vehicle.

3D Object Detection object-detection

Maintaining Reasoning Consistency in Compositional Visual Question Answering

1 code implementation CVPR 2022 Chenchen Jing, Yunde Jia, Yuwei Wu, Xinyu Liu, Qi Wu

Existing VQA models can answer a compositional question well, but cannot work well in terms of reasoning consistency in answering the compositional question and its sub-questions.

Question Answering Visual Question Answering

Learning Point-wise Abstaining Penalty for Point Cloud Anomaly Detection

1 code implementation19 Sep 2023 Shaocong Xu, Pengfei Li, Xinyu Liu, Qianpu Sun, Yang Li, Shihui Guo, Zhen Wang, Bo Jiang, Rui Wang, Kehua Sheng, Bo Zhang, Hao Zhao

We demonstrate that learning different abstaining penalties, apart from point-wise penalty, for different types of (synthesized) outliers can further improve the performance.

Anomaly Detection Autonomous Driving +1

Showing Your Work Doesn't Always Work

1 code implementation ACL 2020 Raphael Tang, Jaejun Lee, Ji Xin, Xinyu Liu, Yao-Liang Yu, Jimmy Lin

In natural language processing, a recently popular line of work explores how to best report the experimental results of neural networks.

Deep least-squares methods: an unsupervised learning-based numerical method for solving elliptic PDEs

1 code implementation5 Nov 2019 Zhiqiang Cai, Jingshuang Chen, Min Liu, Xinyu Liu

This paper studies an unsupervised deep learning-based numerical approach for solving partial differential equations (PDEs).

Exploring Gradient Flow Based Saliency for DNN Model Compression

1 code implementation24 Oct 2021 Xinyu Liu, Baopu Li, Zhen Chen, Yixuan Yuan

Model pruning aims to reduce the deep neural network (DNN) model size or computational overhead.

Image Classification Image Denoising +1

Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics

no code implementations27 Mar 2017 Jeffrey Mahler, Jacky Liang, Sherdil Niyaz, Michael Laskey, Richard Doan, Xinyu Liu, Juan Aparicio Ojea, Ken Goldberg

To reduce data collection time for deep learning of robust robotic grasp plans, we explore training from a synthetic dataset of 6. 7 million point clouds, grasps, and analytic grasp metrics generated from thousands of 3D models from Dex-Net 1. 0 in randomized poses on a table.

Robotics

Dex-Net 3.0: Computing Robust Robot Vacuum Suction Grasp Targets in Point Clouds using a New Analytic Model and Deep Learning

no code implementations19 Sep 2017 Jeffrey Mahler, Matthew Matl, Xinyu Liu, Albert Li, David Gealy, Ken Goldberg

Vacuum-based end effectors are widely used in industry and are often preferred over parallel-jaw and multifinger grippers due to their ability to lift objects with a single point of contact.

Robotics

Spelling Error Correction Using a Nested RNN Model and Pseudo Training Data

no code implementations1 Nov 2018 Hao Li, Yang Wang, Xinyu Liu, Zhichao Sheng, Si Wei

We propose a nested recurrent neural network (nested RNN) model for English spelling error correction and generate pseudo data based on phonetic similarity to train it.

Feature Engineering

Learning Efficient Lexically-Constrained Neural Machine Translation with External Memory

no code implementations31 Jan 2019 Ya Li, Xinyu Liu, Dan Liu, Xueqiang Zhang, Junhua Liu

Recent years has witnessed dramatic progress of neural machine translation (NMT), however, the method of manually guiding the translation procedure remains to be better explored.

Machine Translation NMT +2

TanhExp: A Smooth Activation Function with High Convergence Speed for Lightweight Neural Networks

no code implementations22 Mar 2020 Xinyu Liu, Xiaoguang Di

Lightweight or mobile neural networks used for real-time computer vision tasks contain fewer parameters than normal networks, which lead to a constrained performance.

Image Classification

Review of data analysis in vision inspection of power lines with an in-depth discussion of deep learning technology

no code implementations22 Mar 2020 Xinyu Liu, Xiren Miao, Hao Jiang, Jing Chen

With the aim of providing a comprehensive overview for researchers who are interested in developing a deep-learning-based analysis system for power lines inspection data, this paper conducts a thorough review of the current literature and identifies the challenges for future research.

object-detection Small Object Detection

3D RegNet: Deep Learning Model for COVID-19 Diagnosis on Chest CT Image

no code implementations8 Jul 2021 Haibo Qi, YuHan Wang, Xinyu Liu

In this paper, a 3D-RegNet-based neural network is proposed for diagnosing the physical condition of patients with coronavirus (Covid-19) infection.

COVID-19 Diagnosis

Generalizing to New Domains by Mapping Natural Language to Lifted LTL

no code implementations11 Oct 2021 Eric Hsiung, Hiloni Mehta, Junchi Chu, Xinyu Liu, Roma Patel, Stefanie Tellex, George Konidaris

We compare our method of mapping natural language task specifications to intermediate contextual queries against state-of-the-art CopyNet models capable of translating natural language to LTL, by evaluating whether correct LTL for manipulation and navigation task specifications can be output, and show that our method outperforms the CopyNet model on unseen object references.

Neural Architecture Searching for Facial Attributes-based Depression Recognition

no code implementations24 Jan 2022 Mingzhe Chen, Xi Xiao, Bin Zhang, Xinyu Liu, Runiu Lu

In this paper, we propose to extend Neural Architecture Search (NAS) technique for designing an optimal model for multiple facial attributes-based depression recognition, which can be efficiently and robustly implemented in a small dataset.

Attribute Neural Architecture Search +1

Instruction-ViT: Multi-Modal Prompts for Instruction Learning in ViT

no code implementations29 Apr 2023 Zhenxiang Xiao, Yuzhong Chen, Lu Zhang, Junjie Yao, Zihao Wu, Xiaowei Yu, Yi Pan, Lin Zhao, Chong Ma, Xinyu Liu, Wei Liu, Xiang Li, Yixuan Yuan, Dinggang Shen, Dajiang Zhu, Tianming Liu, Xi Jiang

Prompts have been proven to play a crucial role in large language models, and in recent years, vision models have also been using prompts to improve scalability for multiple downstream tasks.

Image Classification

Towards Robust Aspect-based Sentiment Analysis through Non-counterfactual Augmentations

no code implementations24 Jun 2023 Xinyu Liu, Yan Ding, Kaikai An, Chunyang Xiao, Pranava Madhyastha, Tong Xiao, Jingbo Zhu

While state-of-the-art NLP models have demonstrated excellent performance for aspect based sentiment analysis (ABSA), substantial evidence has been presented on their lack of robustness.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

Deep Transfer Learning for Intelligent Vehicle Perception: a Survey

no code implementations26 Jun 2023 Xinyu Liu, Jinlong Li, Jin Ma, Huiming Sun, Zhigang Xu, Tianyun Zhang, Hongkai Yu

To the best of our knowledge, this paper represents the first comprehensive survey on the topic of the deep transfer learning for intelligent vehicle perception.

Autonomous Driving Decision Making +2

Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos

no code implementations27 Jun 2023 Chiori Hori, Puyuan Peng, David Harwath, Xinyu Liu, Kei Ota, Siddarth Jain, Radu Corcodel, Devesh Jha, Diego Romeres, Jonathan Le Roux

This paper introduces a method for robot action sequence generation from instruction videos using (1) an audio-visual Transformer that converts audio-visual features and instruction speech to a sequence of robot actions called dynamic movement primitives (DMPs) and (2) style-transfer-based training that employs multi-task learning with video captioning and weakly-supervised learning with a semantic classifier to exploit unpaired video-action data.

Multi-Task Learning Scene Understanding +3

S2R-ViT for Multi-Agent Cooperative Perception: Bridging the Gap from Simulation to Reality

no code implementations16 Jul 2023 Jinlong Li, Runsheng Xu, Xinyu Liu, Baolu Li, Qin Zou, Jiaqi Ma, Hongkai Yu

We investigate the effects of these two types of domain gaps and propose a novel uncertainty-aware vision transformer to effectively relief the Deployment Gap and an agent-based feature adaptation module with inter-agent and ego-agent discriminators to reduce the Feature Gap.

3D Object Detection object-detection +1

AdvGPS: Adversarial GPS for Multi-Agent Perception Attack

no code implementations30 Jan 2024 Jinlong Li, Baolu Li, Xinyu Liu, Jianwu Fang, Felix Juefei-Xu, Qing Guo, Hongkai Yu

The multi-agent perception system collects visual data from sensors located on various agents and leverages their relative poses determined by GPS signals to effectively fuse information, mitigating the limitations of single-agent sensing, such as occlusion.

Adversarial Attack object-detection +1

Breaking Data Silos: Cross-Domain Learning for Multi-Agent Perception from Independent Private Sources

no code implementations6 Feb 2024 Jinlong Li, Baolu Li, Xinyu Liu, Runsheng Xu, Jiaqi Ma, Hongkai Yu

However, the data source to train the various agents is independent and private in each company, leading to the Distribution Gap of different private data for training distinct agents in multi-agent perception system.

3D Object Detection object-detection

Endora: Video Generation Models as Endoscopy Simulators

no code implementations17 Mar 2024 Chenxin Li, Hengyu Liu, Yifan Liu, Brandon Y. Feng, Wuyang Li, Xinyu Liu, Zhen Chen, Jing Shao, Yixuan Yuan

In a nutshell, Endora marks a notable breakthrough in the deployment of generative AI for clinical endoscopy research, setting a substantial stage for further advances in medical content generation.

Data Augmentation Video Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.