Search Results for author: Xinyu Liu

Found 62 papers, 25 papers with code

VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer

no code implementations9 Feb 2025 Xinyu Liu, Ailing Zeng, Wei Xue, Harry Yang, Wenhan Luo, Qifeng Liu, Yike Guo

Crafting magic and illusions is one of the most thrilling aspects of filmmaking, with visual effects (VFX) serving as the powerhouse behind unforgettable cinematic experiences.

Image Animation Instance Segmentation +1

Linear $Q$-Learning Does Not Diverge: Convergence Rates to a Bounded Set

no code implementations31 Jan 2025 Xinyu Liu, Zixuan Xie, Shangtong Zhang

As a side product, we also use this general result to establish the $L^2$ convergence rate of tabular $Q$-learning with an $\epsilon$-softmax behavior policy, for which we rely on a novel pseudo-contraction property of the weighted Bellman optimality operator.

Q-Learning

Semantic Consistency Regularization with Large Language Models for Semi-supervised Sentiment Analysis

no code implementations29 Jan 2025 Kunrong Li, Xinyu Liu, Zhen Chen

Inspired by the ability of pretrained Large Language Models (LLMs) in following instructions and generating coherent text, we propose a Semantic Consistency Regularization with Large Language Models (SCR) framework for semi-supervised sentiment analysis.

Semi-Supervised Text Classification Sentiment Analysis +1

Panoramic Interests: Stylistic-Content Aware Personalized Headline Generation

1 code implementation the ACM Web Conference 2025 Junhong Lian, Xiang Ao, Xinyu Liu, Yang Liu, Qing He

Prevailing methods focus on user-oriented content preferences, but most of them overlook the fact that diverse stylistic preferences are integral to users' panoramic interests, leading to suboptimal personalization.

Contrastive Learning Headline Generation +3

GePBench: Evaluating Fundamental Geometric Perception for Multimodal Large Language Models

no code implementations30 Dec 2024 Shangyu Xing, Changhao Xiang, Yuteng Han, Yifan Yue, Zhen Wu, Xinyu Liu, Zhangtai Wu, Fei Zhao, Xinyu Dai

To address this limitation, we introduce GePBench, a novel benchmark designed to assess the geometric perception capabilities of MLLMs.

Almost Sure Convergence Rates and Concentration of Stochastic Approximation and Reinforcement Learning with Markovian Noise

no code implementations20 Nov 2024 Xiaochi Qian, Zixuan Xie, Xinyu Liu, Shangtong Zhang

As applications, we provide the first almost sure convergence rate for $Q$-learning with Markovian samples without count-based learning rates.

Q-Learning

EITNet: An IoT-Enhanced Framework for Real-Time Basketball Action Recognition

no code implementations13 Oct 2024 Jingyu Liu, Xinyu Liu, Mingzhe Qu, Tianyi Lyu

To overcome these challenges, we propose the EITNet model, a deep learning framework that combines EfficientDet for object detection, I3D for spatiotemporal feature extraction, and TimeSformer for temporal analysis, all integrated with IoT technology for seamless real-time data collection and processing.

Action Recognition object-detection +2

Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models

1 code implementation7 Oct 2024 Xinyu Liu, Runsong Zhao, Pengcheng Huang, Chunyang Xiao, Bei Li, Jingang Wang, Tong Xiao, Jingbo Zhu

We provide an extensive survey for limitations in this work and propose a new method called forgetting curve to measure the memorization capability of long-context models.

Memorization

More Effective LLM Compressed Tokens with Uniformly Spread Position Identifiers and Compression Loss

no code implementations22 Sep 2024 Runsong Zhao, Pengcheng Huang, Xinyu Liu, Chunyang Xiao, Tong Xiao, Jingbo Zhu

Compressing Transformer inputs into compressd tokens allows running LLMs with improved speed and cost efficiency.

Position

SoccerNet 2024 Challenges Results

1 code implementation16 Sep 2024 Anthony Cioppa, Silvio Giancola, Vladimir Somers, Victor Joos, Floriane Magera, Jan Held, Seyed Abolfazl Ghasemzadeh, Xin Zhou, Karolina Seweryn, Mateusz Kowalczyk, Zuzanna Mróz, Szymon Łukasik, Michał Hałoń, Hassan Mkhallati, Adrien Deliège, Carlos Hinojosa, Karen Sanchez, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Adam Gorski, Albert Clapés, Andrei Boiarov, Anton Afanasiev, Artur Xarles, Atom Scott, Byoungkwon Lim, Calvin Yeung, Cristian Gonzalez, Dominic Rüfenacht, Enzo Pacilio, Fabian Deuser, Faisal Sami Altawijri, Francisco Cachón, Hankyul Kim, Haobo Wang, Hyeonmin Choe, Hyunwoo J Kim, Il-Min Kim, Jae-Mo Kang, Jamshid Tursunboev, Jian Yang, Jihwan Hong, JiMin Lee, Jing Zhang, Junseok Lee, Kexin Zhang, Konrad Habel, Licheng Jiao, Linyi Li, Marc Gutiérrez-Pérez, Marcelo Ortega, Menglong Li, Milosz Lopatto, Nikita Kasatkin, Nikolay Nemtsev, Norbert Oswald, Oleg Udin, Pavel Kononov, Pei Geng, Saad Ghazai Alotaibi, Sehyung Kim, Sergei Ulasen, Sergio Escalera, Shanshan Zhang, Shuyuan Yang, Sunghwan Moon, Thomas B. Moeslund, Vasyl Shandyba, Vladimir Golovkin, Wei Dai, WonTaek Chung, Xinyu Liu, Yongqiang Zhu, Youngseo Kim, Yuan Li, Yuting Yang, Yuxuan Xiao, Zehua Cheng, Zhihao LI

The SoccerNet 2024 challenges represent the fourth annual video understanding challenges organized by the SoccerNet team.

Action Spotting Dense Video Captioning +2

Unveiling Context-Related Anomalies: Knowledge Graph Empowered Decoupling of Scene and Action for Human-Related Video Anomaly Detection

no code implementations5 Sep 2024 Chenglizhao Chen, Xinyu Liu, Mengke Song, Luming Li, Xu Yu, Shanchen Pang

In short, current methods struggle to integrate low-level visual and high-level action features, leading to poor anomaly detection in varied and complex scenes.

Anomaly Detection Video Anomaly Detection

NDP: Next Distribution Prediction as a More Broad Target

no code implementations30 Aug 2024 Junhao Ruan, Abudukeyumu Abudula, Xinyu Liu, Bei Li, Yinqiao Li, Chenglong Wang, Yuchun Fan, Yuan Ge, Tong Xiao, Jingbo Zhu

In our work, we extend the critique of NTP, highlighting its limitation also due to training with a narrow objective: the prediction of a sub-optimal one-hot distribution.

Data Compression Domain Adaptation +2

AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems

1 code implementation27 Aug 2024 Chi-Min Chan, Jianxuan Yu, Weize Chen, Chunyang Jiang, Xinyu Liu, Weijie Shi, Zhiyuan Liu, Wei Xue, Yike Guo

However, configuring an MAS for a task remains challenging, with performance only observable post-execution.

MTFinEval:A Multi-domain Chinese Financial Benchmark with Eurypalynous questions

no code implementations20 Aug 2024 Xinyu Liu, Ke Jin

In this paper, we have compiled a new benchmark, MTFinEval, focusing on the LLMs' basic knowledge of economics, which can always be used as a basis for judgment.

LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS

no code implementations20 Aug 2024 Xinyu Liu, Jing Zhang, Kexin Zhang, Xu Liu, Lingling Li

Video Object Segmentation (VOS) presents several challenges, including object occlusion and fragmentation, the dis-appearance and re-appearance of objects, and tracking specific objects within crowded scenes.

Instance Segmentation Object +5

DiffRect: Latent Diffusion Label Rectification for Semi-supervised Medical Image Segmentation

1 code implementation13 Jul 2024 Xinyu Liu, Wuyang Li, Yixuan Yuan

DiffRect first utilizes a Label Context Calibration Module (LCC) to calibrate the biased relationship between classes by learning the category-wise correlation in pseudo labels, then apply Latent Feature Rectification Module (LFR) on the latent space to formulate and align the pseudo label distributions of different levels via latent diffusion.

Denoising Image Segmentation +4

GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation

no code implementations8 Jul 2024 Chenxin Li, Xinyu Liu, Cheng Wang, Yifan Liu, Weihao Yu, Jing Shao, Yixuan Yuan

To tackle these, we propose an innovative Modality-prompted Heterogeneous Graph for Omnimodal Learning (GTP-4o), which embeds the numerous disparate clinical modalities into a unified representation, completes the deficient embedding of missing modality and reformulates the cross-modal learning with a graph-based aggregation.

Benchmarking Graph Embedding +2

3rd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation

no code implementations6 Jun 2024 Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang

Video Object Segmentation (VOS) is a vital task in computer vision, focusing on distinguishing foreground objects from the background across video frames.

Object Position +4

U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation

2 code implementations5 Jun 2024 Chenxin Li, Xinyu Liu, Wuyang Li, Cheng Wang, Hengyu Liu, Yifan Liu, Zhen Chen, Yixuan Yuan

We further delved into the potential of U-KAN as an alternative U-Net noise predictor in diffusion models, demonstrating its applicability in generating task-oriented model architectures.

Image Segmentation Kolmogorov-Arnold Networks +3

A Fourier Approach to the Parameter Estimation Problem for One-dimensional Gaussian Mixture Models

no code implementations19 Apr 2024 Xinyu Liu, Hai Zhang

Second, we reveal that there exists a fundamental limit to the problem of estimating the number of Gaussian components or model order in the mixture model if the number of i. i. d samples is finite.

A Structure-Guided Gauss-Newton Method for Shallow ReLU Neural Network

no code implementations7 Apr 2024 Zhiqiang Cai, Tong Ding, Min Liu, Xinyu Liu, Jianlin Xia

In this paper, we propose a structure-guided Gauss-Newton (SgGN) method for solving least squares problems using a shallow ReLU neural network.

Light the Night: A Multi-Condition Diffusion Framework for Unpaired Low-Light Enhancement in Autonomous Driving

no code implementations CVPR 2024 Jinlong Li, Baolu Li, Zhengzhong Tu, Xinyu Liu, Qing Guo, Felix Juefei-Xu, Runsheng Xu, Hongkai Yu

Vision-centric perception systems for autonomous driving have gained considerable attention recently due to their cost-effectiveness and scalability, especially compared to LiDAR-based systems.

Autonomous Driving

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

1 code implementation28 Mar 2024 Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao

However, the exploration of 3D dense captioning in outdoor scenes is hindered by two major challenges: 1) the domain gap between indoor and outdoor scenes, such as dynamics and sparse visual inputs, makes it difficult to directly adapt existing indoor methods; 2) the lack of data with comprehensive box-caption pair annotations specifically tailored for outdoor scenes.

3D dense captioning Dense Captioning

Endora: Video Generation Models as Endoscopy Simulators

no code implementations17 Mar 2024 Chenxin Li, Hengyu Liu, Yifan Liu, Brandon Y. Feng, Wuyang Li, Xinyu Liu, Zhen Chen, Jing Shao, Yixuan Yuan

In a nutshell, Endora marks a notable breakthrough in the deployment of generative AI for clinical endoscopy research, setting a substantial stage for further advances in medical content generation.

Data Augmentation Video Generation

V2X-DGW: Domain Generalization for Multi-agent Perception under Adverse Weather Conditions

no code implementations17 Mar 2024 Baolu Li, Jinlong Li, Xinyu Liu, Runsheng Xu, Zhengzhong Tu, Jiacheng Guo, Xiaopeng Li, Hongkai Yu

In this paper, we propose a Domain Generalization based approach, named V2X-DGW, for LiDAR-based 3D object detection on multi-agent perception system under adverse weather conditions.

3D Object Detection Domain Generalization +2

UN-SAM: Universal Prompt-Free Segmentation for Generalized Nuclei Images

1 code implementation26 Feb 2024 Zhen Chen, Qing Xu, Xinyu Liu, Yixuan Yuan

Moreover, to unleash the generalization capability of SAM across a variety of nuclei images, we devise a Domain-adaptive Tuning Encoder (DT-Encoder) to seamlessly harmonize visual features with domain-common and domain-specific knowledge, and further devise a Domain Query-enhanced Decoder (DQ-Decoder) by leveraging learnable domain queries for segmentation decoding in different nuclei domains.

Decoder Segmentation +1

Breaking Data Silos: Cross-Domain Learning for Multi-Agent Perception from Independent Private Sources

1 code implementation6 Feb 2024 Jinlong Li, Baolu Li, Xinyu Liu, Runsheng Xu, Jiaqi Ma, Hongkai Yu

However, the data source to train the various agents is independent and private in each company, leading to the Distribution Gap of different private data for training distinct agents in multi-agent perception system.

3D Object Detection object-detection

AdvGPS: Adversarial GPS for Multi-Agent Perception Attack

1 code implementation30 Jan 2024 Jinlong Li, Baolu Li, Xinyu Liu, Jianwu Fang, Felix Juefei-Xu, Qing Guo, Hongkai Yu

The multi-agent perception system collects visual data from sensors located on various agents and leverages their relative poses determined by GPS signals to effectively fuse information, mitigating the limitations of single-agent sensing, such as occlusion.

Adversarial Attack object-detection +1

LiON: Learning Point-wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic Data

2 code implementations19 Sep 2023 Shaocong Xu, Pengfei Li, Qianpu Sun, Xinyu Liu, Yang Li, Shihui Guo, Zhen Wang, Bo Jiang, Rui Wang, Kehua Sheng, Bo Zhang, Li Jiang, Hao Zhao, Yilun Chen

We demonstrate that learning different abstaining penalties, apart from point-wise penalty, for different types of (synthesized) outliers can further improve the performance.

Anomaly Detection Autonomous Driving +2

Domain Adaptation based Object Detection for Autonomous Driving in Foggy and Rainy Weather

1 code implementation18 Jul 2023 Jinlong Li, Runsheng Xu, Xinyu Liu, Jin Ma, Baolu Li, Qin Zou, Jiaqi Ma, Hongkai Yu

To bridge the domain gap and improve the performance of object detection in foggy and rainy weather, this paper presents a novel framework for domain-adaptive object detection.

Autonomous Driving Data Augmentation +4

S2R-ViT for Multi-Agent Cooperative Perception: Bridging the Gap from Simulation to Reality

no code implementations16 Jul 2023 Jinlong Li, Runsheng Xu, Xinyu Liu, Baolu Li, Qin Zou, Jiaqi Ma, Hongkai Yu

We investigate the effects of these two types of domain gaps and propose a novel uncertainty-aware vision transformer to effectively relief the Deployment Gap and an agent-based feature adaptation module with inter-agent and ego-agent discriminators to reduce the Feature Gap.

3D Object Detection object-detection +1

Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos

no code implementations27 Jun 2023 Chiori Hori, Puyuan Peng, David Harwath, Xinyu Liu, Kei Ota, Siddarth Jain, Radu Corcodel, Devesh Jha, Diego Romeres, Jonathan Le Roux

This paper introduces a method for robot action sequence generation from instruction videos using (1) an audio-visual Transformer that converts audio-visual features and instruction speech to a sequence of robot actions called dynamic movement primitives (DMPs) and (2) style-transfer-based training that employs multi-task learning with video captioning and weakly-supervised learning with a semantic classifier to exploit unpaired video-action data.

Multi-Task Learning Scene Understanding +3

Deep Transfer Learning for Intelligent Vehicle Perception: a Survey

no code implementations26 Jun 2023 Xinyu Liu, Jinlong Li, Jin Ma, Huiming Sun, Zhigang Xu, Tianyun Zhang, Hongkai Yu

To the best of our knowledge, this paper represents the first comprehensive survey on the topic of the deep transfer learning for intelligent vehicle perception.

Autonomous Driving Decision Making +4

Towards Robust Aspect-based Sentiment Analysis through Non-counterfactual Augmentations

no code implementations24 Jun 2023 Xinyu Liu, Yan Ding, Kaikai An, Chunyang Xiao, Pranava Madhyastha, Tong Xiao, Jingbo Zhu

While state-of-the-art NLP models have demonstrated excellent performance for aspect based sentiment analysis (ABSA), substantial evidence has been presented on their lack of robustness.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention

3 code implementations CVPR 2023 Xinyu Liu, Houwen Peng, Ningxin Zheng, Yuqing Yang, Han Hu, Yixuan Yuan

Comprehensive experiments demonstrate EfficientViT outperforms existing efficient models, striking a good trade-off between speed and accuracy.

Instruction-ViT: Multi-Modal Prompts for Instruction Learning in ViT

no code implementations29 Apr 2023 Zhenxiang Xiao, Yuzhong Chen, Lu Zhang, Junjie Yao, Zihao Wu, Xiaowei Yu, Yi Pan, Lin Zhao, Chong Ma, Xinyu Liu, Wei Liu, Xiang Li, Yixuan Yuan, Dinggang Shen, Dajiang Zhu, Tianming Liu, Xi Jiang

Prompts have been proven to play a crucial role in large language models, and in recent years, vision models have also been using prompts to improve scalability for multiple downstream tasks.

Image Classification

Delving into Shape-aware Zero-shot Semantic Segmentation

1 code implementation CVPR 2023 Xinyu Liu, Beiwen Tian, Zhen Wang, Rui Wang, Kehua Sheng, Bo Zhang, Hao Zhao, Guyue Zhou

Thanks to the impressive progress of large-scale vision-language pretraining, recent recognition models can classify arbitrary objects in a zero-shot and open-set manner, with a surprisingly high accuracy.

Image Segmentation Segmentation +2

ADAPT: Action-aware Driving Caption Transformer

1 code implementation1 Feb 2023 Bu Jin, Xinyu Liu, Yupeng Zheng, Pengfei Li, Hao Zhao, Tong Zhang, Yuhang Zheng, Guyue Zhou, Jingjing Liu

To bridge the gap, we propose an end-to-end transformer-based architecture, ADAPT (Action-aware Driving cAPtion Transformer), which provides user-friendly natural language narrations and reasoning for each decision making step of autonomous vehicular control and action.

Autonomous Driving Decision Making

Learning for Vehicle-to-Vehicle Cooperative Perception under Lossy Communication

1 code implementation16 Dec 2022 Jinlong Li, Runsheng Xu, Xinyu Liu, Jin Ma, Zicheng Chi, Jiaqi Ma, Hongkai Yu

Due to the beneficial Vehicle-to-Vehicle (V2V) communication, the deep learning based features from other agents can be shared to the ego vehicle so as to improve the perception of the ego vehicle.

3D Object Detection object-detection

Towards Robust Adaptive Object Detection under Noisy Annotations

1 code implementation CVPR 2022 Xinyu Liu, Wuyang Li, Qiushi Yang, Baopu Li, Yixuan Yuan

Domain Adaptive Object Detection (DAOD) models a joint distribution of images and labels from an annotated source domain and learns a domain-invariant transformation to estimate the target labels with the given target domain images.

Object object-detection +1

SIGMA: Semantic-complete Graph Matching for Domain Adaptive Object Detection

1 code implementation CVPR 2022 Wuyang Li, Xinyu Liu, Yixuan Yuan

To overcome these challenges, we propose a novel SemantIc-complete Graph MAtching (SIGMA) framework for DAOD, which completes mismatched semantics and reformulates the adaptation with graph matching.

Graph Matching Hallucination +2

Neural Architecture Searching for Facial Attributes-based Depression Recognition

no code implementations24 Jan 2022 Mingzhe Chen, Xi Xiao, Bin Zhang, Xinyu Liu, Runiu Lu

In this paper, we propose to extend Neural Architecture Search (NAS) technique for designing an optimal model for multiple facial attributes-based depression recognition, which can be efficiently and robustly implemented in a small dataset.

Attribute Neural Architecture Search +1

Maintaining Reasoning Consistency in Compositional Visual Question Answering

1 code implementation CVPR 2022 Chenchen Jing, Yunde Jia, Yuwei Wu, Xinyu Liu, Qi Wu

Existing VQA models can answer a compositional question well, but cannot work well in terms of reasoning consistency in answering the compositional question and its sub-questions.

Question Answering Visual Question Answering

Exploring Gradient Flow Based Saliency for DNN Model Compression

1 code implementation24 Oct 2021 Xinyu Liu, Baopu Li, Zhen Chen, Yixuan Yuan

Model pruning aims to reduce the deep neural network (DNN) model size or computational overhead.

Image Classification Image Denoising +1

Generalizing to New Domains by Mapping Natural Language to Lifted LTL

no code implementations11 Oct 2021 Eric Hsiung, Hiloni Mehta, Junchi Chu, Xinyu Liu, Roma Patel, Stefanie Tellex, George Konidaris

We compare our method of mapping natural language task specifications to intermediate contextual queries against state-of-the-art CopyNet models capable of translating natural language to LTL, by evaluating whether correct LTL for manipulation and navigation task specifications can be output, and show that our method outperforms the CopyNet model on unseen object references.

3D RegNet: Deep Learning Model for COVID-19 Diagnosis on Chest CT Image

no code implementations8 Jul 2021 Haibo Qi, YuHan Wang, Xinyu Liu

In this paper, a 3D-RegNet-based neural network is proposed for diagnosing the physical condition of patients with coronavirus (Covid-19) infection.

COVID-19 Diagnosis

Showing Your Work Doesn't Always Work

1 code implementation ACL 2020 Raphael Tang, Jaejun Lee, Ji Xin, Xinyu Liu, Yao-Liang Yu, Jimmy Lin

In natural language processing, a recently popular line of work explores how to best report the experimental results of neural networks.

Review of data analysis in vision inspection of power lines with an in-depth discussion of deep learning technology

no code implementations22 Mar 2020 Xinyu Liu, Xiren Miao, Hao Jiang, Jing Chen

With the aim of providing a comprehensive overview for researchers who are interested in developing a deep-learning-based analysis system for power lines inspection data, this paper conducts a thorough review of the current literature and identifies the challenges for future research.

Fault Diagnosis object-detection +1

TanhExp: A Smooth Activation Function with High Convergence Speed for Lightweight Neural Networks

no code implementations22 Mar 2020 Xinyu Liu, Xiaoguang Di

Lightweight or mobile neural networks used for real-time computer vision tasks contain fewer parameters than normal networks, which lead to a constrained performance.

Image Classification

Deep least-squares methods: an unsupervised learning-based numerical method for solving elliptic PDEs

1 code implementation5 Nov 2019 Zhiqiang Cai, Jingshuang Chen, Min Liu, Xinyu Liu

This paper studies an unsupervised deep learning-based numerical approach for solving partial differential equations (PDEs).

Learning Efficient Lexically-Constrained Neural Machine Translation with External Memory

no code implementations31 Jan 2019 Ya Li, Xinyu Liu, Dan Liu, Xueqiang Zhang, Junhua Liu

Recent years has witnessed dramatic progress of neural machine translation (NMT), however, the method of manually guiding the translation procedure remains to be better explored.

Machine Translation NMT +2

Spelling Error Correction Using a Nested RNN Model and Pseudo Training Data

no code implementations1 Nov 2018 Hao Li, Yang Wang, Xinyu Liu, Zhichao Sheng, Si Wei

We propose a nested recurrent neural network (nested RNN) model for English spelling error correction and generate pseudo data based on phonetic similarity to train it.

Feature Engineering

Dex-Net 3.0: Computing Robust Robot Vacuum Suction Grasp Targets in Point Clouds using a New Analytic Model and Deep Learning

no code implementations19 Sep 2017 Jeffrey Mahler, Matthew Matl, Xinyu Liu, Albert Li, David Gealy, Ken Goldberg

Vacuum-based end effectors are widely used in industry and are often preferred over parallel-jaw and multifinger grippers due to their ability to lift objects with a single point of contact.

Robotics

Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics

no code implementations27 Mar 2017 Jeffrey Mahler, Jacky Liang, Sherdil Niyaz, Michael Laskey, Richard Doan, Xinyu Liu, Juan Aparicio Ojea, Ken Goldberg

To reduce data collection time for deep learning of robust robotic grasp plans, we explore training from a synthetic dataset of 6. 7 million point clouds, grasps, and analytic grasp metrics generated from thousands of 3D models from Dex-Net 1. 0 in randomized poses on a table.

Robotics

Cannot find the paper you are looking for? You can Submit a new open access paper.