Search Results for author: Lei Ke

Found 24 papers, 21 papers with code

M$^3$PC: Test-time Model Predictive Control for Pretrained Masked Trajectory Model

1 code implementation7 Dec 2024 Kehan Wen, Yutong Hu, Yao Mu, Lei Ke

Recent work in Offline Reinforcement Learning (RL) has shown that a unified Transformer trained under a masked auto-encoding objective can effectively capture the relationships between different modalities (e. g., states, actions, rewards) within given trajectory datasets.

D4RL Model Predictive Control +1

Video Depth without Video Models

no code implementations28 Nov 2024 Bingxin Ke, Dominik Narnhofer, Shengyu Huang, Lei Ke, Torben Peters, Katerina Fragkiadaki, Anton Obukhov, Konrad Schindler

Video depth estimation lifts monocular video clips to 3D by inferring dense depth at every frame.

Depth Estimation

SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking

1 code implementation17 Sep 2024 Siyuan Li, Lei Ke, Yung-Hsu Yang, Luigi Piccinelli, Mattia Segù, Martin Danelljan, Luc van Gool

Due to the complexity of motion patterns in the large-vocabulary scenarios and unstable classification of the novel objects, the motion and semantics cues are either ignored or applied based on heuristics in the final matching steps by existing methods.

Multiple Object Tracking

Matching Anything by Segmenting Anything

1 code implementation CVPR 2024 Siyuan Li, Lei Ke, Martin Danelljan, Luigi Piccinelli, Mattia Segu, Luc van Gool, Fisher Yu

The robust association of the same objects across video frames in complex scenes is crucial for many applications, especially Multiple Object Tracking (MOT).

Domain Generalization Multiple Object Tracking +2

DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos

1 code implementation3 May 2024 Wen-Hsuan Chu, Lei Ke, Katerina Fragkiadaki

There are two challenges in this direction: First, rendering error gradients are often insufficient to recover fast object motion, and second, view predictive generative models work much better for objects than whole scenes, so, score distillation objectives cannot currently be applied at the scene level directly.

Depth Estimation Depth Prediction +4

LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning

1 code implementation12 Apr 2024 Junchi Wang, Lei Ke

In this work, we delve into reasoning segmentation, a novel task that enables segmentation system to reason and interpret implicit user intention via large language model reasoning and then segment the corresponding target.

Image Segmentation Language Modelling +4

Gaussian Grouping: Segment and Edit Anything in 3D Scenes

1 code implementation1 Dec 2023 Mingqiao Ye, Martin Danelljan, Fisher Yu, Lei Ke

To address this issue, we propose Gaussian Grouping, which extends Gaussian Splatting to jointly reconstruct and segment anything in open-world 3D scenes.

Colorization Novel View Synthesis +3

Stable Segment Anything Model

1 code implementation27 Nov 2023 Qi Fan, Xin Tao, Lei Ke, Mingqiao Ye, Yuan Zhang, Pengfei Wan, Zhongyuan Wang, Yu-Wing Tai, Chi-Keung Tang

Thus, our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality, with 3) minimal learnable parameters (0. 08 M) and fast adaptation (by 1 training epoch).

Segmentation

Cascade-DETR: Delving into High-Quality Universal Object Detection

1 code implementation ICCV 2023 Mingqiao Ye, Lei Ke, Siyuan Li, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu

While dominating on the COCO benchmark, recent Transformer-based detection methods are not competitive in diverse domains.

Decoder Object +3

Segment Anything Meets Point Tracking

1 code implementation3 Jul 2023 Frano Rajič, Lei Ke, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu

The Segment Anything Model (SAM) has established itself as a powerful zero-shot image segmentation model, enabled by efficient point-centric annotation and prompt-based models.

Interactive Video Object Segmentation Object +5

Occlusion-Aware Instance Segmentation via BiLayer Network Architectures

1 code implementation8 Aug 2022 Lei Ke, Yu-Wing Tai, Chi-Keung Tang

Unlike previous instance segmentation methods, we model image formation as a composition of two overlapping layers, and propose Bilayer Convolutional Network (BCNet), where the top layer detects occluding objects (occluders) and the bottom layer infers partially occluded instances (occludees).

Instance Segmentation Segmentation +2

Video Mask Transfiner for High-Quality Video Instance Segmentation

1 code implementation28 Jul 2022 Lei Ke, Henghui Ding, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

While Video Instance Segmentation (VIS) has seen rapid progress, current approaches struggle to predict high-quality masks with accurate boundary details.

Instance Segmentation Semantic Segmentation +2

Occlusion-Aware Video Object Inpainting

no code implementations ICCV 2021 Lei Ke, Yu-Wing Tai, Chi-Keung Tang

To facilitate this new research, we construct the first large-scale video object inpainting benchmark YouTube-VOI to provide realistic occlusion scenarios with both occluded and visible object masks available.

Object Texture Synthesis +1

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers

1 code implementation CVPR 2021 Lei Ke, Yu-Wing Tai, Chi-Keung Tang

Segmenting highly-overlapping objects is challenging, because typically no distinction is made between real object contours and occlusion boundaries.

Amodal Instance Segmentation Boundary Detection +4

Cascaded deep monocular 3D human pose estimation with evolutionary training data

1 code implementation CVPR 2020 Shichao Li, Lei Ke, Kevin Pratama, Yu-Wing Tai, Chi-Keung Tang, Kwang-Ting Cheng

End-to-end deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation, yet these models may fail for unseen poses with limited and fixed training data.

Data Augmentation Monocular 3D Human Pose Estimation +3

Reflective Decoding Network for Image Captioning

no code implementations ICCV 2019 Lei Ke, Wenjie Pei, Ruiyu Li, Xiaoyong Shen, Yu-Wing Tai

State-of-the-art image captioning methods mostly focus on improving visual features, less attention has been paid to utilizing the inherent properties of language to boost captioning performance.

Decoder Image Captioning +2

Memory-Attended Recurrent Network for Video Captioning

1 code implementation CVPR 2019 Wenjie Pei, Jiyuan Zhang, Xiangrong Wang, Lei Ke, Xiaoyong Shen, Yu-Wing Tai

Typical techniques for video captioning follow the encoder-decoder framework, which can only focus on one source video being processed.

Decoder Video Captioning

Cannot find the paper you are looking for? You can Submit a new open access paper.