no code implementations • 20 Mar 2025 • Zhihang Liu, Chen-Wei Xie, Pandeng Li, Liming Zhao, Longxiang Tang, Yun Zheng, Chuanbin Liu, Hongtao Xie
Specifically, the instruction condition is injected into the grouped visual tokens at the local level and the learnable tokens at the global level, and we conduct the attention mechanism to complete the conditional compression.
1 code implementation • 5 Mar 2025 • Nianzu Yang, Pandeng Li, Liming Zhao, Yang Li, Chen-Wei Xie, Yehui Tang, Xudong Lu, Zhihang Liu, Yun Zheng, Yu Liu, Junchi Yan
Trained using only a basic MSE diffusion loss for reconstruction, along with KL term and LPIPS perceptual loss from scratch, extensive experiments demonstrate that CDT achieves state-of-the-art performance in video reconstruction tasks with just a single-step sampling.
no code implementations • 12 Dec 2024 • Mingda Jia, Liming Zhao, Ge Li, Yun Zheng
To enhance the capabilities of object detectors for HOI detection, we present a dual-branch framework named ContextHOI, which efficiently captures both object detection features and spatial contexts.
no code implementations • 11 Dec 2024 • Mingda Jia, Liming Zhao, Ge Li, Yun Zheng
Human-object interaction (HOI) detectors with popular query-transformer architecture have achieved promising performance.
no code implementations • 10 Nov 2024 • Pingyu Wu, Kai Zhu, Yu Liu, Liming Zhao, Wei Zhai, Yang Cao, Zheng-Jun Zha
Specifically, the KTC architecture divides the latent space into two branches, in which one half completely inherits the compression prior of keyframes from a lower-dimension image VAE while the other half involves temporal compression through 3D group causal convolution, reducing temporal-spatial conflicts and accelerating the convergence speed of video VAE.
no code implementations • 25 Sep 2023 • Liming Zhao, Aman Agrawal, Patrick Rebentrost
Restricted Boltzmann Machines (RBMs) are widely used probabilistic undirected graphical models with visible and latent nodes, playing an important role in statistics and machine learning.
no code implementations • 17 Sep 2023 • Liming Zhao, Naixu Guo, Ming-Xing Luo, Patrick Rebentrost
Several works consider subclasses of quantum states that can be learned in polynomial sample complexity such as stabilizer states or high-temperature Gibbs states.
1 code implementation • NeurIPS 2023 • Pandeng Li, Chen-Wei Xie, Hongtao Xie, Liming Zhao, Lei Zhang, Yun Zheng, Deli Zhao, Yongdong Zhang
Video moment retrieval pursues an efficient and generalized solution to identify the specific temporal segments within an untrimmed video that correspond to a given language description.
1 code implementation • ICCV 2023 • Pandeng Li, Chen-Wei Xie, Liming Zhao, Hongtao Xie, Jiannan Ge, Yun Zheng, Deli Zhao, Yongdong Zhang
In the event-sentence prototype matching phase, we design a temporal prototype generation mechanism to associate intra-frame objects and interact inter-frame temporal relations.
no code implementations • ECCV 2020 • Lele Cheng, Xiangzeng Zhou, Liming Zhao, Dangwei Li, Hong Shang, Yun Zheng, Pan Pan, Yinghui Xu
In many real-world datasets, like WebVision, the performance of DNN based classifier is often limited by the noisy labeled data.
no code implementations • 19 Feb 2019 • Yunpu Ma, Volker Tresp, Liming Zhao, Yuyi Wang
In this work, we propose the first quantum Ans\"atze for the statistical relational learning on knowledge graphs using parametric quantum circuits.
no code implementations • CVPR 2018 • Fangfang Wang, Liming Zhao, Xi Li, Xinchao Wang, DaCheng Tao
Localizing text in the wild is challenging in the situations of complicated geometric layout of the targets like random orientation and large aspect ratio.
1 code implementation • ICCV 2017 • Liming Zhao, Xi Li, Jingdong Wang, Yueting Zhuang
In this paper, we address the problem of person re-identification, which refers to associating the persons captured from different cameras.
Ranked #112 on
Person Re-Identification
on Market-1501
4 code implementations • 23 Nov 2016 • Liming Zhao, Jingdong Wang, Xi Li, Zhuowen Tu, Wen-Jun Zeng
A deep residual network, built by stacking a sequence of residual blocks, is easy to train, because identity mappings skip residual branches and thus improve information flow.
no code implementations • 19 Oct 2015 • Xi Li, Liming Zhao, Lina Wei, Ming-Hsuan Yang, Fei Wu, Yueting Zhuang, Haibin Ling, Jingdong Wang
A key problem in salient object detection is how to effectively model the semantic properties of salient objects in a data-driven manner.
no code implementations • 4 Dec 2014 • Liming Zhao, Xi Li, Jun Xiao, Fei Wu, Yueting Zhuang
As an important and challenging problem in computer vision and graphics, keypoint-based object tracking is typically formulated in a spatio-temporal statistical learning framework.