no code implementations • 7 Jan 2025 • Zhengnan Sun, Zhaotai Shi, Jiayin Chen, Qingtao Liu, Yu Cui, Qi Ye, Jiming Chen
In this paper, we introduce VTAO-BiManip, a novel framework that combines visual-tactile-action pretraining with object understanding to facilitate curriculum RL to enable human-like bimanual manipulation.
1 code implementation • 1 May 2024 • Yu Cui, Feng Liu, Pengbo Wang, Bohao Wang, Heng Tang, Yi Wan, Jun Wang, Jiawei Chen
Owing to their powerful semantic reasoning capabilities, Large Language Models (LLMs) have been effectively utilized as recommenders, achieving impressive performance.
1 code implementation • 12 Dec 2023 • Jian Zhu, Yu Cui, Zhangmin Huang, Xingyu Li, Lei Liu, Lingfang Zeng, Li-Rong Dai
Furthermore, an adaptive confidence multi-view network is employed to measure the confidence of each view and then fuse multi-view features through a weighted summation.
no code implementations • 26 Aug 2023 • Jian Zhu, Wen Cheng, Yu Cui, Chang Tang, Yuyang Dai, Yong Li, Lingfang Zeng
Hash representation learning of multi-view heterogeneous data is the key to improving the accuracy of multimedia retrieval.
no code implementations • 13 Apr 2023 • Jian Zhu, Zhangmin Huang, Xiaohu Ruan, Yu Cui, Yongli Cheng, Lingfang Zeng
However, existing methods fail to effectively fuse the multi-view features and utilize the metric information provided by the dissimilar samples, leading to limited retrieval precision.
no code implementations • 18 Jun 2022 • Yu Cui, Moshiur Farazi
Visual Relationship Detection (VRD) impels a computer vision model to 'see' beyond an individual object instance and 'understand' how different objects in a scene are related.