no code implementations • 13 Apr 2024 • Chenming Shang, Hengyuan Zhang, Hao Wen, Yujiu Yang
The multimodal deep neural networks, represented by CLIP, have generated rich downstream applications owing to their excellent performance, thus making understanding the decision-making process of CLIP an essential research topic.
2 code implementations • 10 Jan 2024 • Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu
Next, we discuss several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges.
no code implementations • 11 Dec 2023 • Zehuan Huang, Hao Wen, Junting Dong, Yaohui Wang, Yangguang Li, Xinyuan Chen, Yan-Pei Cao, Ding Liang, Yu Qiao, Bo Dai, Lu Sheng
Generating multiview images from a single view facilitates the rapid generation of a 3D mesh conditioned on a single image.
no code implementations • 6 Dec 2023 • Hao Wen, Jakob Zeitler, Connor Rupnow
To minimize station downtime and maximize experimental throughput, it is practical to run experiments in asynchronous parallel, in which multiple experiments are being performed at once in different stages.
no code implementations • 29 Aug 2023 • Wenxing Xu, Yuanchun Li, Jiacheng Liu, Yi Sun, Zhengyang Cao, Yixuan Li, Hao Wen, Yunxin Liu
Unlike cloud-based deep learning models that are often large and uniform, edge-deployed models usually demand customization for domain-specific tasks and resource-limited environments.
no code implementations • 29 Aug 2023 • Hao Wen, Yuanchun Li, Guohong Liu, Shanhui Zhao, Tao Yu, Toby Jia-Jun Li, Shiqi Jiang, Yunhao Liu, Yaqin Zhang, Yunxin Liu
Mobile task automation is an attractive technique that aims to enable voice-based hands-free user interaction with smartphones.
1 code implementation • ICCV 2023 • Zhiqiang Shen, Xiaoxiao Sheng, Hehe Fan, Longguang Wang, Yulan Guo, Qiong Liu, Hao Wen, Xi Zhou
In this paper, we propose a Masked Spatio-Temporal Structure Prediction (MaST-Pre) method to capture the structure of point cloud videos without human annotations.
no code implementations • 14 Aug 2023 • Hao Wen, Jie Wang, Xiaodong Qiao
The recognition of abstracts is crucial for effectively locating the content and clarifying the article.
1 code implementation • 23 Jun 2023 • Xiaogang Peng, Xiao Zhou, Yikai Luo, Hao Wen, Yu Ding, Zizhao Wu
We believe that the proposed MI-Motion benchmark dataset and baseline will facilitate future research in this area, ultimately leading to better understanding and modeling of multi-person interactions.
1 code implementation • 30 May 2023 • Xiaogang Peng, Hao Wen, Yikai Luo, Xiao Zhou, Keyang Yu, Ping Yang, Zizhao Wu
To overcome this, we propose HyperVD, a novel framework that learns snippet embeddings in hyperbolic space to improve model discrimination.
no code implementations • 14 Apr 2023 • Hao Wen, Hongming Wang, Jiaxuan Liu, Yuanchun Li
Given a natural language description of a desired task, DroidBot-GPT can automatically generate and execute actions that navigate the app to complete the task.
no code implementations • 13 Mar 2023 • Hao Wen, Yuanchun Li, Zunshuai Zhang, Shiqi Jiang, Xiaozhou Ye, Ye Ouyang, Ya-Qin Zhang, Yunxin Liu
Model elastification generates a high-quality search space of model architectures with the guidance of a developer-specified oracle model.
1 code implementation • 6 Mar 2023 • Hao Wen, Jingsu Kang
Aim: The George B. Moody PhysioNet Challenge 2022 raised problems of heart murmur detection and related abnormal cardiac function identification from phonocardiograms (PCGs).
no code implementations • CVPR 2023 • Hao Wen, Jing Huang, Huili Cui, Haozhe Lin, Yukun Lai, Lu Fang, Kun Li
However, existing methods cannot deal with large scenes containing hundreds of people, which encounter the challenges of large number of people, large variations in human scale, and complex spatial distribution.
no code implementations • 9 Dec 2022 • Xinzhe Ni, Yong liu, Hao Wen, Yatai Ji, Jing Xiao, Yujiu Yang
Then in the visual flow, visual prototypes are computed by a Temporal-Relational CrossTransformer (TRX) module for example.
1 code implementation • 30 Jul 2022 • Hao Wen, Yunze Liu, Jingwei Huang, Bo Duan, Li Yi
This paper proposes a 4D backbone for long-term point cloud video understanding.
1 code implementation • 1 Jul 2021 • Xiongjie Chen, Hao Wen, Yunpeng Li
Differentiable particle filters provide a flexible mechanism to adaptively train dynamic and measurement models by learning from observed data.
1 code implementation • 11 Nov 2020 • Hao Wen, Xiongjie Chen, Georgios Papagiannis, Conghui Hu, Yunpeng Li
Recent advances in incorporating neural networks into particle filters provide the desired flexibility to apply particle filters in large-scale real-world applications.
no code implementations • Aerospace Science and Technology 2020 • Shidong Xu ∗, Hao Wen, Zheng Huang, Dongping Jin
In the meantime, the asymmetric constraint on tether tension is handled in controller design and stability analysis such that tether tension can be kept within a prescribed positive range during the deployment.