Search Results for author: Hao Wen

Found 19 papers, 8 papers with code

Understanding Multimodal Deep Neural Networks: A Concept Selection View

no code implementations • 13 Apr 2024 • Chenming Shang, Hengyuan Zhang, Hao Wen, Yujiu Yang

The multimodal deep neural networks, represented by CLIP, have generated rich downstream applications owing to their excellent performance, thus making understanding the decision-making process of CLIP an essential research topic.

Decision Making

Paper
Add Code

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

2 code implementations • 10 Jan 2024 • Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu

Next, we discuss several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges.

217

Paper
Code

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion

no code implementations • 11 Dec 2023 • Zehuan Huang, Hao Wen, Junting Dong, Yaohui Wang, Yangguang Li, Xinyuan Chen, Yan-Pei Cao, Ding Liang, Yu Qiao, Bo Dai, Lu Sheng

Generating multiview images from a single view facilitates the rapid generation of a 3D mesh conditioned on a single image.

SSIM

Paper
Add Code

Search Strategies for Self-driving Laboratories with Pending Experiments

no code implementations • 6 Dec 2023 • Hao Wen, Jakob Zeitler, Connor Rupnow

To minimize station downtime and maximize experimental throughput, it is practical to run experiments in asynchronous parallel, in which multiple experiments are being performed at once in different stages.

Bayesian Optimisation

Paper
Add Code

Generative Model for Models: Rapid DNN Customization for Diverse Tasks and Resource Constraints

no code implementations • 29 Aug 2023 • Wenxing Xu, Yuanchun Li, Jiacheng Liu, Yi Sun, Zhengyang Cao, Yixuan Li, Hao Wen, Yunxin Liu

Unlike cloud-based deep learning models that are often large and uniform, edge-deployed models usually demand customization for domain-specific tasks and resource-limited environments.

Image Classification object-detection +1

Paper
Add Code

AutoDroid: LLM-powered Task Automation in Android

no code implementations • 29 Aug 2023 • Hao Wen, Yuanchun Li, Guohong Liu, Shanhui Zhao, Tao Yu, Toby Jia-Jun Li, Shiqi Jiang, Yunhao Liu, Yaqin Zhang, Yunxin Liu

Mobile task automation is an attractive technique that aims to enable voice-based hands-free user interaction with smartphones.

Language Modelling

Paper
Add Code

Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos

1 code implementation • ICCV 2023 • Zhiqiang Shen, Xiaoxiao Sheng, Hehe Fan, Longguang Wang, Yulan Guo, Qiong Liu, Hao Wen, Xi Zhou

In this paper, we propose a Masked Spatio-Temporal Structure Prediction (MaST-Pre) method to capture the structure of point cloud videos without human annotations.

point cloud video understanding Self-Supervised Learning +1

Paper
Code

A Novel Ehanced Move Recognition Algorithm Based on Pre-trained Models with Positional Embeddings

no code implementations • 14 Aug 2023 • Hao Wen, Jie Wang, Xiaodong Qiao

The recognition of abstracts is crucial for effectively locating the content and clarifying the article.

Position

Paper
Add Code

The MI-Motion Dataset and Benchmark for 3D Multi-Person Motion Prediction

1 code implementation • 23 Jun 2023 • Xiaogang Peng, Xiao Zhou, Yikai Luo, Hao Wen, Yu Ding, Zizhao Wu

We believe that the proposed MI-Motion benchmark dataset and baseline will facilitate future research in this area, ultimately leading to better understanding and modeling of multi-person interactions.

motion prediction

Paper
Code

Learning Weakly Supervised Audio-Visual Violence Detection in Hyperbolic Space

1 code implementation • 30 May 2023 • Xiaogang Peng, Hao Wen, Yikai Luo, Xiao Zhou, Keyang Yu, Ping Yang, Zizhao Wu

To overcome this, we propose HyperVD, a novel framework that learns snippet embeddings in hyperbolic space to improve model discrimination.

Ranked #1 on Anomaly Detection In Surveillance Videos on XD-Violence

Anomaly Detection In Surveillance Videos

Paper
Code

DroidBot-GPT: GPT-powered UI Automation for Android

no code implementations • 14 Apr 2023 • Hao Wen, Hongming Wang, Jiaxuan Liu, Yuanchun Li

Given a natural language description of a desired task, DroidBot-GPT can automatically generate and execute actions that navigate the app to complete the task.

Navigate

Paper
Add Code

AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments

no code implementations • 13 Mar 2023 • Hao Wen, Yuanchun Li, Zunshuai Zhang, Shiqi Jiang, Xiaozhou Ye, Ye Ouyang, Ya-Qin Zhang, Yunxin Liu

Model elastification generates a high-quality search space of model architectures with the guidance of a developer-specified oracle model.

valid

Paper
Add Code

Searching for Effective Neural Network Architectures for Heart Murmur Detection from Phonocardiogram

1 code implementation • 6 Mar 2023 • Hao Wen, Jingsu Kang

Aim: The George B. Moody PhysioNet Challenge 2022 raised problems of heart murmur detection and related abnormal cardiac function identification from phonocardiograms (PCGs).

Model Selection Multi-Task Learning

Paper
Code

Crowd3D: Towards Hundreds of People Reconstruction from a Single Image

no code implementations • CVPR 2023 • Hao Wen, Jing Huang, Huili Cui, Haozhe Lin, Yukun Lai, Lu Fang, Kun Li

However, existing methods cannot deal with large scenes containing hundreds of people, which encounter the challenges of large number of people, large variations in human scale, and complex spatial distribution.

Paper
Add Code

Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition

no code implementations • 9 Dec 2022 • Xinzhe Ni, Yong liu, Hao Wen, Yatai Ji, Jing Xiao, Yujiu Yang

Then in the visual flow, visual prototypes are computed by a Temporal-Relational CrossTransformer (TRX) module for example.

Few-Shot action recognition Few Shot Action Recognition +1

Paper
Add Code

Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding

1 code implementation • 30 Jul 2022 • Hao Wen, Yunze Liu, Jingwei Huang, Bo Duan, Li Yi

This paper proposes a 4D backbone for long-term point cloud video understanding.

point cloud video understanding Video Understanding

Paper
Code

Differentiable Particle Filters through Conditional Normalizing Flow

1 code implementation • 1 Jul 2021 • Xiongjie Chen, Hao Wen, Yunpeng Li

Differentiable particle filters provide a flexible mechanism to adaptively train dynamic and measurement models by learning from observed data.

Visual Tracking

Paper
Code

End-To-End Semi-supervised Learning for Differentiable Particle Filters

1 code implementation • 11 Nov 2020 • Hao Wen, Xiongjie Chen, Georgios Papagiannis, Conghui Hu, Yunpeng Li

Recent advances in incorporating neural networks into particle filters provide the desired flexibility to apply particle filters in large-scale real-world applications.

Paper
Code

A fuzzy control scheme for deployment of space tethered system with tension constraint

no code implementations • Aerospace Science and Technology 2020 • Shidong Xu ∗, Hao Wen, Zheng Huang, Dongping Jin

In the meantime, the asymmetric constraint on tether tension is handled in controller design and stability analysis such that tether tension can be kept within a prescribed positive range during the deployment.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.