Search Results for author: HongYu Zhou

Found 21 papers, 7 papers with code

Safe Non-Stochastic Control of Control-Affine Systems: An Online Convex Optimization Approach

no code implementations28 Sep 2023 HongYu Zhou, Yichen Song, Vasileios Tzoumas

We study how to safely control nonlinear control-affine systems that are corrupted with bounded non-stochastic noise, i. e., noise that is unknown a priori and that is not necessarily governed by a stochastic model.

Collision Avoidance

DreamLLM: Synergistic Multimodal Comprehension and Creation

1 code implementation20 Sep 2023 Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, HongYu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi

This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (MLLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation.

 Ranked #1 on Visual Question Answering on MMBench (GPT-3.5 score metric)

multimodal generation Visual Question Answering +2

Safe Non-Stochastic Control of Linear Dynamical Systems

1 code implementation23 Aug 2023 HongYu Zhou, Vasileios Tzoumas

We study the problem of \textit{safe control of linear dynamical systems corrupted with non-stochastic noise}, and provide an algorithm that guarantees (i) zero constraint violation of convex time-varying constraints, and (ii) bounded dynamic regret, \ie bounded suboptimality against an optimal clairvoyant controller that knows the future noise a priori.

Collision Avoidance

ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning

no code implementations18 Jul 2023 Liang Zhao, En Yu, Zheng Ge, Jinrong Yang, Haoran Wei, HongYu Zhou, Jianjian Sun, Yuang Peng, Runpei Dong, Chunrui Han, Xiangyu Zhang

Based on precise referring instruction, we propose ChatSpot, a unified end-to-end multimodal large language model that supports diverse forms of interactivity including mouse clicks, drag-and-drop, and drawing boxes, which provides a more flexible and seamless interactive experience.

Instruction Following Language Modelling +1

GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection

no code implementations30 Jun 2023 Weixin Mao, Jinrong Yang, Zheng Ge, Lin Song, HongYu Zhou, Tiezheng Mao, Zeming Li, Osamu Yoshie

In light of the success of sample mining techniques in 2D object detection, we propose a simple yet effective mining strategy for improving depth perception in 3D object detection.

3D Object Detection Depth Estimation +3

Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception

no code implementations10 Mar 2023 Chunrui Han, Jinrong Yang, Jianjian Sun, Zheng Ge, Runpei Dong, HongYu Zhou, Weixin Mao, Yuang Peng, Xiangyu Zhang

In this paper, we explore an embarrassingly simple long-term recurrent fusion strategy built upon the LSS-based methods and find it already able to enjoy the merits from both sides, i. e., rich long-term information and efficient fusion pipeline.

motion prediction object-detection +1

A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions

2 code implementations9 Feb 2023 HongYu Zhou, Xin Zhou, Zhiwei Zeng, Lingzi Zhang, Zhiqi Shen

Recommendation systems have become popular and effective tools to help users discover their interesting items by modeling the user preference and item property based on implicit interactions (e. g., purchasing and clicking).

Multimodal Recommendation

Enhancing Dyadic Relations with Homogeneous Graphs for Multimodal Recommendation

1 code implementation28 Jan 2023 HongYu Zhou, Xin Zhou, Lingzi Zhang, Zhiqi Shen

On top of the finding, we propose a model that enhances the dyadic relations by learning Dual RepresentAtions of both users and items via constructing homogeneous Graphs for multimOdal recommeNdation.

Graph Learning Multimodal Recommendation

Efficient Online Learning with Memory via Frank-Wolfe Optimization: Algorithms with Bounded Dynamic Regret and Applications to Control

no code implementations2 Jan 2023 HongYu Zhou, Zirui Xu, Vasileios Tzoumas

In this paper, we enable projection-free online learning within the framework of Online Convex Optimization with Memory (OCO-M) -- OCO-M captures how the history of decisions affects the current outcome by allowing the online learning loss functions to depend on both current and past decisions.

Time Series Time Series Prediction

MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception

2 code implementations ICCV 2023 HongYu Zhou, Zheng Ge, Zeming Li, Xiangyu Zhang

This paper proposes an efficient multi-camera to Bird's-Eye-View (BEV) view transformation method for 3D perception, dubbed MatrixVT.

Ranked #2 on Bird's-Eye View Semantic Segmentation on nuScenes (IoU lane - 224x480 - 100x100 at 0.5 metric)

Autonomous Driving Bird's-Eye View Semantic Segmentation +2

Online Submodular Coordination with Bounded Tracking Regret: Theory, Algorithm, and Applications to Multi-Robot Coordination

no code implementations26 Sep 2022 Zirui Xu, HongYu Zhou, Vasileios Tzoumas

We are motivated by the future of autonomy that involves multiple robots coordinating in dynamic, unstructured, and adversarial environments to complete complex tasks such as target tracking, environmental mapping, and area monitoring.

PersDet: Monocular 3D Detection in Perspective Bird's-Eye-View

no code implementations19 Aug 2022 HongYu Zhou, Zheng Ge, Weixin Mao, Zeming Li

To address this problem, we revisit the generation of BEV representation and propose detecting objects in perspective BEV -- a new BEV representation that does not require feature sampling.

Autonomous Driving object-detection +1

Safe Control of Partially-Observed Linear Time-Varying Systems with Minimal Worst-Case Dynamic Regret

no code implementations18 Aug 2022 HongYu Zhou, Vasileios Tzoumas

We present safe control of partially-observed linear time-varying systems in the presence of unknown and unpredictable process and measurement noise.

Bootstrap Latent Representations for Multi-modal Recommendation

2 code implementations13 Jul 2022 Xin Zhou, HongYu Zhou, Yong liu, Zhiwei Zeng, Chunyan Miao, Pengwei Wang, Yuan You, Feijun Jiang

Besides the user-item interaction graph, existing state-of-the-art methods usually use auxiliary graphs (e. g., user-user or item-item relation graph) to augment the learned representations of users and/or items.

Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection

2 code implementations6 Jul 2022 HongYu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun

To date, the most powerful semi-supervised object detectors (SS-OD) are based on pseudo-boxes, which need a sequence of post-processing with fine-tuned hyper-parameters.

object-detection Object Detection +2

Transformer for Polyp Detection

no code implementations14 Oct 2021 Shijie Liu, HongYu Zhou, Xiaozhou Shi, Junwen Pan

In recent years, as the Transformer has performed increasingly well on NLP tasks, many researchers have ported the Transformer structure to vision tasks , bridging the gap between NLP and CV tasks.

Grouptron: Dynamic Multi-Scale Graph Convolutional Networks for Group-Aware Dense Crowd Trajectory Forecasting

no code implementations29 Sep 2021 Rui Zhou, HongYu Zhou, Huidong Gao, Masayoshi Tomizuka, Jiachen Li, Zhuo Xu

Accurate, long-term forecasting of pedestrian trajectories in highly dynamic and interactive scenes is a long-standing challenge.

Trajectory Forecasting

RECIST-Net: Lesion detection via grouping keypoints on RECIST-based annotation

no code implementations19 Jul 2021 Cong Xie, Shilei Cao, Dong Wei, HongYu Zhou, Kai Ma, Xianli Zhang, Buyue Qian, Liansheng Wang, Yefeng Zheng

Universal lesion detection in computed tomography (CT) images is an important yet challenging task due to the large variations in lesion type, size, shape, and appearance.

Computed Tomography (CT) Lesion Detection +1

High-Performance FPGA-based Accelerator for Bayesian Neural Networks

no code implementations12 May 2021 Hongxiang Fan, Martin Ferianc, Miguel Rodrigues, HongYu Zhou, Xinyu Niu, Wayne Luk

Neural networks (NNs) have demonstrated their potential in a wide range of applications such as image recognition, decision making or recommendation systems.

Autonomous Vehicles Bayesian Inference +3

Cannot find the paper you are looking for? You can Submit a new open access paper.