1 code implementation • 9 Oct 2024 • Junyan Lin, Haoran Chen, Dawei Zhu, Xiaoyu Shen
However, there is still considerable debate on constructing MLLM architectures, particularly regarding the selection of appropriate connectors for perception tasks of varying granularities.
no code implementations • 8 Aug 2024 • Weilong Chen, Wenxuan Xu, Haoran Chen, Xinran Zhang, Zhijin Qin, Yanru Zhang, Zhu Han
Underwater communication is essential for environmental monitoring, marine biology research, and underwater exploration.
1 code implementation • 10 Jun 2024 • Chensen Huang, Guibo Zhu, Xuepeng Wang, Yifei Luo, Guojing Ge, Haoran Chen, Dong Yi, Jinqiao Wang
To extend the context length of Transformer-based large language models (LLMs) and improve comprehension capabilities, we often face limitations due to computational resources and bounded memory storage capacity.
no code implementations • 23 May 2024 • Haoran Chen, Micah Goldblum, Zuxuan Wu, Yu-Gang Jiang
A common problem in continual learning is the classification layer's bias towards the most recent task.
1 code implementation • 16 Oct 2023 • Zhen Xing, Qijun Feng, Haoran Chen, Qi Dai, Han Hu, Hang Xu, Zuxuan Wu, Yu-Gang Jiang
However, existing surveys mainly focus on diffusion models in the context of image generation, with few up-to-date reviews on their application in the video domain.
2 code implementations • 11 Sep 2023 • Haoran Chen, Kenneth Blomqvist, Francesco Milano, Roland Siegwart
In this paper, we propose to the best of our knowledge the first algorithm for open-vocabulary panoptic segmentation in 3D scenes.
6 code implementations • 23 Aug 2023 • Qing Xu, Wenwei Kuang, Zeyu Zhang, Xueyao Bao, Haoran Chen, Wenting Duan
Compared to the segment anything model, SPPNet shows roughly 20 times faster inference, with 1/70 parameters and computational cost.
1 code implementation • 13 Mar 2023 • Haoran Chen, Zuxuan Wu, Xintong Han, Menglin Jia, Yu-Gang Jiang
Current research on continual learning mainly focuses on relieving catastrophic forgetting, and most of their success is at the cost of limiting the performance of newly incoming tasks.
1 code implementation • NeurIPS 2023 • Haoran Chen, Xintong Han, Zuxuan Wu, Yu-Gang Jiang
Most existing methods for unsupervised domain adaptation (UDA) rely on a shared network to extract domain-invariant features.
Multi-Source Unsupervised Domain Adaptation Unsupervised Domain Adaptation
no code implementations • 7 Apr 2021 • Zhaoyi Wan, Haoran Chen, Jielei Zhang, Wentao Jiang, Cong Yao, Jiebo Luo
In this paper, we address the problem of makeup transfer, which aims at transplanting the makeup from the reference face to the source face while preserving the identity of the source.
1 code implementation • 12 Feb 2021 • Haoran Chen, Jianmin Li, Simone Frintrop, Xiaolin Hu
We cleaned the MSR-VTT annotations by removing these problems, then tested several typical video captioning models on the cleaned dataset.
1 code implementation • 16 Jan 2020 • Haoran Chen, Jianmin Li, Xiaolin Hu
Video captioning is an advanced multi-modal task which aims to describe a video clip using a natural language sentence.
no code implementations • 28 Dec 2019 • Zhaoyi Wan, Minghang He, Haoran Chen, Xiang Bai, Cong Yao
Driven by deep learning and the large volume of data, scene text recognition has evolved rapidly in recent years.
Ranked #19 on Scene Text Recognition on ICDAR2015
2 code implementations • 31 Aug 2019 • Haoran Chen, Ke Lin, Alexander Maye, Jianming Li, Xiaolin Hu
Given the features of a video, recurrent neural networks can be used to automatically generate a caption for the video.
no code implementations • 15 Aug 2019 • Yingzhong Shi, Zhaohong Deng, Haoran Chen, Kup-Sze Choi, Shitong Wang
Data stream classification methods demonstrate promising performance on a single data stream by exploring the cohesion in the data stream.
no code implementations • 12 Aug 2019 • Zhaohong Deng, Chen Cui, Peng Xu, Ling Liang, Haoran Chen, Te Zhang, Shitong Wang
How to exploit the relation-ship between different views effectively using the characteristic of multi-view data has become a crucial challenge.
no code implementations • 27 Apr 2017 • Boyue Wang, Yongli Hu, Junbin Gao, Yanfeng Sun, Haoran Chen, Bao-Cai Yin
Learning on Grassmann manifold has become popular in many computer vision tasks, with the strong capability to extract discriminative information for imagesets and videos.
no code implementations • 21 Sep 2016 • Haoran Chen, Yanfeng Sun, Junbin Gao, Yongli Hu, Bao-Cai Yin
Partial least squares regression (PLSR) has been a popular technique to explore the linear relationship between two datasets.
no code implementations • 7 Dec 2015 • Haoran Chen, Yanfeng Sun, Junbin Gao, Yongli Hu
The paper addresses the problem of optimizing a class of composite functions on Riemannian manifolds and a new first order optimization algorithm (FOA) with a fast convergence rate is proposed.
no code implementations • 8 Dec 2011 • Keqin Liu, Tianshuo Zheng, Haoran Chen
The multi-armed bandit (MAB) problem is a widely studied model in the field of operations research for sequential decision making and reinforcement learning.