Search Results for author: Haoxin Zhang

Found 4 papers, 0 papers with code

From Image to Video, what do we need in multimodal LLMs?

no code implementations18 Apr 2024 Suyuan Huang, Haoxin Zhang, Yan Gao, Yao Hu, Zengchang Qin

Multimodal Large Language Models (MLLMs) have demonstrated profound capabilities in understanding multimodal information, covering from Image LLMs to the more complex Video LLMs.

Video Understanding

NoteLLM: A Retrievable Large Language Model for Note Recommendation

no code implementations4 Mar 2024 Chao Zhang, Shiwei Wu, Haoxin Zhang, Tong Xu, Yan Gao, Yao Hu, Di wu, Enhong Chen

Indeed, learning to generate hashtags/categories can potentially enhance note embeddings, both of which compress key note information into limited content.

Contrastive Learning Language Modelling +1

Single-Channel EEG Based Arousal Level Estimation Using Multitaper Spectrum Estimation at Low-Power Wearable Devices

no code implementations31 Jul 2021 Berken Utku Demirel, Ivan Skelin, Haoxin Zhang, Jack J. Lin, Mohammad Abdullah Al Faruque

This paper proposes a novel lightweight method using the multitaper power spectrum to estimate arousal levels at wearable devices.

EEG

A multi-branch convolutional neural network for detecting double JPEG compression

no code implementations16 Oct 2017 Bin Li, Hu Luo, Haoxin Zhang, Shunquan Tan, Zhongzhou Ji

In this paper, we present a CNN solution by using raw DCT (discrete cosine transformation) coefficients from JPEG images as input.

Cannot find the paper you are looking for? You can Submit a new open access paper.