Search Results for author: BoYu Chen

Found 25 papers, 12 papers with code

Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding

no code implementations9 Jun 2025 BoYu Chen, Siran Chen, Kunchang Li, Qinglin Xu, Yu Qiao, Yali Wang

To fill this gap, we propose a unified Super Encoding Network (SEN) for video understanding, which builds up such distinct interactions through recursive association of multi-modal encoders in the foundation models.

Contrastive Learning Video Editing +1

Blend the Separated: Mixture of Synergistic Experts for Data-Scarcity Drug-Target Interaction Prediction

1 code implementation20 Mar 2025 Xinlong Zhai, Chunchen Wang, Ruijia Wang, Jiazheng Kang, Shujie Li, BoYu Chen, Tengfei Ma, Zikai Zhou, Cheng Yang, Chuan Shi

Extensive experiments on 3 real-world datasets under different extents of input data scarcity and/or label scarcity demonstrate our model outperforms states of the art significantly and steadily, with a maximum improvement of 53. 53%.

Drug Discovery

LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents

no code implementations13 Mar 2025 BoYu Chen, Zhengrong Yue, Siran Chen, Zikang Wang, Yang Liu, Peng Li, Yali Wang

In order to better address long video tasks, we introduce LVAgent, the first framework enabling multi-round dynamic collaboration of MLLM agents in long video understanding.

Computational Efficiency Optical Character Recognition (OCR) +2

PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths

1 code implementation18 Feb 2025 BoYu Chen, Zirui Guo, Zidan Yang, Yuluo Chen, Junze Chen, Zhenghao Liu, Chuan Shi, Cheng Yang

Typical RAG approaches split the text database into chunks, organizing them in a flat structure for efficient searches.

RAG Retrieval +1

Medical Image Quality Assessment based on Probability of Necessity and Sufficiency

no code implementations10 Oct 2024 BoYu Chen, Ameenat L. Solebo, Weiye Bao, Paul Taylor

To that end, we propose an MIQA framework based on a concept from causal inference: Probability of Necessity and Sufficiency (PNS).

Causal Inference Image Quality Assessment +1

Seeking the Sufficiency and Necessity Causal Features in Multimodal Representation Learning

no code implementations29 Aug 2024 BoYu Chen, Junjie Liu, Zhu Li, Mengyue Yang

We address these challenges by first conceptualizing multimodal representations as comprising modality-invariant and modality-specific components.

Representation Learning

Advancing Cell Detection in Anterior Segment Optical Coherence Tomography Images

1 code implementation25 Jun 2024 BoYu Chen, Ameenat L. Solebo, Paul Taylor

This framework consists of a zero-shot chamber segmentation module and a cell detection module.

Cell Detection

JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning

no code implementations18 Jun 2024 BoYu Chen, Peike Li, Yao Yao, Alex Wang

In this paper, we propose a novel method for customized text-to-music generation, which can capture the concept from a two-minute reference music and generate a new piece of music conforming to the concept.

Music Generation Text-to-Music Generation

JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation

1 code implementation29 Oct 2023 Yao Yao, Peike Li, BoYu Chen, Alex Wang

With rapid advances in generative artificial intelligence, the text-to-music synthesis task has emerged as a promising direction for music generation.

Music Generation

Ternary Singular Value Decomposition as a Better Parameterized Form in Linear Mapping

1 code implementation15 Aug 2023 BoYu Chen, Hanxuan Chen, Jiao He, Fengyu Sun, Shangling Jui

We present a simple yet novel parameterized form of linear mapping to achieves remarkable network compression performance: a pseudo SVD called Ternary SVD (TSVD).

Form Language Modeling +3

JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models

2 code implementations9 Aug 2023 Peike Li, BoYu Chen, Yao Yao, Yikai Wang, Allen Wang, Alex Wang

Despite the task's significance, prevailing generative models exhibit limitations in music quality, computational efficiency, and generalization.

Computational Efficiency In-Context Learning +2

Modify Training Directions in Function Space to Reduce Generalization Error

no code implementations25 Jul 2023 Yi Yu, Wenlian Lu, BoYu Chen

We propose theoretical analyses of a modified natural gradient descent method in the neural network function space based on the eigendecompositions of neural tangent kernel and Fisher information matrix.

FedDKD: Federated Learning with Decentralized Knowledge Distillation

no code implementations2 May 2022 Xinjia Li, BoYu Chen, Wenlian Lu

The FedDKD introduces a module of decentralized knowledge distillation (DKD) to distill the knowledge of the local models to train the global model by approaching the neural network map average based on the metric of divergence defined in the loss function, other than only averaging parameters as done in literature.

Federated Learning Knowledge Distillation

Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking

1 code implementation10 Mar 2022 BoYu Chen, Peixia Li, Lei Bai, Lei Qiao, Qiuhong Shen, Bo Li, Weihao Gan, Wei Wu, Wanli Ouyang

Exploiting a general-purpose neural architecture to replace hand-wired designs or inductive biases has recently drawn extensive interest.

All Visual Object Tracking

Deps-SAN: Neural Machine Translation with Dependency-Scaled Self-Attention Network

no code implementations23 Nov 2021 Ru Peng, Nankai Lin, Yi Fang, Shengyi Jiang, Tianyong Hao, BoYu Chen, Junbo Zhao

However, succeeding researches pointed out that limited by the uncontrolled nature of attention computation, the NMT model requires an external syntax to capture the deep syntactic awareness.

Machine Translation NMT +1

BN-NAS: Neural Architecture Search with Batch Normalization

1 code implementation ICCV 2021 BoYu Chen, Peixia Li, Baopu Li, Chen Lin, Chuming Li, Ming Sun, Junjie Yan, Wanli Ouyang

We present BN-NAS, neural architecture search with Batch Normalization (BN-NAS), to accelerate neural architecture search (NAS).

Neural Architecture Search

PSViT: Better Vision Transformer via Token Pooling and Attention Sharing

no code implementations7 Aug 2021 BoYu Chen, Peixia Li, Baopu Li, Chuming Li, Lei Bai, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang

Then, a compact set of the possible combinations for different token pooling and attention sharing mechanisms are constructed.

Real-time 'Actor-Critic' Tracking

no code implementations ECCV 2018 Boyu Chen, Dong Wang, Peixia Li, Shuang Wang, Huchuan Lu

In this work, we propose a novel tracking algorithm with real-time performance based on the ‘Actor-Critic’ framework.

Reinforcement Learning Visual Tracking

Meta-Learning with Hessian-Free Approach in Deep Neural Nets Training

1 code implementation22 May 2018 Boyu Chen, Wenlian Lu, Ernest Fokoue

Meta-learning is a promising method to achieve efficient training method towards deep neural net and has been attracting increases interests in recent years.

Meta-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.