Search Results for author: Yan Zhong

Found 19 papers, 6 papers with code

VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

1 code implementation29 May 2025 Yuanxin Liu, Kun Ouyang, HaoNing Wu, Yi Liu, Lin Sui, Xinhao Li, Yan Zhong, Y. Charles, Xinyu Zhou, Xu sun

Recent studies have shown that long chain-of-thought (CoT) reasoning can significantly enhance the performance of large language models (LLMs) on complex tasks.

Video Understanding

Twin Co-Adaptive Dialogue for Progressive Image Generation

no code implementations21 Apr 2025 Jianhui Wang, Yangfan He, Yan Zhong, Xinyuan Song, Jiayi Su, Yuheng Feng, Hongyang He, Wenyu Zhu, Xinhang Yuan, Kuan Lu, Menghao Huo, Miao Zhang, Keqin Li, Jiaqi Chen, Tianyu Shi, Xueqian Wang

Modern text-to-image generation systems have enabled the creation of remarkably realistic and high-quality visuals, yet they often falter when handling the inherent ambiguities in user prompts.

Text to Image Generation Text-to-Image Generation

Kimi-VL Technical Report

1 code implementation10 Apr 2025 Kimi Team, Angang Du, Bohong Yin, Bowei Xing, Bowen Qu, Bowen Wang, Cheng Chen, Chenlin Zhang, Chenzhuang Du, Chu Wei, Congcong Wang, Dehao Zhang, Dikang Du, Dongliang Wang, Enming Yuan, Enzhe Lu, Fang Li, Flood Sung, Guangda Wei, Guokun Lai, Han Zhu, Hao Ding, Hao Hu, Hao Yang, Hao Zhang, HaoNing Wu, Haotian Yao, Haoyu Lu, Heng Wang, Hongcheng Gao, Huabin Zheng, Jiaming Li, Jianlin Su, Jianzhou Wang, Jiaqi Deng, Jiezhong Qiu, Jin Xie, Jinhong Wang, Jingyuan Liu, Junjie Yan, Kun Ouyang, Liang Chen, Lin Sui, Longhui Yu, Mengfan Dong, Mengnan Dong, Nuo Xu, Pengyu Cheng, Qizheng Gu, Runjie Zhou, Shaowei Liu, Sihan Cao, Tao Yu, Tianhui Song, Tongtong Bai, Wei Song, Weiran He, Weixiao Huang, Weixin Xu, Xiaokun Yuan, Xingcheng Yao, Xingzhe Wu, Xinxing Zu, Xinyu Zhou, Xinyuan Wang, Y. Charles, Yan Zhong, Yang Li, Yangyang Hu, Yanru Chen, Yejie Wang, Yibo Liu, Yibo Miao, Yidao Qin, Yimin Chen, Yiping Bao, Yiqin Wang, Yongsheng Kang, Yuanxin Liu, Yulun Du, Yuxin Wu, Yuzhi Wang, Yuzi Yan, Zaida Zhou, Zhaowei Li, Zhejun Jiang, Zheng Zhang, Zhilin Yang, Zhiqi Huang, Zihao Huang, Zijia Zhao, Ziwei Chen, Zongyu Lin

We present Kimi-VL, an efficient open-source Mixture-of-Experts (MoE) vision-language model (VLM) that offers advanced multimodal reasoning, long-context understanding, and strong agent capabilities - all while activating only 2. 8B parameters in its language decoder (Kimi-VL-A3B).

Long-Context Understanding Mathematical Reasoning +4

Enhancing Intent Understanding for Ambiguous prompt: A Human-Machine Co-Adaption Strategy

no code implementations25 Jan 2025 Yangfan He, Jianhui Wang, Yijin Wang, Kun Li, Yan Zhong, Xinyuan Song, Li Sun, Jingyuan Lu, Miao Zhang, Tianyu Shi, Xinhang Yuan, Kuan Lu, Menghao Huo, Keqin Li, Jiaqi Chen

While some methods focus on enhancing prompts to make the generated images fit user needs, the model is still hard to understand users' real needs, especially for non-expert users.

Image Generation Language Modeling +2

RaSeRec: Retrieval-Augmented Sequential Recommendation

1 code implementation24 Dec 2024 Xinping Zhao, Baotian Hu, Yan Zhong, Shouzheng Huang, Zihao Zheng, Meng Wang, Haofen Wang, Min Zhang

Although prevailing supervised and self-supervised learning (SSL)-augmented sequential recommendation (SeRec) models have achieved improved performance with powerful neural network architectures, we argue that they still suffer from two limitations: (1) Preference Drift, where models trained on past data can hardly accommodate evolving user preference; and (2) Implicit Memory, where head patterns dominate parametric learning, making it harder to recall long tails.

Retrieval +2

SEER: Self-Aligned Evidence Extraction for Retrieval-Augmented Generation

no code implementations15 Oct 2024 Xinping Zhao, Dongfang Li, Yan Zhong, Boren Hu, Yibin Chen, Baotian Hu, Min Zhang

Recent studies in Retrieval-Augmented Generation (RAG) have investigated extracting evidence from retrieved passages to reduce computational costs and enhance the final RAG performance, yet it remains challenging.

Chunking RAG +3

FunnelRAG: A Coarse-to-Fine Progressive Retrieval Paradigm for RAG

no code implementations14 Oct 2024 Xinping Zhao, Yan Zhong, Zetian Sun, Xinshuo Hu, Zhenyu Liu, Dongfang Li, Baotian Hu, Min Zhang

In this work, we propose a progressive retrieval paradigm with coarse-to-fine granularity for RAG, termed FunnelRAG, so as to balance effectiveness and efficiency.

RAG Retrieval +1

SPikE-SSM: A Sparse, Precise, and Efficient Spiking State Space Model for Long Sequences Learning

no code implementations7 Oct 2024 Yan Zhong, Ruoyu Zhao, Chao Wang, Qinghai Guo, JianGuo Zhang, Zhichao Lu, Luziwei Leng

However, applying the highly capable SSMs to SNNs for long sequences learning poses three major challenges: (1) The membrane potential is determined by the past spiking history of the neuron, leading to reduced efficiency for sequence modeling in parallel computing scenarios.

Computational Efficiency State Space Models

MSSDA: Multi-Sub-Source Adaptation for Diabetic Foot Neuropathy Recognition

no code implementations21 Sep 2024 Yan Zhong, Zhixin Yan, Yi Xie, Shibin Wu, Huaidong Zhang, Lin Shu, Peiru Zhou

To advance DFN research, we have collected a novel dataset comprising continuous plantar pressure data to recognize diabetic foot neuropathy.

Domain Adaptation

PhysMamba: State Space Duality Model for Remote Physiological Measurement

no code implementations2 Aug 2024 Zhixin Yan, Yan Zhong, Hongbin Xu, Wenjun Zhang, Shangru Yi, Lin Shu, Wenxiong Kang

Remote Photoplethysmography (rPPG) enables non-contact physiological signal extraction from facial videos, offering applications in psychological state analysis, medical assistance, and anti-face spoofing.

State Space Models

Unlock the Power of Algorithm Features: A Generalization Analysis for Algorithm Selection

no code implementations18 May 2024 Xingyu Wu, Yan Zhong, Jibin Wu, Yuxiao Huang, Sheng-hao Wu, Kay Chen Tan

In the algorithm selection research, the discussion surrounding algorithm features has been significantly overshadowed by the emphasis on problem features.

Inductive Learning

Beyond Score Changes: Adversarial Attack on No-Reference Image Quality Assessment from Two Perspectives

no code implementations20 Apr 2024 Chenxi Yang, Yujia Liu, Dingquan Li, Yan Zhong, Tingting Jiang

Meanwhile, it is important to note that the correlation, like ranking correlation, plays a significant role in NR-IQA tasks.

Adversarial Attack NR-IQA

Large Language Model-Enhanced Algorithm Selection: Towards Comprehensive Algorithm Representation

1 code implementation22 Nov 2023 Xingyu Wu, Yan Zhong, Jibin Wu, Bingbing Jiang, Kay Chen Tan

The high-dimensional algorithm representation extracted by LLM, after undergoing a feature selection module, is combined with the problem representation and passed to the similarity calculation module.

AutoML feature selection +3

RCP-RF: A Comprehensive Road-car-pedestrian Risk Management Framework based on Driving Risk Potential Field

1 code implementation4 May 2023 Shuhang Tan, Zhiling Wang, Yan Zhong

Recent years have witnessed the proliferation of traffic accidents, which led wide researches on Automated Vehicle (AV) technologies to reduce vehicle accidents, especially on risk assessment framework of AV technologies.

Management

TC-SKNet with GridMask for Low-complexity Classification of Acoustic scene

no code implementations5 Oct 2022 Luyuan Xie, Yan Zhong, Lin Yang, Zhaoyu Yan, Zhonghai Wu, Junjie Wang

In our experiments, the performance gain brought by GridMask is stronger than spectrum augmentation in ASCs.

AutoML Data Augmentation

Multi-label Causal Variable Discovery: Learning Common Causal Variables and Label-specific Causal Variables

no code implementations9 Nov 2020 Xingyu Wu, Bingbing Jiang, Yan Zhong, Huanhuan Chen

Analyzing these mechanisms, we provide the theoretical property of common causal variables, based on which the discovery and distinguishing algorithm is designed to identify these two types of variables.

feature selection

Cannot find the paper you are looking for? You can Submit a new open access paper.