1 code implementation • 10 Mar 2025 • Zhangquan Chen, Xufang Luo, Dongsheng Li
Visual understanding is inherently intention-driven - humans selectively focus on different regions of a scene based on their goals.
no code implementations • 8 Feb 2025 • Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Xufang Luo, Hao Cheng, Dongsheng Li, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, Jianfeng Gao
To deliver coherent and personalized experiences in long-term conversations, existing approaches typically perform retrieval augmented response generation by constructing memory banks from conversation history at either the turn-level, session-level, or through summarization techniques. In this paper, we present two key findings: (1) The granularity of memory unit matters: turn-level, session-level, and summarization-based methods each exhibit limitations in both memory retrieval accuracy and the semantic quality of the retrieved content.
1 code implementation • 16 Jan 2025 • Zhihe Yang, Xufang Luo, Dongqi Han, Yunjian Xu, Dongsheng Li
Hallucination remains a major challenge for Large Vision-Language Models (LVLMs).
1 code implementation • 13 Dec 2024 • Yucheng Li, Huiqiang Jiang, Qianhui Wu, Xufang Luo, Surin Ahn, Chengruidong Zhang, Amir H. Abdi, Dongsheng Li, Jianfeng Gao, Yuqing Yang, Lili Qiu
To address these challenges, optimizations for long-context inference have been developed, centered around the KV cache.
1 code implementation • 7 Nov 2024 • Weiquan Huang, Aoqi Wu, Yifan Yang, Xufang Luo, Yuqing Yang, Liang Hu, Qi Dai, Xiyang Dai, Dongdong Chen, Chong Luo, Lili Qiu
CLIP is a foundational multimodal model that aligns image and text features into a shared space using contrastive learning on large-scale image-text pairs.
2 code implementations • 2 Jul 2024 • Huiqiang Jiang, Yucheng Li, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H. Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, Lili Qiu
With the pattern and sparse indices, we perform efficient sparse attention calculations via our optimized GPU kernels to significantly reduce the latency in the pre-filling stage of long-context LLMs.
1 code implementation • 4 Jun 2024 • Yijiong Yu, Huiqiang Jiang, Xufang Luo, Qianhui Wu, Chin-Yew Lin, Dongsheng Li, Yuqing Yang, Yongfeng Huang, Lili Qiu
This paper first explores the micro-level manifestations of position bias, concluding that attention weights are a micro-level expression of position bias.
no code implementations • 19 Apr 2024 • Shentong Mo, Xufang Luo, Yansen Wang, Dongsheng Li
Visual task adaptation has been demonstrated to be effective in adapting pre-trained Vision Transformers (ViTs) to general downstream visual tasks using specialized learnable layers or tokens.
1 code implementation • 2 Apr 2024 • Zhiyuan He, Aashish Gottipati, Lili Qiu, Xufang Luo, Kenuo Xu, Yuqing Yang, Francis Y. Yan
We introduce NADA, the first framework to autonomously design network algorithms by leveraging the generative capabilities of large language models (LLMs).
no code implementations • 1 Apr 2024 • Zilong Wang, Xufang Luo, Xinyang Jiang, Dongsheng Li, Lili Qiu
This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment.
1 code implementation • 19 Mar 2024 • Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Menglin Xia, Xufang Luo, Jue Zhang, QIngwei Lin, Victor Rühle, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, Dongmei Zhang
Additionally, our model is 3x-6x faster than existing prompt compression methods, while accelerating the end-to-end latency by 1. 6x-2. 9x with compression ratios of 2x-5x.
no code implementations • 27 Feb 2024 • Shentong Mo, Yansen Wang, Xufang Luo, Dongsheng Li
Visual Prompt Tuning (VPT) techniques have gained prominence for their capacity to adapt pre-trained Vision Transformers (ViTs) to downstream visual tasks using specialized learnable tokens termed as prompts.
no code implementations • 10 Dec 2023 • William Wei Wang, Dongqi Han, Xufang Luo, Yifei Shen, Charles Ling, Boyu Wang, Dongsheng Li
Empowering embodied agents, such as robots, with Artificial Intelligence (AI) has become increasingly important in recent years.
no code implementations • 24 Nov 2023 • Jie Lian, Xufang Luo, Caihua Shan, Dongqi Han, Varut Vardhanabhuti, Dongsheng Li
However, selecting the appropriate edge feature to define patient similarity and construct the graph is challenging, given that each patient is depicted by high-dimensional features from diverse sources.
no code implementations • 24 Nov 2023 • Xiaoxuan He, Yifan Yang, Xinyang Jiang, Xufang Luo, Haoji Hu, Siyun Zhao, Dongsheng Li, Yuqing Yang, Lili Qiu
To overcome the aforementioned challenges, we propose an Unified Medical Image Pre-training framework, namely UniMedI, which utilizes diagnostic reports as common semantic space to create unified representations for diverse modalities of medical images (especially for 2D and 3D images).
2 code implementations • 10 Oct 2023 • Huiqiang Jiang, Qianhui Wu, Xufang Luo, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, Lili Qiu
In long context scenarios, large language models (LLMs) face three main challenges: higher computational cost, performance reduction, and position bias.
no code implementations • 2 Jul 2023 • Ziyue Li, Yuchen Fang, You Li, Kan Ren, Yansen Wang, Xufang Luo, Juanyong Duan, Congrui Huang, Dongsheng Li, Lili Qiu
A timely detection of seizures for newborn infants with electroencephalogram (EEG) has been a common yet life-saving practice in the Neonatal Intensive Care Unit (NICU).
no code implementations • 2 Jul 2023 • Ruiwen Zhou, Minghuan Liu, Kan Ren, Xufang Luo, Weinan Zhang, Dongsheng Li
Due to the nature of risk management in learning applicable policies, risk-sensitive reinforcement learning (RSRL) has been realized as an important direction.
no code implementations • 14 Mar 2023 • Han Zheng, Xufang Luo, Pengfei Wei, Xuan Song, Dongsheng Li, Jing Jiang
In this paper, we consider an offline-to-online setting where the agent is first learned from the offline dataset and then trained online, and propose a framework called Adaptive Policy Learning for effectively taking advantage of offline and online data.
no code implementations • 17 Jun 2022 • Kerong Wang, Hanye Zhao, Xufang Luo, Kan Ren, Weinan Zhang, Dongsheng Li
Offline reinforcement learning (RL) aims at learning policies from previously collected static trajectory data without interacting with the real environment.
no code implementations • 19 May 2022 • Zhengyu Yang, Kan Ren, Xufang Luo, Minghuan Liu, Weiqing Liu, Jiang Bian, Weinan Zhang, Dongsheng Li
Considering the great performance of ensemble methods on both accuracy and generalization in supervised learning (SL), we design a robust and applicable method named Ensemble Proximal Policy Optimization (EPPO), which learns ensemble policies in an end-to-end manner.
1 code implementation • 17 Feb 2022 • Che Wang, Xufang Luo, Keith Ross, Dongsheng Li
We propose VRL3, a powerful data-driven framework with a simple design for solving challenging visual deep reinforcement learning (DRL) tasks.
no code implementations • ICLR 2022 • Dongqi Han, Tadashi Kozuno, Xufang Luo, Zhao-Yun Chen, Kenji Doya, Yuqing Yang, Dongsheng Li
How to make intelligent decisions is a central problem in machine learning and cognitive science.
no code implementations • 29 Sep 2021 • Zhengyu Yang, Kan Ren, Xufang Luo, Weiqing Liu, Jiang Bian, Weinan Zhang, Dongsheng Li
Ensemble learning, which can consistently improve the prediction performance in supervised learning, has drawn increasing attentions in reinforcement learning (RL).
no code implementations • 29 Sep 2021 • Han Zheng, Xufang Luo, Pengfei Wei, Xuan Song, Dongsheng Li, Jing Jiang
Specifically, we explicitly consider the difference between the online and offline data and apply an adaptive update scheme accordingly, i. e., a pessimistic update strategy for the offline dataset and a greedy or no pessimistic update scheme for the online dataset.
no code implementations • 25 Sep 2019 • Xufang Luo, Qi Meng, Wei Chen, Tie-Yan Liu
Hence, some new algorithms that conduct optimizations directly in the path space (the path space is proven to be PSI) were developed, such as Stochastic Gradient Descent (SGD) in the path space, and it was shown that SGD in the path space is superior to that in the weight space.
no code implementations • 27 Sep 2018 • Xufang Luo, Qi Meng, Di He, Wei Chen, Yunhong Wang, Tie-Yan Liu
Based on our observations, we formally define expressiveness of the state extractor as the rank of the matrix composed by representations.