no code implementations • 16 Feb 2025 • Fei Yu, Yingru Li, Benyou Wang
Value model-guided search is effective in steering the generation but suffers from scaling flaws: Its superiority diminishes with larger sample sizes, underperforming non-search baselines.
no code implementations • 24 Jan 2025 • Zhe Xiang, Fei Yu, Quan Deng, Yuandi Li, Zhiguo Wan
This approach prioritizes high-level semantic information, improving robustness and reducing redundancy across modalities like text, speech, and images.
1 code implementation • 9 Jan 2025 • Qingyu Ren, Jie Zeng, Qianyu He, Jiaqing Liang, Yanghua Xiao, Weikang Zhou, Zeye Sun, Fei Yu
It is crucial for large language models (LLMs) to follow instructions that involve multiple constraints.
no code implementations • 30 Dec 2024 • Junxiao Xue, Quan Deng, Fei Yu, Yanhao Wang, Jun Wang, Yuehua Li
Multimodal large language models (MLLMs), such as GPT-4o, Gemini, LLaVA, and Flamingo, have made significant progress in integrating visual and textual modalities, excelling in tasks like visual question answering (VQA), image captioning, and content retrieval.
no code implementations • 16 Dec 2024 • Jianqing Zhu, Huang Huang, Zhihang Lin, Juhao Liang, Zhengyang Tang, Khalid Almubarak, Abdulmohsen Alharthik, Bang An, Juncai He, Xiangbo Wu, Fei Yu, Junying Chen, Zhuoheng Ma, Yuhao Du, He Zhang, Emad A. Alghamdi, Lian Zhang, Ruoyu Sun, Haizhou Li, Benyou Wang, Jinchao Xu
This paper addresses the critical need for democratizing large language models (LLM) in the Arab world, a region that has seen slower progress in developing models comparable to state-of-the-art offerings like GPT-4 or ChatGPT 3. 5, due to a predominant focus on mainstream languages (e. g., English and Chinese).
no code implementations • 9 Dec 2024 • Fei Yu, Zhe Xiang, Nan Che, Zhuoran Zhang, Yuandi Li, Junxiao Xue, Zhiguo Wan
Existing methods often focus on single modality tasks and fail to handle multimodal stream data, such as video and audio, and their corresponding tasks.
no code implementations • 4 Nov 2024 • Yuandi Li, Zhe Xiang, Fei Yu, Zhangshuang Guan, Hui Ji, Zhiguo Wan, Cheng Feng
This letter introduces MMTrustSC, a novel framework designed to address these challenges by enhancing the security and reliability of multimodal communication.
no code implementations • 14 Oct 2024 • Chenglin Li, Qianglong Chen, Zhi Li, Feng Tao, Yicheng Li, Hao Chen, Fei Yu, Yin Zhang
With tree search and evaluation models, it can efficiently guide each instruction to evolve into a high-quality form, aiding in instruction fine-tuning.
no code implementations • 16 Sep 2024 • Fa-Ting Hong, Yunfei Liu, Yu Li, Changyin Zhou, Fei Yu, Dan Xu
Audio-driven talking head synthesis strives to generate lifelike video portraits from provided audio.
1 code implementation • 28 Aug 2024 • Haowen Hou, Fei Ma, Binwen Bai, Xinxin Zhu, Fei Yu
Large Language Models (LLMs) have garnered widespread attention due to their remarkable performance across various tasks.
1 code implementation • 26 Jul 2024 • Fangze Lin, Ying He, Fei Yu
We initially train a pre-trained model using large-scale expert data.
1 code implementation • 9 Jul 2024 • Yiying Wang, Xiaojing Li, Binzhu WANG, Yueyang Zhou, Yingru Lin, Han Ji, Hong Chen, Jinshi Zhang, Fei Yu, Zewei Zhao, Song Jin, Renji Gong, Wanqing Xu
In domain-specific applications, GPT-4, augmented with precise prompts or Retrieval-Augmented Generation (RAG), shows notable potential but faces the critical tri-lemma of performance, cost, and data privacy.
no code implementations • 1 Jul 2024 • Sirui Xia, Xintao Wang, Jiaqing Liang, Yifei Zhang, Weikang Zhou, Jiaji Deng, Fei Yu, Yanghua Xiao
Retrieval-Augmented Generation (RAG) has been widely adopted to enhance Large Language Models (LLMs) in knowledge-intensive tasks.
1 code implementation • 1 Jul 2024 • Yuxuan Wang, Yijun Liu, Fei Yu, Chen Huang, Kexin Li, Zhiguo Wan, Wanxiang Che
Our in-depth category-level analysis reveals a lack of Chinese cultural knowledge in existing VLMs.
no code implementations • 28 Jun 2024 • Youhua Xia, Tiehua Zhang, Jiong Jin, Ying He, Fei Yu
Efficient data transmission scheduling within vehicular environments poses a significant challenge due to the high mobility of such networks.
no code implementations • 29 Apr 2024 • Dingjie Song, Shunian Chen, Guiming Hardy Chen, Fei Yu, Xiang Wan, Benyou Wang
Despite the advancements and impressive performance of Multimodal Large Language Models (MLLMs) on benchmarks, their effectiveness in real-world, long-context, and multi-image tasks is unclear due to the benchmarks' limited scope.
no code implementations • 19 Jan 2024 • Hao Qian, Hongting Zhou, Qian Zhao, Hao Chen, Hongxiang Yao, Jingwei Wang, Ziqi Liu, Fei Yu, Zhiqiang Zhang, Jun Zhou
The stock market is a crucial component of the financial system, but predicting the movement of stock prices is challenging due to the dynamic and intricate relations arising from various aspects such as economic indicators, financial reports, global news, and investor sentiment.
1 code implementation • 16 Nov 2023 • Fei Yu, Anningzhe Gao, Benyou Wang
These findings offer a novel perspective on the role of outcome supervision in training value models for multi-step reasoning tasks and provide theoretical justification for its advantage in value estimation for guided decoding.
Ranked #45 on
Arithmetic Reasoning
on GSM8K
no code implementations • 20 Oct 2023 • Xabi Azagirre, Akshay Balwally, Guillaume Candeli, Nicholas Chamandy, Benjamin Han, Alona King, Hyungjun Lee, Martin Loncaric, Sebastien Martin, Vijay Narasiman, Zhiwei, Qin, Baptiste Richard, Sara Smoot, Sean Taylor, Garrett van Ryzin, Di wu, Fei Yu, Alex Zamoshchin
This change was the first documented implementation of a ridesharing matching algorithm that can learn and improve in real time.
no code implementations • 7 Oct 2023 • Zhixuan Chu, Huaiyu Guo, Xinyuan Zhou, Yijia Wang, Fei Yu, Hong Chen, Wanqing Xu, Xin Lu, Qing Cui, Longfei Li, Jun Zhou, Sheng Li
Large language models (LLMs) show promise for natural language tasks but struggle when applied directly to complex domains like finance.
1 code implementation • 21 Sep 2023 • Huang Huang, Fei Yu, Jianqing Zhu, Xuening Sun, Hao Cheng, Dingjie Song, Zhihong Chen, Abdulmohsen Alharthi, Bang An, Juncai He, Ziche Liu, Zhiyi Zhang, Junying Chen, Jianquan Li, Benyou Wang, Lian Zhang, Ruoyu Sun, Xiang Wan, Haizhou Li, Jinchao Xu
This paper is devoted to the development of a localized Large Language Model (LLM) specifically for Arabic, a language imbued with unique cultural characteristics inadequately addressed by current mainstream models.
no code implementations • 30 Aug 2023 • Nan Che, Chenrui Liu, Fei Yu
In particular, there is no public common data set for the research field of sound event recognition for the data set of the indoor environmental sound scene.
no code implementations • 24 Aug 2023 • Puning Zhao, Fei Yu, Zhiguo Wan
Federated learning systems are susceptible to adversarial attacks.
no code implementations • 24 Jul 2023 • Helal El-Zaatari, Fei Yu, Michael R Kosorok
Statistical analysis of social networks provides valuable insights into complex network interactions across various scientific disciplines.
no code implementations • ICCV 2023 • Yunfei Liu, Lijian Lin, Fei Yu, Changyin Zhou, Yu Li
Audio-driven portrait animation aims to synthesize portrait videos that are conditioned by given audio.
2 code implementations • 24 May 2023 • Hongbo Zhang, Junying Chen, Feng Jiang, Fei Yu, Zhihong Chen, Jianquan Li, Guiming Chen, Xiangbo Wu, Zhiyi Zhang, Qingying Xiao, Xiang Wan, Benyou Wang, Haizhou Li
Experimental results demonstrate that HuatuoGPT achieves state-of-the-art results in performing medical consultation among open-source LLMs in GPT-4 evaluation, human evaluation, and medical benchmark datasets.
1 code implementation • 20 Apr 2023 • Zhihong Chen, Feng Jiang, Junying Chen, Tiannan Wang, Fei Yu, Guiming Chen, Hongbo Zhang, Juhao Liang, Chen Zhang, Zhiyi Zhang, Jianquan Li, Xiang Wan, Benyou Wang, Haizhou Li
This paper presents our efforts to democratize ChatGPT across language.
1 code implementation • 26 Mar 2023 • Fei Yu, Hongbo Zhang, Prayag Tiwari, Benyou Wang
This survey paper proposes a clearer view of natural language reasoning in the field of Natural Language Processing (NLP), both conceptually and practically.
no code implementations • ICCV 2023 • Tianke Zhang, Xuangeng Chu, Yunfei Liu, Lijian Lin, Zhendong Yang, Zhengzhuo Xu, Chengkun Cao, Fei Yu, Changyin Zhou, Chun Yuan, Yu Li
However, the current deep learning-based methods face significant challenges in achieving accurate reconstruction with disentangled facial parameters and ensuring temporal stability in single-frame methods for 3D face tracking on video data.
1 code implementation • 17 May 2022 • Hexin Dong, ZiFan Chen, Mingze Yuan, Yutong Xie, Jie Zhao, Fei Yu, Bin Dong, Li Zhang
Therefore, we propose a method called region-aware metric learning (RAML), which first separates the regions of the images and generates region-aware features for further metric learning.
no code implementations • 29 Sep 2021 • Hexin Dong, Fei Yu, Jie Zhao, Bin Dong, Li Zhang
This paper proposes an unsupervised cross-modality domain adaptation approach based on pixel alignment and self-training.
no code implementations • 8 Apr 2021 • Mo Zhang, Fei Yu, Jie Zhao, Li Zhang, Quanzheng Li
Blood vessel segmentation is crucial for many diagnostic and research applications.
no code implementations • 30 Jun 2020 • Fei Yu, Jiji Tang, Weichong Yin, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang
Thus, ERNIE-ViL can learn the joint representations characterizing the alignments of the detailed semantics across vision and language.
Ranked #2 on
Visual Question Answering (VQA)
on VCR (QA-R) test
no code implementations • 6 Nov 2019 • Fei Yu, Feiyi Fan, Shouxu Jiang, Kaiping Zheng
In this paper, a novel group recommendation method, called attentive geo-social group recommendation, is proposed to recommend the target user with both activity locations and a group of users that may join the activities.
1 code implementation • 4 Nov 2019 • Jie Zhao, Lei Dai, Mo Zhang, Fei Yu, Meng Li, Hongfeng Li, Wenjia Wang, Li Zhang
The experimental results show that the PGU-net+ has superior accuracy than the previous state-of-the-art methods on cervical nuclei segmentation.
no code implementations • 26 Jul 2019 • Rongchang Xie, Fei Yu, Jiachao Wang, Yizhou Wang, Li Zhang
In recent years, object detection has shown impressive results using supervised deep learning, but it remains challenging in a cross-domain environment.
no code implementations • 26 Jul 2019 • Fei Yu, Jie Zhao, Yanjun Gong, Zhi Wang, Yuxi Li, Fan Yang, Bin Dong, Quanzheng Li, Li Zhang
Segmenting coronary arteries is challenging, as classic unsupervised methods fail to produce satisfactory results and modern supervised learning (deep learning) requires manual annotation which is often time-consuming and can some time be infeasible.
no code implementations • 30 Jul 2014 • Fei Yu, Michal Rybar, Caroline Uhler, Stephen E. Fienberg
Following the publication of an attack on genome-wide association studies (GWAS) data proposed by Homer et al., considerable attention has been given to developing methods for releasing GWAS data in a privacy-preserving way.