no code implementations • 9 Feb 2025 • Wanqi Yang, Yanda Li, Meng Fang, Ling Chen
Understanding temporal dynamics is critical for conversational agents, enabling effective content analysis and informed decision-making.
1 code implementation • 22 Nov 2024 • Wanqi Yang, Yanda Li, Meng Fang, Yunchao Wei, Tianyi Zhou, Ling Chen
We evaluate six state-of-the-art LLMs with voice interaction capabilities, including Gemini-1. 5-Pro, GPT-4o, and others, using three distinct evaluation methods on the CAA benchmark.
1 code implementation • 4 Nov 2024 • Biao Wu, Yanda Li, Meng Fang, Zirui Song, Zhiwei Zhang, Yunchao Wei, Ling Chen
This survey provides a comprehensive review of mobile agent technologies, focusing on recent advancements that enhance real-time adaptability and multimodal interaction.
no code implementations • 25 Sep 2024 • Wanqi Yang, Yanda Li, Meng Fang, Ling Chen
Time-Sensitive Question Answering (TSQA) demands the effective utilization of specific temporal contexts, encompassing multiple time-evolving facts, to address time-sensitive questions.
no code implementations • 5 Aug 2024 • Yanda Li, Chi Zhang, Wanqi Yang, Bin Fu, Pei Cheng, Xin Chen, Ling Chen, Yunchao Wei
In the deployment phase, RAG technology enables efficient retrieval and update from this knowledge base, thereby empowering the agent to perform tasks effectively and accurately.
no code implementations • 17 Jul 2024 • Wanqi Yang, Yunqiu Xu, Yanda Li, Kunze Wang, Binbin Huang, Ling Chen
In this study, we explore an emerging research area of Continual Learning for Temporal Sensitive Question Answering (CLTSQA).
no code implementations • 17 Jun 2024 • Zepeng Ding, Ruiyang Ke, Wenhao Huang, Guochao Jiang, Yanda Li, Deqing Yang, Jiaqing Liang
Existing research on large language models (LLMs) shows that they can solve information extraction tasks through multi-step planning.
no code implementations • 27 May 2024 • Dixuan Wang, Yanda Li, Junyuan Jiang, Zepeng Ding, Guochao Jiang, Jiaqing Liang, Deqing Yang
Our empirical results reveal that our ADT is highly effective on challenging the tokenization of leading LLMs, including GPT-4o, Llama-3, Qwen2. 5-max and so on, thus degrading these LLMs' capabilities.
no code implementations • 28 Apr 2024 • Zirui Song, Yaohang Li, Meng Fang, Yanda Li, Zhenhao Chen, Zecheng Shi, Yuan Huang, Xiuying Chen, Ling Chen
To address this, we propose the Multi-Modal Agent Collaboration framework (MMAC-Copilot), a framework utilizes the collective expertise of diverse agents to enhance interaction ability with application.
no code implementations • 4 Apr 2024 • Yanda Li, Dixuan Wang, Jiaqing Liang, Guochao Jiang, Qianyu He, Yanghua Xiao, Deqing Yang
Large Language Models (LLMs) have demonstrated good performance in many reasoning tasks, but they still struggle with some complicated reasoning tasks including logical reasoning.
1 code implementation • 20 Aug 2023 • Yanda Li, Chi Zhang, Gang Yu, Zhibin Wang, Bin Fu, Guosheng Lin, Chunhua Shen, Ling Chen, Yunchao Wei
However, these datasets often exhibit domain bias, potentially constraining the generative capabilities of the models.
Ranked #149 on
Visual Question Answering
on MM-Vet
1 code implementation • 3 Apr 2023 • Yanda Li, Zilong Huang, Gang Yu, Ling Chen, Yunchao Wei, Jianbo Jiao
The pre-training task is designed in a similar manner as image matting, where random trimap and alpha matte are generated to achieve an image disentanglement objective.