no code implementations • 31 Mar 2025 • Jiaxin Wu, Ting Zhang, Rubing Chen, WengYu Zhang, Chen Jason Zhang, XiaoYong Wei, Li Qing
Current molecular understanding approaches predominantly focus on the descriptive aspect of human perception, providing broad, topic-level insights.
no code implementations • 18 Feb 2025 • Tianyi Zhang, WengYu Zhang, Xulu Zhang, Jiaxin Wu, Xiao-Yong Wei, Jiannong Cao, Qing Li
Accurate human localization is crucial for various applications, especially in the Metaverse era.
no code implementations • 14 Feb 2025 • Xulu Zhang, XiaoYong Wei, Jinlin Wu, Jiaxin Wu, Zhaoxiang Zhang, Zhen Lei, Qing Li
Recursive Self-Improvement (RSI) enables intelligence systems to autonomously refine their capabilities.
no code implementations • 20 Dec 2024 • Jiaxin Wu, Chong-Wah Ngo, Xiao-Yong Wei, Qing Li
The generated queries retrieve different rank lists from the original query.
no code implementations • 20 Dec 2024 • Jiaxin Wu, Yiyang Jiang, Xiao-Yong Wei, Qing Li
Specifically, we provide the video captions generated by the LLaVA-Next-Video model and the video subtitles with timestamps as context, and ask GPT4 to generate step captions for the given medical query.
no code implementations • 20 Dec 2024 • Jiaxin Wu, WengYu Zhang, Xiao-Yong Wei, Qing Li
In this paper, we present our methods and results for the Video-To-Text (VTT) task at TRECVid 2024, exploring the capabilities of Vision-Language Models (VLMs) like LLaVA and LLaVA-NeXT-Video in generating natural language descriptions for video content.
no code implementations • 31 Jul 2024 • Rubing Chen, Xulu Zhang, Jiaxin Wu, Wenqi Fan, Xiao-Yong Wei, Qing Li
We propose a multi-layer knowledge pyramid approach within the RAG framework to achieve a better balance between precision and recall.
no code implementations • 30 Jul 2024 • Tianyi Zhang, WengYu Zhang, Xulu Zhang, Jiaxin Wu, Xiao-Yong Wei, Jiannong Cao, Qing Li
Accurate human localization is crucial for various applications, especially in the Metaverse era.
no code implementations • 11 Jul 2024 • Jiaxin Wu, Yizhou Yu, Hong-Yu Zhou
In this work, we benchmark popular UE methods with different model sizes on medical question-answering datasets.
no code implementations • 9 Apr 2024 • Jiaxin Wu, Chong-Wah Ngo, Wing-Kwong Chan
Experimental results show that the integration of the above-proposed elements doubles the R@1 performance of the AVS method on the MSRVTT dataset and improves the xinfAP on the TRECVid AVS query sets for 2016-2023 (eight years) by a margin from 2% to 77%, with an average about 20%.
1 code implementation • 19 Feb 2024 • Jiaxin Wu, Chong-Wah Ngo
Answering query with semantic concepts has long been the mainstream approach for video search.
no code implementations • 29th International Conference on Multimedia Modeling 2023 • Zhixin Ma, Jiaxin Wu, Weixiong Loo, and Chong-Wah Ngo
With these improvements, the interactive system only searches a subset of video datasets relevant to a query while being able to quickly perform Bayesian updates with systematic planning to recommend the most probable candidates that can potentially lead to minimum iteration rounds.
no code implementations • 3 Jul 2022 • Jiaxin Wu, Pingfeng Wang
As for enhancing the designs, challenges have arisen for designing a robust system due to the increasing scale of modern systems and the complicated underlying physical constraints.
1 code implementation • 1 Jul 2022 • Jiaxin Wu, Chong-Wah Ngo, Wing-Kwong Chan, Zhijian Hou
Cross-modal representation learning has become a new normal for bridging the semantic gap between text and visual data.
no code implementations • 13 Mar 2022 • Jiaxin Wu, Xin Chen, Sobhan Badakhshan, Jie Zhang, Pingfeng Wang
Establishing cleaner energy generation therefore improving the sustainability of the power system is a crucial task in this century, and one of the key strategies being pursued is to shift the dependence on fossil fuel to renewable technologies such as wind, solar, and nuclear.