1 code implementation • 28 Nov 2024 • Xinran Wang, Haiwen Zhang, Baoteng Li, Kongming Liang, Hao Sun, Zhongjiang He, Zhanyu Ma, Jun Guo
Dimension Tailor can not only improve the quality of object details but also offer flexibility in including or excluding specific dimensions based on user preferences.
1 code implementation • 1 Sep 2024 • Haojian Huang, Chuanyu Qin, Zhe Liu, Kaijing Ma, Jin Chen, Han Fang, Chao Ban, Hao Sun, Zhongjiang He
This method effectively integrates local and global feature-neighborhood (F-N) structures for robust decision-making.
no code implementations • 14 Aug 2024 • Kaijing Ma, Han Fang, Xianghao Zang, Chao Ban, Lanxiang Zhou, Zhongjiang He, Yongxiang Li, Hao Sun, Zerun Feng, Xingsong Hou
Video Moment Retrieval, which aims to locate in-context video moments according to a natural language query, is an essential task for cross-modal grounding.
no code implementations • 11 Jul 2024 • Zhenhe Wu, Zhongqiu Li, Jie Zhang, Mengxiang Li, Yu Zhao, Ruiyu Fang, Zhongjiang He, Xuelong Li, Zhoujun Li, Shuangyong Song
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to-SQL task.
no code implementations • 3 Jul 2024 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang
Large Language Models (LLMs) represent a significant stride toward Artificial General Intelligence.
no code implementations • 25 Apr 2024 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang
Large language models (LLMs) have showcased profound capabilities in language understanding and generation, facilitating a wide array of applications.
no code implementations • 18 Apr 2024 • Han Fang, Xianghao Zang, Chao Ban, Zerun Feng, Lanxiang Zhou, Zhongjiang He, Yongxiang Li, Hao Sun
Text-video retrieval aims to find the most relevant cross-modal samples for a given query.
no code implementations • 16 Mar 2024 • Zihan Wang, Jiayu Xiao, Mengxiang Li, Zhongjiang He, Yongxiang Li, Chao Wang, Shuangyong Song
In our dynamic world where data arrives in a continuous stream, continual learning enables us to incrementally add new tasks/domains without the need to retrain from scratch.
no code implementations • 8 Jan 2024 • Zhongjiang He, Zihan Wang, Xinzhang Liu, Shixuan Liu, Yitong Yao, Yuyao Huang, Xuelong Li, Yongxiang Li, Zhonghao Che, Zhaoxi Zhang, Yan Wang, Xin Wang, Luwen Pu, Huinan Xu, Ruiyu Fang, Yu Zhao, Jie Zhang, Xiaomeng Huang, Zhilong Lu, Jiaxin Peng, Wenjun Zheng, Shiquan Wang, Bingkai Yang, Xuewei he, Zhuoru Jiang, Qiyi Xie, Yanhan Zhang, Zhongqiu Li, Lingling Shi, Weiwei Fu, Yin Zhang, Zilu Huang, Sishi Xiong, Yuxiang Zhang, Chao Wang, Shuangyong Song
Subsequently, the model undergoes fine-tuning to align with human preferences, following a detailed methodology that we describe.
1 code implementation • 25 May 2023 • Shilin Yan, Renrui Zhang, Ziyu Guo, Wenchao Chen, Wei zhang, Hongyang Li, Yu Qiao, Hao Dong, Zhongjiang He, Peng Gao
In this paper, we propose MUTR, a Multi-modal Unified Temporal transformer for Referring video object segmentation.