no code implementations • Findings (NAACL) 2022 • Xiaozhi Zhu, Tianyong Hao, Sijie Cheng, Fu Lee Wang, Hai Liu
Pretrained language models such as BERT have been successfully applied to a wide range of natural language processing tasks and also achieved impressive performance in document reranking tasks.
1 code implementation • 26 Mar 2025 • Zhicheng Guo, Sijie Cheng, Yuchen Niu, Hao Wang, Sicheng Zhou, Wenbing Huang, Yang Liu
The rapid advancement of large language models (LLMs) has spurred significant interest in tool learning, where LLMs are augmented with external tools to tackle complex tasks.
1 code implementation • 15 Oct 2024 • Sijie Cheng, Kechen Fang, Yangyang Yu, Sicheng Zhou, Bohao Li, Ye Tian, Tingguang Li, Lei Han, Yang Liu
In conclusion, VidEgoThink reflects a research trend towards employing MLLMs for egocentric vision, akin to human capabilities, enabling active observation and interaction in the complex real-world environments.
1 code implementation • 30 May 2024 • Jinliang Zheng, Jianxiong Li, Sijie Cheng, Yinan Zheng, Jiaming Li, Jihao Liu, Yu Liu, Jingjing Liu, Xianyuan Zhan
To achieve more accurate and nuanced multimodal instruction following, we introduce Instruction-guided Visual Masking (IVM), a new versatile visual grounding model that is compatible with diverse multimodal models, such as LMM and robot model.
Ranked #1 on
Visual Question Answering
on V*bench
1 code implementation • 24 May 2024 • Chunjiang Ge, Sijie Cheng, ZiMing Wang, Jiale Yuan, Yuan Gao, Jun Song, Shiji Song, Gao Huang, Bo Zheng
To enhance the capabilities of ConvLLaVA, we propose two critical optimizations.
Ranked #78 on
Visual Question Answering
on MM-Vet
4 code implementations • 12 Mar 2024 • Zhicheng Guo, Sijie Cheng, Hao Wang, Shihao Liang, Yujia Qin, Peng Li, Zhiyuan Liu, Maosong Sun, Yang Liu
The virtual API server contains a caching system and API simulators which are complementary to alleviate the change in API status.
1 code implementation • 28 Feb 2024 • Jianxiong Li, Jinliang Zheng, Yinan Zheng, Liyuan Mao, Xiao Hu, Sijie Cheng, Haoyi Niu, Jihao Liu, Yu Liu, Jingjing Liu, Ya-Qin Zhang, Xianyuan Zhan
Multimodal pretraining is an effective strategy for the trinity of goals of representation learning in autonomous robots: 1) extracting both local and global task progressions; 2) enforcing temporal consistency of visual representation; 3) capturing trajectory-level language grounding.
Ranked #1 on
Contrastive Learning
on 10,000 People - Human Pose Recognition Data
(using extra training data)
1 code implementation • 23 Feb 2024 • Xiaolong Wang, Yile Wang, Sijie Cheng, Peng Li, Yang Liu
Recent work has made a preliminary attempt to use large language models (LLMs) to solve the stance detection task, showing promising results.
1 code implementation • 22 Jan 2024 • Yile Wang, Sijie Cheng, Zixin Sun, Peng Li, Yang Liu
We propose symbol-to-language (S2L), a tuning-free method that enables large language models to solve symbol-related problems with information expressed in natural language.
1 code implementation • CVPR 2024 • Sijie Cheng, Zhicheng Guo, Jingwen Wu, Kechen Fang, Peng Li, Huaping Liu, Yang Liu
However, the capability of VLMs to "think" from a first-person perspective, a crucial attribute for advancing autonomous agents and robotics, remains largely unexplored.
1 code implementation • 20 Sep 2023 • Guan Wang, Sijie Cheng, Xianyuan Zhan, Xiangang Li, Sen Song, Yang Liu
Specifically, we consider the general SFT training data, consisting of a small amount of expert data mixed with a large proportion of sub-optimal data, without any preference labels.
Ranked #81 on
Arithmetic Reasoning
on GSM8K
no code implementations • 6 Jul 2023 • Li Jiang, Sijie Cheng, JieLin Qiu, Haoran Xu, Wai Kin Chan, Zhao Ding
The prevalent use of benchmarks in current offline reinforcement learning (RL) research has led to a neglect of the imbalance of real-world dataset distributions in the development of models.
1 code implementation • 28 May 2023 • Zhicheng Guo, Sijie Cheng, Yile Wang, Peng Li, Yang Liu
There are two main challenges to leveraging retrieval-augmented methods for NKI tasks: 1) the demand for diverse relevance score functions and 2) the dilemma between training cost and task performance.
1 code implementation • 27 May 2023 • Xuanjie Fang, Sijie Cheng, Yang Liu, Wei Wang
Pre-trained language models (PLMs) have been widely used to underpin various downstream tasks.
1 code implementation • 10 May 2023 • Jiangjie Chen, Wei Shi, Ziquan Fu, Sijie Cheng, Lei LI, Yanghua Xiao
Large language models (LLMs) have been widely studied for their ability to store and utilize positive knowledge.
no code implementations • 21 Nov 2022 • Sijie Cheng, Zhiyong Wu, Jiangjie Chen, Zhixing Li, Yang Liu, Lingpeng Kong
The major difficulty is finding the conflict point, where the statement contradicts our real world.
1 code implementation • 28 Mar 2022 • Sijie Cheng, Zhouhong Gu, Bang Liu, Rui Xie, Wei Wu, Yanghua Xiao
Specifically, i) to fully exploit user behavioral information, we extract candidate hyponymy relations that match user interests from query-click concepts; ii) to enhance the semantic information of new concepts and better detect hyponymy relations, we model concepts and relations through both user-generated content and structural information in existing taxonomies and user click logs, by leveraging Pre-trained Language Models and Graph Neural Network combined with Contrastive Learning; iii) to reduce the cost of dataset construction and overcome data skews, we construct a high-quality and balanced training dataset from existing taxonomy with no supervision.
1 code implementation • ACL 2022 • Qianyu He, Sijie Cheng, Zhixu Li, Rui Xie, Yanghua Xiao
In this paper, we investigate the ability of PLMs in simile interpretation by designing a novel task named Simile Property Probing, i. e., to let the PLMs infer the shared properties of similes.
1 code implementation • 10 Dec 2021 • Jiangjie Chen, Chun Gan, Sijie Cheng, Hao Zhou, Yanghua Xiao, Lei LI
We also propose a new metric to alleviate the shortcomings of current automatic metrics and better evaluate the trade-off.
no code implementations • 21 Oct 2021 • Sijie Cheng, Jingwen Wu, Yanghua Xiao, Yang Liu
Today data is often scattered among billions of resource-constrained edge devices with security and privacy constraints.
no code implementations • Findings (ACL) 2021 • Leyang Cui, Sijie Cheng, Yu Wu, Yue Zhang
We quantitatively investigate the presence of structural commonsense cues in BERT when solving commonsense tasks, and the importance of such cues for the model prediction.