Search Results for author: Tengtao Song

Found 5 papers, 0 papers with code

"See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models

no code implementations17 Feb 2025 Jihao Gu, Yingyao Wang, Pi Bu, Chen Wang, ZiMing Wang, Tengtao Song, Donglai Wei, Jiale Yuan, Yingxiu Zhao, Yancheng He, Shilong Li, Jiaheng Liu, Meng Cao, Jun Song, Yingshui Tan, Xiang Li, Wenbo Su, Zhicheng Zheng, Xiaoyong Zhu, Bo Zheng

The evaluation of factual accuracy in large vision language models (LVLMs) has lagged behind their rapid development, making it challenging to fully reflect these models' knowledge capacity and reliability.

Object Recognition Question Answering +1

Video Referring Expression Comprehension via Transformer with Content-conditioned Query

no code implementations25 Oct 2023 Ji Jiang, Meng Cao, Tengtao Song, Long Chen, Yi Wang, Yuexian Zou

Video Referring Expression Comprehension (REC) aims to localize a target object in videos based on the queried natural language.

cross-modal alignment Referring Expression +2

Improve Retrieval-based Dialogue System via Syntax-Informed Attention

no code implementations12 Mar 2023 Tengtao Song, Nuo Chen, Ji Jiang, Zhihong Zhu, Yuexian Zou

Since incorporating syntactic information like dependency structures into neural models can promote a better understanding of the sentences, such a method has been widely used in NLP tasks.

Retrieval Sentence

A Dynamic Graph Interactive Framework with Label-Semantic Injection for Spoken Language Understanding

no code implementations8 Nov 2022 Zhihong Zhu, Weiyuan Xu, Xuxin Cheng, Tengtao Song, Yuexian Zou

Multi-intent detection and slot filling joint models are gaining increasing traction since they are closer to complicated real-world scenarios.

Intent Detection Semantic Frame Parsing +3

Video Referring Expression Comprehension via Transformer with Content-aware Query

no code implementations6 Oct 2022 Ji Jiang, Meng Cao, Tengtao Song, Yuexian Zou

To this end, we introduce two new datasets (i. e., VID-Entity and VidSTG-Entity) by augmenting the VIDSentence and VidSTG datasets with the explicitly referred words in the whole sentence, respectively.

cross-modal alignment Referring Expression +2

Cannot find the paper you are looking for? You can Submit a new open access paper.