1 code implementation • 30 May 2025 • Duo Zheng, Shijia Huang, Yanyang Li, LiWei Wang
Our approach employs a 3D visual geometry encoder that extracts 3D prior information from video sequences.
1 code implementation • 6 Dec 2024 • Yanyang Li, Tin Long Wong, Cheung To Hung, Jianqiao Zhao, Duo Zheng, Ka Wai Liu, Michael R. Lyu, LiWei Wang
Recent advances in large language models (LLMs) have shown significant promise, yet their evaluation raises concerns, particularly regarding data contamination due to the lack of access to proprietary training data.
1 code implementation • CVPR 2025 • Duo Zheng, Shijia Huang, LiWei Wang
Efforts to enhance MLLMs, such as incorporating point cloud features, have been made, yet a considerable gap remains between the models' learned representations and the inherent complexity of 3D scenes.
Ranked #2 on
3D Question Answering (3D-QA)
on SQA3D
2 code implementations • CVPR 2024 • Duo Zheng, Shijia Huang, Lin Zhao, Yiwu Zhong, LiWei Wang
We conduct extensive experiments to evaluate the performance and generalizability of our model.
3D Question Answering (3D-QA)
Embodied Question Answering
+3
1 code implementation • 9 Aug 2023 • Yanyang Li, Jianqiao Zhao, Duo Zheng, Zi-Yuan Hu, Zhi Chen, Xiaohui Su, Yongfeng Huang, Shijia Huang, Dahua Lin, Michael R. Lyu, LiWei Wang
With the continuous emergence of Chinese Large Language Models (LLMs), how to evaluate a model's capabilities has become an increasingly significant issue.
no code implementations • 16 May 2023 • Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, Jie zhou
In this paper, we aim to unify MLS and CLS into a more general setting, i. e., many-to-many summarization (M2MS), where a single model could process documents in any language and generate their summaries also in any language.
1 code implementation • 24 Oct 2022 • Duo Zheng, Tao Kong, Ya Jing, Jiaan Wang, Xiaojie Wang
Additionally, IRTF could generate pseudo input regions for the REC task to enable a uniform way for sharing the identical representation space across the REC and REG.
no code implementations • 23 Mar 2022 • Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, Jie zhou
Cross-lingual summarization is the task of generating a summary in one language (e. g., English) for the given document(s) in a different language (e. g., Chinese).
1 code implementation • 16 Mar 2022 • Duo Zheng, Fandong Meng, Qingyi Si, Hairun Fan, Zipeng Xu, Jie zhou, Fangxiang Feng, Xiaojie Wang
Visual dialog has witnessed great progress after introducing various vision-oriented goals into the conversation, especially such as GuessWhich and GuessWhat, where the only image is visible by either and both of the questioner and the answerer, respectively.
2 code implementations • 11 Feb 2022 • Jiaan Wang, Fandong Meng, Ziyao Lu, Duo Zheng, Zhixu Li, Jianfeng Qu, Jie zhou
We present ClidSum, a benchmark dataset for building cross-lingual summarization systems on dialogue documents.
1 code implementation • 24 Nov 2021 • Jiaan Wang, Zhixu Li, Tingyi Zhang, Duo Zheng, Jianfeng Qu, An Liu, Lei Zhao, Zhigang Chen
Additionally, we also introduce a knowledge-enhanced summarizer that utilizes both live commentaries and the knowledge to generate sports news.
1 code implementation • Findings (EMNLP) 2021 • Duo Zheng, Zipeng Xu, Fandong Meng, Xiaojie Wang, Jiaan Wang, Jie zhou
To enhance VD Questioner: 1) we propose a Related entity enhanced Questioner (ReeQ) that generates questions under the guidance of related entities and learns entity-based questioning strategy from human dialogs; 2) we propose an Augmented Guesser (AugG) that is strong and is optimized for the VD setting especially.
1 code implementation • 12 Jul 2021 • Zipeng Xu, Fandong Meng, Xiaojie Wang, Duo Zheng, Chenxu Lv, Jie zhou
In Reinforcement Learning, it is crucial to represent states and assign rewards based on the action-caused transitions of states.