Search Results for author: Zhaoyang Liu

Found 22 papers, 17 papers with code

What is Wrong with Perplexity for Long-context Language Modeling?

1 code implementation31 Oct 2024 Lizhe Fang, Yifei Wang, Zhaoyang Liu, Chenheng Zhang, Stefanie Jegelka, Jinyang Gao, Bolin Ding, Yisen Wang

To address this, we propose \textbf{LongPPL}, a novel metric that focuses on key tokens by employing a long-short context contrastive method to identify them.

Document Summarization In-Context Learning +3

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions

1 code implementation30 Jul 2024 Xiaowei Chi, Yatian Wang, Aosong Cheng, Pengjun Fang, Zeyue Tian, Yingqing He, Zhaoyang Liu, Xingqun Qi, Jiahao Pan, Rongyu Zhang, Mengfei Li, Ruibin Yuan, Yanbing Jiang, Wei Xue, Wenhan Luo, Qifeng Chen, Shanghang Zhang, Qifeng Liu, Yike Guo

To fulfill this gap, we present MMTrail, a large-scale multi-modality video-language dataset incorporating more than 20M trailer clips with visual captions, and 2M high-quality clips with multimodal captions.

Audio Generation Image to Video Generation +2

VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks

1 code implementation12 Jun 2024 Jiannan Wu, Muyan Zhong, Sen Xing, Zeqiang Lai, Zhaoyang Liu, Zhe Chen, Wenhai Wang, Xizhou Zhu, Lewei Lu, Tong Lu, Ping Luo, Yu Qiao, Jifeng Dai

It not only allows flexible transmission of task information and gradient feedback between the MLLM and multiple downstream decoders but also effectively resolves training conflicts in multi-tasking scenarios.

Image Generation Language Modeling +7

Paths of A Million People: Extracting Life Trajectories from Wikipedia

1 code implementation25 May 2024 Ying Zhang, Xiaofeng Li, Zhaoyang Liu, Haipeng Zhang

The life trajectories of notable people have been studied to pinpoint the times and places of significant events such as birth, death, education, marriage, competition, work, speeches, scientific discoveries, artistic achievements, and battles.

Contrastive Learning Human Dynamics

ControlLLM: Augment Language Models with Tools by Searching on Graphs

1 code implementation26 Oct 2023 Zhaoyang Liu, Zeqiang Lai, Zhangwei Gao, Erfei Cui, Ziheng Li, Xizhou Zhu, Lewei Lu, Qifeng Chen, Yu Qiao, Jifeng Dai, Wenhai Wang

We present ControlLLM, a novel framework that enables large language models (LLMs) to utilize multi-modal tools for solving complex real-world tasks.

Scheduling

InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language

2 code implementations9 May 2023 Zhaoyang Liu, Yinan He, Wenhai Wang, Weiyun Wang, Yi Wang, Shoufa Chen, Qinglong Zhang, Zeqiang Lai, Yang Yang, Qingyun Li, Jiashuo Yu, Kunchang Li, Zhe Chen, Xue Yang, Xizhou Zhu, Yali Wang, LiMin Wang, Ping Luo, Jifeng Dai, Yu Qiao

Different from existing interactive systems that rely on pure language, by incorporating pointing instructions, the proposed iGPT significantly improves the efficiency of communication between users and chatbots, as well as the accuracy of chatbots in vision-centric tasks, especially in complicated visual scenarios where the number of objects is greater than 2.

Language Modelling

VLG: General Video Recognition with Web Textual Knowledge

1 code implementation3 Dec 2022 Jintao Lin, Zhaoyang Liu, Wenhai Wang, Wayne Wu, LiMin Wang

Our VLG is first pre-trained on video and language datasets to learn a shared feature space, and then devises a flexible bi-modal attention head to collaborate high-level semantic concepts under different settings.

Video Recognition

MotionBERT: A Unified Perspective on Learning Human Motion Representations

1 code implementation ICCV 2023 Wentao Zhu, Xiaoxuan Ma, Zhaoyang Liu, Libin Liu, Wayne Wu, Yizhou Wang

We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources.

 Ranked #1 on Monocular 3D Human Pose Estimation on Human3.6M (using extra training data)

3D Pose Estimation Action Recognition +4

Submission to Generic Event Boundary Detection Challenge@CVPR 2022: Local Context Modeling and Global Boundary Decoding Approach

no code implementations30 Jun 2022 Jiaqi Tang, Zhaoyang Liu, Jing Tan, Chen Qian, Wayne Wu, LiMin Wang

Local context modeling sub-network is proposed to perceive diverse patterns of generic event boundaries, and it generates powerful video representations and reliable boundary confidence.

Boundary Detection Generic Event Boundary Detection +1

Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing

2 code implementations25 Apr 2022 Haoyue Cheng, Zhaoyang Liu, Hang Zhou, Chen Qian, Wayne Wu, LiMin Wang

This paper focuses on the weakly-supervised audio-visual video parsing task, which aims to recognize all events belonging to each modality and localize their temporal boundaries.

Denoising valid

Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection

3 code implementations CVPR 2022 Jiaqi Tang, Zhaoyang Liu, Chen Qian, Wayne Wu, LiMin Wang

Generic event boundary detection is an important yet challenging task in video understanding, which aims at detecting the moments where humans naturally perceive event boundaries.

Boundary Detection Diversity +2

CausCF: Causal Collaborative Filtering for RecommendationEffect Estimation

no code implementations28 May 2021 Xu Xie, Zhaoyang Liu, Shiwen Wu, Fei Sun, Cihang Liu, Jiawei Chen, Jinyang Gao, Bin Cui, Bolin Ding

It is based on the idea that similar users not only have a similar taste on items, but also have similar treatment effect under recommendations.

Collaborative Filtering Recommendation Systems

PURE: An Uncertainty-aware Recommendation Framework for Maximizing Expected Posterior Utility of Platform

no code implementations1 Jan 2021 Haokun Chen, Zhaoyang Liu, Chen Xu, Ziqian Chen, Jinyang Gao, Bolin Ding

In this paper, we propose a novel recommendation framework which effectively utilizes the information of user uncertainty over different item dimensions and explicitly takes into consideration the impact of display policy on user in order to achieve maximal expected posterior utility for the platform.

Contrastive Learning for Sequential Recommendation

1 code implementation27 Oct 2020 Xu Xie, Fei Sun, Zhaoyang Liu, Shiwen Wu, Jinyang Gao, Bolin Ding, Bin Cui

Sequential recommendation methods play a crucial role in modern recommender systems because of their ability to capture a user's dynamic interest from her/his historical interactions.

Contrastive Learning Data Augmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.