1 code implementation • 17 Sep 2024 • Simon Yu, Liangyu Chen, Sara Ahmadian, Marzieh Fadaee
This work addresses the question: How can we determine the optimal subset of data for effective training?
no code implementations • 25 Aug 2024 • Liangyu Chen, Zihao Yue, Boshen Xu, Qin Jin
Audio-Visual Source Localization (AVSL) aims to localize the source of sound within a video.
no code implementations • 1 Jul 2024 • Yubo Ma, Yuhang Zang, Liangyu Chen, Meiqi Chen, Yizhu Jiao, Xinze Li, Xinyuan Lu, Ziyu Liu, Yan Ma, Xiaoyi Dong, Pan Zhang, Liangming Pan, Yu-Gang Jiang, Jiaqi Wang, Yixin Cao, Aixin Sun
Moreover, 33. 2% of the questions are cross-page questions requiring evidence across multiple pages.
no code implementations • 15 Apr 2024 • Ziniu Zhang, Shulin Tian, Liangyu Chen, Ziwei Liu
To answer this question, we present MMInA, a multihop and multimodal benchmark to evaluate the embodied agents for compositional Internet tasks, with several appealing properties: 1) Evolving real-world multimodal websites.
no code implementations • 30 Mar 2024 • Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak, Aleksandr Drozd, Jordan Clive, Kshitij Gupta, Liangyu Chen, Qi Sun, Ken Tsui, Noah Persaud, Nour Fahmy, Tianlong Chen, Mohit Bansal, Nicolo Monti, Tai Dang, Ziyang Luo, Tien-Tung Bui, Roberto Navigli, Virendra Mehta, Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo
Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility.
no code implementations • 5 Jan 2024 • Na Liu, Liangyu Chen, Xiaoyu Tian, Wei Zou, Kaijiang Chen, Ming Cui
This paper introduces RAISE (Reasoning and Acting through Scratchpad and Examples), an advanced architecture enhancing the integration of Large Language Models (LLMs) like GPT-4 into conversational agents.
no code implementations • CVPR 2024 • Ruiqi Wu, Liangyu Chen, Tong Yang, Chunle Guo, Chongyi Li, Xiangyu Zhang
In this paper we present a few-shot text-to-video framework LAMP which enables a text-to-image diffusion model to Learn A specific Motion Pattern with 8 16 videos on a single GPU.
no code implementations • 6 Dec 2023 • Linze Li, Sunqi Fan, Hengjun Pu, Zhaodong Bing, Yao Tang, Tianzhu Ye, Tong Yang, Liangyu Chen, Jiajun Liang
Our method's efficacy has been validated on multiple representative DreamBooth and LoRA models, delivering substantial improvements over the original outcomes in terms of facial fidelity, text-to-image editability, and video motion.
3 code implementations • CVPR 2023 • Jingkang Yang, Wenxuan Peng, Xiangtai Li, Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma, Kaiyang Zhou, Wayne Zhang, Chen Change Loy, Ziwei Liu
PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.
no code implementations • 27 Oct 2023 • Xiaoyu Tian, Liangyu Chen, Na Liu, Yaxuan Liu, Wei Zou, Kaijiang Chen, Ming Cui
The fast thinking model serves as the primary interface for external interactions and initial response generation, evaluating the necessity for engaging the slow thinking model based on the complexity of the complete response.
1 code implementation • 16 Oct 2023 • Ruiqi Wu, Liangyu Chen, Tong Yang, Chunle Guo, Chongyi Li, Xiangyu Zhang
Specifically, we design a first-frame-conditioned pipeline that uses an off-the-shelf text-to-image model for content generation so that our tuned video diffusion model mainly focuses on motion learning.
1 code implementation • 28 Jul 2023 • Cheng Wen, Xianghui Sun, Shuaijiang Zhao, Xiaoquan Fang, Liangyu Chen, Wei Zou
This paper presents the development and evaluation of ChatHome, a domain-specific language model (DSLM) designed for the intricate field of home renovation.
no code implementations • 25 Jul 2023 • Bo Li, Haotian Liu, Liangyu Chen, Yong Jae Lee, Chunyuan Li, Ziwei Liu
Advancements in large pre-trained generative models have expanded their potential as effective data generators in visual recognition.
2 code implementations • 8 Jun 2023 • Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Fanyi Pu, Jingkang Yang, Chunyuan Li, Ziwei Liu
We release the MIMIC-IT dataset, instruction-response collection pipeline, benchmarks, and the Otter model.
Ranked #20 on Visual Question Answering on MM-Vet v2
1 code implementation • 5 May 2023 • Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Jingkang Yang, Ziwei Liu
Large language models (LLMs) have demonstrated significant universal capabilities as few/zero-shot learners in various tasks due to their pre-training on vast amounts of text data, as exemplified by GPT-3, which boosted to InstrctGPT and ChatGPT, effectively following natural language instructions to accomplish real-world tasks.
Ranked #8 on Visual Question Answering on BenchLMM
1 code implementation • 5 Oct 2022 • Liangyu Chen, Yutong Bai, Siyu Huang, Yongyi Lu, Bihan Wen, Alan L. Yuille, Zongwei Zhou
However, we uncover a striking contradiction to this promise: active learning fails to select data as efficiently as random selection at the first few choices.
no code implementations • 21 Aug 2022 • Xiaolu Wang, Ziqi Ding, Liangyu Chen
In this paper, K12 math problems taken as the research object, the LABS model based on label-semantic attention and multi-label smoothing combining textual features is proposed to improve the automatic tagging of knowledge points for math problems.
no code implementations • Findings (NAACL) 2022 • Yunjie Ji, Liangyu Chen, Chenxiao Dou, Baochang Ma, Xiangang Li
Machine Reading Comprehension with Unanswerable Questions is a difficult NLP task, challenged by the questions which can not be answered from passages.
no code implementations • SemEval (NAACL) 2022 • Yong Deng, Chenxiao Dou, Liangyu Chen, Deqiang Miao, Xianghui Sun, Baochang Ma, Xiangang Li
PCL detection task is aimed at identifying and categorizing language that is patronizing or condescending towards vulnerable communities in the general media. Compared to other NLP tasks of paragraph classification, the negative language presented in the PCL detection task is usually more implicit and subtle to be recognized, making the performance of common text-classification approaches disappointed.
Ranked #1 on Multi-label Condescension Detection on DPM
Binary Condescension Detection Multi-Label Classification +1
4 code implementations • 19 Apr 2022 • Xiaojie Chu, Liangyu Chen, Wenqing Yu
This paper inherits a strong and simple image restoration model, NAFNet, for single-view feature extraction and extends it by adding cross attention modules to fuse features between views to adapt to binocular scenarios.
11 code implementations • 10 Apr 2022 • Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, Jian Sun
Although there have been significant advances in the field of image restoration recently, the system complexity of the state-of-the-art (SOTA) methods is increasing as well, which may hinder the convenient analysis and comparison of methods.
Ranked #1 on Deblurring on MSU BASED
2 code implementations • 8 Dec 2021 • Xiaojie Chu, Liangyu Chen, Chengpeng Chen, Xin Lu
Our TLC converts global operations to local ones only during inference so that they aggregate features within local spatial regions rather than the entire large images.
Ranked #1 on Color Image Denoising on Urban100 sigma30
1 code implementation • 17 May 2021 • Andrey Ignatov, Kim Byeoung-su, Radu Timofte, Angeline Pouget, Fenglong Song, Cheng Li, Shuai Xiao, Zhongqian Fu, Matteo Maggioni, Yibin Huang, Shen Cheng, Xin Lu, Yifeng Zhou, Liangyu Chen, Donghao Liu, Xiangyu Zhang, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Minsu Kwon, Myungje Lee, Jaeyoon Yoo, Changbeom Kang, Shinjo Wang, Bin Huang, Tianbao Zhou, Shuai Liu, Lei Lei, Chaoyu Feng, Liguang Huang, Zhikun Lei, Feifei Chen
A detailed description of all models developed in the challenge is provided in this paper.
no code implementations • 14 May 2021 • Hao Du, Melissa Min-Szu Yao, Liangyu Chen, Wing P. Chan, Mengling Feng
In this study, we proposed a multi-task deep graph convolutional network (GCN) method for the automatic characterization of morphology and distribution of microcalcifications in mammograms.
2 code implementations • 13 May 2021 • Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, Chengpeng Chen
Specifically, we present a novel block: Half Instance Normalization Block (HIN Block), to boost the performance of image restoration networks.
Ranked #3 on Single Image Deraining on Test2800
1 code implementation • CVPR 2021 • Liangyu Chen, Tong Yang, Xiangyu Zhang, Wei zhang, Jian Sun
We propose a novel point annotated setting for the weakly semi-supervised object detection task, in which the dataset comprises small fully annotated images and large weakly annotated images by points.