1 code implementation • 11 Dec 2024 • Shijian Wang, Linxin Song, Jieyu Zhang, Ryotaro Shimizu, Ao Luo, Li Yao, Cunjian Chen, Julian McAuley, Hanqian Wu
Models tuned on our augmented dataset achieve the best overall performance when compared with the same scale MLMs tuned on at most 75 times the scale of our augmented dataset, highlighting the importance of instruction templates in MLM training.
no code implementations • 29 Nov 2024 • Xinjie Cui, Yuezun Li, Ao Luo, Jiaran Zhou, Junyu Dong
We describe the Forensics Adapter, an adapter network designed to transform CLIP into an effective and generalizable face forgery detector.
1 code implementation • 26 Sep 2024 • Kunyu Peng, Di Wen, Kailun Yang, Ao Luo, Yufan Chen, Jia Fu, M. Saquib Sarfraz, Alina Roitberg, Rainer Stiefelhagen
In this paper, we observe that an adaptive domain scheduler benefits more in OSDG compared with prefixed sequential and random domain schedulers.
no code implementations • 29 Aug 2024 • Yangxiang Zhang, Yuezun Li, Ao Luo, Jiaran Zhou, Junyu Dong
In this paper, we describe an efficient two-stream architecture for real-time image manipulation detection.
no code implementations • 18 Jul 2024 • Jianwei Zhao, Xin Li, Fan Yang, Qiang Zhai, Ao Luo, Zicheng Jiao, Hong Cheng
Detecting objects seamlessly blended into their surroundings represents a complex task for both human cognitive capabilities and advanced artificial intelligence algorithms.
1 code implementation • 12 Jul 2024 • Hai Jiang, Ao Luo, Xiaohong Liu, Songchen Han, Shuaicheng Liu
In this paper, we propose a diffusion-based unsupervised framework that incorporates physically explainable Retinex theory with diffusion models for low-light image enhancement, named LightenDiffusion.
no code implementations • 29 May 2024 • Linxin Song, Jiale Liu, Jieyu Zhang, Shaokun Zhang, Ao Luo, Shijian Wang, Qingyun Wu, Chi Wang
Leveraging multiple large language model (LLM) agents has shown to be a promising approach for tackling complex tasks, while the effective design of multiple agents for a particular application remains an art.
1 code implementation • CVPR 2024 • Tianhao Zhou, Haipeng Li, Ziyi Wang, Ao Luo, Chen-Lin Zhang, Jiajun Li, Bing Zeng, Shuaicheng Liu
Image stitching from different captures often results in non-rectangular boundaries, which is often considered unappealing.
1 code implementation • 18 Jan 2024 • Linxin Song, Yan Cui, Ao Luo, Freddy Lecue, Irene Li
Transformer-based models excel in various natural language processing (NLP) tasks, attracting countless efforts to explain their inner workings.
1 code implementation • CVPR 2024 • Ao Luo, Xin Li, Fan Yang, Jiangyu Liu, Haoqiang Fan, Shuaicheng Liu
To optimize accuracy and efficiency our FlowDiffuser incorporates a novel Conditional Recurrent Denoising Decoder (Conditional-RDD) streamlining the flow estimation process.
1 code implementation • CVPR 2024 • Xinglong Luo, Ao Luo, Zhengning Wang, Chunyu Lin, Bing Zeng, Shuaicheng Liu
In this paper we explore the problem of event-based meshflow estimation a novel task that involves predicting a spatially smooth sparse motion field from event cameras.
1 code implementation • ICCV 2023 • Ao Luo, Fan Yang, Xin Li, Lang Nie, Chunyu Lin, Haoqiang Fan, Shuaicheng Liu
Moreover, for reliable motion analysis, we provide a new Gaussian-Guided Attention Module (GGAM) which not only inherits properties from Gaussian distribution to instinctively revolve around the neighbor fields of each point but also is empowered to put the emphasis on contextually related regions during matching.
no code implementations • 24 Aug 2023 • Ao Luo, Linxin Song, Keisuke Nonaka, Kyohei Unno, Heming Sun, Masayuki Goto, Jiro Katto
In recent years, the task of learned point cloud compression has gained prominence.
1 code implementation • 1 Jun 2023 • Hai Jiang, Ao Luo, Songchen Han, Haoqiang Fan, Shuaicheng Liu
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
Ranked #2 on Low-Light Image Enhancement on LOLv2
no code implementations • ICCV 2023 • Xinglong Luo, Kunming Luo, Ao Luo, Zhengning Wang, Ping Tan, Shuaicheng Liu
Previous datasets are created by either capturing real scenes by event cameras or synthesizing from images with pasted foreground objects.
1 code implementation • ICCV 2023 • Changxing Deng, Ao Luo, Haibin Huang, Shaodan Ma, Jiangyu Liu, Shuaicheng Liu
In this paper, we propose a novel framework for optical flow estimation that achieves a good balance between performance and efficiency.
1 code implementation • 22 Jul 2022 • Yunhui Han, Kunming Luo, Ao Luo, Jiangyu Liu, Haoqiang Fan, Guiming Luo, Shuaicheng Liu
Specifically, we first estimate optical flow between a pair of video frames, and then synthesize a new image from this pair based on the predicted flow.
no code implementations • 21 Jun 2022 • Ao Luo, Heming Sun, Jinming Liu, Jiro Katto
Learned Image Compression (LIC) gradually became more and more famous in these years.
1 code implementation • 1 Jun 2022 • Yan Zeng, Wangchunshu Zhou, Ao Luo, Ziming Cheng, Xinsong Zhang
To this end, the cross-view language modeling framework considers both multi-modal data (i. e., image-caption pairs) and multi-lingual data (i. e., parallel sentence pairs) as two different views of the same object, and trains the model to align the two views by maximizing the mutual information between them with conditional masked language modeling and contrastive learning.
1 code implementation • 8 Feb 2022 • Ao Luo, Fan Yang, Kunming Luo, Xin Li, Haoqiang Fan, Shuaicheng Liu
Our key idea is to decouple the context reasoning from the matching procedure, and exploit scene information to effectively assist motion estimation by learning to reason over the adaptive graph.
1 code implementation • CVPR 2022 • Ao Luo, Fan Yang, Xin Li, Shuaicheng Liu
Optical flow is a fundamental method used for quantitative motion estimation on the image plane.
1 code implementation • CVPR 2021 • Xin Li, Deng-Ping Fan, Fan Yang, Ao Luo, Hong Cheng, Zicheng Liu
We address this problem with the use of a novel Probabilistic Model Distillation (PMD) approach which transfers knowledge learned by a probabilistic teacher model on synthetic data to a static student model with the use of unlabeled real image pairs.
no code implementations • 8 Apr 2021 • Kunming Luo, Ao Luo, Chuan Wang, Haoqiang Fan, Shuaicheng Liu
Equipped with these two modules, our method achieves the best performance for unsupervised optical flow estimation on multiple leading benchmarks, including MPI-SIntel, KITTI 2012 and KITTI 2015.
1 code implementation • EMNLP 2021 • Xingyu Chen, Zihan Zhao, Lu Chen, Danyang Zhang, Jiabao Ji, Ao Luo, Yuxuan Xiong, Kai Yu
In this paper, we introduce the task of structural reading comprehension (SRC) on web.
1 code implementation • ICCV 2021 • Fan Yang, Qiang Zhai, Xin Li, Rui Huang, Ao Luo, Hong Cheng, Deng-Ping Fan
Spotting objects that are visually adapted to their surroundings is challenging for both humans and AI.
1 code implementation • ECCV 2020 • Ao Luo, Xin Li, Fan Yang, Zhicheng Jiao, Hong Cheng, Siwei Lyu
Current works either simply distill prior knowledge from the corresponding depth map for handling the RGB-image or blindly fuse color and geometric information to generate the coarse depth-aware representations, hindering the performance of RGB-D saliency detectors. In this work, we introduceCascade Graph Neural Networks(Cas-Gnn), a unified framework which is capable of comprehensively distilling and reasoning the mutual benefits between these two data sources through a set of cascade graphs, to learn powerful representations for RGB-D salient object detection.
Ranked #6 on RGB-D Salient Object Detection on NJU2K
no code implementations • 22 Jul 2020 • Ao Luo, Ning Xie, Zhijia Tao, Feng Jiang
In the application, short-form mobile video is so popular all over the world such as Tik Tok.
no code implementations • 31 Jan 2020 • Ao Luo, Fan Yang, Xin Li, Dong Nie, Zhicheng Jiao, Shangchen Zhou, Hong Cheng
In this paper, we present a novel network structure called Hybrid Graph Neural Network (HyGnn) which targets to relieve the problem by interweaving the multi-scale features for crowd density as well as its auxiliary task (localization) together and performing joint reasoning over a graph.
no code implementations • 19 Oct 2019 • Yuezun Li, Ao Luo, Siwei Lyu
In this paper, we describe a fast and light-weight portrait segmentation method based on a new highly light-weight backbone (HLB) architecture.