Search Results for author: Ao Luo

Found 29 papers, 18 papers with code

Template Matters: Understanding the Role of Instruction Templates in Multimodal Language Model Evaluation and Training

1 code implementation11 Dec 2024 Shijian Wang, Linxin Song, Jieyu Zhang, Ryotaro Shimizu, Ao Luo, Li Yao, Cunjian Chen, Julian McAuley, Hanqian Wu

Models tuned on our augmented dataset achieve the best overall performance when compared with the same scale MLMs tuned on at most 75 times the scale of our augmented dataset, highlighting the importance of instruction templates in MLM training.

Language Modelling

Forensics Adapter: Adapting CLIP for Generalizable Face Forgery Detection

no code implementations29 Nov 2024 Xinjie Cui, Yuezun Li, Ao Luo, Jiaran Zhou, Junyu Dong

We describe the Forensics Adapter, an adapter network designed to transform CLIP into an effective and generalizable face forgery detector.

Advancing Open-Set Domain Generalization Using Evidential Bi-Level Hardest Domain Scheduler

1 code implementation26 Sep 2024 Kunyu Peng, Di Wen, Kailun Yang, Ao Luo, Yufan Chen, Jia Fu, M. Saquib Sarfraz, Alina Roitberg, Rainer Stiefelhagen

In this paper, we observe that an adaptive domain scheduler benefits more in OSDG compared with prefixed sequential and random domain schedulers.

Data Augmentation Domain Generalization +1

FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection

no code implementations18 Jul 2024 Jianwei Zhao, Xin Li, Fan Yang, Qiang Zhai, Ao Luo, Zicheng Jiao, Hong Cheng

Detecting objects seamlessly blended into their surroundings represents a complex task for both human cognitive capabilities and advanced artificial intelligence algorithms.

Denoising Object +2

LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models

1 code implementation12 Jul 2024 Hai Jiang, Ao Luo, Xiaohong Liu, Songchen Han, Shuaicheng Liu

In this paper, we propose a diffusion-based unsupervised framework that incorporates physically explainable Retinex theory with diffusion models for low-light image enhancement, named LightenDiffusion.

Low-Light Image Enhancement

Adaptive In-conversation Team Building for Language Model Agents

no code implementations29 May 2024 Linxin Song, Jiale Liu, Jieyu Zhang, Shaokun Zhang, Ao Luo, Shijian Wang, Qingyun Wu, Chi Wang

Leveraging multiple large language model (LLM) agents has shown to be a promising approach for tackling complex tasks, while the effective design of multiple agents for a particular application remains an art.

Diversity Language Modelling +2

RecDiffusion: Rectangling for Image Stitching with Diffusion Models

1 code implementation CVPR 2024 Tianhao Zhou, Haipeng Li, Ziyi Wang, Ao Luo, Chen-Lin Zhang, Jiajun Li, Bing Zeng, Shuaicheng Liu

Image stitching from different captures often results in non-rectangular boundaries, which is often considered unappealing.

Image Stitching

Better Explain Transformers by Illuminating Important Information

1 code implementation18 Jan 2024 Linxin Song, Yan Cui, Ao Luo, Freddy Lecue, Irene Li

Transformer-based models excel in various natural language processing (NLP) tasks, attracting countless efforts to explain their inner workings.

Question Answering

FlowDiffuser: Advancing Optical Flow Estimation with Diffusion Models

1 code implementation CVPR 2024 Ao Luo, Xin Li, Fan Yang, Jiangyu Liu, Haoqiang Fan, Shuaicheng Liu

To optimize accuracy and efficiency our FlowDiffuser incorporates a novel Conditional Recurrent Denoising Decoder (Conditional-RDD) streamlining the flow estimation process.

Decoder Denoising +1

Efficient Meshflow and Optical Flow Estimation from Event Cameras

1 code implementation CVPR 2024 Xinglong Luo, Ao Luo, Zhengning Wang, Chunyu Lin, Bing Zeng, Shuaicheng Liu

In this paper we explore the problem of event-based meshflow estimation a novel task that involves predicting a spatially smooth sparse motion field from event cameras.

Decoder Optical Flow Estimation

GAFlow: Incorporating Gaussian Attention into Optical Flow

1 code implementation ICCV 2023 Ao Luo, Fan Yang, Xin Li, Lang Nie, Chunyu Lin, Haoqiang Fan, Shuaicheng Liu

Moreover, for reliable motion analysis, we provide a new Gaussian-Guided Attention Module (GGAM) which not only inherits properties from Gaussian distribution to instinctively revolve around the neighbor fields of each point but also is empowered to put the emphasis on contextually related regions during matching.

Optical Flow Estimation Representation Learning

SCP: Spherical-Coordinate-based Learned Point Cloud Compression

no code implementations24 Aug 2023 Ao Luo, Linxin Song, Keisuke Nonaka, Kyohei Unno, Heming Sun, Masayuki Goto, Jiro Katto

In recent years, the task of learned point cloud compression has gained prominence.

Low-Light Image Enhancement with Wavelet-based Diffusion Models

1 code implementation1 Jun 2023 Hai Jiang, Ao Luo, Songchen Han, Haoqiang Fan, Shuaicheng Liu

Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.

Denoising Face Detection +2

Learning Optical Flow from Event Camera with Rendered Dataset

no code implementations ICCV 2023 Xinglong Luo, Kunming Luo, Ao Luo, Zhengning Wang, Ping Tan, Shuaicheng Liu

Previous datasets are created by either capturing real scenes by event cameras or synthesizing from images with pasted foreground objects.

Optical Flow Estimation

Explicit Motion Disentangling for Efficient Optical Flow Estimation

1 code implementation ICCV 2023 Changxing Deng, Ao Luo, Haibin Huang, Shaodan Ma, Jiangyu Liu, Shuaicheng Liu

In this paper, we propose a novel framework for optical flow estimation that achieves a good balance between performance and efficiency.

Decoder Motion Estimation +1

RealFlow: EM-based Realistic Optical Flow Dataset Generation from Videos

1 code implementation22 Jul 2022 Yunhui Han, Kunming Luo, Ao Luo, Jiangyu Liu, Haoqiang Fan, Guiming Luo, Shuaicheng Liu

Specifically, we first estimate optical flow between a pair of video frames, and then synthesize a new image from this pair based on the predicted flow.

Image Generation Optical Flow Estimation

Memory-Efficient Learned Image Compression with Pruned Hyperprior Module

no code implementations21 Jun 2022 Ao Luo, Heming Sun, Jinming Liu, Jiro Katto

Learned Image Compression (LIC) gradually became more and more famous in these years.

Image Compression

Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training

1 code implementation1 Jun 2022 Yan Zeng, Wangchunshu Zhou, Ao Luo, Ziming Cheng, Xinsong Zhang

To this end, the cross-view language modeling framework considers both multi-modal data (i. e., image-caption pairs) and multi-lingual data (i. e., parallel sentence pairs) as two different views of the same object, and trains the model to align the two views by maximizing the mutual information between them with conditional masked language modeling and contrastive learning.

Contrastive Learning Image-text Retrieval +9

Learning Optical Flow with Adaptive Graph Reasoning

1 code implementation8 Feb 2022 Ao Luo, Fan Yang, Kunming Luo, Xin Li, Haoqiang Fan, Shuaicheng Liu

Our key idea is to decouple the context reasoning from the matching procedure, and exploit scene information to effectively assist motion estimation by learning to reason over the adaptive graph.

Motion Estimation Optical Flow Estimation +1

Probabilistic Model Distillation for Semantic Correspondence

1 code implementation CVPR 2021 Xin Li, Deng-Ping Fan, Fan Yang, Ao Luo, Hong Cheng, Zicheng Liu

We address this problem with the use of a novel Probabilistic Model Distillation (PMD) approach which transfers knowledge learned by a probabilistic teacher model on synthetic data to a static student model with the use of unlabeled real image pairs.

Representation Learning Semantic correspondence

ASFlow: Unsupervised Optical Flow Learning with Adaptive Pyramid Sampling

no code implementations8 Apr 2021 Kunming Luo, Ao Luo, Chuan Wang, Haoqiang Fan, Shuaicheng Liu

Equipped with these two modules, our method achieves the best performance for unsupervised optical flow estimation on multiple leading benchmarks, including MPI-SIntel, KITTI 2012 and KITTI 2015.

Optical Flow Estimation

Cascade Graph Neural Networks for RGB-D Salient Object Detection

1 code implementation ECCV 2020 Ao Luo, Xin Li, Fan Yang, Zhicheng Jiao, Hong Cheng, Siwei Lyu

Current works either simply distill prior knowledge from the corresponding depth map for handling the RGB-image or blindly fuse color and geometric information to generate the coarse depth-aware representations, hindering the performance of RGB-D saliency detectors. In this work, we introduceCascade Graph Neural Networks(Cas-Gnn), a unified framework which is capable of comprehensively distilling and reasoning the mutual benefits between these two data sources through a set of cascade graphs, to learn powerful representations for RGB-D salient object detection.

Object object-detection +3

Deep-VFX: Deep Action Recognition Driven VFX for Short Video

no code implementations22 Jul 2020 Ao Luo, Ning Xie, Zhijia Tao, Feng Jiang

In the application, short-form mobile video is so popular all over the world such as Tik Tok.

Action Recognition Template Matching

Hybrid Graph Neural Networks for Crowd Counting

no code implementations31 Jan 2020 Ao Luo, Fan Yang, Xin Li, Dong Nie, Zhicheng Jiao, Shangchen Zhou, Hong Cheng

In this paper, we present a novel network structure called Hybrid Graph Neural Network (HyGnn) which targets to relieve the problem by interweaving the multi-scale features for crowd density as well as its auxiliary task (localization) together and performing joint reasoning over a graph.

Crowd Counting Graph Neural Network

Fast Portrait Segmentation with Highly Light-weight Network

no code implementations19 Oct 2019 Yuezun Li, Ao Luo, Siwei Lyu

In this paper, we describe a fast and light-weight portrait segmentation method based on a new highly light-weight backbone (HLB) architecture.

Portrait Segmentation Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.