ForumSum: A Multi-Speaker Conversation Summarization Dataset

We also show that using a conversational corpus for pre-training improves the quality of the chat summarization model.

We also show that using a conversational corpus for pre-training improves the quality of the chat summarization model.

Abstractive Text Summarization

PixelLM: Pixel Reasoning with Large Multimodal Model

no code implementations4 Dec 2023 Zhongwei Ren, Zhicheng Huang, Yunchao Wei, Yao Zhao, Dongmei Fu, Jiashi Feng, Xiaojie Jin

PixelLM excels across various pixel-level image reasoning and understanding tasks, outperforming well-established methods in multiple benchmarks, including MUSE, single- and multi-referring segmentation.


On What Basis? Predicting Text Preference Via Structured Comparative Reasoning

Comparative reasoning plays a crucial role in text preference prediction; however, large language models (LLMs) often demonstrate inconsistencies in their reasoning.

Comparative reasoning plays a crucial role in text preference prediction; however, large language models (LLMs) often demonstrate inconsistencies in their reasoning.


Unleashing the potential of GNNs via Bi-directional Knowledge Transfer

no code implementations26 Oct 2023 Shuai Zheng, Zhizhe Liu, Zhenfeng Zhu, Xingxing Zhang, JianXin Li, Yao Zhao

On this basis, BiKT not only allows us to acquire knowledge from both the GNN and its derived model but promotes each other by injecting the knowledge into the other.

Domain Adaptation Representation Learning +1

WeatherDepth: Curriculum Contrastive Learning for Self-Supervised Depth Estimation under Adverse Weather Conditions

no code implementations9 Oct 2023 Jiyuan Wang, Chunyu Lin, Lang Nie, Shujun Huang, Yao Zhao, Xing Pan, Rui Ai

Concretely, we first present a progressive curriculum learning scheme with three simple-to-complex curricula to gradually adapt the model from clear to relative adverse, and then to adverse weather scenes.

Contrastive Learning Depth Estimation +1

Learning Mask-aware CLIP Representations for Zero-Shot Segmentation

1 code implementation NeurIPS 2023 Siyu Jiao, Yunchao Wei, YaoWei Wang, Yao Zhao, Humphrey Shi

However, in the paper, we reveal that CLIP is insensitive to different mask proposals and tends to produce similar predictions for various mask proposals of the same image.

Zero Shot Segmentation

IBVC: Interpolation-driven B-frame Video Compression

no code implementations25 Sep 2023 Meiqin Liu, Chenming Xu, Chao Yao, Weisi Lin, Yao Zhao

Learned B-frame video compression aims to adopt bi-directional motion estimation and motion compensation (MEMC) coding for middle frame reconstruction.

Motion Compensation Motion Estimation +4

Unified Frequency-Assisted Transformer Framework for Detecting and Grounding Multi-Modal Manipulation

no code implementations18 Sep 2023 Huan Liu, Zichang Tan, Qiang Chen, Yunchao Wei, Yao Zhao, Jingdong Wang

Moreover, to address the semantic conflicts between image and frequency domains, the forgery-aware mutual module is developed to further enable the effective interaction of disparate image and frequency features, resulting in aligned and comprehensive visual forgery representations.


Statistical Rejection Sampling Improves Preference Optimization

DPO's lack of a reward model constrains its ability to sample preference pairs from the optimal policy, and SLiC is restricted to sampling preference pairs only from the SFT policy.

DPO's lack of a reward model constrains its ability to sample preference pairs from the optimal policy, and SLiC is restricted to sampling preference pairs only from the SFT policy.

Language Modelling Large Language Model

SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection

1 code implementation17 Aug 2023 Runmin Cong, Yuchen Guan, Jinpeng Chen, Wei zhang, Yao Zhao, Sam Kwong

Despite significant progress in shadow detection, current methods still struggle with the adverse impact of background color, which may lead to errors when shadows are present on complex backgrounds.

Disentanglement Shadow Detection

Frequency Perception Network for Camouflaged Object Detection

Camouflaged object detection (COD) aims to accurately detect objects hidden in the surrounding environment.

Camouflaged object detection (COD) aims to accurately detect objects hidden in the surrounding environment.

object-detection Object Detection

Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation

2 code implementations ICCV 2023 Huan Liu, Qiang Chen, Zichang Tan, Jiang-Jiang Liu, Jian Wang, Xiangbo Su, Xiaolong Li, Kun Yao, Junyu Han, Errui Ding, Yao Zhao, Jingdong Wang

State-of-the-art solutions adopt the DETR-like framework, and mainly develop the complex decoder, e. g., regarding pose estimation as keypoint box detection and combining with human detection in ED-Pose, hierarchically predicting with pose decoder and joint (keypoint) decoder in PETR.

Human Detection Multi-Person Pose Estimation

CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation

1 code implementation14 Aug 2023 Hongguang Zhu, Yunchao Wei, Xiaodan Liang, Chunjie Zhang, Yao Zhao

Regarding the growing nature of real-world data, such an offline training paradigm on ever-expanding data is unsustainable, because models lack the continual learning ability to accumulate knowledge constantly.

Continual Learning Continual Pretraining

CLE Diffusion: Controllable Light Enhancement Diffusion Model

Low light enhancement has gained increasing importance with the rapid development of visual creation and editing.

Low light enhancement has gained increasing importance with the rapid development of visual creation and editing.

You Can Mask More For Extremely Low-Bitrate Image Compression

Extensive experiments have demonstrated that our approach outperforms recent state-of-the-art methods in R-D performance, visual quality, and downstream applications, at very low bitrates.

Extensive experiments have demonstrated that our approach outperforms recent state-of-the-art methods in R-D performance, visual quality, and downstream applications, at very low bitrates.

Image Compression

Exploring Resolution Fields for Scalable Image Compression with Uncertainty Guidance

1 code implementation15 Jun 2023 Dongyi Zhang, Feng Li, Man Liu, Runmin Cong, Huihui Bai, Meng Wang, Yao Zhao

In this work, we explore the potential of resolution fields in scalable image compression and propose the reciprocal pyramid network (RPN) that fulfills the need for more adaptable and versatile compression.

Image Compression

NPVForensics: Jointing Non-critical Phonemes and Visemes for Deepfake Detection

no code implementations12 Jun 2023 Yu Chen, Yang Yu, Rongrong Ni, Yao Zhao, Haoliang Li

Next, we design a phoneme-viseme awareness module for cross-modal feature fusion and representation alignment, so that the modality gap can be reduced and the intrinsic complementarity of the two modalities can be better explored.

DeepFake Detection Face Swapping

SLiC-HF: Sequence Likelihood Calibration with Human Feedback

no code implementations17 May 2023 Yao Zhao, Rishabh Joshi, Tianqi Liu, Misha Khalman, Mohammad Saleh, Peter J. Liu

Past work has often relied on Reinforcement Learning from Human Feedback (RLHF), which optimizes the language model using reward scores assigned from a reward model trained on human preference data.

Language Modelling Offline RL

Learning Robust Deep Equilibrium Models

no code implementations25 Apr 2023 Haoyu Chu, Shikui Wei, Ting Liu, Yao Zhao

Deep equilibrium (DEQ) models have emerged as a promising class of implicit layer models in deep learning, which abandon traditional depth by solving for the fixed points of a single nonlinear layer.

Adversarial Defense Adversarial Robustness

Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning

1 code implementation CVPR 2023 Man Liu, Feng Li, Chunjie Zhang, Yunchao Wei, Huihui Bai, Yao Zhao

Generalized Zero-Shot Learning (GZSL) identifies unseen categories by knowledge transferred from the seen domain, relying on the intrinsic interactions between visual and semantic information.

Generalized Zero-Shot Learning

Deep Learning for Camera Calibration and Beyond: A Survey

In this paper, we provide a comprehensive survey of learning-based camera calibration techniques, by analyzing their strengths and limitations.

In this paper, we provide a comprehensive survey of learning-based camera calibration techniques, by analyzing their strengths and limitations.

Camera Calibration

Global Knowledge Calibration for Fast Open-Vocabulary Segmentation

Recent advancements in pre-trained vision-language models, such as CLIP, have enabled the segmentation of arbitrary concepts solely from textual inputs, a process commonly referred to as open-vocabulary semantic segmentation (OVS).

Recent advancements in pre-trained vision-language models, such as CLIP, have enabled the segmentation of arbitrary concepts solely from textual inputs, a process commonly referred to as open-vocabulary semantic segmentation (OVS).

Knowledge Distillation Open Vocabulary Semantic Segmentation +4

SigVIC: Spatial Importance Guided Variable-Rate Image Compression

no code implementations16 Mar 2023 Jiaming Liang, Meiqin Liu, Chao Yao, Chunyu Lin, Yao Zhao

Variable-rate mechanism has improved the flexibility and efficiency of learning-based image compression that trains multiple models for different rate-distortion tradeoffs.

Image Compression

Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness

Based on the Manhattan World assumption, most existing indoor layout estimation schemes focus on recovering layouts from vertically compressed 1D sequences.

Based on the Manhattan World assumption, most existing indoor layout estimation schemes focus on recovering layouts from vertically compressed 1D sequences.

Room Layout Estimation

Unsupervised OmniMVS: Efficient Omnidirectional Depth Inference via Establishing Pseudo-Stereo Supervision

In this paper, we propose the first unsupervised omnidirectional MVS framework based on multiple fisheye images.

In this paper, we propose the first unsupervised omnidirectional MVS framework based on multiple fisheye images.

Parallax-Tolerant Unsupervised Deep Image Stitching

1 code implementation ICCV 2023 Lang Nie, Chunyu Lin, Kang Liao, Shuaicheng Liu, Yao Zhao

First, we propose a robust and flexible warp to model the image registration from global homography to local thin-plate spline motion.

Image Registration Image Stitching

Spatiotemporal Deformation Perception for Fisheye Video Rectification

1 code implementation8 Feb 2023 Shangrong Yang, Chunyu Lin, Kang Liao, Yao Zhao

Subsequently, we observe that the inter-frame optical flow of the video is facilitated to perceive the local spatial deformation of the fisheye video.

Optical Flow Estimation

Dual Diffusion Architecture for Fisheye Image Rectification: Synthetic-to-Real Generalization

no code implementations26 Jan 2023 Shangrong Yang, Chunyu Lin, Kang Liao, Yao Zhao

To this end, we propose a Dual Diffusion Architecture (DDA) for the fisheye rectification with a better generalization ability.


RecRecNet: Rectangling Rectified Wide-Angle Images by Thin-Plate Spline Model and DoF-based Curriculum Learning

1 code implementation ICCV 2023 Kang Liao, Lang Nie, Chunyu Lin, Zishuo Zheng, Yao Zhao

In this work, we explore constructing a win-win representation on both content and boundary by contributing a new learning model, i. e., Rectangling Rectification Network (RecRecNet).

Innovating Real Fisheye Image Correction with Dual Diffusion Architecture

Fisheye image rectification is hindered by synthetic models producing poor results for real-world correction.

Fisheye image rectification is hindered by synthetic models producing poor results for real-world correction.


Learning To Segment Every Referring Object Point by Point

1 code implementation CVPR 2023 Mengxue Qu, Yu Wu, Yunchao Wei, Wu Liu, Xiaodan Liang, Yao Zhao

Extensive experiments show that our model achieves 52. 06% in terms of accuracy (versus 58. 93% in fully supervised setting) on RefCOCO+@testA, when only using 1% of the mask annotations.

Referring Expression Referring Expression Segmentation

An In-Depth Exploration of Person Re-Identification and Gait Recognition in Cloth-Changing Conditions

1 code implementation CVPR 2023 Weijia Li, Saihui Hou, Chunjie Zhang, Chunshui Cao, Xu Liu, Yongzhen Huang, Yao Zhao

For the cloth-changing problem, video-based ReID is rarely studied due to the lack of a suitable cloth-changing benchmark, and gait recognition is often researched under controlled conditions.

Gait Recognition Person Re-Identification

CTP:Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation

1 code implementation ICCV 2023 Hongguang Zhu, Yunchao Wei, Xiaodan Liang, Chunjie Zhang, Yao Zhao

Regarding the growing nature of real-world data, such an offline training paradigm on ever-expanding data is unsustainable, because models lack the continual learning ability to accumulate knowledge constantly.

Continual Learning Continual Pretraining

Improving the Robustness of Summarization Models by Detecting and Removing Input Noise

We present a large empirical study quantifying the sometimes severe loss in performance (up to 12 ROUGE-1 points) from different types of input noise for a range of datasets and model sizes.

We present a large empirical study quantifying the sometimes severe loss in performance (up to 12 ROUGE-1 points) from different types of input noise for a range of datasets and model sizes.

Abstractive Text Summarization

Fully and Weakly Supervised Referring Expression Segmentation with End-to-End Learning

no code implementations17 Dec 2022 Hui Li, MingJie Sun, Jimin Xiao, Eng Gee Lim, Yao Zhao

To validate our framework on a weakly-supervised setting, we annotated three RES benchmark datasets (RefCOCO, RefCOCO+ and RefCOCOg) with click annotations. Our method is simple but surprisingly effective, outperforming all previous state-of-the-art RES methods on fully- and weakly-supervised settings by a large margin.

Referring Expression Referring Expression Segmentation +2

Node-oriented Spectral Filtering for Graph Neural Networks

no code implementations7 Dec 2022 Shuai Zheng, Zhenfeng Zhu, Zhizhe Liu, Youru Li, Yao Zhao

Graph neural networks (GNNs) have shown remarkable performance on homophilic graph data while being far less impressive when handling non-homophilic graph data due to the inherent low-pass filtering property of GNNs.

Mask Matching Transformer for Few-Shot Segmentation

Typical methods follow the paradigm to firstly learn prototypical features from support images and then match query features in pixel-level to obtain segmentation results.

Typical methods follow the paradigm to firstly learn prototypical features from support images and then match query features in pixel-level to obtain segmentation results.

Few-Shot Semantic Segmentation Segmentation

Bridging Component Learning with Degradation Modelling for Blind Image Super-Resolution

1 code implementation3 Dec 2022 Yixuan Wu, Feng Li, Huihui Bai, Weisi Lin, Runmin Cong, Yao Zhao

In this paper, we analyze the degradation of a high-resolution (HR) image from image intrinsic components according to a degradation-based formulation model.

Image Super-Resolution

HGV4Risk: Hierarchical Global View-guided Sequence Representation Learning for Risk Prediction

1 code implementation15 Nov 2022 Youru Li, Zhenfeng Zhu, Xiaobo Guo, Shaoshuai Li, Yuchen Yang, Yao Zhao

Moreover, the hierarchical representations at both instance level and channel level can be coordinated by the heterogeneous information aggregation under the guidance of global view.

Graph Embedding Representation Learning +1

FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration

no code implementations9 Nov 2022 Yangjun Wu, Kebin Fang, Yao Zhao, Hao Zhang, Lifeng Shi, Mengqi Zhang

To accomplish punctuation restoration, most existing methods focus on introducing extra information (e. g., part-of-speech) or addressing the class imbalance problem.

Language Modelling Punctuation Restoration +1

Temporal Consistency Learning of inter-frames for Video Super-Resolution

A spatio-temporal stability module is designed to learn the self-alignment from inter-frames.

A spatio-temporal stability module is designed to learn the self-alignment from inter-frames.

Video Super-Resolution

Revisiting Simple Regret: Fast Rates for Returning a Good Arm

no code implementations30 Oct 2022 Yao Zhao, Connor James Stephens, Csaba Szepesvári, Kwang-Sung Jun

Simple regret is a natural and parameter-free performance criterion for pure exploration in multi-armed bandits yet is less popular than the probability of missing the best arm or an $\epsilon$-good arm, perhaps due to lack of easy ways to characterize it.

Multi-Armed Bandits

PSNet: Parallel Symmetric Network for Video Salient Object Detection

no code implementations12 Oct 2022 Runmin Cong, Weiyu Song, Jianjun Lei, Guanghui Yue, Yao Zhao, Sam Kwong

Finally, we use the Importance Perception Fusion (IPF) module to fuse the features from two parallel branches according to their different importance in different scenarios.

object-detection Optical Flow Estimation +3

Does Thermal Really Always Matter for RGB-T Salient Object Detection?

In addition, considering the role of thermal modality, we set up different cross-modality interaction mechanisms in the encoding phase and the decoding phase.

In addition, considering the role of thermal modality, we set up different cross-modality interaction mechanisms in the encoding phase and the decoding phase.

object-detection Object Detection +2

CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient Object Detection

3 code implementations6 Oct 2022 Runmin Cong, Qinwei Lin, Chen Zhang, Chongyi Li, Xiaochun Cao, Qingming Huang, Yao Zhao

Focusing on the issue of how to effectively capture and utilize cross-modality information in RGB-D salient object detection (SOD) task, we present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement.

object-detection RGB-D Salient Object Detection +1

Out-of-Distribution Detection and Selective Generation for Conditional Language Models

Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output.

Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output.

Abstractive Text Summarization Out-of-Distribution Detection +1

A Weakly Supervised Learning Framework for Salient Object Detection via Hybrid Labels

3 code implementations7 Sep 2022 Runmin Cong, Qi Qin, Chen Zhang, Qiuping Jiang, Shiqi Wang, Yao Zhao, Sam Kwong

In this paper, we focus on a new weakly-supervised SOD task under hybrid labels, where the supervision labels include a large number of coarse labels generated by the traditional unsupervised method and a small number of real labels.

object-detection RGB Salient Object Detection +3

Boundary Guided Semantic Learning for Real-time COVID-19 Lung Infection Segmentation System

1 code implementation7 Sep 2022 Runmin Cong, Yumo Zhang, Ning Yang, Haisheng Li, Xueqi Zhang, Ruochen Li, Zewen Chen, Yao Zhao, Sam Kwong

The coronavirus disease 2019 (COVID-19) continues to have a negative impact on healthcare systems around the world, though the vaccines have been developed and national vaccination coverage rate is steadily increasing.

HVS-Inspired Signal Degradation Network for Just Noticeable Difference Estimation

1 code implementation16 Aug 2022 Jian Jin, Yuan Xue, Xingxing Zhang, Lili Meng, Yao Zhao, Weisi Lin

However, they have a major drawback that the generated JND is assessed in the real-world signal domain instead of in the perceptual domain in the human brain.

Investigating Efficiently Extending Transformers for Long Input Summarization

1 code implementation8 Aug 2022 Jason Phang, Yao Zhao, Peter J. Liu

While large pretrained Transformer models have proven highly capable at tackling natural language tasks, handling long sequence inputs continues to be a significant challenge.

Ranked #2 on Long-range modeling on SCROLLS (GovRep metric)

Long-range modeling Text Summarization

Neural Contourlet Network for Monocular 360 Depth Estimation

For a monocular 360 image, depth estimation is a challenging because the distortion increases along the latitude.

For a monocular 360 image, depth estimation is a challenging because the distortion increases along the latitude.

Depth Estimation

SMART: Sentences as Basic Units for Text Evaluation

no code implementations1 Aug 2022 Reinald Kim Amplayo, Peter J. Liu, Yao Zhao, Shashi Narayan

Specifically, We treat sentences as basic units of matching instead of tokens, and use a sentence matching function to soft-match candidate and reference sentences.

Text Generation

SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding

1 code implementation27 Jul 2022 Mengxue Qu, Yu Wu, Wu Liu, Qiqi Gong, Xiaodan Liang, Olga Russakovsky, Yao Zhao, Yunchao Wei

Particularly, SiRi conveys a significant principle to the research of visual grounding, i. e., a better initialized vision-language encoder would help the model converge to a better local minimum, advancing the performance accordingly.

Visual Grounding

BCS-Net: Boundary, Context and Semantic for Automatic COVID-19 Lung Infection Segmentation from CT Images

The spread of COVID-19 has brought a huge disaster to the world, and the automatic segmentation of infection regions can help doctors to make diagnosis quickly and reduce workload.

The spread of COVID-19 has brought a huge disaster to the world, and the automatic segmentation of infection regions can help doctors to make diagnosis quickly and reduce workload.


Deep Rotation Correction without Angle Prior

To this end, we leverage a neural network to predict the optical flows that can warp the tilted images to be perceptually horizontal.

To this end, we leverage a neural network to predict the optical flows that can warp the tilted images to be perceptually horizontal.

Optical Flow Estimation

FishFormer: Annulus Slicing-based Transformer for Fisheye Rectification with Efficacy Domain Exploration

no code implementations5 Jul 2022 Shangrong Yang, Chunyu Lin, Kang Liao, Yao Zhao

To leverage these two characteristics, we introduced Fishformer that processes the fisheye image as a sequence to enhance global and local perception.

FisheyeEX: Polar Outpainting for Extending the FoV of Fisheye Lens

1 code implementation12 Jun 2022 Kang Liao, Chunyu Lin, Yunchao Wei, Yao Zhao

For the distortion synthesis, we propose a spiral distortion-aware perception module, in which the learning path keeps consistent with the distortion prior of the fisheye image.

Image Outpainting

JNMR: Joint Non-linear Motion Regression for Video Frame Interpolation

1 code implementation9 Jun 2022 Meiqin Liu, Chenming Xu, Chao Yao, Chunyu Lin, Yao Zhao

Video frame interpolation (VFI) aims to generate predictive frames by warping learnable motions from the bidirectional historical references.

Motion Estimation regression +1

TALM: Tool Augmented Language Models

no code implementations24 May 2022 Aaron Parisi, Yao Zhao, Noah Fiedel

Transformer based language models (LMs) demonstrate increasing performance with scale across a wide variety of tasks.


Global-and-Local Collaborative Learning for Co-Salient Object Detection

2 code implementations19 Apr 2022 Runmin Cong, Ning Yang, Chongyi Li, Huazhu Fu, Yao Zhao, Qingming Huang, Sam Kwong

In this paper, we propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM) to capture comprehensive inter-image corresponding relationship among different images from the global and local perspectives.

Co-Salient Object Detection object-detection +1

Cylin-Painting: Seamless 360° Panoramic Image Outpainting and Beyond with Cylinder-Style Convolutions

1 code implementation18 Apr 2022 Kang Liao, Xiangyu Xu, Chunyu Lin, Wenqi Ren, Yunchao Wei, Yao Zhao

Motivated by this analysis, we present a Cylin-Painting framework that involves meaningful collaborations between inpainting and outpainting and efficiently fuses the different arrangements, with a view to leveraging their complementary benefits on a consistent and seamless cylinder.

Depth Estimation Image Outpainting +3

Towards Reliable Image Outpainting: Learning Structure-Aware Multimodal Fusion with Depth Guidance

no code implementations12 Apr 2022 Lei Zhang, Kang Liao, Chunyu Lin, Yao Zhao

Concretely, we propose a Depth-Guided Outpainting Network to model different feature representations of two modalities and learn the structure-aware cross-modal fusion.

Image Outpainting

A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation

We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality compared to previous stochastic decoding strategies.

We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality compared to previous stochastic decoding strategies.

Question Generation Question-Generation

A Context-Aware Feature Fusion Framework for Punctuation Restoration

1 code implementation23 Mar 2022 Yangjun Wu, Kebin Fang, Yao Zhao

To accomplish the punctuation restoration task, most existing approaches focused on leveraging extra information (e. g., part-of-speech tags) or addressing the class imbalance problem.

Punctuation Restoration

PanoFormer: Panorama Transformer for Indoor 360 Depth Estimation

In particular, we divide patches on the spherical tangent domain into tokens to reduce the negative effect of panoramic distortions.

In particular, we divide patches on the spherical tangent domain into tokens to reduce the negative effect of panoramic distortions.

Depth Estimation Semantic Segmentation

Multi-modal Graph Learning for Disease Prediction

1 code implementation11 Mar 2022 Shuai Zheng, Zhenfeng Zhu, Zhizhe Liu, Zhenyu Guo, Yang Liu, Yuchen Yang, Yao Zhao

For disease prediction tasks, most existing graph-based methods tend to define the graph manually based on specified modality (e. g., demographic information), and then integrated other modalities to obtain the patient representation by Graph Representation Learning (GRL).

Disease Prediction Graph Learning +1

Improving Neural ODEs via Knowledge Distillation

no code implementations10 Mar 2022 Haoyu Chu, Shikui Wei, Qiming Lu, Yao Zhao

We propose a new training based on knowledge distillation to construct more powerful and robust Neural ODEs fitting image recognition tasks.

Knowledge Distillation

Deep Rectangling for Image Stitching: A Learning Baseline

In this paper, we address these issues by proposing the first deep learning solution to image rectangling.

In this paper, we address these issues by proposing the first deep learning solution to image rectangling.

Image Stitching

ACTIVE:Augmentation-Free Graph Contrastive Learning for Partial Multi-View Clustering

no code implementations1 Mar 2022 Yiming Wang, Dongxia Chang, Zhiqiang Fu, Jie Wen, Yao Zhao

In this paper, we propose an augmentation-free graph contrastive learning framework, namely ACTIVE, to solve the problem of partial multi-view clustering.

Clustering Contrastive Learning

Toward a More Populous Online Platform: The Economic Impacts of Compensated Reviews

no code implementations26 Jan 2022 Peng Li, Arim Park, Soohyun Cho, Yao Zhao

In this paper, we study the effect of compensated reviews on non-compensated reviews by utilizing online reviews on 1, 240 auto shipping companies over a ten-year period from a transportation website.

text-classification Text Classification

Trustworthy Knowledge Graph Completion Based on Multi-sourced Noisy Data

In this paper, we propose a new trustworthy method that exploits facts for a KG based on multi-sourced noisy data and existing facts in the KG.

In this paper, we propose a new trustworthy method that exploits facts for a KG based on multi-sourced noisy data and existing facts in the KG.

Auto-Weighted Layer Representation Based View Synthesis Distortion Estimation for 3-D Video Coding

Experimental results show that the VSD can be accurately estimated with the weights learnt by the nonlinear mapping function once its associated S-VSDs are available.

Experimental results show that the VSD can be accurately estimated with the weights learnt by the nonlinear mapping function once its associated S-VSDs are available.

RRNet: Relational Reasoning Network with Parallel Multi-scale Attention for Salient Object Detection in Optical Remote Sensing Images

2 code implementations27 Oct 2021 Runmin Cong, Yumo Zhang, Leyuan Fang, Jun Li, Yao Zhao, Sam Kwong

Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs.

object-detection Object Detection +2

Cross-modality Discrepant Interaction Network for RGB-D Salient Object Detection

1 code implementation4 Aug 2021 Chen Zhang, Runmin Cong, Qinwei Lin, Lin Ma, Feng Li, Yao Zhao, Sam Kwong

For the cross-modality interaction in feature encoder, existing methods either indiscriminately treat RGB and depth modalities, or only habitually utilize depth cues as auxiliary information of the RGB branch.

object-detection RGB-D Salient Object Detection +1

Dynamic Feature Regularized Loss for Weakly Supervised Semantic Segmentation

no code implementations3 Aug 2021 Bingfeng Zhang, Jimin Xiao, Yao Zhao

In this paper, we propose a new regularized loss which utilizes both shallow and deep features that are dynamically updated in order to aggregate sufficient information to represent the relationship of different pixels.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and Monocular Depth Estimation

The other is the content guidance bridge (CGBdg) designed for the depth map reconstruction process, which provides the content guidance learned from DSR task for MDE task.

The other is the content guidance bridge (CGBdg) designed for the depth map reconstruction process, which provides the content guidance learned from DSR task for MDE task.

Depth Map Super-Resolution Monocular Depth Estimation +1

Depth-Aware Multi-Grid Deep Homography Estimation with Contextual Correlation

1 code implementation6 Jul 2021 Lang Nie, Chunyu Lin, Kang Liao, Shuaicheng Liu, Yao Zhao

Homography estimation is an important task in computer vision applications, such as image stitching, video stabilization, and camera calibration.

Camera Calibration Homography Estimation +2

Double Low-Rank Representation With Projection Distance Penalty for Clustering

no code implementations CVPR 2021 Zhiqiang Fu, Yao Zhao, Dongxia Chang, Xingxing Zhang, Yiming Wang

This paper presents a novel, simple yet robust self-representation method, i. e., Double Low-Rank Representation with Projection Distance penalty (DLRRPD) for clustering.


Affinity Attention Graph Neural Network for Weakly Supervised Semantic Segmentation

1 code implementation8 Jun 2021 Bingfeng Zhang, Jimin Xiao, Jianbo Jiao, Yunchao Wei, Yao Zhao

More importantly, our approach can be readily applied to bounding box supervised instance segmentation task or other weakly supervised semantic segmentation tasks, with state-of-the-art or comparable performance among almot all weakly supervised tasks on PASCAL VOC or COCO dataset.

Box-supervised Instance Segmentation Model Optimization +3

Consistent Multiple Graph Embedding for Multi-View Clustering

no code implementations11 May 2021 Yiming Wang, Dongxia Chang, Zhiqiang Fu, Yao Zhao

Specifically, a multiple graph auto-encoder(M-GAE) is designed to flexibly encode the complementary information of multi-view data using a multi-graph attention fusion encoder.

Clustering Graph Attention +1

Seeing All From a Few: Nodes Selection Using Graph Pooling for Graph Clustering

no code implementations30 Apr 2021 Yiming Wang, Dongxia Chang, Zhiqian Fu, Yao Zhao

This paper is the first attempt to employ graph pooling technique for node clustering and we propose a novel dual graph embedding network (DGEN), which is designed as a two-step graph encoder connected by a graph pooling layer to learn the graph embedding.

Clustering Graph Clustering +2

Auto-weighted low-rank representation for clustering

no code implementations26 Apr 2021 Zhiqiang Fu, Yao Zhao, Dongxia Chang, Xingxing Zhang, Yiming Wang

In this paper, a novel unsupervised low-rank representation model, i. e., Auto-weighted Low-Rank Representation (ALRR), is proposed to construct a more favorable similarity graph (SG) for clustering.

Clustering Representation Learning

Planning with Learned Entity Prompts for Abstractive Summarization

no code implementations15 Apr 2021 Shashi Narayan, Yao Zhao, Joshua Maynez, Gonçalo Simoes, Vitaly Nikolaev, Ryan Mcdonald

Moreover, we demonstrate empirically that planning with entity chains provides a mechanism to control hallucinations in abstractive summaries.

Abstractive Text Summarization Specificity +1

Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow

1 code implementation CVPR 2021 Shangrong Yang, Chunyu Lin, Kang Liao, Chunjie Zhang, Yao Zhao

We embed a correction layer in skip-connection and leverage the appearance flows in different layers to pre-correct the image features.

Margin Preserving Self-paced Contrastive Learning Towards Domain Adaptation for Medical Image Segmentation

1 code implementation15 Mar 2021 Zhizhe Liu, Zhenfeng Zhu, Shuai Zheng, Yang Liu, Jiayu Zhou, Yao Zhao

To bridge the gap between the source and target domains in unsupervised domain adaptation (UDA), the most common strategy puts focus on matching the marginal distributions in the feature space through adversarial learning.

Cardiac Segmentation Contrastive Learning +4

Just Noticeable Difference for Deep Machine Vision

no code implementations16 Feb 2021 Jian Jin, Xingxing Zhang, Xin Fu, huan zhang, Weisi Lin, Jian Lou, Yao Zhao

Experimental results on image classification demonstrate that we successfully find the JND for deep machine vision.

Image Classification Neural Network Security +1

Image Splicing Detection, Localization and Attribution via JPEG Primary Quantization Matrix Estimation and Clustering

no code implementations2 Feb 2021 Yakun Niu, Benedetta Tondi, Yao Zhao, Rongrong Ni, Mauro Barni

We assume that both the spliced regions and the background image have undergone a double JPEG compression, and use a local estimate of the primary quantization matrix to distinguish between spliced regions taken from different sources.

Clustering Quantization

Efficient video integrity analysis through container characterization

no code implementations26 Jan 2021 Pengpeng Yang, Daniele Baracchi, Massimo Iuliani, Dasara Shullani, Rongrong Ni, Yao Zhao, Alessandro Piva

Furthermore, it is capable of correctly identifying the operating system of the source device for most of the tampered videos.

Multi-Level Curriculum for Training a Distortion-Aware Barrel Distortion Rectification Model

no code implementations ICCV 2021 Kang Liao, Chunyu Lin, Lixin Liao, Yao Zhao, Weiyao Lin

In this paper, inspired by the curriculum learning, we analyze the barrel distortion rectification task in a progressive and meaningful manner.

Towards Complete Scene and Regular Shape for Distortion Rectification by Curve-Aware Extrapolation

no code implementations ICCV 2021 Kang Liao, Chunyu Lin, Yunchao Wei, Feng Li, Shangrong Yang, Yao Zhao

To our knowledge, we are the first to tackle the challenging rectification via outpainting, and our curve-aware strategy can reach a rectification construction with complete content and regular shape.

Learning Edge-Preserved Image Stitching from Large-Baseline Deep Homography

no code implementations11 Dec 2020 Lang Nie, Chunyu Lin, Kang Liao, Yao Zhao

In this paper, we propose an image stitching learning framework, which consists of a large-baseline deep homography module and an edge-preserved deformation module.

Image Stitching

Towards Natural Robustness Against Adversarial Examples

no code implementations4 Dec 2020 Haoyu Chu, Shikui Wei, Yao Zhao

Thus, Neural ODEs have natural robustness against adversarial examples.

Adversarial Attack

Dense Attention Fluid Network for Salient Object Detection in Optical Remote Sensing Images

3 code implementations26 Nov 2020 Qijian Zhang, Runmin Cong, Chongyi Li, Ming-Ming Cheng, Yuming Fang, Xiaochun Cao, Yao Zhao, Sam Kwong

Despite the remarkable advances in visual saliency analysis for natural scene images (NSIs), salient object detection (SOD) for optical remote sensing images (RSIs) still remains an open and challenging problem.

object-detection Object Detection +1

CoADNet: Collaborative Aggregation-and-Distribution Networks for Co-Salient Object Detection

1 code implementation NeurIPS 2020 Qijian Zhang, Runmin Cong, Junhui Hou, Chongyi Li, Yao Zhao

In the first stage, we propose a group-attentional semantic aggregation module that models inter-image relationships to generate the group-wise semantic representations.

Co-Salient Object Detection object-detection +1

Learning Deep Interleaved Networks with Asymmetric Co-Attention for Image Restoration

1 code implementation29 Oct 2020 Feng Li, Runmin Cong, Huihui Bai, Yifan He, Yao Zhao, Ce Zhu

In this paper, we present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction.

Deblurring Image Deblurring +2

Mining Generalized Features for Detecting AI-Manipulated Fake Faces

no code implementations27 Oct 2020 Yang Yu, Rongrong Ni, Yao Zhao

Recently, AI-manipulated face techniques have developed rapidly and constantly, which has raised new security issues in society.

A Parallel Down-Up Fusion Network for Salient Object Detection in Optical Remote Sensing Images

no code implementations2 Oct 2020 Chongyi Li, Runmin Cong, Chunle Guo, Hua Li, Chunjie Zhang, Feng Zheng, Yao Zhao

In this paper, we propose a novel Parallel Down-up Fusion network (PDF-Net) for SOD in optical RSIs, which takes full advantage of the in-path low- and high-level features and cross-path multi-resolution features to distinguish diversely scaled salient objects and suppress the cluttered backgrounds.

object-detection Object Detection +1

Taking Modality-free Human Identification as Zero-shot Learning

no code implementations2 Oct 2020 Zhizhe Liu, Xingxing Zhang, Zhenfeng Zhu, Shuai Zheng, Yao Zhao, Jian Cheng

There have been numerous methods proposed for human identification, such as face identification, person re-identification, and gait identification.

Event Detection Face Identification +3

A Deep Ordinal Distortion Estimation Approach for Distortion Rectification

no code implementations21 Jul 2020 Kang Liao, Chunyu Lin, Yao Zhao

Distortion is widely existed in the images captured by popular wide-angle cameras and fisheye cameras.

Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation and Spatial Supervision

no code implementations20 Jun 2020 Haojie Liu, Kang Liao, Chunyu Lin, Yao Zhao, Yulan Guo

Pseudo-LiDAR point cloud interpolation is a novel and challenging task in the field of autonomous driving, which aims to address the frequency mismatching problem between camera and LiDAR.

Autonomous Driving Optical Flow Estimation

SEAL: Segment-wise Extractive-Abstractive Long-form Text Summarization

no code implementations18 Jun 2020 Yao Zhao, Mohammad Saleh, Peter J. Liu

Most prior work in the sequence-to-sequence paradigm focused on datasets with input sequence lengths in the hundreds of tokens due to the computational constraints of common RNN and Transformer architectures.

Abstractive Text Summarization

Referring Image Segmentation by Generative Adversarial Learning

no code implementations IEEE 2020 Shuang Qiu, Yao Zhao, Jianbo Jiao, Yunchao Wei, Shikui Wei

To this end, we propose to train the referring image segmentation model in a generative adversarial fashion, which well addresses the distribution similarity problem.

Image Segmentation Referring Expression +3

Fast Template Matching and Update for Video Object Tracking and Segmentation

1 code implementation CVPR 2020 Mingjie Sun, Jimin Xiao, Eng Gee Lim, Bingfeng Zhang, Yao Zhao

Specifically, the reinforcement learning agent learns to decide whether to update the target template according to the quality of the predicted result.

reinforcement-learning Reinforcement Learning (RL) +5

From Anchor Generation to Distribution Alignment: Learning a Discriminative Embedding Space for Zero-Shot Recognition

no code implementations10 Feb 2020 Fuzhen Li, Zhenfeng Zhu, Xingxing Zhang, Jian Cheng, Yao Zhao

In zero-shot learning (ZSL), the samples to be classified are usually projected into side information templates such as attributes.

Zero-Shot Learning

Deep Optimized Multiple Description Image Coding via Scalar Quantization Learning

2 code implementations12 Jan 2020 Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

In this paper, we introduce a deep multiple description coding (MDC) framework optimized by minimizing multiple description (MD) compressive loss.


Concurrently Extrapolating and Interpolating Networks for Continuous Model Generation

1 code implementation12 Jan 2020 Lijun Zhao, Jinjing Zhang, Fan Zhang, Anhong Wang, Huihui Bai, Yao Zhao

Most deep image smoothing operators are always trained repetitively when different explicit structure-texture pairs are employed as label images for each algorithm configured with different parameters.

image smoothing

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

16 code implementations ICML 2020 Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu

Recent work pre-training Transformers with self-supervised objectives on large text corpora has shown great success when fine-tuned on downstream NLP tasks including text summarization.

Abstractive Text Summarization

To See in the Dark: N2DGAN for Background Modeling in Nighttime Scene

no code implementations12 Dec 2019 Zhenfeng Zhu, Yingying Meng, Deqiang Kong, Xingxing Zhang, Yandong Guo, Yao Zhao

Due to the deteriorated conditions of \mbox{illumination} lack and uneven lighting, nighttime images have lower contrast and higher noise than their daytime counterparts of the same scene, which limits seriously the performances of conventional background modeling methods.

Distribution-induced Bidirectional Generative Adversarial Network for Graph Representation Learning

1 code implementation CVPR 2020 Shuai Zheng, Zhenfeng Zhu, Xingxing Zhang, Zhizhe Liu, Jian Cheng, Yao Zhao

Graph representation learning aims to encode all nodes of a graph into low-dimensional vectors that will serve as input of many compute vision tasks.

Graph Representation Learning

Progressive Sample Mining and Representation Learning for One-Shot Person Re-identification with Adversarial Samples

1 code implementation2 Nov 2019 Hui Li, Jimin Xiao, Ming-Jie Sun, Eng Gee Lim, Yao Zhao

To tackle this problem, we propose to iteratively guess pseudo labels for the unlabeled image samples, which are later used to update the re-identification model together with the labelled samples.

Person Re-Identification Pseudo Label +1

ATZSL: Defensive Zero-Shot Recognition in the Presence of Adversaries

no code implementations24 Oct 2019 Xingxing Zhang, Shupeng Gui, Zhenfeng Zhu, Yao Zhao, Ji Liu

In this paper, we take an initial attempt, and propose a generic formulation to provide a systematical solution (named ATZSL) for learning a robust ZSL model.

Image Captioning Object Recognition +2

Hierarchical Prototype Learning for Zero-Shot Recognition

no code implementations24 Oct 2019 Xingxing Zhang, Shupeng Gui, Zhenfeng Zhu, Yao Zhao, Ji Liu

Specifically, HPL is able to obtain discriminability on both seen and unseen class domains by learning visual prototypes respectively under the transductive setting.

Image Captioning Object Recognition +2

ProLFA: Representative Prototype Selection for Local Feature Aggregation

1 code implementation24 Oct 2019 Xingxing Zhang, Zhenfeng Zhu, Yao Zhao

Given a set of hand-crafted local features, acquiring a global representation via aggregation is a promising technique to boost computational efficiency and improve task performance.

Prototype Selection

Convolutional Prototype Learning for Zero-Shot Recognition

no code implementations22 Oct 2019 Zhizhe Liu, Xingxing Zhang, Zhenfeng Zhu, Shuai Zheng, Yao Zhao, Jian Cheng

The key to ZSL is to transfer knowledge from the seen to the unseen classes via auxiliary class attribute vectors.

Image Captioning Object Recognition +2

PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation

no code implementations16 Sep 2019 Haojie Liu, Kang Liao, Chunyu Lin, Yao Zhao, Yulan Guo

In this paper, we propose a novel Pseudo-LiDAR interpolation network (PLIN) to increase the frequency of LiDAR sensors.

Autonomous Driving

Segmentation Mask Guided End-to-End Person Search

1 code implementation27 Aug 2019 Dingyuan Zheng, Jimin Xiao, Kai-Zhu Huang, Yao Zhao

Person search aims to search for a target person among multiple images recorded by multiple surveillance cameras, which faces various challenges from both pedestrian detection and person re-identification.

Pedestrian Detection Person Re-Identification +2

Primary quantization matrix estimation of double compressed JPEG images via CNN

1 code implementation9 Aug 2019 Yakun Niu, Benedetta Tondi, Yao Zhao, Mauro Barni

Available model-based techniques for the estimation of the primary quantization matrix in double-compressed JPEG images work only under specific conditions regarding the relationship between the first and second compression quality factors, and the alignment of the first and second JPEG compression grids.


Edge Heuristic GAN for Non-uniform Blind Deblurring

no code implementations11 Jul 2019 Shuai Zheng, Zhenfeng Zhu, Jian Cheng, Yandong Guo, Yao Zhao

Non-uniform blur, mainly caused by camera shake and motions of multiple objects, is one of the most common causes of image quality degradation.


Architecture Selection via the Trade-off Between Accuracy and Robustness

no code implementations4 Jun 2019 Zhun Deng, Cynthia Dwork, Jialiang Wang, Yao Zhao

We provide a general framework for characterizing the trade-off between accuracy and robustness in supervised learning.

Adversarial Attack

EA-LSTM: Evolutionary Attention-based LSTM for Time Series Prediction

no code implementations9 Nov 2018 Youru Li, Zhenfeng Zhu, Deqiang Kong, Hua Han, Yao Zhao

To address this issue, an evolutionary attention-based LSTM training with competitive random search is proposed for multivariate time series prediction.

Time Series Time Series Prediction

Deep Multiple Description Coding by Learning Scalar Quantization

1 code implementation5 Nov 2018 Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

Secondly, two entropy estimation networks are learned to estimate the informative amounts of the quantized tensors, which can further supervise the learning of multiple description encoder network to represent the input image delicately.


Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks

1 code implementation EMNLP 2018 Yao Zhao, Xiaochuan Ni, Yuanyuan Ding, Qifa Ke

Long text has posed challenges for sequence to sequence neural models in question generation {--} worse performances were reported if using the whole paragraph (with multiple sentences) as the input.

Question Answering Question Generation +3

Virtual Codec Supervised Re-Sampling Network for Image Compression

1 code implementation22 Jun 2018 Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

In order to train RSN network and IDN network together in an end-to-end fashion, our VCN network intimates projection from the re-sampled vectors to the IDN-decoded image.

Dimensionality Reduction Image Compression +1

Adversarial Attacks and Defences Competition

1 code implementation31 Mar 2018 Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille, Sangxia Huang, Yao Zhao, Yuzhe Zhao, Zhonglin Han, Junjiajia Long, Yerkebulan Berdibekov, Takuya Akiba, Seiya Tokui, Motoki Abe

To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them.

BIG-bench Machine Learning

Security Consideration For Deep Learning-Based Image Forensics

no code implementations29 Mar 2018 Wei Zhao, Pengpeng Yang, Rongrong Ni, Yao Zhao, Haorui Wu

Instead of improving it, in this paper, the safety of deep learning based methods in the field of image forensics is taken into account.

Image Forensics

Non-Local Graph-Based Prediction For Reversible Data Hiding In Images

no code implementations20 Feb 2018 Qi Chang, Gene Cheung, Yao Zhao, Xiaolong Li, Rongrong Ni

If sufficiently smooth, we pose a maximum a posteriori (MAP) problem using either a quadratic Laplacian regularizer or a graph total variation (GTV) term as signal prior.

Mixed-Resolution Image Representation and Compression with Convolutional Neural Networks

no code implementations2 Feb 2018 Lijun Zhao, Huihui Bai, Feng Li, Anhong Wang, Yao Zhao

Firstly, given one input image, feature description neural network (FDNN) is used to generate a new representation of this image, so that this image representation can be more efficiently compressed by standard codec, as compared to the input image.

Image Compression Quantization

Secure Detection of Image Manipulation by means of Random Feature Selection

no code implementations2 Feb 2018 Zhipeng Chen, Benedetta Tondi, Xiaolong Li, Rongrong Ni, Yao Zhao, Mauro Barni

We address the problem of data-driven image manipulation detection in the presence of an attacker with limited knowledge about the detector.

Cryptography and Security

Multiple Description Convolutional Neural Networks for Image Compression

no code implementations20 Jan 2018 Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

Thirdly, multiple description virtual codec network (MDVCN) is proposed to bridge the gap between MDGN network and MDRN network in order to train an end-to-end MDC framework.

Image Compression

Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image

1 code implementation16 Dec 2017 Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

Due to the challenge of directly learning a non-linear function for a standard codec based on convolutional neural network, we propose to learn a virtual codec neural network to approximate the projection from the valid description image to the post-processed compressed image, so that the gradient could be efficiently back-propagated from the post-processing neural network to the feature description neural network during training.

Blocking Image Compression +2

Simultaneously Color-Depth Super-Resolution with Conditional Generative Adversarial Network

no code implementations30 Aug 2017 Lijun Zhao, Huihui Bai, Jie Liang, Bing Zeng, Anhong Wang, Yao Zhao

Firstly, given the low-resolution depth image and low-resolution color image, a generative network is proposed to leverage mutual information of color image and depth image to enhance each other in consideration of the geometry structural dependency of color-depth image in the same scene.

Edge Detection image smoothing +4

Local Activity-tuned Image Filtering for Noise Removal and Image Smoothing

no code implementations9 Jul 2017 Lijun Zhao, Jie Liang, Huihui Bai, Lili Meng, Anhong Wang, Yao Zhao

Both frameworks employ the division of gradient and the local activity measurement to achieve noise removal.

Image Denoising image smoothing

Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach

no code implementations CVPR 2017 Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, Shuicheng Yan

We investigate a principle way to progressively mine discriminative object regions using classification networks to address the weakly-supervised semantic segmentation problems.

Classification General Classification +4

Source Camera Identification Based On Content-Adaptive Fusion Network

no code implementations15 Mar 2017 Pengpeng Yang, Wei Zhao, Rongrong Ni, Yao Zhao

In this paper, we propose a solution to identify the source camera of the small-size images: content-adaptive fusion network.

A New Evaluation Protocol and Benchmarking Results for Extendable Cross-media Retrieval

no code implementations10 Mar 2017 Ruoyu Liu, Yao Zhao, Liang Zheng, Shikui Wei, Yi Yang

Additionally, a trivial solution, \ie, directly using the predicted class label for cross-media retrieval, is tested.

Benchmarking Image Retrieval +2

Camera Fingerprint: A New Perspective for Identifying User's Identity

no code implementations25 Oct 2016 Xiang Jiang, Shikui Wei, Ruizhen Zhao, Yao Zhao, Xindong Wu

The underlying assumption is that multiple accounts belonging to the same person contain the same or similar camera fingerprint information.

Product Recommendation

STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation

1 code implementation10 Sep 2015 Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Jiashi Feng, Yao Zhao, Shuicheng Yan

Then, a better network called Enhanced-DCNN is learned with supervision from the predicted segmentation masks of simple images based on the Initial-DCNN as well as the image-level annotations.

object-detection RGB Salient Object Detection +4

Indexing of CNN Features for Large Scale Image Search

no code implementations2 Aug 2015 Ruoyu Liu, Yao Zhao, Shikui Wei, Yi Yang

The convolutional neural network (CNN) features can give a good description of image content, which usually represent images with unique global vectors.

Clustering Image Retrieval +2

Modality-dependent Cross-media Retrieval

no code implementations22 Jun 2015 Yunchao Wei, Yao Zhao, Zhenfeng Zhu, Shikui Wei, Yanhui Xiao, Jiashi Feng, Shuicheng Yan

Specifically, by jointly optimizing the correlation between images and text and the linear regression from one modal space (image or text) to the semantic space, two couples of mappings are learned to project images and text from their original feature spaces into two common latent subspaces (one for I2T and the other for T2I).


CNN: Single-label to Multi-label

no code implementations22 Jun 2014 Yunchao Wei, Wei Xia, Junshi Huang, Bingbing Ni, Jian Dong, Yao Zhao, Shuicheng Yan

Convolutional Neural Network (CNN) has demonstrated promising performance in single-label image classification tasks.

Image Classification

Kernel Reconstruction ICA for Sparse Representation

no code implementations9 Apr 2013 Yanhui Xiao, Zhenfeng Zhu, Yao Zhao

However, ICA is not only sensitive to whitening but also difficult to learn an over-complete basis.

Image Classification

