Search Results for author: Yan Yan

Found 160 papers, 49 papers with code

SECOND: Sparsely Embedded Convolutional Detection

1 code implementation • Sensors 2018 • Yan Yan, Yuxing Mao, Bo Li

LiDAR-based or RGB-D-based object detection is used in numerous applications, ranging from autonomous driving to robot vision.

3D Object Detection Autonomous Driving +2

1,687

Paper
Code

MIM4DD: Mutual Information Maximization for Dataset Distillation

1 code implementation • NeurIPS 2023 • Yuzhang Shang, Zhihang Yuan, Yan Yan

Thus, we introduce mutual information (MI) as the metric to quantify the shared information between the synthetic and the real datasets, and devise MIM4DD numerically maximizing the MI via a newly designed optimizable objective within a contrastive learning framework to update the synthetic dataset.

Contrastive Learning

1,153

Paper
Code

Style Aggregated Network for Facial Landmark Detection

1 code implementation • CVPR 2018 • Xuanyi Dong, Yan Yan, Wanli Ouyang, Yi Yang

In this work, we propose a style-aggregated approach to deal with the large intrinsic variance of image styles for facial landmark detection.

Ranked #2 on Facial Landmark Detection on AFLW-Front (Mean NME metric)

Face Alignment Facial Landmark Detection

917

Paper
Code

Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation

8 code implementations • 28 Mar 2019 • Hao Tang, Dan Xu, Nicu Sebe, Yan Yan

To handle the limitation, in this paper we propose a novel Attention-Guided Generative Adversarial Network (AGGAN), which can detect the most discriminative semantic object and minimize changes of unwanted part for semantic manipulation problems without using extra data and models.

Ranked #1 on Facial Expression Translation on CelebA

Generative Adversarial Network Translation +1

621

Paper
Code

Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation

3 code implementations • CVPR 2019 • Hao Tang, Dan Xu, Nicu Sebe, Yanzhi Wang, Jason J. Corso, Yan Yan

In this paper, we propose a novel approach named Multi-Channel Attention SelectionGAN (SelectionGAN) that makes it possible to generate images of natural scenes in arbitrary viewpoints, based on an image of the scene and a novel semantic map.

Ranked #1 on Cross-View Image-to-Image Translation on Dayton (64×64) - aerial-to-ground

Bird View Synthesis Cross-View Image-to-Image Translation +1

458

Paper
Code

Large-scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification

4 code implementations • ICCV 2021 • Zhuoning Yuan, Yan Yan, Milan Sonka, Tianbao Yang

Our studies demonstrate that the proposed DAM method improves the performance of optimizing cross-entropy loss by a large margin, and also achieves better performance than optimizing the existing AUC square loss on these medical image classification tasks.

Ranked #2 on Multi-Label Classification on CheXpert

General Classification Graph Property Prediction +3

270

Paper
Code

Unsupervised High-Resolution Portrait Gaze Correction and Animation

1 code implementation • 1 Jul 2022 • Jichao Zhang, Jingjing Chen, Hao Tang, Enver Sangineto, Peng Wu, Yan Yan, Nicu Sebe, Wei Wang

Solving this problem using an unsupervised method remains an open problem, especially for high-resolution face images in the wild, which are not easy to annotate with gaze and head pose labels.

Image Inpainting Vocal Bursts Intensity Prediction

190

Paper
Code

GestureGAN for Hand Gesture-to-Gesture Translation in the Wild

1 code implementation • 14 Aug 2018 • Hao Tang, Wei Wang, Dan Xu, Yan Yan, Nicu Sebe

Therefore, this task requires a high-level understanding of the mapping between the input source gesture and the output target gesture.

Ranked #1 on Gesture-to-Gesture Translation on NTU Hand Digit

Data Augmentation Generative Adversarial Network +2

173

Paper
Code

Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection

1 code implementation • CVPR 2021 • Zhenyu Zhang, Yanhao Ge, Renwang Chen, Ying Tai, Yan Yan, Jian Yang, Chengjie Wang, Jilin Li, Feiyue Huang

Non-parametric face modeling aims to reconstruct 3D face only from images without shape assumptions.

3D Face Modelling Attribute

151

Paper
Code

Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation

1 code implementation • 15 Jun 2022 • Ye Zhu, Yu Wu, Kyle Olszewski, Jian Ren, Sergey Tulyakov, Yan Yan

Diffusion probabilistic models (DPMs) have become a popular approach to conditional generation, due to their promising results and support for cross-modal synthesis.

Contrastive Learning Denoising +2

151

Paper
Code

LLM Inference Unveiled: Survey and Roofline Model Insights

2 code implementations • 26 Feb 2024 • Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer

Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model for systematic analysis of LLM inference techniques.

Knowledge Distillation Language Modelling +3

148

Paper
Code

Dual In-painting Model for Unsupervised Gaze Correction and Animation in the Wild

1 code implementation • 9 Aug 2020 • Jichao Zhang, Jingjing Chen, Hao Tang, Wei Wang, Yan Yan, Enver Sangineto, Nicu Sebe

In this paper we address the problem of unsupervised gaze correction in the wild, presenting a solution that works without the need for precise annotations of the gaze angle and the head pose.

146

Paper
Code

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

2 code implementations • CVPR 2020 • Hao Tang, Dan Xu, Yan Yan, Philip H. S. Torr, Nicu Sebe

To tackle this issue, in this work we consider learning the scene generation in a local context, and correspondingly design a local class-specific generative network with semantic maps as a guidance, which separately constructs and learns sub-generators concentrating on the generation of different classes, and is able to provide more scene details.

Ranked #2 on Cross-View Image-to-Image Translation on Dayton (256×256) - aerial-to-ground

Image Generation Scene Generation

145

Paper
Code

Post-training Quantization on Diffusion Models

1 code implementation • CVPR 2023 • Yuzhang Shang, Zhihang Yuan, Bin Xie, Bingzhe Wu, Yan Yan

These approaches define a forward diffusion process for transforming data into noise and a backward denoising process for sampling data from noise.

Denoising Noise Estimation +1

100

Paper
Code

Quantized GAN for Complex Music Generation from Dance Videos

1 code implementation • 1 Apr 2022 • Ye Zhu, Kyle Olszewski, Yu Wu, Panos Achlioptas, Menglei Chai, Yan Yan, Sergey Tulyakov

We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates complex musical samples conditioned on dance videos.

Music Generation

Paper
Code

Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation

1 code implementation • 2 Aug 2019 • Hao Tang, Dan Xu, Gaowen Liu, Wei Wang, Nicu Sebe, Yan Yan

In this work, we propose a novel Cycle In Cycle Generative Adversarial Network (C$^2$GAN) for the task of keypoint-guided image generation.

Generative Adversarial Network Image Generation

Paper
Code

Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

1 code implementation • 14 Jan 2019 • Hao Tang, Dan Xu, Wei Wang, Yan Yan, Nicu Sebe

State-of-the-art methods for image-to-image translation with Generative Adversarial Networks (GANs) can learn a mapping from one domain to another domain using unpaired image data.

Generative Adversarial Network Image-to-Image Translation +1

Paper
Code

Deep Micro-Dictionary Learning and Coding Network

1 code implementation • 11 Sep 2018 • Hao Tang, Heng Wei, Wei Xiao, Wei Wang, Dan Xu, Yan Yan, Nicu Sebe

In this paper, we propose a novel Deep Micro-Dictionary Learning and Coding Network (DDLCN).

Dictionary Learning

Paper
Code

Cross-View Panorama Image Synthesis

1 code implementation • 22 Mar 2022 • Songsong Wu, Hao Tang, Xiao-Yuan Jing, Haifeng Zhao, Jianjun Qian, Nicu Sebe, Yan Yan

In this paper, we tackle the problem of synthesizing a ground-view panorama image conditioned on a top-view aerial image, which is a challenging problem due to the large gap between the two image domains with different view-points.

Image Generation

Paper
Code

ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models

1 code implementation • 10 Dec 2023 • Zhihang Yuan, Yuzhang Shang, Yue Song, Qiang Wu, Yan Yan, Guangyu Sun

This paper explores a new post-hoc training-free compression paradigm for compressing Large Language Models (LLMs) to facilitate their wider adoption in various computing environments.

Paper
Code

TSTTC: A Large-Scale Dataset for Time-to-Contact Estimation in Driving Scenarios

1 code implementation • 4 Sep 2023 • Yuheng Shi, Zehao Huang, Yan Yan, Naiyan Wang, Xiaojie Guo

Time-to-Contact (TTC) estimation is a critical task for assessing collision risk and is widely used in various driver assistance and autonomous driving systems.

Autonomous Driving Neural Rendering

Paper
Code

QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning

1 code implementation • 6 Feb 2024 • Haoxuan Wang, Yuzhang Shang, Zhihang Yuan, Junyi Wu, Yan Yan

Diffusion models have achieved remarkable success in image generation tasks, yet their practical deployment is restrained by the high memory and time consumption.

Image Generation Model Compression +1

Paper
Code

Cascade Attention Guided Residue Learning GAN for Cross-Modal Translation

1 code implementation • 3 Jul 2019 • Bin Duan, Wei Wang, Hao Tang, Hugo Latapie, Yan Yan

However, in machine learning, this cross-modal learning is a nontrivial task because different modalities have no homogeneous properties.

BIG-bench Machine Learning Translation

Paper
Code

Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

1 code implementation • 12 Oct 2021 • Zongmeng Zhang, Xianjing Han, Xuemeng Song, Yan Yan, Liqiang Nie

Towards this end, in this work, we propose a Multi-modal Interaction Graph Convolutional Network (MIGCN), which jointly explores the complex intra-modal relations and inter-modal interactions residing in the video and sentence query to facilitate the understanding and semantic correspondence capture of the video and sentence query.

Semantic correspondence Semantic Similarity +2

Paper
Code

Learn-to-Decompose: Cascaded Decomposition Network for Cross-Domain Few-Shot Facial Expression Recognition

1 code implementation • 16 Jul 2022 • Xinyi Zou, Yan Yan, Jing-Hao Xue, Si Chen, Hanzi Wang

Extensive experiments on both in-the-lab and in-the-wild compound expression datasets demonstrate the superiority of our proposed CDNet against several state-of-the-art FSL methods.

cross-domain few-shot learning Facial Expression Recognition +1

Paper
Code

A Simple and Effective Framework for Pairwise Deep Metric Learning

1 code implementation • ECCV 2020 • Qi Qi, Yan Yan, Xiaoyu Wang, Tianbao Yang

To tackle this issue, we propose a simple and effective framework to sample pairs in a batch of data for updating the model.

Binary Classification Metric Learning

Paper
Code

Network Binarization via Contrastive Learning

1 code implementation • 6 Jul 2022 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.

Binarization Contrastive Learning +2

Paper
Code

Person Re-Identification via Recurrent Feature Aggregation

1 code implementation • 23 Jan 2017 • Yichao Yan, Bingbing Ni, Zhichao Song, Chao Ma, Yan Yan, Xiaokang Yang

We address the person re-identification problem by effectively exploiting a globally discriminative feature representation from a sequence of tracked human regions/patches.

Patch Matching Person Re-Identification

Paper
Code

Attribute-Guided Sketch Generation

1 code implementation • 28 Jan 2019 • Hao Tang, Xinya Chen, Wei Wang, Dan Xu, Jason J. Corso, Nicu Sebe, Yan Yan

To this end, we propose a novel Attribute-Guided Sketch Generative Adversarial Network (ASGAN) which is an end-to-end framework and contains two pairs of generators and discriminators, one of which is used to generate faces with attributes while the other one is employed for image-to-sketch translation.

Attribute Generative Adversarial Network +1

Paper
Code

Segmenting Objects in Day and Night:Edge-Conditioned CNN for Thermal Image Semantic Segmentation

1 code implementation • 24 Jul 2019 • Chenglong Li, Wei Xia, Yan Yan, Bin Luo, Jin Tang

These advantages of thermal infrared cameras make the segmentation of semantic objects in day and night.

Segmentation Semantic Segmentation

Paper
Code

Seeing your sleep stage: cross-modal distillation from EEG to infrared video

1 code implementation • 11 Aug 2022 • Jianan Han, Shaoxing Zhang, Aidong Men, Yang Liu, Ziming Yao, Yan Yan, Qingchao Chen

$S^3VE$ is a large-scale dataset including synchronized infrared video and EEG signal for sleep stage classification, including 105 subjects and 154, 573 video clips that is more than 1100 hours long.

EEG

Paper
Code

Towards Saner Deep Image Registration

1 code implementation • ICCV 2023 • Bin Duan, Ming Zhong, Yan Yan

Moreover, we derive a set of theoretical guarantees for our sanity-checked image registration method, with experimental results supporting our theoretical findings and their effectiveness in increasing the sanity of models without sacrificing any performance.

Image Registration

Paper
Code

Recognizing Emotions From Abstract Paintings Using Non-Linear Matrix Completion

1 code implementation • CVPR 2016 • Xavier Alameda-Pineda, Elisa Ricci, Yan Yan, Nicu Sebe

A very popular approach for transductive multi-label recognition under linear classification settings is matrix completion.

General Classification Matrix Completion +1

Paper
Code

Causal-DFQ: Causality Guided Data-free Network Quantization

1 code implementation • ICCV 2023 • Yuzhang Shang, Bingxin Xu, Gaowen Liu, Ramana Kompella, Yan Yan

Inspired by the causal understanding, we propose the Causality-guided Data-free Network Quantization method, Causal-DFQ, to eliminate the reliance on data via approaching an equilibrium of causality-driven intervened distributions.

Data Free Quantization Neural Network Compression

Paper
Code

High-Order Structure Based Middle-Feature Learning for Visible-Infrared Person Re-Identification

1 code implementation • 13 Dec 2023 • Liuxiang Qiu, Si Chen, Yan Yan, Jing-Hao Xue, Da-Han Wang, Shunzhi Zhu

Existing VI-ReID methods ignore high-order structure information of features while being relatively difficult to learn a reasonable common feature space due to the large modality discrepancy between VIS and IR images.

Person Re-Identification

Paper
Code

Improving Uncertainty Quantification of Deep Classifiers via Neighborhood Conformal Prediction: Novel Algorithm and Theoretical Analysis

1 code implementation • 19 Mar 2023 • Subhankar Ghosh, Taha Belkhouja, Yan Yan, Janardhan Rao Doppa

Safe deployment of deep neural networks in high-stake real-world applications requires theoretically sound uncertainty quantification.

Conformal Prediction Uncertainty Quantification

Paper
Code

Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents

1 code implementation • ECCV 2020 • Ye Zhu, Yu Wu, Yi Yang, Yan Yan

With the arising concerns for the AI systems provided with direct access to abundant sensitive information, researchers seek to develop more reliable AI with implicit information sources.

Video Description

Paper
Code

Parallel Blockwise Knowledge Distillation for Deep Neural Network Compression

1 code implementation • 5 Dec 2020 • Cody Blakeney, Xiaomin Li, Yan Yan, Ziliang Zong

The experimental results running on an AMD server with four Geforce RTX 2080Ti GPUs show that our algorithm can achieve 3x speedup plus 19% energy savings on VGG distillation, and 3. 5x speedup plus 29% energy savings on ResNet distillation, both with negligible accuracy loss.

Knowledge Distillation Neural Network Compression +3

Paper
Code

Saying the Unseen: Video Descriptions via Dialog Agents

1 code implementation • 26 Jun 2021 • Ye Zhu, Yu Wu, Yi Yang, Yan Yan

Current vision and language tasks usually take complete visual data (e. g., raw images or videos) as input, however, practical scenarios may often consist the situations where part of the visual information becomes inaccessible due to various reasons e. g., restricted view with fixed camera or intentional vision block for security concerns.

Transfer Learning

Paper
Code

Lipschitz Continuity Retained Binary Neural Network

1 code implementation • 13 Jul 2022 • Yuzhang Shang, Dan Xu, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Relying on the premise that the performance of a binary neural network can be largely restored with eliminated quantization error between full-precision weight vectors and their corresponding binary vectors, existing works of network binarization frequently adopt the idea of model robustness to reach the aforementioned objective.

Binarization Quantization

Paper
Code

Supplementing Missing Visions via Dialog for Scene Graph Generations

1 code implementation • 23 Apr 2022 • Zhenghao Zhao, Ye Zhu, Xiaoguang Zhu, Yuzhang Shang, Yan Yan

Most current AI systems rely on the premise that the input visual data are sufficient to achieve competitive performance in various computer vision tasks.

Graph Generation Scene Graph Generation

Paper
Code

Dynamic Time Warping based Adversarial Framework for Time-Series Domain

1 code implementation • 9 Jul 2022 • Taha Belkhouja, Yan Yan, Janardhan Rao Doppa

Despite the rapid progress on research in adversarial robustness of deep neural networks (DNNs), there is little principled work for the time-series domain.

Adversarial Robustness Dynamic Time Warping +2

Paper
Code

Out-of-Distribution Detection in Time-Series Domain: A Novel Seasonal Ratio Scoring Approach

1 code implementation • 9 Jul 2022 • Taha Belkhouja, Yan Yan, Janardhan Rao Doppa

Experiments on diverse real-world benchmarks demonstrate that the SRS method is well-suited for time-series OOD detection when compared to baseline methods.

Out-of-Distribution Detection Out of Distribution (OOD) Detection +2

Paper
Code

Adversarial-Metric Learning for Audio-Visual Cross-Modal Matching

1 code implementation • IEEE Transactions on Multimedia 2021 • Aihua Zheng, Menglan Hu, Bo Jiang *, Yan Huang, Yan Yan, and Bin Luo

AML aims to generate a modality-independent representation for each person in each modality via adversarial learning, while simultaneously learns a robust similarity measure for cross-modality matching via metric learning.

audio-visual learning Metric Learning +1

Paper
Code

3D Cross-Pseudo Supervision (3D-CPS): A semi-supervised nnU-Net architecture for abdominal organ segmentation

1 code implementation • 19 Sep 2022 • Yongzhi Huang, Hanwen Zhang, Yan Yan, Haseeb Hassan

Large curated datasets are necessary, but annotating medical images is a time-consuming, laborious, and expensive process.

Organ Segmentation

Paper
Code

Spatial-Contextual Discrepancy Information Compensation for GAN Inversion

1 code implementation • 12 Dec 2023 • Ziqiang Zhang, Yan Yan, Jing-Hao Xue, Hanzi Wang

SDIC follows a "compensate-and-edit" paradigm and successfully bridges the gap in image details between the original image and the reconstructed/edited image.

Paper
Code

Simon Says: Evaluating and Mitigating Bias in Pruned Neural Networks with Knowledge Distillation

1 code implementation • 15 Jun 2021 • Cody Blakeney, Nathaniel Huish, Yan Yan, Ziliang Zong

In recent years the ubiquitous deployment of AI has posed great concerns in regards to algorithmic bias, discrimination, and fairness.

Fairness Knowledge Distillation

Paper
Code

Cross-modal Knowledge Distillation for Vision-to-Sensor Action Recognition

1 code implementation • 8 Oct 2021 • Jianyuan Ni, Raunak Sarbajna, Yang Liu, Anne H. H. Ngu, Yan Yan

Human activity recognition (HAR) based on multi-modal approach has been recently shown to improve the accuracy performance of HAR.

Action Recognition Human Activity Recognition +3

Paper
Code

Distribution-based Label Space Transformation for Multi-label Learning

no code implementations • 15 May 2018 • Zongting Lyu, Yan Yan, Fei Wu

This endows DLST the capability to handle label set sparsity and training data sparsity in multi-label learning problems.

Dimensionality Reduction Multi-Label Learning

Paper
Add Code

Multi-task Learning of Cascaded CNN for Facial Attribute Classification

no code implementations • 3 May 2018 • Ni Zhuang, Yan Yan, Si Chen, Hanzi Wang

In order to address the above problems, we propose a novel multi-task learning of cas- caded convolutional neural network method, termed MCFA, for predicting multiple facial attributes simultaneously.

Attribute Classification +5

Paper
Add Code

Multi-label Learning Based Deep Transfer Neural Network for Facial Attribute Classification

no code implementations • 3 May 2018 • Ni Zhuang, Yan Yan, Si Chen, Hanzi Wang, Chunhua Shen

To address the above problem, we propose a novel deep transfer neural network method based on multi-label learning for facial attribute classification, termed FMTNet, which consists of three sub-networks: the Face detection Network (FNet), the Multi-label learning Network (MNet) and the Transfer learning Network (TNet).

Attribute Classification +6

Paper
Add Code

Superpixel-guided Two-view Deterministic Geometric Model Fitting

no code implementations • 3 May 2018 • Guobao Xiao, Hanzi Wang, Yan Yan, David Suter

Specifically, SDF includes three main parts: a deterministic sampling algorithm, a model hypothesis updating strategy and a novel model selection algorithm.

Model Selection Superpixels +1

Paper
Add Code

A Fast Face Detection Method via Convolutional Neural Network

no code implementations • 27 Mar 2018 • Guanjun Guo, Hanzi Wang, Yan Yan, Jin Zheng, Bo Li

Current face or object detection methods via convolutional neural network (such as OverFeat, R-CNN and DenseNet) explicitly extract multi-scale features based on an image pyramid.

Face Detection object-detection +1

Paper
Add Code

A New Target-specific Object Proposal Generation Method for Visual Tracking

no code implementations • 27 Mar 2018 • Guanjun Guo, Hanzi Wang, Yan Yan, Hong-Yuan Mark Liao, Bo Li

Then, we apply the proposed TOPG method to the task of visual tracking and propose a TOPG-based tracker (called as TOPGT), where TOPG is used as a sample selection strategy to select a small number of high-quality target candidates from the generated object proposals.

Object Object Proposal Generation +1

Paper
Add Code

Deep Adversarial Attention Alignment for Unsupervised Domain Adaptation: the Benefit of Target Expectation Maximization

no code implementations • ECCV 2018 • Guoliang Kang, Liang Zheng, Yan Yan, Yi Yang

Second, we estimate the posterior label distribution of the unlabeled data for target network training.

Unsupervised Domain Adaptation

Paper
Add Code

Searching for Representative Modes on Hypergraphs for Robust Geometric Model Fitting

no code implementations • 4 Feb 2018 • Hanzi Wang, Guobao Xiao, Yan Yan, David Suter

We cast the task of geometric model fitting as a representative mode-seeking problem on hypergraphs.

Paper
Add Code

Automatic Image Cropping for Visual Aesthetic Enhancement Using Deep Neural Networks and Cascaded Regression

no code implementations • 25 Dec 2017 • Guanjun Guo, Hanzi Wang, Chunhua Shen, Yan Yan, Hong-Yuan Mark Liao

The deep CNN model is then designed to extract features from several image cropping datasets, upon which the cropping bounding boxes are predicted by the proposed CCR method.

Image Cropping regression

Paper
Add Code

Revisiting Graph Construction for Fast Image Segmentation

no code implementations • 18 Feb 2017 • Zizhao Zhang, Fuyong Xing, Hanzi Wang, Yan Yan, Ying Huang, Xiaoshuang Shi, Lin Yang

In this paper, we propose a simple but effective method for fast image segmentation.

Clustering graph construction +4

Paper
Add Code

From Query-By-Keyword to Query-By-Example: LinkedIn Talent Search Approach

no code implementations • 3 Sep 2017 • Viet Ha-Thuc, Yan Yan, Xianren Wu, Vijay Dialani, Abhishek Gupta, Shakti Sinha

One key challenge in talent search is to translate complex criteria of a hiring position into a search query, while it is relatively easy for a searcher to list examples of suitable candidates for a given position.

Position

Paper
Add Code

Optimizing Gross Merchandise Volume via DNN-MAB Dynamic Ranking Paradigm

no code implementations • 14 Aug 2017 • Yan Yan, Wentao Guo, Meng Zhao, Jinghe Hu, Weipeng P. Yan

With the transition from people's traditional `brick-and-mortar' shopping to online mobile shopping patterns in web 2. 0 $\mathit{era}$, the recommender system plays a critical role in E-Commerce and E-Retails.

Recommendation Systems

Paper
Add Code

Object Discovery via Cohesion Measurement

no code implementations • 28 Apr 2017 • Guanjun Guo, Hanzi Wang, Wan-Lei Zhao, Yan Yan, Xuelong. Li

Based on the new Cohesion Measurement, a novel object discovery method is proposed to discover objects latent in an image by utilizing the eigenvectors of the affinity matrix.

Clustering Image Segmentation +5

Paper
Add Code

Learn to Model Motion from Blurry Footages

no code implementations • 19 Apr 2017 • Wenbin Li, Da Chen, Zhihan Lv, Yan Yan, Darren Cosker

It is difficult to recover the motion field from a real-world footage given a mixture of camera shake and other photometric effects.

Optical Flow Estimation

Paper
Add Code

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than $O(1/ε)$

no code implementations • NeurIPS 2016 • Yi Xu, Yan Yan, Qihang Lin, Tianbao Yang

In this work, we will show that the proposed HOPS achieved a lower iteration complexity of $\widetilde O(1/\epsilon^{1-\theta})$\footnote{$\widetilde O()$ suppresses a logarithmic factor.}

Paper
Add Code

Superpixel-based Two-view Deterministic Fitting for Multiple-structure Data

no code implementations • 20 Jul 2016 • Guobao Xiao, Hanzi Wang, Yan Yan, David Suter

The feature appearances are beneficial to reduce the computational complexity for deterministic fitting methods.

Model Selection Superpixels +1

Paper
Add Code

Learning Hough Regression Models via Bridge Partial Least Squares for Object Detection

no code implementations • 26 Mar 2016 • Jianyu Tang, Hanzi Wang, Yan Yan

And the appropriate value of the only parameter used in PLS (i. e., the number of latent components) can be determined by using a cross-validation procedure.

Clustering Object +3

Paper
Add Code

Mode-Seeking on Hypergraphs for Robust Geometric Model Fitting

no code implementations • ICCV 2015 • Hanzi Wang, Guobao Xiao, Yan Yan, David Suter

In addition to the mode seeking algorithm, MSH includes a similarity measure between vertices on the hypergraph and a weight-aware sampling technique.

Paper
Add Code

An Effective Unconstrained Correlation Filter and Its Kernelization for Face Recognition

no code implementations • 25 Mar 2016 • Yan Yan, Hanzi Wang, Cuihua Li, Chenhui Yang, Bineng Zhong

In this paper, an effective unconstrained correlation filter called Uncon- strained Optimal Origin Tradeoff Filter (UOOTF) is presented and applied to robust face recognition.

Face Recognition Robust Face Recognition

Paper
Add Code

Quadratic Projection Based Feature Extraction with Its Application to Biometric Recognition

no code implementations • 25 Mar 2016 • Yan Yan, Hanzi Wang, Si Chen, Xiaochun Cao, David Zhang

This paper presents a novel quadratic projection based feature extraction framework, where a set of quadratic matrices is learned to distinguish each class from all other classes.

Paper
Add Code

Multi-Subregion Based Correlation Filter Bank for Robust Face Recognition

no code implementations • 24 Mar 2016 • Yan Yan, Hanzi Wang, David Suter

In this paper, we propose an effective feature extraction algorithm, called Multi-Subregion based Correlation Filter Bank (MS-CFB), for robust face recognition.

Face Recognition Robust Face Recognition

Paper
Add Code

Search by Ideal Candidates: Next Generation of Talent Search at LinkedIn

no code implementations • 26 Feb 2016 • Viet Ha-Thuc, Ye Xu, Satya Pradeep Kanduri, Xianren Wu, Vijay Dialani, Yan Yan, Abhishek Gupta, Shakti Sinha

This new system only needs the searcher to input one or several examples of suitable candidates for the position.

Position

Paper
Add Code

Learning Deep Representations of Appearance and Motion for Anomalous Event Detection

no code implementations • 6 Oct 2015 • Dan Xu, Elisa Ricci, Yan Yan, Jingkuan Song, Nicu Sebe

We present a novel unsupervised deep learning framework for anomalous event detection in complex video scenes.

Anomaly Detection Denoising +1

Paper
Add Code

Efficient Semidefinite Spectral Clustering via Lagrange Duality

no code implementations • 22 Feb 2014 • Yan Yan, Chunhua Shen, Hanzi Wang

constraint for spectral clustering.

Clustering

Paper
Add Code

A Unified Analysis of Stochastic Momentum Methods for Deep Learning

no code implementations • 30 Aug 2018 • Yan Yan, Tianbao Yang, Zhe Li, Qihang Lin, Yi Yang

However, their theoretical analysis of convergence of the training objective and the generalization error for prediction is still under-explored.

Paper
Add Code

Learning Discriminators as Energy Networks in Adversarial Learning

no code implementations • ICLR 2019 • Pingbo Pan, Yan Yan, Tianbao Yang, Yi Yang

In this work, we propose to refine the predictions of structured prediction models by effectively integrating discriminative models into the prediction.

Image Segmentation Multi-Label Classification +2

Paper
Add Code

DSNet: Deep and Shallow Feature Learning for Efficient Visual Tracking

no code implementations • 6 Nov 2018 • Qiangqiang Wu, Yan Yan, Yanjie Liang, Yi Liu, Hanzi Wang

In recent years, Discriminative Correlation Filter (DCF) based tracking methods have achieved great success in visual tracking.

Image Classification Visual Tracking

Paper
Add Code

Stagewise Training Accelerates Convergence of Testing Error Over SGD

no code implementations • NeurIPS 2019 • Zhuoning Yuan, Yan Yan, Rong Jin, Tianbao Yang

For convex loss functions and two classes of "nice-behaviored" non-convex objectives that are close to a convex function, we establish faster convergence of stagewise training than the vanilla SGD under the PL condition on both training error and testing error.

Paper
Add Code

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/\epsilon)

no code implementations • NeurIPS 2016 • Yi Xu, Yan Yan, Qihang Lin, Tianbao Yang

To the best of our knowledge, this is the lowest iteration complexity achieved so far for the considered non-smooth optimization problems without strong convexity assumption.

Paper
Add Code

Exploit the Unknown Gradually: One-Shot Video-Based Person Re-Identification by Stepwise Learning

no code implementations • CVPR 2018 • Yu Wu, Yutian Lin, Xuanyi Dong, Yan Yan, Wanli Ouyang, Yi Yang

We focus on the one-shot learning for video-based person re-Identification (re-ID).

One-Shot Learning Pedestrian Detection +1

Paper
Add Code

Optimal Graph Learning With Partial Tags and Multiple Features for Image and Video Annotation

no code implementations • CVPR 2015 • Lianli Gao, Jingkuan Song, Feiping Nie, Yan Yan, Nicu Sebe, Heng Tao Shen

In multimedia annotation, due to the time constraints and the tediousness of manual tagging, it is quite common to utilize both tagged and untagged data to improve the performance of supervised learning when only limited tagged training data are available.

graph construction Graph Learning

Paper
Add Code

Recurrent Face Aging

no code implementations • CVPR 2016 • Wei Wang, Zhen Cui, Yan Yan, Jiashi Feng, Shuicheng Yan, Xiangbo Shu, Nicu Sebe

Modeling the aging process of human face is important for cross-age face verification and recognition.

Face Verification

Paper
Add Code

Weakly Supervised Actor-Action Segmentation via Robust Multi-Task Ranking

no code implementations • CVPR 2017 • Yan Yan, Chenliang Xu, Dawen Cai, Jason J. Corso

However, current methods for detailed understanding of actor and action have significant limitations: they require large amounts of finely labeled data, and they fail to capture any internal relationship among actors and actions.

Action Classification Action Segmentation +2

Paper
Add Code

Localize Me Anywhere, Anytime: A Multi-Task Point-Retrieval Approach

no code implementations • ICCV 2015 • Guoyu Lu, Yan Yan, Li Ren, Jingkuan Song, Nicu Sebe, Chandra Kambhamettu

The main contribution of our paper is that we use a 3D model reconstructed by a short video as the query to realize 3D-to-3D localization under a multi-task point retrieval framework.

Image-Based Localization Multi-Task Learning +1

Paper
Add Code

Stochastic Primal-Dual Algorithms with Faster Convergence than $O(1/\sqrt{T})$ for Problems without Bilinear Structure

no code implementations • 23 Apr 2019 • Yan Yan, Yi Xu, Qihang Lin, Lijun Zhang, Tianbao Yang

The main contribution of this paper is the design and analysis of new stochastic primal-dual algorithms that use a mixture of stochastic gradient updates and a logarithmic number of deterministic dual updates for solving a family of convex-concave problems with no bilinear structure assumed.

Paper
Add Code

Joint Learning of Self-Representation and Indicator for Multi-View Image Clustering

no code implementations • 11 May 2019 • Songsong Wu, Zhiqiang Lu, Hao Tang, Yan Yan, Songhao Zhu, Xiao-Yuan Jing, Zuoyong Li

Multi-view subspace clustering aims to divide a set of multisource data into several groups according to their underlying subspace structure.

Clustering Multi-view Subspace Clustering

Paper
Add Code

Structured Discriminative Tensor Dictionary Learning for Unsupervised Domain Adaptation

no code implementations • 11 May 2019 • Songsong Wu, Yan Yan, Hao Tang, Jianjun Qian, Jian Zhang, Xiao-Yuan Jing

However, the number of labeled source samples are always limited due to expensive annotation cost in practice, making sub-optimal performance been observed.

Dictionary Learning Pseudo Label +1

Paper
Add Code

Expression Conditional GAN for Facial Expression-to-Expression Translation

no code implementations • 14 May 2019 • Hao Tang, Wei Wang, Songsong Wu, Xinya Chen, Dan Xu, Nicu Sebe, Yan Yan

In this paper, we focus on the facial expression translation task and propose a novel Expression Conditional GAN (ECGAN) which can learn the mapping from one image domain to another one based on an additional expression attribute.

Attribute Facial expression generation +2

Paper
Add Code

GazeCorrection:Self-Guided Eye Manipulation in the wild using Self-Supervised Generative Adversarial Networks

no code implementations • arXiv 2019 • Jichao Zhang, Meng Sun, Jingjing Chen, Hao Tang, Yan Yan, Xueying Qin, Nicu Sebe

Gaze correction aims to redirect the person's gaze into the camera by manipulating the eye region, and it can be considered as a specific image resynthesis problem.

Resynthesis

Paper
Add Code

Pattern-Affinitive Propagation across Depth, Surface Normal and Semantic Segmentation

no code implementations • CVPR 2019 • Zhen-Yu Zhang, Zhen Cui, Chunyan Xu, Yan Yan, Nicu Sebe, Jian Yang

In this paper, we propose a novel Pattern-Affinitive Propagation (PAP) framework to jointly predict depth, surface normal and semantic segmentation.

Ranked #51 on Semantic Segmentation on NYU Depth v2

Monocular Depth Estimation Semantic Segmentation

Paper
Add Code

Hallucinated Adversarial Learning for Robust Visual Tracking

no code implementations • 17 Jun 2019 • Qiangqiang Wu, Zhihui Chen, Lin Cheng, Yan Yan, Bo Li, Hanzi Wang

Incorporating such an ability to hallucinate diverse new samples of the tracked instance can help the trackers alleviate the over-fitting problem in the low-data tracking regime.

Visual Tracking

Paper
Add Code

Hierarchical Bayesian Personalized Recommendation: A Case Study and Beyond

no code implementations • 20 Aug 2019 • Zitao Liu, Zhexuan Xu, Yan Yan

Items in modern recommender systems are often organized in hierarchical structures.

Recommendation Systems Variational Inference

Paper
Add Code

Stochastic Optimization for Non-convex Inf-Projection Problems

no code implementations • ICML 2020 • Yan Yan, Yi Xu, Lijun Zhang, Xiaoyu Wang, Tianbao Yang

In this paper, we study a family of non-convex and possibly non-smooth inf-projection minimization problems, where the target objective function is equal to minimization of a joint function over another variable.

Stochastic Optimization

Paper
Add Code

Time-weighted Attentional Session-Aware Recommender System

no code implementations • 12 Sep 2019 • Mei Wang, Weizhi Li, Yan Yan

Session-based Recurrent Neural Networks (RNNs) are gaining increasing popularity for recommendation task, due to the high autocorrelation of user's behavior on the latest session and the effectiveness of RNN to capture the sequence order information.

Collaborative Filtering Recommendation Systems

Paper
Add Code

Adversarial Partial Multi-Label Learning

no code implementations • 15 Sep 2019 • Yan Yan, Yuhong Guo

Partial multi-label learning (PML), which tackles the problem of learning multi-label prediction models from instances with overcomplete noisy annotations, has recently started gaining attention from the research community.

Generative Adversarial Network Multi-Label Learning

Paper
Add Code

Adversarial Paritial Multi-label Learning

no code implementations • ICLR 2020 • Yan Yan, Yuhong Guo

Generative Adversarial Network Multi-Label Learning

Paper
Add Code

Joint Deep Learning of Facial Expression Synthesis and Recognition

no code implementations • 6 Feb 2020 • Yan Yan, Ying Huang, Si Chen, Chunhua Shen, Hanzi Wang

Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.

Facial Expression Recognition Facial Expression Recognition (FER) +1

Paper
Add Code

Adaptive Deep Metric Embeddings for Person Re-Identification under Occlusions

no code implementations • 7 Feb 2020 • Wanxiang Yang, Yan Yan, Si Chen

In this paper, we propose a novel person ReID method, which learns the spatial dependencies between the local regions and extracts the discriminative feature representation of the pedestrian image based on Long Short-Term Memory (LSTM), dealing with the problem of occlusions.

Person Re-Identification

Paper
Add Code

Object-Adaptive LSTM Network for Real-time Visual Tracking with Adversarial Data Augmentation

no code implementations • 7 Feb 2020 • Yihan Du, Yan Yan, Si Chen, Yang Hua

This strategy efficiently filters out some irrelevant proposals and avoids the redundant computation for feature extraction, which enables our method to operate faster than conventional classification-based tracking methods.

Computational Efficiency Data Augmentation +3

Paper
Add Code

Deep Multi-task Multi-label CNN for Effective Facial Attribute Classification

no code implementations • 10 Feb 2020 • Longbiao Mao, Yan Yan, Jing-Hao Xue, Hanzi Wang

Two different network architectures are respectively designed to extract features for two groups of attributes, and a novel dynamic weighting scheme is proposed to automatically assign the loss weight to each facial attribute during training.

Attribute Face Detection +5

Paper
Add Code

Exocentric to Egocentric Image Generation via Parallel Generative Adversarial Network

no code implementations • 8 Feb 2020 • Gaowen Liu, Hao Tang, Hugo Latapie, Yan Yan

In this paper, we investigate exocentric (third-person) view to egocentric (first-person) view image generation.

Generative Adversarial Network Image Generation

Paper
Add Code

Hypergraph Optimization for Multi-structural Geometric Model Fitting

no code implementations • 13 Feb 2020 • Shuyuan Lin, Guobao Xiao, Yan Yan, David Suter, Hanzi Wang

Recently, some hypergraph-based methods have been proposed to deal with the problem of model fitting in computer vision, mainly due to the superior capability of hypergraph to represent the complex relationship between data points.

Clustering

Paper
Add Code

Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization

no code implementations • NeurIPS 2020 • Yan Yan, Yi Xu, Qihang Lin, Wei Liu, Tianbao Yang

In this paper, we bridge this gap by providing a sharp analysis of epoch-wise stochastic gradient descent ascent method (referred to as Epoch-GDA) for solving strongly convex strongly concave (SCSC) min-max problems, without imposing any additional assumption about smoothness or the function's structure.

LEMMA

Paper
Add Code

Learning Object Scale With Click Supervision for Object Detection

no code implementations • 20 Feb 2020 • Liao Zhang, Yan Yan, Lin Cheng, Hanzi Wang

Finally, we fuse these CAMs together to generate pseudoground-truths and train a fully-supervised object detector withthese ground-truths.

Object object-detection +1

Paper
Add Code

Revisiting SGD with Increasingly Weighted Averaging: Optimization and Generalization Perspectives

no code implementations • 9 Mar 2020 • Zhishuai Guo, Yan Yan, Tianbao Yang

It remains unclear how these averaging schemes affect the convergence of {\it both optimization error and generalization error} (two equally important components of testing error) for {\bf non-strongly convex objectives, including non-convex problems}.

Paper
Add Code

Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes

no code implementations • 11 Mar 2020 • Genshun Dong, Yan Yan, Chunhua Shen, Hanzi Wang

Meanwhile, a Spatial detail-Preserving Network (SPN) with shallow convolutional layers is designed to generate high-resolution feature maps preserving the detailed spatial information.

Image Segmentation Segmentation +2

Paper
Add Code

Incorporating Multiple Cluster Centers for Multi-Label Learning

no code implementations • 17 Apr 2020 • Senlin Shu, Fengmao Lv, Yan Yan, Li Li, Shuo He, Jun He

In this article, we propose to leverage the data augmentation technique to improve the performance of multi-label learning.

Clustering Data Augmentation +1

Paper
Add Code

Multi-Level Generative Models for Partial Label Learning with Non-random Label Noise

no code implementations • 11 May 2020 • Yan Yan, Yuhong Guo

Partial label (PL) learning tackles the problem where each training instance is associated with a set of candidate labels that include both the true label and irrelevant noise labels.

Denoising Partial Label Learning

Paper
Add Code

Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave Min-Max Problems with PL Condition

no code implementations • 12 Jun 2020 • Zhishuai Guo, Yan Yan, Zhuoning Yuan, Tianbao Yang

However, most of the existing algorithms are slow in practice, and their analysis revolves around the convergence to a nearly stationary point. We consider leveraging the Polyak-Lojasiewicz (PL) condition to design faster stochastic algorithms with stronger convergence guarantee.

Paper
Add Code

Nearly Optimal Robust Method for Convex Compositional Problems with Heavy-Tailed Noise

no code implementations • 17 Jun 2020 • Yan Yan, Xin Man, Tianbao Yang

In this paper, we propose robust stochastic algorithms for solving convex compositional problems of the form $f(\E_\xi g(\cdot; \xi)) + r(\cdot)$ by establishing {\bf sub-Gaussian confidence bounds} under weak assumptions about the tails of noise distribution, i. e., {\bf heavy-tailed noise} with bounded second-order moments.

Paper
Add Code

Correlation filter tracking with adaptive proposal selection for accurate scale estimation

no code implementations • 14 Jul 2020 • Luo Xiong, Yanjie Liang, Yan Yan, Hanzi Wang

In this paper, we propose an adaptive proposal selection algorithm which can generate a small number of high-quality proposals to handle the problem of scale variations for visual object tracking.

Visual Object Tracking

Paper
Add Code

Audio-Visual Event Localization via Recursive Fusion by Joint Co-Attention

no code implementations • 14 Aug 2020 • Bin Duan, Hao Tang, Wei Wang, Ziliang Zong, Guowei Yang, Yan Yan

Recent works have shown that attention mechanism is beneficial to the fusion process.

audio-visual event localization valid

Paper
Add Code

Hierarchical HMM for Eye Movement Classification

no code implementations • 18 Aug 2020 • Ye Zhu, Yan Yan, Oleg Komogortsev

In this work, we tackle the problem of ternary eye movement classification, which aims to separate fixations, saccades and smooth pursuits from the raw eye positional data.

Classification General Classification

Paper
Add Code

Revisiting Optical Flow Estimation in 360 Videos

no code implementations • 15 Oct 2020 • Keshav Bhandari, Ziliang Zong, Yan Yan

Second, we refine the network by training with augmented data in a supervised manner.

Data Augmentation Domain Adaptation +1

Paper
Add Code

Egok360: A 360 Egocentric Kinetic Human Activity Video Dataset

no code implementations • 15 Oct 2020 • Keshav Bhandari, Mario A. DeLaGarza, Ziliang Zong, Hugo Latapie, Yan Yan

To bridge this gap, in this paper we propose a novel Egocentric (first-person) 360{\deg} Kinetic human activity video dataset (EgoK360).

Egocentric Activity Recognition Video Understanding

Paper
Add Code

Robust Visual Tracking via Statistical Positive Sample Generation and Gradient Aware Learning

no code implementations • 9 Nov 2020 • Lijian Lin, Haosheng Chen, Yanjie Liang, Yan Yan, Hanzi Wang

In this paper, we propose a robust tracking method via Statistical Positive sample generation and Gradient Aware learning (SPGA) to address the above two limitations.

Visual Tracking

Paper
Add Code

Photometric and Spectroscopic Study of Flares on Ross 15

no code implementations • 15 Sep 2020 • Jian-Ying Bai, Ali Esamdin, Xing Gao, Yan Yan, Juan-Juan Ren

We conducted photometric and spectroscopic observations for Ross 15 in order to further study the flare properties of this less observed flare star.

Solar and Stellar Astrophysics High Energy Astrophysical Phenomena

Paper
Add Code

How does stock market reflect the change in economic demand? A study on the industry-specific volatility spillover networks of China's stock market during the outbreak of COVID-19

no code implementations • 15 Jul 2020 • Fu Qiao, Yan Yan

At the beginning of the outbreak of COVID-19, in China's stock market, spillover effects from industry indices of sectors meeting the investment demand to those meeting the consumption demands rose significantly.

Paper
Add Code

Hierarchical Representation via Message Propagation for Robust Model Fitting

no code implementations • 29 Dec 2020 • Shuyuan Lin, Xing Wang, Guobao Xiao, Yan Yan, Hanzi Wang

In this paper, we propose a novel hierarchical representation via message propagation (HRMP) method for robust model fitting, which simultaneously takes advantages of both the consensus analysis and the preference analysis to estimate the parameters of multiple model instances from data corrupted by outliers, for robust model fitting.

Paper
Add Code

Learning Audio-Visual Correlations from Variational Cross-Modal Generation

no code implementations • 5 Feb 2021 • Ye Zhu, Yu Wu, Hugo Latapie, Yi Yang, Yan Yan

People can easily imagine the potential sound while seeing an event.

Retrieval

Paper
Add Code

A Metamodel and Framework for Artificial General Intelligence From Theory to Practice

no code implementations • 11 Feb 2021 • Hugo Latapie, Ozkan Kilic, Gaowen Liu, Yan Yan, Ramana Kompella, Pei Wang, Kristinn R. Thorisson, Adam Lawrence, Yuhong Sun, Jayanth Srinivasa

This paper introduces a new metamodel-based knowledge representation that significantly improves autonomous learning and adaptation.

BIG-bench Machine Learning Federated Learning +4

Paper
Add Code

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

no code implementations • CVPR 2021 • Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, Hanzi Wang

In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition.

Facial Expression Recognition Facial Expression Recognition (FER) +1

Paper
Add Code

Learning Spatial-Semantic Relationship for Facial Attribute Recognition With Limited Labeled Data

no code implementations • CVPR 2021 • Ying Shu, Yan Yan, Si Chen, Jing-Hao Xue, Chunhua Shen, Hanzi Wang

First, three auxiliary tasks, consisting of a Patch Rotation Task (PRT), a Patch Segmentation Task (PST), and a Patch Classification Task (PCT), are jointly developed to learn the spatial-semantic relationship from large-scale unlabeled facial data.

Ranked #3 on Facial Attribute Classification on LFWA

Attribute Facial Attribute Classification +1

Paper
Add Code

Cross-View Exocentric to Egocentric Video Synthesis

no code implementations • 7 Jul 2021 • Gaowen Liu, Hao Tang, Hugo Latapie, Jason Corso, Yan Yan

Particularly, we propose a novel Bi-directional Spatial Temporal Attention Fusion Generative Adversarial Network (STA-GAN) to learn both spatial and temporal information to generate egocentric video sequences from the exocentric view.

Generative Adversarial Network Video Generation

Paper
Add Code

Lipschitz Continuity Guided Knowledge Distillation

no code implementations • ICCV 2021 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Knowledge distillation has become one of the most important model compression techniques by distilling knowledge from larger teacher networks to smaller student ones.

Knowledge Distillation Model Compression +2

Paper
Add Code

Contrastive Mutual Information Maximization for Binary Neural Networks

no code implementations • 29 Sep 2021 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.

Binarization Contrastive Learning +2

Paper
Add Code

Measure Twice, Cut Once: Quantifying Bias and Fairness in Deep Neural Networks

no code implementations • 8 Oct 2021 • Cody Blakeney, Gentry Atkinson, Nathaniel Huish, Yan Yan, Vangelis Metris, Ziliang Zong

Algorithmic bias is of increasing concern, both to the research community, and society at large.

Fairness Multi-class Classification

Paper
Add Code

Event Data Association via Robust Model Fitting for Event-based Object Tracking

no code implementations • 25 Oct 2021 • Haosheng Chen, Shuyuan Lin, Yan Yan, Hanzi Wang, Xinbo Gao

In EDA, we first asynchronously fuse the event data based on its information entropy.

Model Selection Object Tracking

Paper
Add Code

When Facial Expression Recognition Meets Few-Shot Learning: A Joint and Alternate Learning Framework

no code implementations • 18 Jan 2022 • Xinyi Zou, Yan Yan, Jing-Hao Xue, Si Chen, Hanzi Wang

To alleviate the problem of limited base classes in our FER task, we propose a novel Emotion Guided Similarity Network (EGS-Net), consisting of an emotion branch and a similarity branch, based on a two-stage learning framework.

cross-domain few-shot learning Facial Expression Recognition +1

Paper
Add Code

Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network Pruning

no code implementations • 30 Jan 2022 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Extensive experiments on CIFAR-10 and CIFAR-100 demonstrate the superiority of our novel Fourier analysis based MBP compared to other traditional MBP algorithms.

Knowledge Distillation Network Pruning

Paper
Add Code

Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion Network for Action Recognition

no code implementations • 23 Feb 2022 • Xiaoguang Zhu, Ye Zhu, Haoyu Wang, Honglin Wen, Yan Yan, Peilin Liu

To solve the problem, we propose a multi-modality feature fusion network to combine the modalities of the skeleton sequence and RGB frame instead of the RGB video, as the key information contained by the combination of skeleton sequence and RGB frame is close to that of the skeleton sequence and RGB video.

Action Recognition

Paper
Add Code

Deep Multi-Branch Aggregation Network for Real-Time Semantic Segmentation in Street Scenes

no code implementations • 8 Mar 2022 • Xi Weng, Yan Yan, Genshun Dong, Chang Shu, Biao Wang, Hanzi Wang, Ji Zhang

This shows that DMA-Net provides a good tradeoff between segmentation quality and speed for semantic segmentation in street scenes.

Real-Time Semantic Segmentation Segmentation

Paper
Add Code

Stage-Aware Feature Alignment Network for Real-Time Semantic Segmentation of Street Scenes

no code implementations • 8 Mar 2022 • Xi Weng, Yan Yan, Si Chen, Jing-Hao Xue, Hanzi Wang

In this paper, we present a novel Stage-aware Feature Alignment Network (SFANet) based on the encoder-decoder structure for real-time semantic segmentation of street scenes.

Real-Time Semantic Segmentation Segmentation

Paper
Add Code

Topological EEG Nonlinear Dynamics Analysis for Emotion Recognition

no code implementations • 14 Mar 2022 • Yan Yan, Xuankun Wu, Chengdong Li, Yini He, Zhicheng Zhang, Huihui Li, Ang Li, Lei Wang

The proposed work is the first investigation in the emotion recognition oriented EEG topological feature analysis, which brought a novel insight into the brain neural system nonlinear dynamics analysis and feature extraction.

Arousal Estimation Dominance Estimation +6

Paper
Add Code

Deep Transfer Learning with Graph Neural Network for Sensor-Based Human Activity Recognition

no code implementations • 14 Mar 2022 • Yan Yan, Tianzheng Liao, Jinjin Zhao, Jiahong Wang, Liang Ma, Wei Lv, Jing Xiong, Lei Wang

Given this observation, we devised a graph-inspired deep learning approach toward the sensor-based HAR tasks, which was further used to build a deep transfer learning model toward giving a tentative solution for these two challenging problems.

Few-Shot Learning Human Activity Recognition +1

Paper
Add Code

HiTPR: Hierarchical Transformer for Place Recognition in Point Cloud

no code implementations • 12 Apr 2022 • Zhixing Hou, Yan Yan, Chengzhong Xu, Hui Kong

In the SRT, we extract the local feature for each point cell.

Ranked #18 on Point Cloud Retrieval on Oxford RobotCar (LiDAR 4096 points)

Loop Closure Detection Point Cloud Retrieval

Paper
Add Code

A Proposal-Based Paradigm for Self-Supervised Sound Source Localization in Videos

no code implementations • CVPR 2022 • Hanyu Xuan, Zhiliang Wu, Jian Yang, Yan Yan, Xavier Alameda-Pineda

Humans can easily recognize where and how the sound is produced via watching a scene and listening to corresponding audio cues.

Multiple Instance Learning

Paper
Add Code

Training Robust Deep Models for Time-Series Domain: Novel Algorithms and Theoretical Analysis

1 code implementation • 9 Jul 2022 • Taha Belkhouja, Yan Yan, Janardhan Rao Doppa

Despite the success of deep neural networks (DNNs) for real-world applications over time-series data such as mobile health, little is known about how to train robust DNNs for time-series domain due to its unique characteristics compared to images and text data.

Data Augmentation Dynamic Time Warping +3

Paper
Code

MLP-GAN for Brain Vessel Image Segmentation

no code implementations • 17 Jul 2022 • Bin Xie, Hao Tang, Bin Duan, Dawen Cai, Yan Yan

Brain vessel image segmentation can be used as a promising biomarker for better prevention and treatment of different diseases.

Generative Adversarial Network Image Segmentation +2

Paper
Add Code

Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem

no code implementations • 24 Jul 2022 • Yudong Han, Liqiang Nie, Jianhua Yin, Jianlong Wu, Yan Yan

Several studies have recently pointed that existing Visual Question Answering (VQA) models heavily suffer from the language prior problem, which refers to capturing superficial statistical correlations between the question type and the answer whereas ignoring the image contents.

Question Answering Visual Question Answering

Paper
Add Code

Learning Omnidirectional Flow in 360-degree Video via Siamese Representation

no code implementations • 7 Aug 2022 • Keshav Bhandari, Bin Duan, Gaowen Liu, Hugo Latapie, Ziliang Zong, Yan Yan

Optical flow estimation in omnidirectional videos faces two significant issues: the lack of benchmark datasets and the challenge of adapting perspective video-based methods to accommodate the omnidirectional nature.

Optical Flow Estimation Representation Learning

Paper
Add Code

Semi-Supervised Video Inpainting with Cycle Consistency Constraints

no code implementations • CVPR 2023 • Zhiliang Wu, Hanyu Xuan, Changchang Sun, Kang Zhang, Yan Yan

Specifically, in this work, we propose an end-to-end trainable framework consisting of completion network and mask prediction network, which are designed to generate corrupted contents of the current frame using the known mask and decide the regions to be filled of the next frame, respectively.

Video Inpainting

Paper
Add Code

Progressive Cross-modal Knowledge Distillation for Human Action Recognition

no code implementations • 17 Aug 2022 • Jianyuan Ni, Anne H. H. Ngu, Yan Yan

However, the accuracy performance of wearable sensor-based HAR is still far behind the ones from the visual modalities-based system (i. e., RGB video, skeleton, and depth).

Action Recognition Knowledge Distillation +3

Paper
Add Code

DPTNet: A Dual-Path Transformer Architecture for Scene Text Detection

no code implementations • 21 Aug 2022 • Jingyu Lin, Jie Jiang, Yan Yan, Chunchao Guo, Hongfa Wang, Wei Liu, Hanzi Wang

We further propose a parallel design that integrates the convolutional network with a powerful self-attention mechanism to provide complementary clues between the attention path and convolutional path.

Scene Text Detection Text Detection

Paper
Add Code

Vision+X: A Survey on Multimodal Learning in the Light of Data

no code implementations • 5 Oct 2022 • Ye Zhu, Yu Wu, Nicu Sebe, Yan Yan

We are perceiving and communicating with the world in a multisensory manner, where different information sources are sophisticatedly processed and interpreted by separate parts of the human brain to constitute a complex, yet harmonious and unified sensing system.

Representation Learning

Paper
Add Code

Few-shot Medical Image Segmentation with Cycle-resemblance Attention

no code implementations • 7 Dec 2022 • Hao Ding, Changchang Sun, Hao Tang, Dawen Cai, Yan Yan

Recently, due to the increasing requirements of medical imaging applications and the professional requirements of annotating medical images, few-shot learning has gained increasing attention in the medical image semantic segmentation field.

Few-Shot Learning Image Segmentation +4

Paper
Add Code

Optical Flow Estimation in 360$^\circ$ Videos: Dataset, Model and Application

no code implementations • 27 Jan 2023 • Bin Duan, Keshav Bhandari, Gaowen Liu, Yan Yan

Moreover, we present a novel Siamese representation Learning framework for Omnidirectional Flow (SLOF) estimation, which is trained in a contrastive manner via a hybrid loss that combines siamese contrastive and optical flow losses.

Egocentric Activity Recognition Optical Flow Estimation +1

Paper
Add Code

BPT: Binary Point Cloud Transformer for Place Recognition

no code implementations • 2 Mar 2023 • Zhixing Hou, Yuzhang Shang, Tian Gao, Yan Yan

To solve this issue, we propose a binary point cloud transformer for place recognition.

Paper
Add Code

MRCN: A Novel Modality Restitution and Compensation Network for Visible-Infrared Person Re-identification

no code implementations • 26 Mar 2023 • Yukang Zhang, Yan Yan, Jie Li, Hanzi Wang

Furthermore, to better disentangle the modality-relevant features and the modality-irrelevant features, we propose a novel Center-Quadruplet Causal (CQC) loss to encourage the network to effectively learn the modality-relevant features and the modality-irrelevant features.

Person Re-Identification

Paper
Add Code

Deep Stereo Video Inpainting

no code implementations • CVPR 2023 • Zhiliang Wu, Changchang Sun, Hanyu Xuan, Yan Yan

Stereo video inpainting aims to fill the missing regions on the left and right views of the stereo video with plausible content simultaneously.

Video Inpainting

Paper
Add Code

A Decision Making Framework for Recommended Maintenance of Road Segments

no code implementations • 19 Jul 2023 • Haoyu Sun, Yan Yan

Due to limited budgets allocated for road maintenance projects in various countries, road management departments face difficulties in making scientific maintenance decisions.

Decision Making Management

Paper
Add Code

Probabilistically robust conformal prediction

no code implementations • 31 Jul 2023 • Subhankar Ghosh, Yuanjie Shi, Taha Belkhouja, Yan Yan, Jana Doppa, Brian Jones

We propose a novel adaptive PRCP (aPRCP) algorithm to achieve probabilistically robust coverage.

Conformal Prediction

Paper
Add Code

The Compatibility between the Pangu Weather Forecasting Model and Meteorological Operational Data

no code implementations • 7 Aug 2023 • Wencong Cheng, Yan Yan, Jiangjiang Xia, Qi Liu, Chang Qu, Zhigang Wang

Recently, multiple data-driven models based on machine learning for weather forecasting have emerged.

Weather Forecasting

Paper
Add Code

Unseen Image Synthesis with Diffusion Models

no code implementations • 13 Oct 2023 • Ye Zhu, Yu Wu, Zhiwei Deng, Olga Russakovsky, Yan Yan

While the current trend in the generative field is scaling up towards larger models and more training data for generalized domain representations, we go the opposite direction in this work by synthesizing unseen domain images without additional training.

Denoising Image Generation

Paper
Add Code

Frequency Domain Nuances Mining for Visible-Infrared Person Re-identification

no code implementations • 4 Jan 2024 • Yukang Zhang, Yang Lu, Yan Yan, Hanzi Wang, Xuelong Li

Specifically, we propose a novel Frequency Domain Nuances Mining (FDNM) method to explore the cross-modality frequency domain information, which mainly includes an amplitude guided phase (AGP) module and an amplitude nuances mining (ANM) module.

Face Recognition Person Re-Identification

Paper
Add Code

Online Multi-spectral Neuron Tracing

no code implementations • 10 Mar 2024 • Bin Duan, Yuzhang Shang, Dawen Cai, Yan Yan

In this paper, we propose an online multi-spectral neuron tracing method with uniquely designed modules, where no offline training are required.

Paper
Add Code

FBPT: A Fully Binary Point Transformer

no code implementations • 15 Mar 2024 • Zhixing Hou, Yuzhang Shang, Yan Yan

This paper presents a novel Fully Binary Point Cloud Transformer (FBPT) model which has the potential to be widely applied and expanded in the fields of robotics and mobile devices.

Binarization Point Cloud Classification

Paper
Add Code

Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer

no code implementations • 21 Mar 2024 • Junyi Wu, Bin Duan, Weitai Kang, Hao Tang, Yan Yan

To incorporate the influence of token transformation into interpretation, we propose TokenTM, a novel post-hoc explanation method that utilizes our introduced measurement of token transformation effects.

Paper
Add Code

MaskSAM: Towards Auto-prompt SAM with Mask Classification for Medical Image Segmentation

no code implementations • 21 Mar 2024 • Bin Xie, Hao Tang, Bin Duan, Dawen Cai, Yan Yan

Each pair of auxiliary mask and box prompts, which can solve the requirements of extra prompts, is associated with class label predictions by the sum of the auxiliary classifier token and the learnable global classifier tokens in the mask decoder of SAM to solve the predictions of semantic labels.

Image Segmentation Medical Image Segmentation +2

Paper
Add Code

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

no code implementations • 22 Mar 2024 • Yuzhang Shang, Mu Cai, Bingxin Xu, Yong Jae Lee, Yan Yan

Based on this, we propose PruMerge, a novel adaptive visual token reduction approach, which largely reduces the number of visual tokens while maintaining comparable model performance.

Language Modelling Large Language Model +3

Paper
Add Code

On the Faithfulness of Vision Transformer Explanations

no code implementations • 1 Apr 2024 • Junyi Wu, Weitai Kang, Hao Tang, Yuan Hong, Yan Yan

In contrast, our proposed SaCo offers a reliable faithfulness measurement, establishing a robust metric for interpretations.

Paper
Add Code

Versatile Navigation under Partial Observability via Value-guided Diffusion Policy

no code implementations • 1 Apr 2024 • Gengyu Zhang, Hao Tang, Yan Yan

To address these deficiencies, we propose a versatile diffusion-based approach for both 2D and 3D route planning under partial observability.

Autonomous Driving Semantic Segmentation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.