no code implementations • 2 Feb 2025 • Bin Xie, Hao Tang, Dawen Cai, Yan Yan, Gady Agam
We design a multi-scale prompt generator combined with the image encoder in SAM to generate auxiliary masks.
no code implementations • 3 Jan 2025 • Hu Ding, Yan Yan, Yang Lu, Jing-Hao Xue, Hanzi Wang
This indicates the superiority of hypergraph modeling for uncertainty estimation and label refinement on the personalized federated FER task.
Facial Expression Recognition
Facial Expression Recognition (FER)
+1
1 code implementation • 3 Jan 2025 • Ruikang Chen, Yan Yan, Jing-Hao Xue, Yang Lu, Hanzi Wang
However, obtaining correct annotations is extremely hard if not impossible for large-scale X-ray images, where item overlapping is ubiquitous. As a result, X-ray images are easily contaminated with noisy annotations, leading to performance deterioration of existing methods. In this paper, we address the challenging problem of training a robust prohibited item detector under noisy annotations (including both category noise and bounding box noise) from a novel perspective of data augmentation, and propose an effective label-aware mixed patch paste augmentation method (Mix-Paste).
1 code implementation • 21 Dec 2024 • Changchang Sun, Ren Wang, Yihua Zhang, Jinghan Jia, Jiancheng Liu, Gaowen Liu, Sijia Liu, Yan Yan
Machine unlearning (MU), which seeks to erase the influence of specific unwanted data from already-trained models, is becoming increasingly vital in model editing, particularly to comply with evolving data regulations like the ``right to be forgotten''.
no code implementations • 18 Dec 2024 • Zhihang Yuan, Yuzhang Shang, Hanling Zhang, Tongcheng Fang, Rui Xie, Bingxin Xu, Yan Yan, Shengen Yan, Guohao Dai, Yu Wang
Our approach not only enhances computational efficiency but also aligns naturally with image generation principles by operating in continuous token space and following a hierarchical generation process from coarse to fine details.
no code implementations • 18 Dec 2024 • Hanbin Hong, Shenao Yan, Shuya Feng, Yan Yan, Yuan Hong
Active Learning (AL) represents a crucial methodology within machine learning, emphasizing the identification and utilization of the most informative samples for efficient model training.
no code implementations • 10 Dec 2024 • Bo-Wen Zhang, Yan Yan, Boxiang Yang, Yifei Xue, Guang Liu
While scaling laws optimize training configurations for large language models (LLMs) through experiments on smaller or early-stage models, they fail to predict emergent abilities due to the absence of such capabilities in these models.
no code implementations • 5 Dec 2024 • Sanjoeng Wong, Yan Yan
Meanwhile, the inter-OL module introduces the wavelet decomposition-based adversarial learning block and the objectness block, effectively reducing the modality discrepancy and transferring the objectness knowledge learned from natural images with box annotations to X-ray images.
no code implementations • 1 Dec 2024 • Yan Yan, Yuanchi Ma
Finally, we employ a large model to generate summaries of each plot segment and produce the overall outline.
no code implementations • 23 Nov 2024 • Bingxin Xu, Yuzhang Shang, Yunhao Ge, Qian Lou, Yan Yan
Large Multimodal Models (LMMs) have demonstrated impressive capabilities in visual-language tasks but face significant deployment challenges due to their high computational demands.
no code implementations • 18 Nov 2024 • Hanyu Guo, Wanchuan Yu, Suzhou Que, Kaiwen Du, Yan Yan, Hanzi Wang
In this paper, we propose a novel Dual Motion-Guided Attention Learning method (called DMGAL) for few-shot action recognition, aiming to learn the spatio-temporal relationships from the video-specific to the task-specific level.
no code implementations • 3 Nov 2024 • Kaiang Wen, Bin Xie, Bin Duan, Yan Yan
The Mamba-based architecture seamlessly integrates the local feature extraction power of convolutional layers with the long-range dependency modeling capabilities of Mamba.
1 code implementation • 30 Sep 2024 • Weitai Kang, Haifeng Huang, Yuzhang Shang, Mubarak Shah, Yan Yan
RIG generates two key instruction data: 1) the Adversarial Instruction-following data, which features mixed negative and positive samples to enhance the model's discriminative understanding.
no code implementations • 19 Sep 2024 • Yuzhang Shang, Bingxin Xu, Weitai Kang, Mu Cai, Yuheng Li, Zehao Wen, Zhen Dong, Kurt Keutzer, Yong Jae Lee, Yan Yan
In this paper, we first identify the primary challenges in interpolating Video-LLMs: (1) the video encoder and modality alignment projector are fixed, preventing the integration of additional frames into Video-LLMs, and (2) the LLM backbone is limited in its content length capabilities, which complicates the processing of an increased number of video tokens.
no code implementations • 5 Sep 2024 • Qianlong Xiang, Miao Zhang, Yuzhang Shang, Jianlong Wu, Yan Yan, Liqiang Nie
Furthermore, considering that the source data is either unaccessible or too enormous to store for current generative models, we introduce a new paradigm for their distillation without source data, termed Data-Free Knowledge Distillation for Diffusion Models (DKDM).
1 code implementation • 24 Aug 2024 • Zhenghao Zhao, Haoxuan Wang, Yuzhang Shang, Kai Wang, Yan Yan
It reduces the distance between the student and the biased expert trajectories and prevents the tail class bias from being distilled to the synthetic dataset.
1 code implementation • 9 Jul 2024 • Zhenghao Zhao, Yuzhang Shang, Junyi Wu, Yan Yan
In addition, we introduce a novel pipeline for dataset quantization, utilizing feature space from the final stage of dataset quantization to generate more precise dataset bins.
no code implementations • 3 Jul 2024 • Weitai Kang, Mengxue Qu, Yunchao Wei, Yan Yan
Building upon this, ACTRESS consists of an active sampling strategy and a selective retraining strategy.
1 code implementation • 3 Jul 2024 • Weitai Kang, Gaowen Liu, Mubarak Shah, Yan Yan
Specifically, we propose the Multi-layer Multi-task Encoder-Decoder as the target grounding stage, where we learn a regression query and multiple segmentation queries to ground the target by regression and segmentation of the box in each decoding layer, respectively.
no code implementations • 3 Jul 2024 • Weitai Kang, Luowei Zhou, Junyi Wu, Changchang Sun, Yan Yan
Building upon this, we further propose a novel framework named Attention-Driven Constraint Balancing (AttBalance) to optimize the behavior of visual features within language-relevant regions.
1 code implementation • 10 Jun 2024 • Yuanjie Shi, Subhankar Ghosh, Taha Belkhouja, Janardhan Rao Doppa, Yan Yan
In contrast to the standard class-conditional CP (CCP) method that uniformly thresholds the class-wise conformity score for each class, the augmented label rank calibration step allows RC3P to selectively iterate this class-wise thresholding subroutine only for a subset of classes whose class-wise top-k error is small.
no code implementations • 31 May 2024 • Mohammed Amine Gharsallaoui, Bhupinderjeet Singh, Supriya Savalkar, Aryan Deshwal, Yan Yan, Ananth Kalyanaraman, Kirti Rajagopalan, Janardhan Rao Doppa
Predicting the spatiotemporal variation in streamflow along with uncertainty quantification enables decision-making for sustainable management of scarce water resources.
no code implementations • 29 May 2024 • Qin Yang, Meisam Mohammad, Han Wang, Ali Payani, Ashish Kundu, Kai Shu, Yan Yan, Yuan Hong
To address such limitations, we propose a novel Language Model-based Optimal Differential Privacy (LMO-DP) mechanism, which takes the first step to enable the tight composition of accurately fine-tuning (large) language models with a sub-optimal DP mechanism, even in strong privacy regimes (e. g., $0. 1\leq \epsilon<3$).
no code implementations • 28 May 2024 • Weitai Kang, Mengxue Qu, Jyoti Kini, Yunchao Wei, Mubarak Shah, Yan Yan
To achieve detection based on human intention, it relies on humans to observe the scene, reason out the target that aligns with their intention ("pillow" in this case), and finally provide a reference to the AI system, such as "A pillow on the couch".
1 code implementation • 25 May 2024 • Junyi Wu, Haoxuan Wang, Yuzhang Shang, Mubarak Shah, Yan Yan
SSC extends this approach by dynamically adjusting the balanced salience to capture the temporal variations in activation.
no code implementations • CVPR 2024 • Yuzhang Shang, Dan Xu, Gaowen Liu, Ramana Rao Kompella, Yan Yan
Moreover, we introduce a knowledge distillation mechanism to correct the direction of information flow in backward propagation.
1 code implementation • 14 May 2024 • Ziquan Liu, Yufei Cui, Yan Yan, Yi Xu, Xiangyang Ji, Xue Liu, Antoni B. Chan
In safety-critical applications such as medical imaging and autonomous driving, where decisions have profound implications for patient health and road safety, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks and reliable uncertainty quantification in decision-making.
no code implementations • 14 Apr 2024 • Jianyuan Ni, Hao Tang, Syed Tousiful Haque, Yan Yan, Anne H. H. Ngu
We begin by presenting the recent sensor modalities as well as deep learning approaches in HAR.
no code implementations • CVPR 2024 • Gengyu Zhang, Hao Tang, Yan Yan
To address these deficiencies, we propose a versatile diffusion-based approach for both 2D and 3D route planning under partial observability.
no code implementations • CVPR 2024 • Junyi Wu, Weitai Kang, Hao Tang, Yuan Hong, Yan Yan
In contrast, our proposed SaCo offers a reliable faithfulness measurement, establishing a robust metric for interpretations.
1 code implementation • 22 Mar 2024 • Yuzhang Shang, Mu Cai, Bingxin Xu, Yong Jae Lee, Yan Yan
In response, we propose PruMerge, a novel adaptive visual token reduction strategy that significantly reduces the number of visual tokens without compromising the performance of LMMs.
no code implementations • 21 Mar 2024 • Bin Xie, Hao Tang, Bin Duan, Dawen Cai, Yan Yan
Each pair of auxiliary mask and box prompts, which can solve the requirements of extra prompts, is associated with class label predictions by the sum of the auxiliary classifier token and the learnable global classifier tokens in the mask decoder of SAM to solve the predictions of semantic labels.
no code implementations • CVPR 2024 • Junyi Wu, Bin Duan, Weitai Kang, Hao Tang, Yan Yan
To incorporate the influence of token transformation into interpretation, we propose TokenTM, a novel post-hoc explanation method that utilizes our introduced measurement of token transformation effects.
no code implementations • 15 Mar 2024 • Zhixing Hou, Yuzhang Shang, Yan Yan
This paper presents a novel Fully Binary Point Cloud Transformer (FBPT) model which has the potential to be widely applied and expanded in the fields of robotics and mobile devices.
no code implementations • 10 Mar 2024 • Bin Duan, Yuzhang Shang, Dawen Cai, Yan Yan
In this paper, we propose an online multi-spectral neuron tracing method with uniquely designed modules, where no offline training are required.
2 code implementations • 26 Feb 2024 • Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model for systematic analysis of LLM inference techniques.
1 code implementation • 6 Feb 2024 • Haoxuan Wang, Yuzhang Shang, Zhihang Yuan, Junyi Wu, Junchi Yan, Yan Yan
We empirically verify that our approach modifies the activation distribution and provides meaningful temporal information, facilitating easier and more accurate quantization.
no code implementations • 4 Jan 2024 • Yukang Zhang, Yang Lu, Yan Yan, Hanzi Wang, Xuelong Li
Specifically, we propose a novel Frequency Domain Nuances Mining (FDNM) method to explore the cross-modality frequency domain information, which mainly includes an amplitude guided phase (AGP) module and an amplitude nuances mining (ANM) module.
no code implementations • CVPR 2024 • Yuzhang Shang, Gaowen Liu, Ramana Rao Kompella, Yan Yan
We aim to calibrate the quantized activations by maximizing the mutual information between the pre- and post-quantized activations.
1 code implementation • CVPR 2024 • Yuxuan Zhou, Xudong Yan, Zhi-Qi Cheng, Yan Yan, Qi Dai, Xian-Sheng Hua
To remedy this we propose a two-fold strategy: (1) We introduce an innovative approach that encodes bone connectivity by harnessing the power of graph distances to describe the physical topology; we further incorporate action-specific topological representation via persistent homology analysis to depict systemic dynamics.
Ranked #6 on
Skeleton Based Action Recognition
on NTU RGB+D 120
1 code implementation • NeurIPS 2023 • Yuzhang Shang, Zhihang Yuan, Yan Yan
Thus, we introduce mutual information (MI) as the metric to quantify the shared information between the synthetic and the real datasets, and devise MIM4DD numerically maximizing the MI via a newly designed optimizable objective within a contrastive learning framework to update the synthetic dataset.
1 code implementation • 13 Dec 2023 • Liuxiang Qiu, Si Chen, Yan Yan, Jing-Hao Xue, Da-Han Wang, Shunzhi Zhu
Existing VI-ReID methods ignore high-order structure information of features while being relatively difficult to learn a reasonable common feature space due to the large modality discrepancy between VIS and IR images.
1 code implementation • 12 Dec 2023 • Ziqiang Zhang, Yan Yan, Jing-Hao Xue, Hanzi Wang
SDIC follows a "compensate-and-edit" paradigm and successfully bridges the gap in image details between the original image and the reconstructed/edited image.
1 code implementation • 10 Dec 2023 • Zhihang Yuan, Yuzhang Shang, Yue Song, Qiang Wu, Yan Yan, Guangyu Sun
Based on the success of the low-rank decomposition of projection matrices in the self-attention module, we further introduce ASVD to compress the KV cache.
no code implementations • 13 Oct 2023 • Ye Zhu, Yu Wu, Duo Xu, Zhiwei Deng, Yan Yan, Olga Russakovsky
In this work, we study the generalization properties of diffusion models in a few-shot setup, introduce a novel tuning-free paradigm to synthesize the target out-of-domain (OOD) data, and demonstrate its advantages compared to existing methods in data-sparse scenarios with large domain gaps.
1 code implementation • ICCV 2023 • Yuzhang Shang, Bingxin Xu, Gaowen Liu, Ramana Kompella, Yan Yan
Inspired by the causal understanding, we propose the Causality-guided Data-free Network Quantization method, Causal-DFQ, to eliminate the reliance on data via approaching an equilibrium of causality-driven intervened distributions.
1 code implementation • 4 Sep 2023 • Yuheng Shi, Zehao Huang, Yan Yan, Naiyan Wang, Xiaojie Guo
Time-to-Contact (TTC) estimation is a critical task for assessing collision risk and is widely used in various driver assistance and autonomous driving systems.
no code implementations • 7 Aug 2023 • Wencong Cheng, Yan Yan, Jiangjiang Xia, Qi Liu, Chang Qu, Zhigang Wang
Recently, multiple data-driven models based on machine learning for weather forecasting have emerged.
no code implementations • 31 Jul 2023 • Subhankar Ghosh, Yuanjie Shi, Taha Belkhouja, Yan Yan, Jana Doppa, Brian Jones
We propose a novel adaptive PRCP (aPRCP) algorithm to achieve probabilistically robust coverage.
1 code implementation • ICCV 2023 • Bin Duan, Ming Zhong, Yan Yan
Moreover, we derive a set of theoretical guarantees for our sanity-checked image registration method, with experimental results supporting our theoretical findings and their effectiveness in increasing the sanity of models without sacrificing any performance.
no code implementations • 19 Jul 2023 • Haoyu Sun, Yan Yan
Due to limited budgets allocated for road maintenance projects in various countries, road management departments face difficulties in making scientific maintenance decisions.
no code implementations • 26 Mar 2023 • Yukang Zhang, Yan Yan, Jie Li, Hanzi Wang
Furthermore, to better disentangle the modality-relevant features and the modality-irrelevant features, we propose a novel Center-Quadruplet Causal (CQC) loss to encourage the network to effectively learn the modality-relevant features and the modality-irrelevant features.
1 code implementation • 19 Mar 2023 • Subhankar Ghosh, Taha Belkhouja, Yan Yan, Janardhan Rao Doppa
Safe deployment of deep neural networks in high-stake real-world applications requires theoretically sound uncertainty quantification.
no code implementations • 2 Mar 2023 • Zhixing Hou, Yuzhang Shang, Tian Gao, Yan Yan
To solve this issue, we propose a binary point cloud transformer for place recognition.
1 code implementation • NeurIPS 2023 • Ye Zhu, Yu Wu, Zhiwei Deng, Olga Russakovsky, Yan Yan
Applying pre-trained generative denoising diffusion models (DDMs) for downstream tasks such as image semantic editing usually requires either fine-tuning DDMs or learning auxiliary editing networks in the existing literature.
no code implementations • 27 Jan 2023 • Bin Duan, Keshav Bhandari, Gaowen Liu, Yan Yan
Moreover, we present a novel Siamese representation Learning framework for Omnidirectional Flow (SLOF) estimation, which is trained in a contrastive manner via a hybrid loss that combines siamese contrastive and optical flow losses.
no code implementations • CVPR 2023 • Zhiliang Wu, Changchang Sun, Hanyu Xuan, Yan Yan
Stereo video inpainting aims to fill the missing regions on the left and right views of the stereo video with plausible content simultaneously.
no code implementations • 7 Dec 2022 • Hao Ding, Changchang Sun, Hao Tang, Dawen Cai, Yan Yan
Recently, due to the increasing requirements of medical imaging applications and the professional requirements of annotating medical images, few-shot learning has gained increasing attention in the medical image semantic segmentation field.
1 code implementation • CVPR 2023 • Yuzhang Shang, Zhihang Yuan, Bin Xie, Bingzhe Wu, Yan Yan
These approaches define a forward diffusion process for transforming data into noise and a backward denoising process for sampling data from noise.
no code implementations • 5 Oct 2022 • Ye Zhu, Yu Wu, Nicu Sebe, Yan Yan
We are perceiving and communicating with the world in a multisensory manner, where different information sources are sophisticatedly processed and interpreted by separate parts of the human brain to constitute a complex, yet harmonious and unified sensing system.
1 code implementation • 19 Sep 2022 • Yongzhi Huang, Hanwen Zhang, Yan Yan, Haseeb Hassan
Large curated datasets are necessary, but annotating medical images is a time-consuming, laborious, and expensive process.
no code implementations • 21 Aug 2022 • Jingyu Lin, Jie Jiang, Yan Yan, Chunchao Guo, Hongfa Wang, Wei Liu, Hanzi Wang
We further propose a parallel design that integrates the convolutional network with a powerful self-attention mechanism to provide complementary clues between the attention path and convolutional path.
no code implementations • 17 Aug 2022 • Jianyuan Ni, Anne H. H. Ngu, Yan Yan
However, the accuracy performance of wearable sensor-based HAR is still far behind the ones from the visual modalities-based system (i. e., RGB video, skeleton, and depth).
no code implementations • CVPR 2023 • Zhiliang Wu, Hanyu Xuan, Changchang Sun, Kang Zhang, Yan Yan
Specifically, in this work, we propose an end-to-end trainable framework consisting of completion network and mask prediction network, which are designed to generate corrupted contents of the current frame using the known mask and decide the regions to be filled of the next frame, respectively.
1 code implementation • 11 Aug 2022 • Jianan Han, Shaoxing Zhang, Aidong Men, Yang Liu, Ziming Yao, Yan Yan, Qingchao Chen
$S^3VE$ is a large-scale dataset including synchronized infrared video and EEG signal for sleep stage classification, including 105 subjects and 154, 573 video clips that is more than 1100 hours long.
no code implementations • 7 Aug 2022 • Keshav Bhandari, Bin Duan, Gaowen Liu, Hugo Latapie, Ziliang Zong, Yan Yan
Optical flow estimation in omnidirectional videos faces two significant issues: the lack of benchmark datasets and the challenge of adapting perspective video-based methods to accommodate the omnidirectional nature.
no code implementations • 24 Jul 2022 • Yudong Han, Liqiang Nie, Jianhua Yin, Jianlong Wu, Yan Yan
Several studies have recently pointed that existing Visual Question Answering (VQA) models heavily suffer from the language prior problem, which refers to capturing superficial statistical correlations between the question type and the answer whereas ignoring the image contents.
no code implementations • 17 Jul 2022 • Bin Xie, Hao Tang, Bin Duan, Dawen Cai, Yan Yan
Brain vessel image segmentation can be used as a promising biomarker for better prevention and treatment of different diseases.
1 code implementation • 16 Jul 2022 • Xinyi Zou, Yan Yan, Jing-Hao Xue, Si Chen, Hanzi Wang
Extensive experiments on both in-the-lab and in-the-wild compound expression datasets demonstrate the superiority of our proposed CDNet against several state-of-the-art FSL methods.
cross-domain few-shot learning
Facial Expression Recognition
+1
1 code implementation • 13 Jul 2022 • Yuzhang Shang, Dan Xu, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
Relying on the premise that the performance of a binary neural network can be largely restored with eliminated quantization error between full-precision weight vectors and their corresponding binary vectors, existing works of network binarization frequently adopt the idea of model robustness to reach the aforementioned objective.
1 code implementation • 9 Jul 2022 • Taha Belkhouja, Yan Yan, Janardhan Rao Doppa
Experiments on diverse real-world benchmarks demonstrate that the SRS method is well-suited for time-series OOD detection when compared to baseline methods.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
+2
1 code implementation • 9 Jul 2022 • Taha Belkhouja, Yan Yan, Janardhan Rao Doppa
Despite the success of deep neural networks (DNNs) for real-world applications over time-series data such as mobile health, little is known about how to train robust DNNs for time-series domain due to its unique characteristics compared to images and text data.
1 code implementation • 9 Jul 2022 • Taha Belkhouja, Yan Yan, Janardhan Rao Doppa
Despite the rapid progress on research in adversarial robustness of deep neural networks (DNNs), there is little principled work for the time-series domain.
1 code implementation • 6 Jul 2022 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan
Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.
1 code implementation • 1 Jul 2022 • Jichao Zhang, Jingjing Chen, Hao Tang, Enver Sangineto, Peng Wu, Yan Yan, Nicu Sebe, Wei Wang
Solving this problem using an unsupervised method remains an open problem, especially for high-resolution face images in the wild, which are not easy to annotate with gaze and head pose labels.
1 code implementation • 15 Jun 2022 • Ye Zhu, Yu Wu, Kyle Olszewski, Jian Ren, Sergey Tulyakov, Yan Yan
Diffusion probabilistic models (DPMs) have become a popular approach to conditional generation, due to their promising results and support for cross-modal synthesis.
1 code implementation • 23 Apr 2022 • Zhenghao Zhao, Ye Zhu, Xiaoguang Zhu, Yuzhang Shang, Yan Yan
Most current AI systems rely on the premise that the input visual data are sufficient to achieve competitive performance in various computer vision tasks.
no code implementations • 12 Apr 2022 • Zhixing Hou, Yan Yan, Chengzhong Xu, Hui Kong
In the SRT, we extract the local feature for each point cell.
1 code implementation • 1 Apr 2022 • Ye Zhu, Kyle Olszewski, Yu Wu, Panos Achlioptas, Menglei Chai, Yan Yan, Sergey Tulyakov
We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates complex musical samples conditioned on dance videos.
1 code implementation • 22 Mar 2022 • Songsong Wu, Hao Tang, Xiao-Yuan Jing, Haifeng Zhao, Jianjun Qian, Nicu Sebe, Yan Yan
In this paper, we tackle the problem of synthesizing a ground-view panorama image conditioned on a top-view aerial image, which is a challenging problem due to the large gap between the two image domains with different view-points.
no code implementations • 14 Mar 2022 • Yan Yan, Tianzheng Liao, Jinjin Zhao, Jiahong Wang, Liang Ma, Wei Lv, Jing Xiong, Lei Wang
Given this observation, we devised a graph-inspired deep learning approach toward the sensor-based HAR tasks, which was further used to build a deep transfer learning model toward giving a tentative solution for these two challenging problems.
no code implementations • 14 Mar 2022 • Yan Yan, Xuankun Wu, Chengdong Li, Yini He, Zhicheng Zhang, Huihui Li, Ang Li, Lei Wang
The proposed work is the first investigation in the emotion recognition oriented EEG topological feature analysis, which brought a novel insight into the brain neural system nonlinear dynamics analysis and feature extraction.
no code implementations • 8 Mar 2022 • Xi Weng, Yan Yan, Genshun Dong, Chang Shu, Biao Wang, Hanzi Wang, Ji Zhang
This shows that DMA-Net provides a good tradeoff between segmentation quality and speed for semantic segmentation in street scenes.
no code implementations • 8 Mar 2022 • Xi Weng, Yan Yan, Si Chen, Jing-Hao Xue, Hanzi Wang
In this paper, we present a novel Stage-aware Feature Alignment Network (SFANet) based on the encoder-decoder structure for real-time semantic segmentation of street scenes.
no code implementations • 23 Feb 2022 • Xiaoguang Zhu, Ye Zhu, Haoyu Wang, Honglin Wen, Yan Yan, Peilin Liu
To solve the problem, we propose a multi-modality feature fusion network to combine the modalities of the skeleton sequence and RGB frame instead of the RGB video, as the key information contained by the combination of skeleton sequence and RGB frame is close to that of the skeleton sequence and RGB video.
no code implementations • 30 Jan 2022 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
Extensive experiments on CIFAR-10 and CIFAR-100 demonstrate the superiority of our novel Fourier analysis based MBP compared to other traditional MBP algorithms.
no code implementations • 18 Jan 2022 • Xinyi Zou, Yan Yan, Jing-Hao Xue, Si Chen, Hanzi Wang
To alleviate the problem of limited base classes in our FER task, we propose a novel Emotion Guided Similarity Network (EGS-Net), consisting of an emotion branch and a similarity branch, based on a two-stage learning framework.
cross-domain few-shot learning
Facial Expression Recognition
+1
no code implementations • CVPR 2022 • Hanyu Xuan, Zhiliang Wu, Jian Yang, Yan Yan, Xavier Alameda-Pineda
Humans can easily recognize where and how the sound is produced via watching a scene and listening to corresponding audio cues.
no code implementations • 25 Oct 2021 • Haosheng Chen, Shuyuan Lin, Yan Yan, Hanzi Wang, Xinbo Gao
In EDA, we first asynchronously fuse the event data based on its information entropy.
1 code implementation • 12 Oct 2021 • Zongmeng Zhang, Xianjing Han, Xuemeng Song, Yan Yan, Liqiang Nie
Towards this end, in this work, we propose a Multi-modal Interaction Graph Convolutional Network (MIGCN), which jointly explores the complex intra-modal relations and inter-modal interactions residing in the video and sentence query to facilitate the understanding and semantic correspondence capture of the video and sentence query.
no code implementations • 8 Oct 2021 • Cody Blakeney, Gentry Atkinson, Nathaniel Huish, Yan Yan, Vangelis Metris, Ziliang Zong
Algorithmic bias is of increasing concern, both to the research community, and society at large.
1 code implementation • 8 Oct 2021 • Jianyuan Ni, Raunak Sarbajna, Yang Liu, Anne H. H. Ngu, Yan Yan
Human activity recognition (HAR) based on multi-modal approach has been recently shown to improve the accuracy performance of HAR.
no code implementations • 29 Sep 2021 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan
Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.
no code implementations • ICCV 2021 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
Knowledge distillation has become one of the most important model compression techniques by distilling knowledge from larger teacher networks to smaller student ones.
no code implementations • 7 Jul 2021 • Gaowen Liu, Hao Tang, Hugo Latapie, Jason Corso, Yan Yan
Particularly, we propose a novel Bi-directional Spatial Temporal Attention Fusion Generative Adversarial Network (STA-GAN) to learn both spatial and temporal information to generate egocentric video sequences from the exocentric view.
1 code implementation • 26 Jun 2021 • Ye Zhu, Yu Wu, Yi Yang, Yan Yan
Current vision and language tasks usually take complete visual data (e. g., raw images or videos) as input, however, practical scenarios may often consist the situations where part of the visual information becomes inaccessible due to various reasons e. g., restricted view with fixed camera or intentional vision block for security concerns.
no code implementations • CVPR 2021 • Ying Shu, Yan Yan, Si Chen, Jing-Hao Xue, Chunhua Shen, Hanzi Wang
First, three auxiliary tasks, consisting of a Patch Rotation Task (PRT), a Patch Segmentation Task (PST), and a Patch Classification Task (PCT), are jointly developed to learn the spatial-semantic relationship from large-scale unlabeled facial data.
Ranked #3 on
Facial Attribute Classification
on LFWA
1 code implementation • 15 Jun 2021 • Cody Blakeney, Nathaniel Huish, Yan Yan, Ziliang Zong
In recent years the ubiquitous deployment of AI has posed great concerns in regards to algorithmic bias, discrimination, and fairness.
1 code implementation • CVPR 2021 • Zhenyu Zhang, Yanhao Ge, Renwang Chen, Ying Tai, Yan Yan, Jian Yang, Chengjie Wang, Jilin Li, Feiyue Huang
Non-parametric face modeling aims to reconstruct 3D face only from images without shape assumptions.
no code implementations • CVPR 2021 • Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, Hanzi Wang
In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition.
Facial Expression Recognition
Facial Expression Recognition (FER)
+1
no code implementations • 11 Feb 2021 • Hugo Latapie, Ozkan Kilic, Gaowen Liu, Yan Yan, Ramana Kompella, Pei Wang, Kristinn R. Thorisson, Adam Lawrence, Yuhong Sun, Jayanth Srinivasa
This paper introduces a new metamodel-based knowledge representation that significantly improves autonomous learning and adaptation.
no code implementations • 5 Feb 2021 • Ye Zhu, Yu Wu, Hugo Latapie, Yi Yang, Yan Yan
People can easily imagine the potential sound while seeing an event.
1 code implementation • IEEE Transactions on Multimedia 2021 • Aihua Zheng, Menglan Hu, Bo Jiang *, Yan Huang, Yan Yan, and Bin Luo
AML aims to generate a modality-independent representation for each person in each modality via adversarial learning, while simultaneously learns a robust similarity measure for cross-modality matching via metric learning.
no code implementations • 29 Dec 2020 • Shuyuan Lin, Xing Wang, Guobao Xiao, Yan Yan, Hanzi Wang
In this paper, we propose a novel hierarchical representation via message propagation (HRMP) method for robust model fitting, which simultaneously takes advantages of both the consensus analysis and the preference analysis to estimate the parameters of multiple model instances from data corrupted by outliers, for robust model fitting.
4 code implementations • ICCV 2021 • Zhuoning Yuan, Yan Yan, Milan Sonka, Tianbao Yang
Our studies demonstrate that the proposed DAM method improves the performance of optimizing cross-entropy loss by a large margin, and also achieves better performance than optimizing the existing AUC square loss on these medical image classification tasks.
Ranked #2 on
Multi-Label Classification
on CheXpert
1 code implementation • 5 Dec 2020 • Cody Blakeney, Xiaomin Li, Yan Yan, Ziliang Zong
The experimental results running on an AMD server with four Geforce RTX 2080Ti GPUs show that our algorithm can achieve 3x speedup plus 19% energy savings on VGG distillation, and 3. 5x speedup plus 29% energy savings on ResNet distillation, both with negligible accuracy loss.
no code implementations • 9 Nov 2020 • Lijian Lin, Haosheng Chen, Yanjie Liang, Yan Yan, Hanzi Wang
In this paper, we propose a robust tracking method via Statistical Positive sample generation and Gradient Aware learning (SPGA) to address the above two limitations.
no code implementations • 15 Oct 2020 • Keshav Bhandari, Ziliang Zong, Yan Yan
Second, we refine the network by training with augmented data in a supervised manner.
no code implementations • 15 Oct 2020 • Keshav Bhandari, Mario A. DeLaGarza, Ziliang Zong, Hugo Latapie, Yan Yan
To bridge this gap, in this paper we propose a novel Egocentric (first-person) 360{\deg} Kinetic human activity video dataset (EgoK360).
no code implementations • 15 Sep 2020 • Jian-Ying Bai, Ali Esamdin, Xing Gao, Yan Yan, Juan-Juan Ren
We conducted photometric and spectroscopic observations for Ross 15 in order to further study the flare properties of this less observed flare star.
Solar and Stellar Astrophysics High Energy Astrophysical Phenomena
1 code implementation • ECCV 2020 • Ye Zhu, Yu Wu, Yi Yang, Yan Yan
With the arising concerns for the AI systems provided with direct access to abundant sensitive information, researchers seek to develop more reliable AI with implicit information sources.
no code implementations • 18 Aug 2020 • Ye Zhu, Yan Yan, Oleg Komogortsev
In this work, we tackle the problem of ternary eye movement classification, which aims to separate fixations, saccades and smooth pursuits from the raw eye positional data.
no code implementations • 14 Aug 2020 • Bin Duan, Hao Tang, Wei Wang, Ziliang Zong, Guowei Yang, Yan Yan
Recent works have shown that attention mechanism is beneficial to the fusion process.
1 code implementation • 9 Aug 2020 • Jichao Zhang, Jingjing Chen, Hao Tang, Wei Wang, Yan Yan, Enver Sangineto, Nicu Sebe
In this paper we address the problem of unsupervised gaze correction in the wild, presenting a solution that works without the need for precise annotations of the gaze angle and the head pose.
no code implementations • 15 Jul 2020 • Fu Qiao, Yan Yan
At the beginning of the outbreak of COVID-19, in China's stock market, spillover effects from industry indices of sectors meeting the investment demand to those meeting the consumption demands rose significantly.
no code implementations • 14 Jul 2020 • Luo Xiong, Yanjie Liang, Yan Yan, Hanzi Wang
In this paper, we propose an adaptive proposal selection algorithm which can generate a small number of high-quality proposals to handle the problem of scale variations for visual object tracking.
no code implementations • 17 Jun 2020 • Yan Yan, Xin Man, Tianbao Yang
In this paper, we propose robust stochastic algorithms for solving convex compositional problems of the form $f(\E_\xi g(\cdot; \xi)) + r(\cdot)$ by establishing {\bf sub-Gaussian confidence bounds} under weak assumptions about the tails of noise distribution, i. e., {\bf heavy-tailed noise} with bounded second-order moments.
no code implementations • 12 Jun 2020 • Zhishuai Guo, Yan Yan, Zhuoning Yuan, Tianbao Yang
However, most of the existing algorithms are slow in practice, and their analysis revolves around the convergence to a nearly stationary point. We consider leveraging the Polyak-Lojasiewicz (PL) condition to design faster stochastic algorithms with stronger convergence guarantee.
no code implementations • 11 May 2020 • Yan Yan, Yuhong Guo
Partial label (PL) learning tackles the problem where each training instance is associated with a set of candidate labels that include both the true label and irrelevant noise labels.
no code implementations • 17 Apr 2020 • Senlin Shu, Fengmao Lv, Yan Yan, Li Li, Shuo He, Jun He
In this article, we propose to leverage the data augmentation technique to improve the performance of multi-label learning.
no code implementations • 11 Mar 2020 • Genshun Dong, Yan Yan, Chunhua Shen, Hanzi Wang
Meanwhile, a Spatial detail-Preserving Network (SPN) with shallow convolutional layers is designed to generate high-resolution feature maps preserving the detailed spatial information.
no code implementations • 9 Mar 2020 • Zhishuai Guo, Yan Yan, Tianbao Yang
It remains unclear how these averaging schemes affect the convergence of {\it both optimization error and generalization error} (two equally important components of testing error) for {\bf non-strongly convex objectives, including non-convex problems}.
no code implementations • 20 Feb 2020 • Liao Zhang, Yan Yan, Lin Cheng, Hanzi Wang
Finally, we fuse these CAMs together to generate pseudoground-truths and train a fully-supervised object detector withthese ground-truths.
no code implementations • 13 Feb 2020 • Shuyuan Lin, Guobao Xiao, Yan Yan, David Suter, Hanzi Wang
Recently, some hypergraph-based methods have been proposed to deal with the problem of model fitting in computer vision, mainly due to the superior capability of hypergraph to represent the complex relationship between data points.
no code implementations • NeurIPS 2020 • Yan Yan, Yi Xu, Qihang Lin, Wei Liu, Tianbao Yang
In this paper, we bridge this gap by providing a sharp analysis of epoch-wise stochastic gradient descent ascent method (referred to as Epoch-GDA) for solving strongly convex strongly concave (SCSC) min-max problems, without imposing any additional assumption about smoothness or the function's structure.
no code implementations • 10 Feb 2020 • Longbiao Mao, Yan Yan, Jing-Hao Xue, Hanzi Wang
Two different network architectures are respectively designed to extract features for two groups of attributes, and a novel dynamic weighting scheme is proposed to automatically assign the loss weight to each facial attribute during training.
no code implementations • 8 Feb 2020 • Gaowen Liu, Hao Tang, Hugo Latapie, Yan Yan
In this paper, we investigate exocentric (third-person) view to egocentric (first-person) view image generation.
no code implementations • 7 Feb 2020 • Wanxiang Yang, Yan Yan, Si Chen
In this paper, we propose a novel person ReID method, which learns the spatial dependencies between the local regions and extracts the discriminative feature representation of the pedestrian image based on Long Short-Term Memory (LSTM), dealing with the problem of occlusions.
no code implementations • 7 Feb 2020 • Yihan Du, Yan Yan, Si Chen, Yang Hua
This strategy efficiently filters out some irrelevant proposals and avoids the redundant computation for feature extraction, which enables our method to operate faster than conventional classification-based tracking methods.
no code implementations • 6 Feb 2020 • Yan Yan, Ying Huang, Si Chen, Chunhua Shen, Hanzi Wang
Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
no code implementations • ICLR 2020 • Yan Yan, Yuhong Guo
Partial multi-label learning (PML), which tackles the problem of learning multi-label prediction models from instances with overcomplete noisy annotations, has recently started gaining attention from the research community.
2 code implementations • CVPR 2020 • Hao Tang, Dan Xu, Yan Yan, Philip H. S. Torr, Nicu Sebe
To tackle this issue, in this work we consider learning the scene generation in a local context, and correspondingly design a local class-specific generative network with semantic maps as a guidance, which separately constructs and learns sub-generators concentrating on the generation of different classes, and is able to provide more scene details.
1 code implementation • ECCV 2020 • Qi Qi, Yan Yan, Xiaoyu Wang, Tianbao Yang
To tackle this issue, we propose a simple and effective framework to sample pairs in a batch of data for updating the model.
no code implementations • 15 Sep 2019 • Yan Yan, Yuhong Guo
Partial multi-label learning (PML), which tackles the problem of learning multi-label prediction models from instances with overcomplete noisy annotations, has recently started gaining attention from the research community.
no code implementations • 12 Sep 2019 • Mei Wang, Weizhi Li, Yan Yan
Session-based Recurrent Neural Networks (RNNs) are gaining increasing popularity for recommendation task, due to the high autocorrelation of user's behavior on the latest session and the effectiveness of RNN to capture the sequence order information.
no code implementations • ICML 2020 • Yan Yan, Yi Xu, Lijun Zhang, Xiaoyu Wang, Tianbao Yang
In this paper, we study a family of non-convex and possibly non-smooth inf-projection minimization problems, where the target objective function is equal to minimization of a joint function over another variable.
no code implementations • 20 Aug 2019 • Zitao Liu, Zhexuan Xu, Yan Yan
Items in modern recommender systems are often organized in hierarchical structures.
1 code implementation • 2 Aug 2019 • Hao Tang, Dan Xu, Gaowen Liu, Wei Wang, Nicu Sebe, Yan Yan
In this work, we propose a novel Cycle In Cycle Generative Adversarial Network (C$^2$GAN) for the task of keypoint-guided image generation.
1 code implementation • 24 Jul 2019 • Chenglong Li, Wei Xia, Yan Yan, Bin Luo, Jin Tang
These advantages of thermal infrared cameras make the segmentation of semantic objects in day and night.
1 code implementation • 3 Jul 2019 • Bin Duan, Wei Wang, Hao Tang, Hugo Latapie, Yan Yan
However, in machine learning, this cross-modal learning is a nontrivial task because different modalities have no homogeneous properties.
no code implementations • 17 Jun 2019 • Qiangqiang Wu, Zhihui Chen, Lin Cheng, Yan Yan, Bo Li, Hanzi Wang
Incorporating such an ability to hallucinate diverse new samples of the tracked instance can help the trackers alleviate the over-fitting problem in the low-data tracking regime.
no code implementations • CVPR 2019 • Zhen-Yu Zhang, Zhen Cui, Chunyan Xu, Yan Yan, Nicu Sebe, Jian Yang
In this paper, we propose a novel Pattern-Affinitive Propagation (PAP) framework to jointly predict depth, surface normal and semantic segmentation.
Ranked #56 on
Monocular Depth Estimation
on NYU-Depth V2
(RMSE metric)
no code implementations • arXiv 2019 • Jichao Zhang, Meng Sun, Jingjing Chen, Hao Tang, Yan Yan, Xueying Qin, Nicu Sebe
Gaze correction aims to redirect the person's gaze into the camera by manipulating the eye region, and it can be considered as a specific image resynthesis problem.
no code implementations • 14 May 2019 • Hao Tang, Wei Wang, Songsong Wu, Xinya Chen, Dan Xu, Nicu Sebe, Yan Yan
In this paper, we focus on the facial expression translation task and propose a novel Expression Conditional GAN (ECGAN) which can learn the mapping from one image domain to another one based on an additional expression attribute.
no code implementations • 11 May 2019 • Songsong Wu, Zhiqiang Lu, Hao Tang, Yan Yan, Songhao Zhu, Xiao-Yuan Jing, Zuoyong Li
Multi-view subspace clustering aims to divide a set of multisource data into several groups according to their underlying subspace structure.
no code implementations • 11 May 2019 • Songsong Wu, Yan Yan, Hao Tang, Jianjun Qian, Jian Zhang, Xiao-Yuan Jing
However, the number of labeled source samples are always limited due to expensive annotation cost in practice, making sub-optimal performance been observed.
no code implementations • 23 Apr 2019 • Yan Yan, Yi Xu, Qihang Lin, Lijun Zhang, Tianbao Yang
The main contribution of this paper is the design and analysis of new stochastic primal-dual algorithms that use a mixture of stochastic gradient updates and a logarithmic number of deterministic dual updates for solving a family of convex-concave problems with no bilinear structure assumed.
3 code implementations • CVPR 2019 • Hao Tang, Dan Xu, Nicu Sebe, Yanzhi Wang, Jason J. Corso, Yan Yan
In this paper, we propose a novel approach named Multi-Channel Attention SelectionGAN (SelectionGAN) that makes it possible to generate images of natural scenes in arbitrary viewpoints, based on an image of the scene and a novel semantic map.
Bird View Synthesis
Cross-View Image-to-Image Translation
+1
8 code implementations • 28 Mar 2019 • Hao Tang, Dan Xu, Nicu Sebe, Yan Yan
To handle the limitation, in this paper we propose a novel Attention-Guided Generative Adversarial Network (AGGAN), which can detect the most discriminative semantic object and minimize changes of unwanted part for semantic manipulation problems without using extra data and models.
Ranked #1 on
Facial Expression Translation
on CelebA
1 code implementation • 28 Jan 2019 • Hao Tang, Xinya Chen, Wei Wang, Dan Xu, Jason J. Corso, Nicu Sebe, Yan Yan
To this end, we propose a novel Attribute-Guided Sketch Generative Adversarial Network (ASGAN) which is an end-to-end framework and contains two pairs of generators and discriminators, one of which is used to generate faces with attributes while the other one is employed for image-to-sketch translation.
1 code implementation • 14 Jan 2019 • Hao Tang, Dan Xu, Wei Wang, Yan Yan, Nicu Sebe
State-of-the-art methods for image-to-image translation with Generative Adversarial Networks (GANs) can learn a mapping from one domain to another domain using unpaired image data.
Generative Adversarial Network
Image-to-Image Translation
+1
no code implementations • NeurIPS 2019 • Zhuoning Yuan, Yan Yan, Rong Jin, Tianbao Yang
For convex loss functions and two classes of "nice-behaviored" non-convex objectives that are close to a convex function, we establish faster convergence of stagewise training than the vanilla SGD under the PL condition on both training error and testing error.
no code implementations • 6 Nov 2018 • Qiangqiang Wu, Yan Yan, Yanjie Liang, Yi Liu, Hanzi Wang
In recent years, Discriminative Correlation Filter (DCF) based tracking methods have achieved great success in visual tracking.