Search Results for author: Yong Xu

Found 87 papers, 32 papers with code

Multi-modal Aggregation Network for Fast MR Imaging

no code implementations15 Oct 2021 Chun-Mei Feng, Huazhu Fu, Tianfei Zhou, Yong Xu, Ling Shao, David Zhang

In this work, we propose a novel Multi-modal Aggregation Network, named MANet, which is capable of discovering complementary representations from a fully sampled auxiliary modality, with which to hierarchically guide the reconstruction of a given target modality.

Image Reconstruction

Graph Meta Network for Multi-Behavior Recommendation

1 code implementation8 Oct 2021 Lianghao Xia, Yong Xu, Chao Huang, Peng Dai, Liefeng Bo

Modern recommender systems often embed users and items into low-dimensional latent representations, based on their observed interactions.

Meta-Learning Recommendation Systems

Knowledge-aware Coupled Graph Neural Network for Social Recommendation

1 code implementation8 Oct 2021 Chao Huang, Huance Xu, Yong Xu, Peng Dai, Lianghao Xia, Mengyin Lu, Liefeng Bo, Hao Xing, Xiaoping Lai, Yanfang Ye

While many recent efforts show the effectiveness of neural network-based social recommender systems, several important challenges have not been well addressed yet: (i) The majority of models only consider users' social connections, while ignoring the inter-dependent knowledge across items; (ii) Most of existing solutions are designed for singular type of user-item interactions, making them infeasible to capture the interaction heterogeneity; (iii) The dynamic nature of user-item interactions has been less explored in many social-aware recommendation techniques.

Collaborative Filtering Recommendation Systems

Graph-Enhanced Multi-Task Learning of Multi-Level Transition Dynamics for Session-based Recommendation

no code implementations8 Oct 2021 Chao Huang, Jiahui Chen, Lianghao Xia, Yong Xu, Peng Dai, Yanqing Chen, Liefeng Bo, Jiashu Zhao, Jimmy Xiangji Huang

The learning process of intra- and inter-session transition dynamics are integrated, to preserve the underlying low- and high-level item relationships in a common latent space.

Multi-Task Learning Session-Based Recommendations

Knowledge-Enhanced Hierarchical Graph Transformer Network for Multi-Behavior Recommendation

1 code implementation8 Oct 2021 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Xiyue Zhang, Hongsheng Yang, Jian Pei, Liefeng Bo

In particular: i) complex inter-dependencies across different types of user behaviors; ii) the incorporation of knowledge-aware item relations into the multi-behavior recommendation framework; iii) dynamic characteristics of multi-typed user-item interactions.

Graph Attention Recommendation Systems

Multiplex Behavioral Relation Learning for Recommendation via Memory Augmented Transformer Network

1 code implementation8 Oct 2021 Lianghao Xia, Chao Huang, Yong Xu, Peng Dai, Bo Zhang, Liefeng Bo

The overlook of multiplex behavior relations can hardly recognize the multi-modal contextual signals across different types of interactions, which limit the feasibility of current recommendation methods.

Recommendation Systems

Traffic Flow Forecasting with Spatial-Temporal Graph Diffusion Network

no code implementations8 Oct 2021 Xiyue Zhang, Chao Huang, Yong Xu, Lianghao Xia, Peng Dai, Liefeng Bo, Junbo Zhang, Yu Zheng

Accurate forecasting of citywide traffic flow has been playing critical role in a variety of spatial-temporal mining applications, such as intelligent traffic control and public risk assessment.

Traffic Prediction

Social Recommendation with Self-Supervised Metagraph Informax Network

1 code implementation8 Oct 2021 Xiaoling Long, Chao Huang, Yong Xu, Huance Xu, Peng Dai, Lianghao Xia, Liefeng Bo

To model relation heterogeneity, we design a metapath-guided heterogeneous graph neural network to aggregate feature embeddings from different types of meta-relations across users and items, em-powering SMIN to maintain dedicated representations for multi-faceted user- and item-wise dependencies.

Collaborative Filtering Recommendation Systems

Global Context Enhanced Social Recommendation with Hierarchical Graph Neural Networks

1 code implementation8 Oct 2021 Huance Xu, Chao Huang, Yong Xu, Lianghao Xia, Hao Xing, Dawei Yin

Social recommendation which aims to leverage social connections among users to enhance the recommendation performance.

Recommendation Systems

Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution

1 code implementation3 Sep 2021 Chun-Mei Feng, Yunlu Yan, Chengliang Liu, Huazhu Fu, Yong Xu, Ling Shao

Our method can explore the foreground and background areas in the forward and reverse directions with the help of the auxiliary contrast, enabling it to learn clearer anatomical structures and edge information for the SR of a target-contrast MR image.

Image Super-Resolution

Heterogeneous relational message passing networks for molecular dynamics simulations

no code implementations2 Sep 2021 Zun Wang, Chong Wang, Sibo Zhao, Yong Xu, Shaogang Hao, Chang Yu Hsieh, Bing-Lin Gu, Wenhui Duan

With many frameworks based on message passing neural networks proposed to predict molecular and bulk properties, machine learning methods have tremendously shifted the paradigms of computational sciences underpinning physics, material science, chemistry, and biology.

Fully Non-Homogeneous Atmospheric Scattering Modeling with Convolutional Neural Networks for Single Image Dehazing

no code implementations25 Aug 2021 Cong Wang, Yan Huang, Yuexian Zou, Yong Xu

However, it is noted that ASM-based SIDM degrades its performance in dehazing real world hazy images due to the limited modelling ability of ASM where the atmospheric light factor (ALF) and the angular scattering coefficient (ASC) are assumed as constants for one image.

Image Dehazing Single Image Dehazing

Accelerated Multi-Modal MR Imaging with Transformers

1 code implementation27 Jun 2021 Chun-Mei Feng, Yunlu Yan, Geng Chen, Huazhu Fu, Yong Xu, Ling Shao

To this end, we propose a multi-modal transformer (MTrans), which is capable of transferring multi-scale features from the target modality to the auxiliary modality, for accelerated MR imaging.

Dual-Stream Reciprocal Disentanglement Learning for Domain Adaptation Person Re-Identification

1 code implementation26 Jun 2021 Huafeng Li, Kaixiong Xu, Jinxing Li, Guangming Lu, Yong Xu, Zhengtao Yu, David Zhang

Since human-labeled samples are free for the target set, unsupervised person re-identification (Re-ID) has attracted much attention in recent years, by additionally exploiting the source set.

Domain Adaptation Image Generation +1

Deep Texture Recognition via Exploiting Cross-Layer Statistical Self-Similarity

no code implementations CVPR 2021 Zhile Chen, Feng Li, Yuhui Quan, Yong Xu, Hui Ji

In recent years, convolutional neural networks (CNNs) have become a prominent tool for texture recognition.

Task Transformer Network for Joint MRI Reconstruction and Super-Resolution

1 code implementation12 Jun 2021 Chun-Mei Feng, Yunlu Yan, Huazhu Fu, Li Chen, Yong Xu

Then, a task transformer module is designed to embed and synthesize the relevance between the two tasks.

MRI Reconstruction Super-Resolution

Multi-Contrast MRI Super-Resolution via a Multi-Stage Integration Network

1 code implementation19 May 2021 Chun-Mei Feng, Huazhu Fu, Shuhao Yuan, Yong Xu

In this work, we propose a multi-stage integration network (i. e., MINet) for multi-contrast MRI SR, which explicitly models the dependencies between multi-contrast images at different stages to guide image SR.


DONet: Dual-Octave Network for Fast MR Image Reconstruction

no code implementations12 May 2021 Chun-Mei Feng, Zhanyuan Yang, Huazhu Fu, Yong Xu, Jian Yang, Ling Shao

In this paper, we propose the Dual-Octave Network (DONet), which is capable of learning multi-scale spatial-frequency features from both the real and imaginary components of MR data, for fast parallel MR image reconstruction.

Image Reconstruction

MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation

no code implementations17 Apr 2021 Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu

The spatial self-attention module is designed to attend on the cross-channel correlation in the covariance matrices.

automatic-speech-recognition Speech Quality +2

Dual-Octave Convolution for Accelerated Parallel MR Image Reconstruction

1 code implementation12 Apr 2021 Chun-Mei Feng, Zhanyuan Yang, Geng Chen, Yong Xu, Ling Shao

We evaluate the performance of the proposed model on the acceleration of multi-coil MR image reconstruction.

Image Reconstruction

MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment

no code implementations2 Apr 2021 Meng Yu, Chunlei Zhang, Yong Xu, ShiXiong Zhang, Dong Yu

The objective speech quality assessment is usually conducted by comparing received speech signal with its clean reference, while human beings are capable of evaluating the speech quality without any reference, such as in the mean opinion score (MOS) tests.

Speech Quality

TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation

no code implementations31 Mar 2021 Helin Wang, Bo Wu, LianWu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu

In this paper, we exploit the effective way to leverage contextual information to improve the speech dereverberation performance in real-world reverberant environments.

Speech Dereverberation

Asymmetric CNN for image super-resolution

1 code implementation25 Mar 2021 Chunwei Tian, Yong Xu, WangMeng Zuo, Chia-Wen Lin, David Zhang

In this paper, we propose an asymmetric CNN (ACNet) comprising an asymmetric block (AB), a memory enhancement block (MEB) and a high-frequency feature enhancement block (HFFEB) for image super-resolution.

Image Super-Resolution

Distributed Newton Optimization with Maximized Convergence Rate

no code implementations17 Feb 2021 Damián Marelli, Yong Xu, Minyue Fu, Zenghong Huang

As the second step towards our goal we complement the proposed method with a fully distributed method for estimating the optimal step size that maximizes convergence speed.

Distributed Optimization Optimization and Control

MultiFace: A Generic Training Mechanism for Boosting Face Recognition Performance

1 code implementation25 Jan 2021 Jing Xu, Tszhang Guo, Yong Xu, Zenglin Xu, Kun Bai

Deep Convolutional Neural Networks (DCNNs) and their variants have been widely used in large scale face recognition(FR) recently.

Face Recognition

Field-free spin-orbit torque-induced switching of perpendicular magnetization in a ferrimagnetic layer with vertical composition gradient

no code implementations21 Jan 2021 Zhenyi Zheng, Yue Zhang, Victor Lopez-Dominguez, Luis Sánchez-Tejerina, Jiacheng Shi, Xueqiang Feng, Lei Chen, Zilu Wang, Zhizhong Zhang, Kun Zhang, Bin Hong, Yong Xu, Youguang Zhang, Mario Carpentieri, Albert Fert, Giovanni Finocchio, Weisheng Zhao, Pedram Khalili Amiri

Existing methods to do so involve the application of an in-plane bias magnetic field, or incorporation of in-plane structural asymmetry in the device, both of which can be difficult to implement in practical applications.

Mesoscale and Nanoscale Physics

FWB-Net:Front White Balance Network for Color Shift Correction in Single Image Dehazing via Atmospheric Light Estimation

no code implementations21 Jan 2021 Cong Wang, Yan Huang, Yuexian Zou, Yong Xu

However, for images taken in real-world, the illumination is not uniformly distributed over whole image which brings model mismatch and possibly results in color shift of the deep models using ASM.

Image Dehazing Single Image Dehazing

Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation

no code implementations ICCV 2021 Tianyi Chen, Yi Liu, Yunfei Zhang, Si Wu, Yong Xu, Feng Liangbing, Hau San Wong

To ensure disentanglement among the variables, we maximize mutual information between the class-independent variable and synthesized images, map real images to the latent space of a generator to perform consistency regularization of cross-class attributes, and incorporate class semantic-based regularization into a discriminator's feature space.

Image Generation

Hypergraph Neural Networks for Hypergraph Matching

no code implementations ICCV 2021 Xiaowei Liao, Yong Xu, Haibin Ling

Specifically, given two hypergraphs to be matched, we first construct an association hypergraph over them and convert the hypergraph matching problem into a node classification problem on the association hypergraph.

Hypergraph Matching Node Classification

Detection of magnetic gap in the topological surface states of MnBi2Te4

no code implementations31 Dec 2020 Haoran Ji, Yanzhao Liu, He Wang, Jiawei Luo, Jiaheng Li, Hao Li, Yang Wu, Yong Xu, Jian Wang

An essential ingredient to realize these quantum states is the magnetic gap in the topological surface states induced by the out-of-plane ferromagnetism on the surface of MnBi2Te4.

Materials Science

Vehicle Re-identification Based on Dual Distance Center Loss

no code implementations23 Dec 2020 Zhijun Hu, Yong Xu, Jie Wen, Lilei Sun, Raja S P

Moreover, by designing a Euclidean distance threshold between all center pairs, which not only strengthens the inter-class separability of center loss, but also makes the center loss (or DDCL) works well without the combination of softmax loss.

Person Re-Identification Vehicle Re-Identification

Structural Disorder Induced Second-order Topological Insulators in Three Dimensions

no code implementations22 Dec 2020 Jiong-Hao Wang, Yan-Bin Yang, Ning Dai, Yong Xu

Here we predict the existence of a secondorder topological insulating phase in an amorphous system without any crystalline symmetry.

Mesoscale and Nanoscale Physics Disordered Systems and Neural Networks

Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization

no code implementations30 Oct 2020 Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu

The advantages of D-ASR over existing methods are threefold: (1) it provides explicit speaker locations, (2) it improves the explainability factor, and (3) it achieves better ASR performance as the process is more streamlined.

automatic-speech-recognition Speech Recognition

LaSOT: A High-quality Large-scale Single Object Tracking Benchmark

1 code implementation8 Sep 2020 Heng Fan, Hexin Bai, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Harshit, Mingzhen Huang, Juehuan Liu, Yong Xu, Chunyuan Liao, Lin Yuan, Haibin Ling

The average video length of LaSOT is around 2, 500 frames, where each video contains various challenge factors that exist in real world video footage, such as the targets disappearing and re-appearing.

Object Tracking Visual Tracking

An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation

1 code implementation21 Aug 2020 Daniel Michelsanti, Zheng-Hua Tan, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, Jesper Jensen

Speech enhancement and speech separation are two related tasks, whose purpose is to extract either one or more target speech signals, respectively, from a mixture of sounds generated by several sources.

Speech Enhancement Speech Separation

ADL-MVDR: All deep learning MVDR beamformer for target speech separation

1 code implementation16 Aug 2020 Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, LianWu Chen, Dong Yu

Speech separation algorithms are often used to separate the target speech from other interfering sources.

Speech Separation

Recurrent Exposure Generation for Low-Light Face Detection

no code implementations21 Jul 2020 Jinxiu Liang, Jingwen Wang, Yuhui Quan, Tianyi Chen, Jiaying Liu, Haibin Ling, Yong Xu

REG produces progressively and efficiently intermediate images corresponding to various exposure settings, and such pseudo-exposures are then fused by MED to detect faces across different lighting conditions.

Face Detection Image Enhancement

Lightweight image super-resolution with enhanced CNN

1 code implementation8 Jul 2020 Chunwei Tian, Ruibin Zhuge, Zhihao Wu, Yong Xu, WangMeng Zuo, Chen Chen, Chia-Wen Lin

Finally, the IRB uses coarse high-frequency features from the RB to learn more accurate SR features and construct a SR image.

Image Super-Resolution

Designing and Training of A Dual CNN for Image Denoising

1 code implementation8 Jul 2020 Chunwei Tian, Yong Xu, WangMeng Zuo, Bo Du, Chia-Wen Lin, David Zhang

The enhancement block gathers and fuses the global and local features to provide complementary information for the latter network.

Image Denoising

Deep Bilateral Retinex for Low-Light Image Enhancement

no code implementations4 Jul 2020 Jinxiu Liang, Yong Xu, Yuhui Quan, Jingwen Wang, Haibin Ling, Hui Ji

Low-light images, i. e. the images captured in low-light conditions, suffer from very poor visibility caused by low contrast, color distortion and significant measurement noise.

Low-Light Image Enhancement

Neural Spatio-Temporal Beamformer for Target Speech Separation

1 code implementation8 May 2020 Yong Xu, Meng Yu, Shi-Xiong Zhang, Lian-Wu Chen, Chao Weng, Jianming Liu, Dong Yu

Purely neural network (NN) based speech separation and enhancement methods, although can achieve good objective scores, inevitably cause nonlinear speech distortions that are harmful for the automatic speech recognition (ASR).

Audio and Speech Processing Sound

Pathwise Unique Solutions and Stochastic Averaging for Mixed Stochastic Partial Differential Equations Driven by Fractional Brownian Motion and Brownian Motion

no code implementations11 Apr 2020 Bin Pei, Yuzuru Inahama, Yong Xu

This paper is devoted to a system of stochastic partial differential equations (SPDEs) that have a slow component driven by fractional Brownian motion (fBm) with the Hurst parameter $H >1/2$ and a fast component driven by fast-varying diffusion.

Probability Dynamical Systems 60G22, 60H05, 60H15, 34C29

Multi-modal Multi-channel Target Speech Separation

no code implementations16 Mar 2020 Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Lian-Wu Chen, Yuexian Zou, Dong Yu

Target speech separation refers to extracting a target speaker's voice from an overlapped audio of simultaneous talkers.

Speech Separation

Enhancing End-to-End Multi-channel Speech Separation via Spatial Feature Learning

no code implementations9 Mar 2020 Rongzhi Gu, Shi-Xiong Zhang, Lian-Wu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu

Hand-crafted spatial features (e. g., inter-channel phase difference, IPD) play a fundamental role in recent deep learning based multi-channel speech separation (MCSS) methods.

Speech Separation

Self-supervised learning for audio-visual speaker diarization

no code implementations13 Feb 2020 Yifan Ding, Yong Xu, Shi-Xiong Zhang, Yahuan Cong, Liqiang Wang

Speaker diarization, which is to find the speech segments of specific speakers, has been widely used in human-centered applications such as video conferences or human-computer interaction systems.

Self-Supervised Learning Speaker Diarization +1

Deep Learning on Image Denoising: An overview

no code implementations31 Dec 2019 Chunwei Tian, Lunke Fei, Wenxian Zheng, Yong Xu, WangMeng Zuo, Chia-Wen Lin

However, there are substantial differences in the various types of deep learning methods dealing with image denoising.

Image Denoising

A Unified Framework for Speech Separation

no code implementations17 Dec 2019 Fahimeh Bahmaninezhad, Shi-Xiong Zhang, Yong Xu, Meng Yu, John H. L. Hansen, Dong Yu

The initial solutions introduced for deep learning based speech separation analyzed the speech signals into time-frequency domain with STFT; and then encoded mixed signals were fed into a deep neural network based separator.

Speech Separation

Adaptive GNN for Image Analysis and Editing

no code implementations NeurIPS 2019 Lingyu Liang, Lianwen Jin, Yong Xu

In practical verification, we design a new regularization structure with guided feature to produce GNN-based filtering and propagation diffusion to tackle the ill-posed inverse problems of quotient image analysis (QIA), which recovers the reflectance ratio as a signature for image analysis or adjustment.

Low-Light Image Enhancement

Audio-Visual Speech Separation and Dereverberation with a Two-Stage Multimodal Network

no code implementations16 Sep 2019 Ke Tan, Yong Xu, Shi-Xiong Zhang, Meng Yu, Dong Yu

Background noise, interfering speech and room reverberation frequently distort target speech in real listening environments.

Audio and Speech Processing Sound Signal Processing

Dedge-AGMNet:an effective stereo matching network optimized by depth edge auxiliary task

no code implementations25 Aug 2019 Weida Yang, Xindong Ai, Zuliu Yang, Yong Xu, Yong Zhao

To improve the performance in ill-posed regions, this paper proposes an atrous granular multi-scale network based on depth edge subnetwork(Dedge-AGMNet).

Disparity Estimation Edge Detection +2

Coupled-Projection Residual Network for MRI Super-Resolution

no code implementations12 Jul 2019 Chun-Mei Feng, Kai Wang, Shijian Lu, Yong Xu, Heng Kong, Ling Shao

The deep sub-network learns from the residuals of the high-frequency image information, where multiple residual blocks are cascaded to magnify the MRI images at the last network layer.


Single-Channel Signal Separation and Deconvolution with Generative Adversarial Networks

1 code implementation14 Jun 2019 Qiuqiang Kong, Yong Xu, Wenwu Wang, Philip J. B. Jackson, Mark D. Plumbley

Single-channel signal separation and deconvolution aims to separate and deconvolve individual sources from a single-channel mixture and is a challenging problem in which no prior knowledge of the mixing filters is available.

Image Inpainting

Robust Classification with Sparse Representation Fusion on Diverse Data Subsets

no code implementations10 Jun 2019 Chun-Mei Feng, Yong Xu, Zuoyong Li, Jian Yang

It performs Sparse Representation Fusion based on the Diverse Subset of training samples (SRFDS), which reduces the impact of randomness of the sample set and enhances the robustness of classification results.

Classification General Classification +1

A comprehensive study of speech separation: spectrogram vs waveform separation

no code implementations17 May 2019 Fahimeh Bahmaninezhad, Jian Wu, Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu

We study the speech separation problem for far-field data (more similar to naturalistic audio streams) and develop multi-channel solutions for both frequency and time-domain separators with utilizing spectral, spatial and speaker location information.

Speech Recognition Speech Separation

End-to-End Multi-Channel Speech Separation

no code implementations15 May 2019 Rongzhi Gu, Jian Wu, Shi-Xiong Zhang, Lian-Wu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu

This paper extended the previous approach and proposed a new end-to-end model for multi-channel speech separation.

Speech Separation

Time Domain Audio Visual Speech Separation

no code implementations7 Apr 2019 Jian Wu, Yong Xu, Shi-Xiong Zhang, Lian-Wu Chen, Meng Yu, Lei Xie, Dong Yu

Audio-visual multi-modal modeling has been demonstrated to be effective in many speech related tasks, such as speech recognition and speech enhancement.

Audio and Speech Processing Sound

Image Cartoon-Texture Decomposition Using Isotropic Patch Recurrence

no code implementations10 Nov 2018 Ruotao Xu, Yuhui Quan, Yong Xu

Aiming at separating the cartoon and texture layers from an image, cartoon-texture decomposition approaches resort to image priors to model cartoon and texture respectively.

Enhanced CNN for image denoising

no code implementations28 Oct 2018 Chunwei Tian, Yong Xu, Lunke Fei, Junqian Wang, Jie Wen, Nan Luo

Owing to flexible architectures of deep convolutional neural networks (CNNs), CNNs are successfully used for image denoising.

Image Denoising

Deep Learning for Image Denoising: A Survey

no code implementations11 Oct 2018 Chunwei Tian, Yong Xu, Lunke Fei, Ke Yan

Since the proposal of big data analysis and Graphic Processing Unit (GPU), the deep learning technology has received a great deal of attention and has been widely applied in the field of imaging processing.

Image Denoising

Sound Event Detection and Time-Frequency Segmentation from Weakly Labelled Data

2 code implementations12 Apr 2018 Qiuqiang Kong, Yong Xu, Iwona Sobieraj, Wenwu Wang, Mark D. Plumbley

Sound event detection (SED) aims to detect when and recognize what sound events happen in an audio clip.

Sound Audio and Speech Processing

Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning

1 code implementation CVPR 2018 Jingwen Wang, Wenhao Jiang, Lin Ma, Wei Liu, Yong Xu

We propose a bidirectional proposal method that effectively exploits both past and future contexts to make proposal predictions.

Dense Video Captioning

A joint separation-classification model for sound event detection of weakly labelled data

2 code implementations8 Nov 2017 Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley

First, we propose a separation mapping from the time-frequency (T-F) representation of an audio to the T-F segmentation masks of the audio events.

Sound Audio and Speech Processing

Audio Set classification with attention model: A probabilistic perspective

5 code implementations2 Nov 2017 Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley

Then the classification of a bag is the expectation of the classification output of the instances in the bag with respect to the learned probability measure.

Sound Audio and Speech Processing

Large-scale weakly supervised audio classification using gated convolutional neural network

4 code implementations1 Oct 2017 Yong Xu, Qiuqiang Kong, Wenwu Wang, Mark D. Plumbley

In this paper, we present a gated convolutional neural network and a temporal attention-based localization method for audio classification, which won the 1st place in the large-scale weakly supervised sound event detection task of Detection and Classification of Acoustic Scenes and Events (DCASE) 2017 challenge.

Sound Audio and Speech Processing

Discriminative Block-Diagonal Representation Learning for Image Recognition

no code implementations12 Jul 2017 Zheng Zhang, Yong Xu, Ling Shao, Jian Yang

In particular, the elaborate BDLRR is formulated as a joint optimization problem of shrinking the unfavorable representation from off-block-diagonal elements and strengthening the compact block-diagonal representation under the semi-supervised framework of low-rank representation.

Representation Learning

Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation

3 code implementations CVPR 2017 Hongliang Yan, Yukang Ding, Peihua Li, Qilong Wang, Yong Xu, WangMeng Zuo

Specifically, we introduce class-specific auxiliary weights into the original MMD for exploiting the class prior probability on source and target domains, whose challenge lies in the fact that the class label in target domain is unavailable.

Unsupervised Domain Adaptation

Learning Inverse Mapping by Autoencoder based Generative Adversarial Nets

no code implementations29 Mar 2017 Junyu Luo, Yong Xu, Chenwei Tang, Jiancheng Lv

The inverse mapping of GANs'(Generative Adversarial Nets) generator has a great potential value. Hence, some works have been developed to construct the inverse function of generator by directly learning or adversarial learning. While the results are encouraging, the problem is highly challenging and the existing ways of training inverse models of GANs have many disadvantages, such as hard to train or poor performance. Due to these reasons, we propose a new approach based on using inverse generator ($IG$) model as encoder and pre-trained generator ($G$) as decoder of an AutoEncoder network to train the $IG$ model.

Multi-Objective Learning and Mask-Based Post-Processing for Deep Neural Network Based Speech Enhancement

no code implementations21 Mar 2017 Yong Xu, Jun Du, Zhen Huang, Li-Rong Dai, Chin-Hui Lee

We propose a multi-objective framework to learn both secondary targets not directly related to the intended task of speech enhancement (SE) and the primary target of the clean log-power spectra (LPS) features to be used directly for constructing the enhanced speech signals.


Attention and Localization based on a Deep Convolutional Recurrent Model for Weakly Supervised Audio Tagging

2 code implementations17 Mar 2017 Yong Xu, Qiuqiang Kong, Qiang Huang, Wenwu Wang, Mark D. Plumbley

Audio tagging aims to perform multi-label classification on audio chunks and it is a newly proposed task in the Detection and Classification of Acoustic Scenes and Events 2016 (DCASE 2016) challenge.


Convolutional Gated Recurrent Neural Network Incorporating Spatial Features for Audio Tagging

2 code implementations24 Feb 2017 Yong Xu, Qiuqiang Kong, Qiang Huang, Wenwu Wang, Mark D. Plumbley

In this paper, we propose to use a convolutional neural network (CNN) to extract robust features from mel-filter banks (MFBs), spectrograms or even raw waveforms for audio tagging.

Audio Tagging

A Joint Detection-Classification Model for Audio Tagging of Weakly Labelled Data

1 code implementation6 Oct 2016 Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark Plumbley

The labeling of an audio clip is often based on the audio events in the clip and no event level label is provided to the user.


Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging

2 code implementations13 Jul 2016 Yong Xu, Qiang Huang, Wenwu Wang, Peter Foster, Siddharth Sigtia, Philip J. B. Jackson, Mark D. Plumbley

For the unsupervised feature learning, we propose to use a symmetric or asymmetric deep de-noising auto-encoder (sDAE or aDAE) to generate new data-driven features from the Mel-Filter Banks (MFBs) features.

Audio Tagging General Classification +1

Lecture bilingue augment\'ee par des alignements multi-niveaux (Augmenting bilingual reading with alignment information)

no code implementations JEPTALNRECITAL 2016 Fran{\c{c}}ois Yvon, Yong Xu, Marianna Apidianaki, Cl{\'e}ment Pillias, Cubaud Pierre

Le travail qui a conduit {\`a} cette d{\'e}monstration combine des outils de traitement des langues multilingues, en particulier l{'}alignement automatique, avec des techniques de visualisation et d{'}interaction.

Fully DNN-based Multi-label regression for audio tagging

no code implementations24 Jun 2016 Yong Xu, Qiang Huang, Wenwu Wang, Philip J. B. Jackson, Mark D. Plumbley

Compared with the conventional Gaussian Mixture Model (GMM) and support vector machine (SVM) methods, the proposed fully DNN-based method could well utilize the long-term temporal information with the whole chunk as the input.

Audio Tagging Event Detection +2

Natural Scene Character Recognition Using Robust PCA and Sparse Representation

no code implementations15 Jun 2016 Zheng Zhang, Yong Xu, Cheng-Lin Liu

Natural scene character recognition is challenging due to the cluttered background, which is hard to separate from text.

Sparse Coding for Classification via Discrimination Ensemble

no code implementations CVPR 2016 Yuhui Quan, Yong Xu, Yuping Sun, Yan Huang, Hui Ji

Discriminative sparse coding has emerged as a promising technique in image analysis and recognition, which couples the process of classifier training and the process of dictionary learning for improving the discriminability of sparse codes.

Classification Dictionary Learning +1

A survey of sparse representation: algorithms and applications

no code implementations23 Feb 2016 Zheng Zhang, Yong Xu, Jian Yang, Xuelong. Li, David Zhang

The main purpose of this article is to provide a comprehensive study and an updated review on sparse representation and to supply a guidance for researchers.

Removing Rain From a Single Image via Discriminative Sparse Coding

no code implementations ICCV 2015 Yu Luo, Yong Xu, Hui Ji

The paper aims at developing an effective algorithm to remove visual effects of rain from a single rainy image, i. e. separate the rain layer and the de-rained image layer from an rainy image.

Dictionary Learning Rain Removal

Lacunarity Analysis on Image Patterns for Texture Classification

no code implementations CVPR 2014 Yuhui Quan, Yong Xu, Yuping Sun, Yu Luo

Based on the concept of lacunarity in fractal geometry, we developed a statistical approach to texture description, which yields highly discriminative feature with strong robustness to a wide range of transformations, including photometric changes and geometric changes.

Classification General Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.