Search Results for author: Xin Wang

Found 475 papers, 164 papers with code

On Algorithms for Sparse Multi-factor NMF

no code implementations NeurIPS 2013 Siwei Lyu, Xin Wang

Nonnegative matrix factorization (NMF) is a popular data analysis method, the objective of which is to decompose a matrix with all nonnegative components into the product of two other nonnegative matrices.

Stochastic Averaging for Constrained Optimization with Application to Online Resource Allocation

no code implementations7 Oct 2016 Tianyi Chen, Aryan Mokhtari, Xin Wang, Alejandro Ribeiro, Georgios B. Giannakis

Existing approaches to resource allocation for nowadays stochastic networks are challenged to meet fast convergence and tolerable delay requirements.

A multi-task learning model for malware classification with useful file access pattern from API call sequence

no code implementations19 Oct 2016 Xin Wang, Siu Ming Yiu

Based on API call sequences, semantic-aware and machine learning (ML) based malware classifiers can be built for malware detection or classification.

Classification Document Classification +6

On Multiplicative Multitask Feature Learning

no code implementations NeurIPS 2014 Xin Wang, Jinbo Bi, Shipeng Yu, Jiangwen Sun

We prove that this framework is mathematically equivalent to the widely used multitask feature learning methods that are based on a joint regularization of all model parameters, but with a more general form of regularizers.

Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer

2 code implementations CVPR 2017 Xin Wang, Geoffrey Oxholm, Da Zhang, Yuan-Fang Wang

That is, our scheme can generate results that are visually pleasing and more similar to multiple desired artistic styles with color and texture cues at multiple scales.

Style Transfer

Classification of Neurological Gait Disorders Using Multi-task Feature Learning

no code implementations8 Dec 2016 Ioannis Papavasileiou, Wenlong Zhang, Xin Wang, Jinbo Bi, Li Zhang, Song Han

An advanced machine learning method, multi-task feature learning (MTFL), is used to jointly train classification models of a subject's gait in three classes, post-stroke, PD and healthy gait.

Classification General Classification

Robust Learning with Kernel Mean p-Power Error Loss

no code implementations21 Dec 2016 Badong Chen, Lei Xing, Xin Wang, Jing Qin, Nanning Zheng

Correntropy is a second order statistical measure in kernel space, which has been successfully applied in robust learning and signal processing.

Deep Reinforcement Learning for Visual Object Tracking in Videos

no code implementations31 Jan 2017 Da Zhang, Hamid Maei, Xin Wang, Yuan-Fang Wang

In this paper we introduce a fully end-to-end approach for visual tracking in videos that learns to predict the bounding box locations of a target object at every frame.

Decision Making Object +4

IDK Cascades: Fast Deep Learning by Learning not to Overthink

no code implementations3 Jun 2017 Xin Wang, Yujia Luo, Daniel Crankshaw, Alexey Tumanov, Fisher Yu, Joseph E. Gonzalez

Advances in deep learning have led to substantial increases in prediction accuracy but have been accompanied by increases in the cost of rendering predictions.

Dialogue Generation

Non-asymptotic entanglement distillation

1 code implementation19 Jun 2017 Kun Fang, Xin Wang, Marco Tomamichel, Runyao Duan

For isotropic states, it can be further simplified to a linear program.

Quantum Physics

A region-growing approach for automatic outcrop fracture extraction from a three-dimensional point cloud

1 code implementation27 Jun 2017 Xin Wang, Lejun Zou, Xiaohua Shen, Yupeng Ren, Yi Qin

In tests using outcrop point cloud data, the proposed method identified and extracted the full extent of individual fractures with high accuracy.

Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks

no code implementations NeurIPS 2017 Urs Köster, Tristan J. Webb, Xin Wang, Marcel Nassar, Arjun K. Bansal, William H. Constable, Oğuz H. Elibol, Scott Gray, Stewart Hall, Luke Hornof, Amir Khosrowshahi, Carey Kloss, Ruby J. Pai, Naveen Rao

Here we present the Flexpoint data format, aiming at a complete replacement of 32-bit floating point format training and inference, designed to support modern deep network topologies without modifications.

Generative Adversarial Network

SkipNet: Learning Dynamic Routing in Convolutional Networks

2 code implementations ECCV 2018 Xin Wang, Fisher Yu, Zi-Yi Dou, Trevor Darrell, Joseph E. Gonzalez

While deeper convolutional networks are needed to achieve maximum accuracy in visual perception tasks, for many inputs shallower networks are sufficient.

Decision Making

Group Linguistic Bias Aware Neural Response Generation

no code implementations WS 2017 Jianan Wang, Xin Wang, Fang Li, Zhen Xu, Zhuoran Wang, Baoxun Wang

For practical chatbots, one of the essential factor for improving user experience is the capability of customizing the talking style of the agents, that is, to make chatbots provide responses meeting users{'} preference on language styles, topics, etc.

Decoder Response Generation

Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data

no code implementations2 Mar 2018 Jaime Lorenzo-Trueba, Fuming Fang, Xin Wang, Isao Echizen, Junichi Yamagishi, Tomi Kinnunen

Thanks to the growing availability of spoofing databases and rapid advances in using them, systems for detecting voice spoofing attacks are becoming more and more capable, and error rates close to zero are being reached for the ASVspoof2015 database.

Generative Adversarial Network Speech Enhancement +2

Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation

1 code implementation ECCV 2018 Xin Wang, Wenhan Xiong, Hongmin Wang, William Yang Wang

In this paper, we take a radical approach to bridge the gap between synthetic studies and real-world practices---We propose a novel, planned-ahead hybrid reinforcement learning model that combines model-free and model-based reinforcement learning to solve a real-world vision-language navigation task.

Model-based Reinforcement Learning reinforcement-learning +4

Speech waveform synthesis from MFCC sequences with generative adversarial networks

1 code implementation3 Apr 2018 Lauri Juvela, Bajibabu Bollepalli, Xin Wang, Hirokazu Kameoka, Manu Airaksinen, Junichi Yamagishi, Paavo Alku

This paper proposes a method for generating speech from filterbank mel frequency cepstral coefficients (MFCC), which are widely used in speech applications, such as ASR, but are generally considered unusable for speech synthesis.

Generative Adversarial Network Speech Synthesis

A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesis

no code implementations7 Apr 2018 Xin Wang, Jaime Lorenzo-Trueba, Shinji Takaki, Lauri Juvela, Junichi Yamagishi

Recent advances in speech synthesis suggest that limitations such as the lossy nature of the amplitude spectrum with minimum phase approximation and the over-smoothing effect in acoustic modeling can be overcome by using advanced machine learning approaches.

Speech Synthesis

Fast Weight Long Short-Term Memory

no code implementations18 Apr 2018 T. Anderson Keller, Sharath Nittur Sridhar, Xin Wang

Associative memory using fast weights is a short-term memory mechanism that substantially improves the memory capacity and time scale of recurrent neural networks (RNNs).

Retrieval

No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling

2 code implementations ACL 2018 Xin Wang, Wenhu Chen, Yuan-Fang Wang, William Yang Wang

Though impressive results have been achieved in visual captioning, the task of generating abstract stories from photo streams is still a little-tapped problem.

Image Captioning Visual Storytelling

BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

4 code implementations CVPR 2020 Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, Fangchen Liu, Vashisht Madhavan, Trevor Darrell

Datasets drive vision progress, yet existing driving datasets are impoverished in terms of visual content and supported tasks to study multitask learning for autonomous driving.

Autonomous Driving Domain Adaptation +8

Deep Mixture of Experts via Shallow Embedding

no code implementations5 Jun 2018 Xin Wang, Fisher Yu, Lisa Dunlap, Yi-An Ma, Ruth Wang, Azalia Mirhoseini, Trevor Darrell, Joseph E. Gonzalez

Larger networks generally have greater representational power at the cost of increased computational complexity.

Few-Shot Learning Zero-Shot Learning

Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video

1 code implementation CVPR 2019 Samvit Jain, Xin Wang, Joseph Gonzalez

We present Accel, a novel semantic video segmentation system that achieves high accuracy at low inference cost by combining the predictions of two network branches: (1) a reference branch that extracts high-detail features on a reference keyframe, and warps these features forward using frame-to-frame optical flow estimates, and (2) an update branch that computes features of adjustable quality on the current frame, performing a temporal update at each video frame.

Optical Flow Estimation Segmentation +3

S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

1 code implementation21 Jul 2018 Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang

In this paper, we present a novel Single Shot multi-Span Detector for temporal activity detection in long, untrimmed videos using a simple end-to-end fully three-dimensional convolutional (Conv3D) network.

Action Detection Activity Detection

Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis

no code implementations30 Jul 2018 Gustav Eje Henter, Jaime Lorenzo-Trueba, Xin Wang, Junichi Yamagishi

Generating versatile and appropriate synthetic speech requires control over the output expression separate from the spoken text.

Acoustic Modelling Decoder +2

Investigating accuracy of pitch-accent annotations in neural network-based speech synthesis and denoising effects

no code implementations2 Aug 2018 Hieu-Thi Luong, Xin Wang, Junichi Yamagishi, Nobuyuki Nishizawa

We investigated the impact of noisy linguistic features on the performance of a Japanese speech synthesis system based on neural network that uses WaveNet vocoder.

Denoising Speech Synthesis

XL-NBT: A Cross-lingual Neural Belief Tracking Framework

1 code implementation EMNLP 2018 Wenhu Chen, Jianshu Chen, Yu Su, Xin Wang, Dong Yu, Xifeng Yan, William Yang Wang

Then, we pre-train a state tracker for the source language as a teacher, which is able to exploit easy-to-access parallel data.

Transfer Learning

Hierarchically-Structured Variational Autoencoders for Long Text Generation

no code implementations27 Sep 2018 Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Lawrence Carin

Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation.

Decoder Sentence +1

Neural source-filter-based waveform model for statistical parametric speech synthesis

no code implementations29 Oct 2018 Xin Wang, Shinji Takaki, Junichi Yamagishi

Neural waveform models such as the WaveNet are used in many recent text-to-speech systems, but the original WaveNet is quite slow in waveform generation because of its autoregressive (AR) structure.

Speech Synthesis

Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language

1 code implementation29 Oct 2018 Yusuke Yasuda, Xin Wang, Shinji Takaki, Junichi Yamagishi

Towards end-to-end Japanese speech synthesis, we extend Tacotron to systems with self-attention to capture long-term dependencies related to pitch accents and compare their audio quality with classical pipeline systems under various conditions to show their pros and cons.

Speech Synthesis Text-To-Speech Synthesis

STFT spectral loss for training a neural speech waveform model

1 code implementation29 Oct 2018 Shinji Takaki, Toru Nakashika, Xin Wang, Junichi Yamagishi

This paper proposes a new loss using short-time Fourier transform (STFT) spectra for the aim of training a high-performance neural speech waveform model that predicts raw continuous speech waveform samples directly.

Audiovisual speaker conversion: jointly and simultaneously transforming facial expression and acoustic characteristics

no code implementations29 Oct 2018 Fuming Fang, Xin Wang, Junichi Yamagishi, Isao Echizen

Transforming the facial and acoustic features together makes it possible for the converted voice and facial expressions to be highly correlated and for the generated target speaker to appear and sound natural.

Image Reconstruction

Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video Captioning

no code implementations7 Nov 2018 Xin Wang, Jiawei Wu, Da Zhang, Yu Su, William Yang Wang

Although promising results have been achieved in video captioning, existing models are limited to the fixed inventory of activities in the training corpus, and do not generalize to open vocabulary scenarios.

Video Captioning

Guided Feature Selection for Deep Visual Odometry

no code implementations25 Nov 2018 Fei Xue, Qiuyuan Wang, Xin Wang, Wei Dong, Junqiu Wang, Hongbin Zha

We present a novel end-to-end visual odometry architecture with guided feature selection based on deep convolutional recurrent neural networks.

feature selection Monocular Visual Odometry +1

MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment

no code implementations CVPR 2019 Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang, Larry S. Davis

In this paper, we present Moment Alignment Network (MAN), a novel framework that unifies the candidate moment encoding and temporal structural reasoning in a single-shot feed-forward network.

Moment Retrieval Natural Language Moment Retrieval +1

Few-shot Object Detection via Feature Reweighting

4 code implementations ICCV 2019 Bingyi Kang, Zhuang Liu, Xin Wang, Fisher Yu, Jiashi Feng, Trevor Darrell

The feature learner extracts meta features that are generalizable to detect novel object classes, using training data from base classes with sufficient samples.

Few-Shot Learning Few-Shot Object Detection +3

Explanatory Graphs for CNNs

no code implementations18 Dec 2018 Quanshi Zhang, Xin Wang, Ruiming Cao, Ying Nian Wu, Feng Shi, Song-Chun Zhu

This paper introduces a graphical model, namely an explanatory graph, which reveals the knowledge hierarchy hidden inside conv-layers of a pre-trained CNN.

Object

Residual Attention based Network for Hand Bone Age Assessment

no code implementations21 Dec 2018 Eric Wu, Bin Kong, Xin Wang, Junjie Bai, Yi Lu, Feng Gao, Shaoting Zhang, Kunlin Cao, Qi Song, Siwei Lyu, Youbing Yin

The hierarchical attention components of the residual attention subnet force our network to focus on the key components of the X-ray images and generate the final predictions as well as the associated visual supports, which is similar to the assessment procedure of clinicians.

Hand Segmentation

Disparity-preserved Deep Cross-platform Association for Cross-platform Video Recommendation

no code implementations1 Jan 2019 Shengze Yu, Xin Wang, Wenwu Zhu, Peng Cui, Jingdong Wang

However, there remain two unsolved challenges: i) there exist inconsistencies in cross-platform association due to platform-specific disparity, and ii) data from distinct platforms may have different semantic granularities.

Interpretable CNNs for Object Classification

no code implementations8 Jan 2019 Quanshi Zhang, Xin Wang, Ying Nian Wu, Huilin Zhou, Song-Chun Zhu

This paper proposes a generic method to learn interpretable convolutional filters in a deep convolutional neural network (CNN) for object classification, where each interpretable filter encodes features of a specific object part.

Classification General Classification +1

Attention-driven Tree-structured Convolutional LSTM for High Dimensional Data Understanding

no code implementations29 Jan 2019 Bin Kong, Xin Wang, Junjie Bai, Yi Lu, Feng Gao, Kunlin Cao, Qi Song, Shaoting Zhang, Siwei Lyu, Youbing Yin

In order to address these limitations, we present tree-structured ConvLSTM models for tree-structured image analysis tasks which can be trained end-to-end.

Vocal Bursts Intensity Prediction

HAHE: Hierarchical Attentive Heterogeneous Information Network Embedding

2 code implementations31 Jan 2019 Sheng Zhou, Jiajun Bu, Xin Wang, Jia-Wei Chen, Can Wang

Second, given a meta path, nodes in HIN are connected by path instances while existing works fail to fully explore the differences between path instances that reflect nodes' preferences in the semantic space.

Network Embedding

Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models

no code implementations ACL 2019 Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Jianfeng Gao, Lawrence Carin

Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation with latent variables.

Decoder Sentence +1

When does reinforcement learning stand out in quantum control? A comparative study on state preparation

2 code implementations6 Feb 2019 Xiao-Ming Zhang, Zezhu Wei, Raza Asad, Xu-Chen Yang, Xin Wang

In this work, we perform a comparative study on the efficacy of three reinforcement learning algorithms: tabular Q-learning, deep Q-learning, and policy gradient, as well as two non-machine-learning methods: stochastic gradient descent and Krotov algorithms, in the problem of preparing a desired quantum state.

Quantum Physics

Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization

no code implementations15 Feb 2019 Hesham Mostafa, Xin Wang

We evaluate the performance of dynamic reallocation methods in training deep convolutional networks and show that our method outperforms previous static and dynamic reparameterization methods, yielding the best accuracy for a fixed parameter budget, on par with accuracies obtained by iteratively pruning a pre-trained dense model.

DeepCenterline: a Multi-task Fully Convolutional Network for Centerline Extraction

no code implementations25 Mar 2019 Zhihui Guo, Junjie Bai, Yi Lu, Xin Wang, Kunlin Cao, Qi Song, Milan Sonka, Youbing Yin

The proposed method generates well-positioned centerlines, exhibiting lower number of missing branches and is more robust in the presence of minor imperfections of the object segmentation mask.

Object Semantic Segmentation

Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet

no code implementations29 Mar 2019 Mingyang Zhang, Xin Wang, Fuming Fang, Haizhou Li, Junichi Yamagishi

We propose using an extended model architecture of Tacotron, that is a multi-source sequence-to-sequence model with a dual attention mechanism as the shared model for both the TTS and VC tasks.

Decoder Speech Synthesis +1

Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora

no code implementations1 Apr 2019 Hieu-Thi Luong, Xin Wang, Junichi Yamagishi, Nobuyuki Nishizawa

When the available data of a target speaker is insufficient to train a high quality speaker-dependent neural text-to-speech (TTS) system, we can combine data from multiple speakers and train a multi-speaker TTS model instead.

Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation

no code implementations NAACL 2019 Jiawei Wu, Xin Wang, William Yang Wang

The overreliance on large parallel corpora significantly limits the applicability of machine translation systems to the majority of language pairs.

Sentence Translation +1

VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research

2 code implementations ICCV 2019 Xin Wang, Jiawei Wu, Junkun Chen, Lei LI, Yuan-Fang Wang, William Yang Wang

We also introduce two tasks for video-and-language research based on VATEX: (1) Multilingual Video Captioning, aimed at describing a video in various languages with a compact unified captioning model, and (2) Video-guided Machine Translation, to translate a source language description into the target language using the video information as additional spatiotemporal context.

Machine Translation Translation +3

TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning

1 code implementation CVPR 2019 Xin Wang, Fisher Yu, Ruth Wang, Trevor Darrell, Joseph E. Gonzalez

We show that TAFE-Net is highly effective in generalizing to new tasks or concepts and evaluate the TAFE-Net on a range of benchmarks in zero-shot and few-shot learning.

Attribute Few-Shot Learning +1

ACE: Adapting to Changing Environments for Semantic Segmentation

no code implementations ICCV 2019 Zuxuan Wu, Xin Wang, Joseph E. Gonzalez, Tom Goldstein, Larry S. Davis

However, neural classifiers are often extremely brittle when confronted with domain shift---changes in the input distribution that occur over time.

Meta-Learning Semantic Segmentation

Maximum Correntropy Criterion with Variable Center

no code implementations13 Apr 2019 Badong Chen, Xin Wang, Yingsong Li, Jose C. Principe

The kernel function in correntropy is usually restricted to the Gaussian function with center located at zero.

Position

MOSNet: Deep Learning based Objective Assessment for Voice Conversion

6 code implementations17 Apr 2019 Chen-Chou Lo, Szu-Wei Fu, Wen-Chin Huang, Xin Wang, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang

In this paper, we propose deep learning-based assessment models to predict human ratings of converted speech.

Voice Conversion

REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments

1 code implementation CVPR 2020 Yuankai Qi, Qi Wu, Peter Anderson, Xin Wang, William Yang Wang, Chunhua Shen, Anton Van Den Hengel

One of the long-term challenges of robotics is to enable robots to interact with humans in the visual world via natural language, as humans are visual animals that communicate through language.

Referring Expression Vision and Language Navigation

Generalizable control for quantum parameter estimation through reinforcement learning

1 code implementation25 Apr 2019 Han Xu, Junning Li, Liqiang Liu, Yu Wang, Haidong Yuan, Xin Wang

Measurement and estimation of parameters are essential for science and engineering, where one of the main quests is to find systematic schemes that can achieve high precision.

Quantum Physics Mesoscale and Nanoscale Physics

Neural source-filter waveform models for statistical parametric speech synthesis

no code implementations27 Apr 2019 Xin Wang, Shinji Takaki, Junichi Yamagishi

Other models such as Parallel WaveNet and ClariNet bring together the benefits of AR and IAF-based models and train an IAF model by transferring the knowledge from a pre-trained AR teacher to an IAF student without any sequential transformation.

Speech Synthesis

Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction

no code implementations29 Apr 2019 Mingming Sun, Xu Li, Xin Wang, Miao Fan, Yue Feng, Ping Li

In this paper, we consider the problem of open information extraction (OIE) for extracting entity and relation level intermediate structures from sentences in open-domain.

Attribute Open Information Extraction +3

Compositional Coding for Collaborative Filtering

1 code implementation9 May 2019 Chenghao Liu, Tao Lu, Xin Wang, Zhiyong Cheng, Jianling Sun, Steven C. H. Hoi

However, CF with binary codes naturally suffers from low accuracy due to limited representation capability in each bit, which impedes it from modeling complex structure of the data.

Collaborative Filtering Recommendation Systems

Multi-Kernel Correntropy for Robust Learning

no code implementations24 May 2019 Badong Chen, Yuqing Xie, Xin Wang, Zejian yuan, Pengju Ren, Jing Qin

In a recent work, the concept of mixture correntropy (MC) was proposed to improve the learning performance, where the kernel function is a mixture Gaussian kernel, namely a linear combination of several zero-mean Gaussian kernels with different widths.

One-shot entanglement distillation beyond LOCC

no code implementations4 Jun 2019 Bartosz Regula, Kun Fang, Xin Wang, Mile Gu

We show in particular that the $\varepsilon$-error one-shot distillable entanglement of any pure state is the same under all sets of operations ranging from one-way LOCC to separability-preserving operations or operations preserving the set of states with positive partial transpose, and can be computed exactly as a quadratically constrained linear program.

Quantum Physics Mathematical Physics Mathematical Physics

Spatial Heterogeneity Automatic Detection and Estimation

no code implementations5 Jun 2019 Xin Wang, Zhengyuan Zhu, Hao Helen Zhang

Spatial regression is widely used for modeling the relationship between a dependent variable and explanatory covariates.

Methodology

Self-Supervised Learning for Contextualized Extractive Summarization

2 code implementations ACL 2019 Hong Wang, Xin Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, William Yang Wang

Existing models for extractive summarization are usually trained from scratch with a cross-entropy loss, which does not explicitly capture the global context at the document level.

Extractive Summarization Self-Supervised Learning

Recognizing License Plates in Real-Time

no code implementations11 Jun 2019 Xuewen Yang, Xin Wang

To enable real-time and accurate license plate recognition, in this work, we propose a set of techniques: 1) a contour reconstruction method along with edge-detection to quickly detect the candidate plates; 2) a simple zero-one-alternation scheme to effectively remove the fake top and bottom borders around plates to facilitate more accurate segmentation of characters on plates; 3) a set of techniques to augment the training data, incorporate SIFT features into the CNN network, and exploit transfer learning to obtain the initial parameters for more effective training; and 4) a two-phase verification procedure to determine the correct plate at low cost, a statistical filtering in the plate detection stage to quickly remove unwanted candidates, and the accurate CR results after the CR process to perform further plate verification without additional processing.

Edge Detection License Plate Detection +2

Task-Aware Feature Generation for Zero-Shot Compositional Learning

1 code implementation11 Jun 2019 Xin Wang, Fisher Yu, Trevor Darrell, Joseph E. Gonzalez

In this work, we propose a task-aware feature generation (TFG) framework for compositional learning, which generates features of novel visual concepts by transferring knowledge from previously seen concepts.

Novel Concepts Zero-Shot Learning

Self-Supervised Dialogue Learning

no code implementations ACL 2019 Jiawei Wu, Xin Wang, William Yang Wang

The sequential order of utterances is often meaningful in coherent dialogues, and the order changes of utterances could lead to low-quality and incoherent conversations.

Self-Supervised Learning

Modeling the Uncertainty in Electronic Health Records: a Bayesian Deep Learning Approach

no code implementations14 Jul 2019 Riyi Qiu, Yugang Jia, Mirsad Hadzikadic, Michael Dulin, Xi Niu, Xin Wang

Deep learning models have exhibited superior performance in predictive tasks with the explosively increasing Electronic Health Records (EHR).

Decision Making

Outfit Compatibility Prediction and Diagnosis with Multi-Layered Comparison Network

1 code implementation26 Jul 2019 Xin Wang, Bo Wu, Yun Ye, Yueqi Zhong

Existing works about fashion outfit compatibility focus on predicting the overall compatibility of a set of fashion items with their information from different modalities.

Fashion Compatibility Learning Persuasiveness

Privacy-preserving Distributed Machine Learning via Local Randomization and ADMM Perturbation

no code implementations30 Jul 2019 Xin Wang, Hideaki Ishii, Linkang Du, Peng Cheng, Jiming Chen

With the proliferation of training data, distributed machine learning (DML) is becoming more competent for large-scale learning tasks.

BIG-bench Machine Learning Privacy Preserving

Sequential Adversarial Learning for Self-Supervised Deep Visual Odometry

no code implementations ICCV 2019 Shunkai Li, Fei Xue, Xin Wang, Zike Yan, Hongbin Zha

As single-view depth estimation is an ill-posed problem, and photometric loss is incapable of discriminating distortion artifacts of warped images, the estimated depth is vague and pose is inaccurate.

Depth Estimation Image Generation +2

Latent Part-of-Speech Sequences for Neural Machine Translation

no code implementations IJCNLP 2019 Xuewen Yang, Yingru Liu, Dongliang Xie, Xin Wang, Niranjan Balasubramanian

In this work, we introduce a new latent variable model, LaSyn, that captures the co-dependence between syntax and semantics, while allowing for effective and efficient inference over the latent space.

Decoder Machine Translation +2

Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments

no code implementations30 Aug 2019 Yusuke Yasuda, Xin Wang, Junichi Yamagishi

The advantages of our approach are that we can simplify many modules for the soft attention and that we can train the end-to-end TTS model using a single likelihood function.

Decoder

Detecting Deep Neural Network Defects with Data Flow Analysis

no code implementations5 Sep 2019 Jiazhen Gu, Huanlin Xu, Yangfan Zhou, Xin Wang, Hui Xu, Michael Lyu

Deep neural networks (DNNs) are shown to be promising solutions in many challenging artificial intelligence tasks.

Object Recognition

Data Sanity Check for Deep Learning Systems via Learnt Assertions

no code implementations6 Sep 2019 Haochuan Lu, Huanlin Xu, Nana Liu, Yangfan Zhou, Xin Wang

But the statistical nature of DL makes it quite vulnerable to invalid inputs, i. e., those cases that are not considered in the training phase of a DL model.

Reject Illegal Inputs: Scaling Generative Classifiers with Supervised Deep Infomax

no code implementations25 Sep 2019 Xin Wang, SiuMing Yiu

The supervised probabilistic constraints are equivalent to a generative classifier on high-level data representations, where class conditional log-likelihoods of samples can be evaluated.

Representation Learning

Exploring the Correlation between Likelihood of Flow-based Generative Models and Image Semantics

no code implementations25 Sep 2019 Xin Wang, SiuMing Yiu

In this paper, we explore the correlation between flows' likelihood and image semantics.

Generalized Natural Language Grounded Navigation via Environment-agnostic Multitask Learning

no code implementations25 Sep 2019 Xin Wang, Vihan Jain, Eugene Ie, William Wang, Zornitsa Kozareva, Sujith Ravi

Recent research efforts enable study for natural language grounded navigation in photo-realistic environments, e. g., following natural language instructions or dialog.

Vision-Language Navigation

Multi-modal Deep Analysis for Multimedia

no code implementations11 Oct 2019 Wenwu Zhu, Xin Wang, Hongzhi Li

To address the two scientific problems, we investigate them from the following aspects: 1) multi-modal correlational representation: multi-modal fusion of data across different modalities, and 2) multi-modal data and knowledge fusion: multi-modal fusion of data with domain knowledge.

Question Answering Transfer Learning +2

Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings

3 code implementations23 Oct 2019 Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang, Nanxin Chen, Junichi Yamagishi

While speaker adaptation for end-to-end speech synthesis using speaker embeddings can produce good speaker similarity for speakers seen during training, there remains a gap for zero-shot adaptation to unseen speakers.

Audio and Speech Processing

Cross-Channel Intragroup Sparsity Neural Network

no code implementations26 Oct 2019 Zhilin Yu, Chao Wang, Xin Wang, Qing Wu, Yong Zhao, Xundong Wu

Modern deep neural networks rely on overparameterization to achieve state-of-the-art generalization.

Model Compression Network Pruning

Latent Suicide Risk Detection on Microblog via Suicide-Oriented Word Embeddings and Layered Attention

no code implementations IJCNLP 2019 Lei Cao, Huijun Zhang, Ling Feng, Zihan Wei, Xin Wang, Ningyun Li, Xiaohao He

Despite detection of suicidal ideation on social media has made great progress in recent years, people's implicitly and anti-real contrarily expressed posts still remain as an obstacle, constraining the detectors to acquire higher satisfactory performance.

Word Embeddings

Transferring neural speech waveform synthesizers to musical instrument sounds generation

no code implementations27 Oct 2019 Yi Zhao, Xin Wang, Lauri Juvela, Junichi Yamagishi

Recent neural waveform synthesizers such as WaveNet, WaveGlow, and the neural-source-filter (NSF) model have shown good performance in speech synthesis despite their different methods of waveform generation.

Audio Generation Audio Synthesis +2

Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model

1 code implementation10 Nov 2019 Seyyed Saeed Sarfjoo, Xin Wang, Gustav Eje Henter, Jaime Lorenzo-Trueba, Shinji Takaki, Junichi Yamagishi

Nowadays vast amounts of speech data are recorded from low-quality recorder devices such as smartphones, tablets, laptops, and medium-quality microphones.

Sound Audio and Speech Processing

Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation

no code implementations CVPR 2020 Juncheng Li, Xin Wang, Siliang Tang, Haizhou Shi, Fei Wu, Yueting Zhuang, William Yang Wang

Visual navigation is a task of training an embodied agent by intelligently navigating to a target object (e. g., television) using only visual observations.

Object reinforcement-learning +3

Adaptive Activation Network and Functional Regularization for Efficient and Flexible Deep Multi-Task Learning

no code implementations19 Nov 2019 Yingru Liu, Xuewen Yang, Dongliang Xie, Xin Wang, Li Shen, Hao-Zhi Huang, Niranjan Balasubramanian

In this paper, we propose a novel deep learning model called Task Adaptive Activation Network (TAAN) that can automatically learn the optimal network architecture for MTL.

Multi-Task Learning

Domain-Aware Dynamic Networks

no code implementations26 Nov 2019 Tianyuan Zhang, Bichen Wu, Xin Wang, Joseph Gonzalez, Kurt Keutzer

In this work, we propose a method to improve the model capacity without increasing inference-time complexity.

object-detection Object Detection

"How do urban incidents affect traffic speed?" A Deep Graph Convolutional Network for Incident-driven Traffic Speed Prediction

no code implementations3 Dec 2019 Qinge Xie, Tiancheng Guo, Yang Chen, Yu Xiao, Xin Wang, Ben Y. Zhao

Combining above methods, we propose a Deep Incident-Aware Graph Convolutional Network (DIGC-Net) to effectively incorporate urban traffic incident, spatio-temporal, periodic and context features for traffic speed prediction.

Theme-Matters: Fashion Compatibility Learning via Theme Attention

no code implementations12 Dec 2019 Jui-Hsin Lai, Bo Wu, Xin Wang, Dan Zeng, Tao Mei, Jingen Liu

This model associates themes with the pairwise compatibility with attention, and thus compute the outfit-wise compatibility.

Fashion Compatibility Learning

Fully Convolutional Graph Neural Networks using Bipartite Graph Convolutions

no code implementations ICLR 2020 Marcel Nassar, Xin Wang, Evren Tumer

Graph neural networks have been adopted in numerous applications ranging from learning relational representations to modeling data on irregular domains such as point clouds, social graphs, and molecular structures.

Deep Learning for Learning Graph Representations

no code implementations2 Jan 2020 Wenwu Zhu, Xin Wang, Peng Cui

Mining graph data has become a popular research topic in computer science and has been widely studied in both academia and industry given the increasing amount of network data in the recent years.

Network Embedding

Reject Illegal Inputs with Generative Classifier Derived from Any Discriminative Classifier

no code implementations2 Jan 2020 Xin Wang

Experiments on illegal inputs, including adversarial examples, samples with common corruptions, and out-of-distribution~(OOD) samples show that allowed to reject a portion of test samples, SDIM-\emph{logit} significantly improves the performance on the left test sets.

Automated Pavement Crack Segmentation Using U-Net-based Convolutional Neural Network

no code implementations7 Jan 2020 Stephen L. H. Lau, Edwin K. P. Chong, Xu Yang, Xin Wang

In this paper, we propose a deep learning technique based on a convolutional neural network to perform segmentation tasks on pavement crack images.

Crack Segmentation Feature Engineering +2

Joint User Identification, Channel Estimation, and Signal Detection for Grant-Free NOMA

no code implementations12 Jan 2020 Shuchao Jiang, Xiaojun Yuan, Xin Wang, Chongbin Xu, Wei Yu

To address the problem that the exact calculation of the messages exchanged within CSCE and between the two modules is complicated due to phase ambiguity issues, this paper proposes a rotationally invariant Gaussian mixture (RIGM) model, and develops an efficient JUICESD-RIGM algorithm.

Domain Embedded Multi-model Generative Adversarial Networks for Image-based Face Inpainting

no code implementations5 Feb 2020 Xian Zhang, Xin Wang, Bin Kong, Youbing Yin, Qi Song, Siwei Lyu, Jiancheng Lv, Canghong Shi, Xiaojie Li

We firstly represent only face regions using the latent variable as the domain knowledge and combine it with the non-face parts textures to generate high-quality face images with plausible contents.

Facial Inpainting

Category-wise Attack: Transferable Adversarial Examples for Anchor Free Object Detection

no code implementations10 Feb 2020 Quanyu Liao, Xin Wang, Bin Kong, Siwei Lyu, Youbing Yin, Qi Song, Xi Wu

Deep neural networks have been demonstrated to be vulnerable to adversarial attacks: subtle perturbations can completely change the classification results.

Object object-detection +1

Artificial Intelligence Distinguishes COVID-19 from Community Acquired Pneumonia on Chest CT

1 code implementation Radiology 2020 Lin Li, Lixin Qin, Zeguo Xu, Youbing Yin, Xin Wang, Bin Kong, Junjie Bai, Yi Lu, Zhenghan Fang, Qi Song, Kunlin Cao, Daliang Liu, Guisheng Wang, Qizhong Xu, Xisheng Fang, Shiqin Zhang, Juan Xia, Jun Xia

Materials and Methods In this retrospective and multi-center study, a deep learning model, COVID-19 detection neural network (COVNet), was developed to extract visual features from volumetric chest CT exams for the detection of COVID-19.

COVID-19 Image Segmentation Specificity

Improving Sampling Accuracy of Stochastic Gradient MCMC Methods via Non-uniform Subsampling of Gradients

no code implementations20 Feb 2020 Ruilin Li, Xin Wang, Hongyuan Zha, Molei Tao

In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index, which is coupled to the MCMC algorithm.

Computational Efficiency

Frustratingly Simple Few-Shot Object Detection

4 code implementations ICML 2020 Xin Wang, Thomas E. Huang, Trevor Darrell, Joseph E. Gonzalez, Fisher Yu

Such a simple approach outperforms the meta-learning methods by roughly 2~20 points on current benchmarks and sometimes even doubles the accuracy of the prior methods.

Few-Shot Object Detection Meta-Learning +2

Synergic Adversarial Label Learning for Grading Retinal Diseases via Knowledge Distillation and Multi-task Learning

no code implementations24 Mar 2020 Lie Ju, Xin Wang, Xin Zhao, Huimin Lu, Dwarikanath Mahapatra, Paul Bonnington, ZongYuan Ge

In addition, we conduct additional experiments to show the effectiveness of SALL from the aspects of reliability and interpretability in the context of medical imaging application.

Classification General Classification +3

An adaptive neuro-fuzzy model for attitude estimation and 2 control a 3 DOF system

no code implementations21 Apr 2020 Xin Wang, SeyedMehdi Abtahi, Mahmood Chahari, Tianyu Zhao

To evaluate the performance of the AN-FIS controller in closed-loop simulation, an ANFIS observer is used to estimate the attitude and angular velocities of the satellite using magnetometer, sun sensor and data gyro data.

How fine can fine-tuning be? Learning efficient language models

no code implementations24 Apr 2020 Evani Radiya-Dixit, Xin Wang

Given a language model pre-trained on massive unlabeled text corpora, only very light supervised fine-tuning is needed to learn a task: the number of fine-tuning steps is typically five orders of magnitude lower than the total parameter count.

Language Modelling

Introducing the VoicePrivacy Initiative

3 code implementations4 May 2020 Natalia Tomashenko, Brij Mohan Lal Srivastava, Xin Wang, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-François Bonastre, Paul-Gauthier Noé, Massimiliano Todisco

The VoicePrivacy initiative aims to promote the development of privacy preservation tools for speech technology by gathering a new community to define the tasks of interest and the evaluation methodology, and benchmarking solutions through a series of challenges.

Benchmarking

Self-Supervised Deep Visual Odometry with Online Adaptation

no code implementations CVPR 2020 Shunkai Li, Xin Wang, Yingdian Cao, Fei Xue, Zike Yan, Hongbin Zha

In this paper, we propose an online meta-learning algorithm to enable VO networks to continuously adapt to new environments in a self-supervised manner.

Meta-Learning Visual Odometry

Variational quantum Gibbs state preparation with a truncated Taylor series

1 code implementation18 May 2020 Youle Wang, Guangxi Li, Xin Wang

By performing numerical experiments, we show that shallow parameterized circuits with only one additional qubit can be trained to prepare the Ising chain and spin chain Gibbs states with a fidelity higher than 95%.

Quantum Machine Learning

Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis

no code implementations20 May 2020 Yusuke Yasuda, Xin Wang, Junichi Yamagishi

Our experiments suggest that a) a neural sequence-to-sequence TTS system should have a sufficient number of model parameters to produce high quality speech, b) it should also use a powerful encoder when it takes characters as inputs, and c) the encoder still has a room for improvement and needs to have an improved architecture to learn supra-segmental features more appropriately.

Speech Synthesis Text-To-Speech Synthesis

A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions

no code implementations1 Jun 2020 Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Xiaojiang Chen, Xin Wang

Neural Architecture Search (NAS) is just such a revolutionary algorithm, and the related research work is complicated and rich.

Neural Architecture Search

More Practical and Adaptive Algorithms for Online Quantum State Learning

no code implementations1 Jun 2020 Yifang Chen, Xin Wang

This regret bound depends only on the maximum rank $M$ of measurements rather than the number of qubits, which takes advantage of low-rank measurements.

Variational Quantum Singular Value Decomposition

1 code implementation3 Jun 2020 Xin Wang, Zhixin Song, Youle Wang

In this work, we propose a variational quantum algorithm for singular value decomposition (VQSVD).

Image Compression Recommendation Systems

Elliptic Blowup Equations for 6d SCFTs. IV: Matters

1 code implementation4 Jun 2020 Jie Gu, Babak Haghighat, Albrecht Klemm, Kaiwen Sun, Xin Wang

Given the recent geometrical classification of 6d $(1, 0)$ SCFTs, a major question is how to compute for this large class their elliptic genera.

High Energy Physics - Theory Mathematical Physics Mathematical Physics

Learning Continuous-Time Dynamics by Stochastic Differential Networks

no code implementations11 Jun 2020 Yingru Liu, Yucheng Xing, Xuewen Yang, Xin Wang, Jing Shi, Di Jin, Zhaoyue Chen

Learning continuous-time stochastic dynamics is a fundamental and essential problem in modeling sporadic time series, whose observations are irregular and sparse in both time and dimension.

Time Series Time Series Analysis

Cooling-Aware Resource Allocation and Load Management for Mobile Edge Computing Systems

no code implementations19 Jun 2020 Xiaojing Chen, Zhouyu Lu, Wei Ni, Xin Wang, Feng Wang, Shunqing Zhang, Shugong Xu

Driven by explosive computation demands of Internet of Things (IoT), mobile edge computing (MEC) provides a promising technique to enhance the computation capability for mobile users.

Edge-computing Management +1

The curious case of developmental BERTology: On sparsity, transfer learning, generalization and the brain

no code implementations7 Jul 2020 Xin Wang

In this essay, we explore a point of intersection between deep learning and neuroscience, through the lens of large language models, transfer learning and network compression.

Transfer Learning

Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals

no code implementations12 Jul 2020 Tomi Kinnunen, Héctor Delgado, Nicholas Evans, Kong Aik Lee, Ville Vestman, Andreas Nautsch, Massimiliano Todisco, Xin Wang, Md Sahidullah, Junichi Yamagishi, Douglas A. Reynolds

Recent years have seen growing efforts to develop spoofing countermeasures (CMs) to protect automatic speaker verification (ASV) systems from being deceived by manipulated or artificial inputs.

Speaker Verification

Hierarchical Interaction Networks with Rethinking Mechanism for Document-level Sentiment Analysis

1 code implementation16 Jul 2020 Lingwei Wei, Dou Hu, Wei Zhou, Xuehai Tang, Xiaodan Zhang, Xin Wang, Jizhong Han, Songlin Hu

Furthermore, we design a Sentiment-based Rethinking mechanism (SR) by refining the HIN with sentiment label information to learn a more sentiment-aware document representation.

Sentiment Analysis Sentiment Classification +1

A SLAM Map Restoration Algorithm Based on Submaps and an Undirected Connected Graph

no code implementations29 Jul 2020 Zongqian Zhan, Wenjie Jian, Yi-Hui Li, Xin Wang, Yang Yue

To solve the missing map problem, which is an issue in many applications , after the tracking is lost, based on monocular visual SLAM, we present a method of reconstructing a complete global map of UAV datasets by sequentially merging the submaps via the corresponding undirected connected graph.

Simultaneous Localization and Mapping

Shortcuts to Adiabaticity for the Quantum Rabi Model: Efficient Generation of Giant Entangled Cat States via Parametric Amplification

no code implementations10 Aug 2020 Ye-Hong Chen, Wei Qin, Xin Wang, Adam Miranowicz, Franco Nori

We propose a method for the fast generation of nonclassical ground states of the Rabi model in the ultrastrong and deep-strong coupling regimes via the shortcuts-to-adiabatic (STA) dynamics.

Quantum Physics

Deep Learning to Quantify Pulmonary Edema in Chest Radiographs

1 code implementation13 Aug 2020 Steven Horng, Ruizhi Liao, Xin Wang, Sandeep Dalal, Polina Golland, Seth J. Berkowitz

Results: The area under the receiver operating characteristic curve (AUC) for differentiating alveolar edema from no edema was 0. 99 for the semi-supervised model and 0. 87 for the pre-trained models.

Learning Tuple Compatibility for Conditional OutfitRecommendation

no code implementations18 Aug 2020 Xuewen Yang, Dongliang Xie, Xin Wang, Jiangbo Yuan, Wanying Ding, Pengyun Yan

Our contributions include: 1) Designing a Mixed Category Attention Net (MCAN) which integrates both fine-grained and coarse category information into recommendation and learns the compatibility among fashion tuples.

Cultural Vocal Bursts Intensity Prediction Recommendation Systems

Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment

1 code implementation22 Aug 2020 Geeticka Chauhan, Ruizhi Liao, William Wells, Jacob Andreas, Xin Wang, Seth Berkowitz, Steven Horng, Peter Szolovits, Polina Golland

To take advantage of the rich information present in the radiology reports, we develop a neural network model that is trained on both images and free-text to assess pulmonary edema severity from chest radiographs at inference time.

Image Classification Representation Learning

Disentangled Self-Supervision in Sequential Recommenders

1 code implementation23 Aug 2020 Jianxin Ma, Chang Zhou, Hongxia Yang, Peng Cui, Xin Wang, Wenwu Zhu

There exist two challenges: i) reconstructing a future sequence containing many behaviors is exponentially harder than reconstructing a single next behavior, which can lead to difficulty in convergence, and ii) the sequence of all future behaviors can involve many intentions, not all of which may be predictable from the sequence of earlier behaviors.

Disentanglement

Crossing-Domain Generative Adversarial Networks for Unsupervised Multi-Domain Image-to-Image Translation

no code implementations27 Aug 2020 Xuewen Yang, Dongliang Xie, Xin Wang

In this work, we propose a general framework for unsupervised image-to-image translation across multiple domains, which can translate images from domain X to any a domain without requiring direct training between the two domains involved in image translation.

Translation Unsupervised Image-To-Image Translation

Learning by Minimizing the Sum of Ranked Range

1 code implementation NeurIPS 2020 Shu Hu, Yiming Ying, Xin Wang, Siwei Lyu

In forming learning objectives, one oftentimes needs to aggregate a set of individual values to a single output.

Binary Classification General Classification +2

A Unified Approach to Interpreting and Boosting Adversarial Transferability

1 code implementation8 Oct 2020 Xin Wang, Jie Ren, Shuyun Lin, Xiangming Zhu, Yisen Wang, Quanshi Zhang

We discover and prove the negative correlation between the adversarial transferability and the interaction inside adversarial perturbations.

Holistic Combination of Structural and Textual Code Information for Context based API Recommendation

no code implementations15 Oct 2020 Chi Chen, Xin Peng, Zhenchang Xing, Jun Sun, Xin Wang, Yifan Zhao, Wenyun Zhao

APIRec-CST is a deep learning model that combines the API usage with the text information in the source code based on an API Context Graph Network and a Code Token Network that simultaneously learn structural and textual features for API recommendation.

End-to-End Text-to-Speech using Latent Duration based on VQ-VAE

no code implementations19 Oct 2020 Yusuke Yasuda, Xin Wang, Junichi Yamagishi

Explicit duration modeling is a key to achieving robust and efficient alignment in text-to-speech synthesis (TTS).

Speech Synthesis Text-To-Speech Synthesis

SWIPENET: Object detection in noisy underwater images

no code implementations19 Oct 2020 Long Chen, Feixiang Zhou, Shengke Wang, Junyu Dong, Ning li, Haiping Ma, Xin Wang, Huiyu Zhou

Moreover, inspired by the human education process that drives the learning from easy to hard concepts, we here propose the CMA training paradigm that first trains a clean detector which is free from the influence of noisy data.

Object object-detection +1

A Survey on Curriculum Learning

no code implementations25 Oct 2020 Xin Wang, Yudong Chen, Wenwu Zhu

We discuss works on curriculum learning within a general CL framework, elaborating on how to design a manually predefined curriculum or an automatic curriculum.

Active Learning BIG-bench Machine Learning +3

Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis

no code implementations10 Nov 2020 Erica Cooper, Xin Wang, Yi Zhao, Yusuke Yasuda, Junichi Yamagishi

We explore pretraining strategies including choice of base corpus with the aim of choosing the best strategy for zero-shot multi-speaker end-to-end synthesis.

Speech Synthesis

Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for 3D Reconstruction

1 code implementation25 Nov 2020 Anzhu Yu, Wenyue Guo, Bing Liu, Xin Chen, Xin Wang, Xuefeng Cao, Bingchuan Jiang

This strategy estimates the depth map at coarsest level, while the depth maps at finer levels are considered as the upsampled depth map from previous level with pixel-wise depth residual.

3D Reconstruction

Leveraging Regular Fundus Images for Training UWF Fundus Diagnosis Models via Adversarial Learning and Pseudo-Labeling

no code implementations27 Nov 2020 Lie Ju, Xin Wang, Xin Zhao, Paul Bonnington, Tom Drummond, ZongYuan Ge

We propose the use of a modified cycle generative adversarial network (CycleGAN) model to bridge the gap between regular and UWF fundus and generate additional UWF fundus images for training.

Generative Adversarial Network Lesion Detection

Sentence Matching with Syntax- and Semantics-Aware BERT

no code implementations COLING 2020 Tao Liu, Xin Wang, Chengguo Lv, Ranran Zhen, Guohong Fu

Sentence matching aims to identify the special relationship between two sentences, and plays a key role in many natural language processing tasks.

Sentence

Partially Connected Automated Vehicle Cooperative Control Strategy with a Deep Reinforcement Learning Approach

no code implementations3 Dec 2020 Haotian Shi, Yang Zhou, Keshu Wu, Xin Wang, Yangxin Lin, Bin Ran

This paper proposes a cooperative strategy of connected and automated vehicles (CAVs) longitudinal control for partially connected and automated traffic environment based on deep reinforcement learning (DRL) algorithm, which enhances the string stability of mixed traffic, car following efficiency, and energy efficiency.

reinforcement-learning Reinforcement Learning (RL)

Variational Quantum Algorithms for Trace Distance and Fidelity Estimation

1 code implementation10 Dec 2020 Ranyiliu Chen, Zhixin Song, Xuanqiang Zhao, Xin Wang

A novel variational algorithm for trace distance estimation is then derived from this technique, with the assistance of a single ancillary qubit.

Quantum Physics Information Theory Mathematical Physics Information Theory Mathematical Physics Optimization and Control

VSQL: Variational Shadow Quantum Learning for Classification

1 code implementation15 Dec 2020 Guangxi Li, Zhixin Song, Xin Wang

Classification of quantum data is essential for quantum machine learning and near-term quantum technologies.

BIG-bench Machine Learning Classification +3

Noise-Assisted Quantum Autoencoder

1 code implementation15 Dec 2020 Chenfeng Cao, Xin Wang

Based on this understanding, we present a noise-assisted quantum autoencoder algorithm to go beyond the limitations, our model can achieve high recovering fidelity for general input states.

Quantum Physics

Joint Optimization of Trajectory, Propulsion and Thrust Powers for Covert UAV-on-UAV Video Tracking and Surveillance

no code implementations22 Dec 2020 Shuyan Hu, Wei Ni, Xin Wang, Abbas Jamalipour, Dean Ta

Autonomous tracking of suspicious unmanned aerial vehicles (UAVs) by legitimate monitoring UAVs (or monitors) can be crucial to public safety and security.

Generalizable control for multiparameter quantum metrology

no code implementations24 Dec 2020 Han Xu, Lingna Wang, Haidong Yuan, Xin Wang

Here we study the generalizability of optimal control, namely, optimal controls that can be systematically updated across a range of parameters with minimal cost.

Quantum Physics

Detecting and quantifying entanglement on near-term quantum devices

1 code implementation28 Dec 2020 Kun Wang, Zhixin Song, Xuanqiang Zhao, Zihe Wang, Xin Wang

Firstly, it decomposes a positive map into a combination of quantum operations implementable on near-term quantum devices.

Quantum Physics Strongly Correlated Electrons

Intragroup sparsity for efficient inference

no code implementations1 Jan 2021 Zilin Yu, Chao Wang, Xin Wang, Yong Zhao, Xundong Wu

This work studies intragroup sparsity, a fine-grained structural constraint on network weight parameters.

TkML-AP: Adversarial Attacks to Top-k Multi-Label Learning

1 code implementation ICCV 2021 Shu Hu, Lipeng Ke, Xin Wang, Siwei Lyu

Top-k multi-label learning, which returns the top-k predicted labels from an input, has many practical applications such as image annotation, document analysis, and web search engine.

Multi-Label Learning

Towards A Unified Understanding and Improving of Adversarial Transferability

no code implementations ICLR 2021 Xin Wang, Jie Ren, Shuyun Lin, Xiangming Zhu, Yisen Wang, Quanshi Zhang

We discover and prove the negative correlation between the adversarial transferability and the interaction inside adversarial perturbations.

A Marching Cube Algorithm Based on Edge Growth

no code implementations3 Jan 2021 Xin Wang, Su Gao, Monan Wang, Zhenghua Duan

When only the main contour of the 3D model needs to be organized, the algorithm performs well.

3D Reconstruction Graphics

Multimodal Gait Recognition for Neurodegenerative Diseases

1 code implementation7 Jan 2021 Aite Zhao, Jianbo Li, Junyu Dong, Lin Qi, Qianni Zhang, Ning li, Xin Wang, Huiyu Zhou

In recent years, single modality based gait recognition has been extensively explored in the analysis of medical images or other sensory data, and it is recognised that each of the established approaches has different strengths and weaknesses.

Gait Recognition

Instance-Aware Predictive Navigation in Multi-Agent Environments

1 code implementation14 Jan 2021 Jinkun Cao, Xin Wang, Trevor Darrell, Fisher Yu

To decide the action at each step, we seek the action sequence that can lead to safe future states based on the prediction module outputs by repeatedly sampling likely action sequences.

The dynamic energy balance in earthquakes expressed by fault surface morphology

no code implementations18 Jan 2021 Xin Wang, Juan Liu, Feng Gao, Zhizhen Zhang

The fault surface morphology is the direct result of the microscopic processes near the crack tip or on the frictional interface.

Geophysics

A Closer Look at Temporal Sentence Grounding in Videos: Dataset and Metric

no code implementations22 Jan 2021 Yitian Yuan, Xiaohan Lan, Xin Wang, Long Chen, Zhi Wang, Wenwu Zhu

All the results demonstrate that the re-organized dataset splits and new metric can better monitor the progress in TSGV.

Benchmarking Sentence +1

Practical distributed quantum information processing with LOCCNet

2 code implementations28 Jan 2021 Xuanqiang Zhao, Benchi Zhao, Zihe Wang, Zhixin Song, Xin Wang

Here we introduce LOCCNet, a machine learning framework facilitating protocol design and optimization for distributed quantum information processing tasks.

BIG-bench Machine Learning Quantum Machine Learning

Explicit Perturbations to the Stabilizer $τ= {\rm i}$ of Modular $A^\prime_5$ Symmetry and Leptonic CP Violation

no code implementations8 Feb 2021 Xin Wang, Shun Zhou

In a class of neutrino mass models with modular flavor symmetries, it has been observed that CP symmetry is preserved at the fixed point (or stabilizer) of the modulus parameter $\tau = {\rm i}$, whereas significant CP violation emerges within the neighbourhood of this stabilizer.

High Energy Physics - Phenomenology

MetaDelta: A Meta-Learning System for Few-shot Image Classification

1 code implementation22 Feb 2021 Yudong Chen, Chaoyu Guan, Zhikun Wei, Xin Wang, Wenwu Zhu

Meta-learning aims at learning quickly on novel tasks with limited data by transferring generic experience learned from previous tasks.

Classification Decoder +3

Improving Medical Image Classification with Label Noise Using Dual-uncertainty Estimation

no code implementations28 Feb 2021 Lie Ju, Xin Wang, Lin Wang, Dwarikanath Mahapatra, Xin Zhao, Mehrtash Harandi, Tom Drummond, Tongliang Liu, ZongYuan Ge

In this paper, we systematically discuss and define the two common types of label noise in medical images - disagreement label noise from inconsistency expert opinions and single-target label noise from wrong diagnosis record.

Benchmarking General Classification +3

Automated Machine Learning on Graphs: A Survey

2 code implementations1 Mar 2021 Ziwei Zhang, Xin Wang, Wenwu Zhu

Machine learning on graphs has been extensively studied in both academic and industry.

BIG-bench Machine Learning Graph Learning +1

A Hybrid Quantum-Classical Hamiltonian Learning Algorithm

no code implementations1 Mar 2021 Youle Wang, Guangxi Li, Xin Wang

Hamiltonian learning is crucial to the certification of quantum devices and quantum simulators.

A Unified Game-Theoretic Interpretation of Adversarial Robustness

1 code implementation12 Mar 2021 Jie Ren, Die Zhang, Yisen Wang, Lu Chen, Zhanpeng Zhou, Yiting Chen, Xu Cheng, Xin Wang, Meng Zhou, Jie Shi, Quanshi Zhang

This paper provides a unified view to explain different adversarial attacks and defense methods, i. e. the view of multi-order interactions between input variables of DNNs.

Adversarial Robustness

A Comparative Study on Recent Neural Spoofing Countermeasures for Synthetic Speech Detection

no code implementations21 Mar 2021 Xin Wang, Junich Yamagishi

A great deal of recent research effort on speech spoofing countermeasures has been invested into back-end neural networks and training criteria.

Synthetic Speech Detection

Online Learning of a Probabilistic and Adaptive Scene Representation

no code implementations CVPR 2021 Zike Yan, Xin Wang, Hongbin Zha

Constructing and maintaining a consistent scene model on-the-fly is the core task for online spatial perception, interpretation, and action.

Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances

1 code implementation4 Apr 2021 Chang Zeng, Xin Wang, Erica Cooper, Xiaoxiao Miao, Junichi Yamagishi

Probabilistic linear discriminant analysis (PLDA) or cosine similarity have been widely used in traditional speaker verification systems as back-end techniques to measure pairwise similarities.

Speaker Verification

Multi-Scale Context Aggregation Network with Attention-Guided for Crowd Counting

1 code implementation6 Apr 2021 Xin Wang, Yang Zhao, Tangwen Yang, Qiuqi Ruan

In this paper, we propose a multi-scale context aggregation network (MSCANet) based on single-column encoder-decoder architecture for crowd counting, which consists of an encoder based on a dense context-aware module (DCAM) and a hierarchical attention-guided decoder.

Crowd Counting Decoder

An Initial Investigation for Detecting Partially Spoofed Audio

no code implementations6 Apr 2021 Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi, Jose Patino, Nicholas Evans

By definition, partially-spoofed utterances contain a mix of both spoofed and bona fide segments, which will likely degrade the performance of countermeasures trained with entirely spoofed utterances.

Voice Anti-spoofing

Robust Object Detection via Instance-Level Temporal Cycle Confusion

1 code implementation ICCV 2021 Xin Wang, Thomas E. Huang, Benlin Liu, Fisher Yu, Xiaolong Wang, Joseph E. Gonzalez, Trevor Darrell

Building reliable object detectors that are robust to domain shifts, such as various changes in context, viewpoint, and object appearances, is critical for real-world applications.

Object object-detection +2

Relational Subsets Knowledge Distillation for Long-tailed Retinal Diseases Recognition

no code implementations22 Apr 2021 Lie Ju, Xin Wang, Lin Wang, Tongliang Liu, Xin Zhao, Tom Drummond, Dwarikanath Mahapatra, ZongYuan Ge

For example, there are estimated more than 40 different kinds of retinal diseases with variable morbidity, however with more than 30+ conditions are very rare from the global patient cohorts, which results in a typical long-tailed learning problem for deep learning-based screening models.

Knowledge Distillation

Adversarial Attack Framework on Graph Embedding Models with Limited Knowledge

no code implementations26 May 2021 Heng Chang, Yu Rong, Tingyang Xu, Wenbing Huang, Honglei Zhang, Peng Cui, Xin Wang, Wenwu Zhu, Junzhou Huang

We investigate the theoretical connections between graph signal processing and graph embedding models and formulate the graph embedding model as a general graph signal process with a corresponding graph filter.

Adversarial Attack Graph Embedding +1

A Multi-Level Attention Model for Evidence-Based Fact Checking

1 code implementation Findings (ACL) 2021 Canasai Kruengkrai, Junichi Yamagishi, Xin Wang

Evidence-based fact checking aims to verify the truthfulness of a claim against evidence extracted from textual sources.

Fact Checking Sentence

Imperceptible Adversarial Examples for Fake Image Detection

no code implementations3 Jun 2021 Quanyu Liao, Yuezun Li, Xin Wang, Bin Kong, Bin Zhu, Siwei Lyu, Youbing Yin, Qi Song, Xi Wu

Fooling people with highly realistic fake images generated with Deepfake or GANs brings a great social disturbance to our society.

Face Swapping Fake Image Detection

Transferable Adversarial Examples for Anchor Free Object Detection

no code implementations3 Jun 2021 Quanyu Liao, Xin Wang, Bin Kong, Siwei Lyu, Bin Zhu, Youbing Yin, Qi Song, Xi Wu

Deep neural networks have been demonstrated to be vulnerable to adversarial attacks: subtle perturbation can completely change prediction result.

Adversarial Attack Object +2

Visual Question Rewriting for Increasing Response Rate

no code implementations4 Jun 2021 Jiayi Wei, Xilian Li, Yi Zhang, Xin Wang

Offline experiments and mechanical Turk based evaluations show that it is possible to rewrite bland questions in a more detailed and attractive way to increase the response rate, and images can be helpful.

4k Question Rewriting

Sum of Ranked Range Loss for Supervised Learning

1 code implementation7 Jun 2021 Shu Hu, Yiming Ying, Xin Wang, Siwei Lyu

A combination loss of AoRR and TKML is proposed as a new learning objective for improving the robustness of multi-label learning in the face of outliers in sample and labels alike.

Multi-class Classification Multi-Label Learning

DETReg: Unsupervised Pretraining with Region Priors for Object Detection

1 code implementation CVPR 2022 Amir Bar, Xin Wang, Vadim Kantorov, Colorado J Reed, Roei Herzig, Gal Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson

Recent self-supervised pretraining methods for object detection largely focus on pretraining the backbone of the object detector, neglecting key parts of detection architecture.

Few-Shot Learning Few-Shot Object Detection +6

Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing

1 code implementation11 Jun 2021 Tomi Kinnunen, Andreas Nautsch, Md Sahidullah, Nicholas Evans, Xin Wang, Massimiliano Todisco, Héctor Delgado, Junichi Yamagishi, Kong Aik Lee

Whether it be for results summarization, or the analysis of classifier fusion, some means to compare different classifiers can often provide illuminating insight into their behaviour, (dis)similarity or complementarity.

Speaker Verification Voice Anti-spoofing

Medical Matting: A New Perspective on Medical Segmentation with Uncertainty

1 code implementation18 Jun 2021 Lin Wang, Lie Ju, Xin Wang, Wanji He, Donghao Zhang, Yelin Huang, Zhiwen Yang, Xuan Yao, Xin Zhao, Xiufen Ye, ZongYuan Ge

None of them investigate the influence of the ambiguous nature of the lesion itself. Inspired by image matting, this paper introduces alpha matte as a soft mask to represent uncertain areas in medical scenes and accordingly puts forward a new uncertainty quantification method to fill the gap of uncertainty research for lesion structure.

Image Matting Image Segmentation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.