Search Results for author: Bo Xu

Found 80 papers, 20 papers with code

Bridging the Gap between Prior and Posterior Knowledge Selection for Knowledge-Grounded Dialogue Generation

no code implementations EMNLP 2020 Xiuyi Chen, Fandong Meng, Peng Li, Feilong Chen, Shuang Xu, Bo Xu, Jie zhou

Here, we deal with these issues on two aspects: (1) We enhance the prior selection module with the necessary posterior information obtained from the specially designed Posterior Information Prediction Module (PIPM); (2) We propose a Knowledge Distillation Based Training Strategy (KDBTS) to train the decoder with the knowledge selected from the prior distribution, removing the exposure bias of knowledge selection.

Dialogue Generation Knowledge Distillation +1

Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction

no code implementations ICCV 2021 Bo Xu, Han Huang, Cheng Lu, Ziwen Li, Yandong Guo

In this paper, we propose a Virtual Multi-modality Foreground Matting (VMFM) method to learn human-object interactive foreground (human and objects interacted with him or her) from a raw RGB image.

Human-Object Interaction Detection

Tao: A Learning Framework for Adaptive Nearest Neighbor Search using Static Features Only

no code implementations2 Oct 2021 Kaixiang Yang, Hongya Wang, Bo Xu, Wei Wang, Yingyuan Xiao, Ming Du, Junfeng Zhou

In the middle of query execution, AdaptNN collects a number of runtime features and predicts termination condition for each individual query, by which better end-to-end latency is attained.

Feature Selection Information Retrieval

A Simple Unified Framework for Anomaly Detection in Deep Reinforcement Learning

no code implementations21 Sep 2021 Hongming Zhang, Ke Sun, Bo Xu, Linglong Kong, Martin Müller

In this paper, we propose a simple yet effective anomaly detection framework for deep RL algorithms that simultaneously considers random, adversarial and out-of-distribution~(OOD) state outliers.

Anomaly Detection Atari Games

Population-coding and Dynamic-neurons improved Spiking Actor Network for Reinforcement Learning

no code implementations15 Jun 2021 Duzhen Zhang, Tielin Zhang, Shuncheng Jia, Xiang Cheng, Bo Xu

Based on a hybrid learning framework, where a spike actor-network infers actions from states and a deep critic network evaluates the actor, we propose a Population-coding and Dynamic-neurons improved Spiking Actor Network (PDSAN) for efficient state representation from two different scales: input coding and neuronal coding.

OpenAI Gym

WASE: Learning When to Attend for Speaker Extraction in Cocktail Party Environments

1 code implementation13 Jun 2021 Yunzhe Hao, Jiaming Xu, Peng Zhang, Bo Xu

In the speaker extraction problem, it is found that additional information from the target speaker contributes to the tracking and extraction of the target speaker, which includes voiceprint, lip movement, facial expression, and spatial information.

Action Detection Activity Detection

Counterfactual Supporting Facts Extraction for Explainable Medical Record Based Diagnosis with Graph Network

no code implementations NAACL 2021 Haoran Wu, Wei Chen, Shuang Xu, Bo Xu

Specifically, we first structure the sequence of EMR into a hierarchical graph network and then obtain the causal relationship between multi-granularity features and diagnosis results through counterfactual intervention on the graph.

MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation

no code implementations17 Apr 2021 Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu

The spatial self-attention module is designed to attend on the cross-channel correlation in the covariance matrices.

automatic-speech-recognition Speech Quality +2

Network Representation Learning: From Traditional Feature Learning to Deep Learning

no code implementations7 Mar 2021 Ke Sun, Lei Wang, Bo Xu, Wenhong Zhao, Shyh Wei Teng, Feng Xia

Network representation learning (NRL) is an effective graph analytics technique and promotes users to deeply understand the hidden characteristics of graph data.

Recommendation Systems Representation Learning

Graph Force Learning

no code implementations7 Mar 2021 Ke Sun, Jiaying Liu, Shuo Yu, Bo Xu, Feng Xia

Features representation leverages the great power in network analysis tasks.

Graph Learning

Long-Running Speech Recognizer:An End-to-End Multi-Task Learning Framework for Online ASR and VAD

no code implementations2 Mar 2021 Meng Li, Shiyu Zhou, Bo Xu

Experimental results on segmented speech data show that the proposed MTL framework outperforms the baseline single-task learning (STL) framework in ASR task.

Action Detection Activity Detection +3

MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition

no code implementations25 Feb 2021 Linghui Meng, Jin Xu, Xu Tan, Jindong Wang, Tao Qin, Bo Xu

In this paper, we propose MixSpeech, a simple yet effective data augmentation method based on mixup for automatic speech recognition (ASR).

automatic-speech-recognition Data Augmentation +2

1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

1 code implementation20 Jan 2021 Fei Du, Bo Xu, Jiasheng Tang, Yuqi Zhang, Fan Wang, Hao Li

We extend the classical tracking-by-detection paradigm to this tracking-any-object task.

 Ranked #1 on Multi-Object Tracking on TAO (using extra training data)

Multi-Object Tracking

Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech Recognition

no code implementations17 Jan 2021 Cheng Yi, Shiyu Zhou, Bo Xu

In this work, we fuse a pre-trained acoustic encoder (wav2vec2. 0) and a pre-trained linguistic encoder (BERT) into an end-to-end ASR model.

automatic-speech-recognition End-To-End Speech Recognition +2

Research on Fast Text Recognition Method for Financial Ticket Image

no code implementations5 Jan 2021 Fukang Tian, Haiyu Wu, Bo Xu

At present, a few works have applied deep learning methods to financial ticket recognition.

Region Proposal

Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages

no code implementations22 Dec 2020 Cheng Yi, Jianzhong Wang, Ning Cheng, Shiyu Zhou, Bo Xu

To verify its universality over languages, we apply pre-trained models to solve low-resource speech recognition tasks in various spoken languages.

Speech Recognition

CIF-based Collaborative Decoding for End-to-end Contextual Speech Recognition

no code implementations17 Dec 2020 Minglun Han, Linhao Dong, Shiyu Zhou, Bo Xu

End-to-end (E2E) models have achieved promising results on multiple speech recognition benchmarks, and shown the potential to become the mainstream.

Speech Recognition

Research on All-content Text Recognition Method for Financial Ticket Image

no code implementations15 Dec 2020 Fukang Tian, Haiyu Wu, Bo Xu

With the development of the economy, the number of financial tickets increases rapidly.

Knowledge Aware Emotion Recognition in Textual Conversations via Multi-Task Incremental Transformer

no code implementations COLING 2020 Duzhen Zhang, Xiuyi Chen, Shuang Xu, Bo Xu

For one thing, speakers often rely on the context and commonsense knowledge to express emotions; for another, most utterances contain neutral emotion in conversations, as a result, the confusion between a few non-neutral utterances and much more neutral ones restrains the emotion recognition performance.

Emotion Recognition Graph Attention +3

Audio-visual Speech Separation with Adversarially Disentangled Visual Representation

no code implementations29 Nov 2020 Peng Zhang, Jiaming Xu, Jing Shi, Yunzhe Hao, Bo Xu

In our model, we use the face detector to detect the number of speakers in the scene and use visual information to avoid the permutation problem.

Speech Separation

Financial ticket intelligent recognition system based on deep learning

no code implementations29 Oct 2020 Fukang Tian, Haiyu Wu, Bo Xu

Facing the rapid growth in the issuance of financial tickets (or bills, invoices etc.

Tuning Convolutional Spiking Neural Network with Biologically-plausible Reward Propagation

no code implementations9 Oct 2020 Tielin Zhang, Shuncheng Jia, Xiang Cheng, Bo Xu

The performance of the proposed BRP-SNN is further verified on the spatial (including MNIST and Cifar-10) and temporal (including TIDigits and DvsGesture) tasks, where the SNN using BRP has reached a similar accuracy compared to other state-of-the-art BP-based SNNs and saved 50% more computational cost than ANNs.

Finite Meta-Dynamic Neurons in Spiking Neural Networks for Spatio-temporal Learning

no code implementations7 Oct 2020 Xiang Cheng, Tielin Zhang, Shuncheng Jia, Bo Xu

Spiking Neural Networks (SNNs) have incorporated more biologically-plausible structures and learning principles, hence are playing critical roles in bridging the gap between artificial and natural neural networks.

Consecutive Decoding for Speech-to-text Translation

1 code implementation21 Sep 2020 Qianqian Dong, Mingxuan Wang, Hao Zhou, Shuang Xu, Bo Xu, Lei LI

The key idea is to generate source transcript and target translation text with a single decoder.

Machine Translation Speech Recognition +2

Shifu2: A Network Representation Learning Based Model for Advisor-advisee Relationship Mining

no code implementations17 Aug 2020 Jiaying Liu, Feng Xia, Lei Wang, Bo Xu, Xiangjie Kong, Hanghang Tong, Irwin King

The advisor-advisee relationship represents direct knowledge heritage, and such relationship may not be readily available from academic libraries and search engines.

Representation Learning

DINE: A Framework for Deep Incomplete Network Embedding

no code implementations9 Aug 2020 Ke Hou, Jiaying Liu, Yin Peng, Bo Xu, Ivan Lee, Feng Xia

Empirically, we evaluate DINE over three networks on multi-label classification and link prediction tasks.

General Classification Link Prediction +3

MODEL: Motif-based Deep Feature Learning for Link Prediction

no code implementations9 Aug 2020 Lei Wang, Jing Ren, Bo Xu, Jian-Xin Li, Wei Luo, Feng Xia

Link prediction plays an important role in network analysis and applications.

Link Prediction

Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals

no code implementations NeurIPS 2020 Jing Shi, Xuankai Chang, Pengcheng Guo, Shinji Watanabe, Yusuke Fujita, Jiaming Xu, Bo Xu, Lei Xie

This model additionally has a simple and efficient stop criterion for the end of the transduction, making it able to infer the variable number of output sequences.

Speech Recognition Speech Separation

Speaker-Conditional Chain Model for Speech Separation and Extraction

no code implementations25 Jun 2020 Jing Shi, Jiaming Xu, Yusuke Fujita, Shinji Watanabe, Bo Xu

With the predicted speaker information from whole observation, our model is helpful to solve the problem of conventional speech separation and speaker extraction for multi-round long recordings.

Audio and Speech Processing Sound

Deep Learning Guided Building Reconstruction from Satellite Imagery-derived Point Clouds

no code implementations19 May 2020 Bo Xu, Xu Zhang, Zhixin Li, Matt Leotta, Shih-Fu Chang, Jie Shan

For points that belong to the same roof shape, a multi-cue, hierarchical RANSAC approach is proposed for efficient and reliable segmenting and reconstructing the building point cloud.

3D Reconstruction

Discriminative Multi-modality Speech Recognition

2 code implementations CVPR 2020 Bo Xu, Cheng Lu, Yandong Guo, Jacob Wang

Vision is often used as a complementary modality for audio speech recognition (ASR), especially in the noisy environment where performance of solo audio modality significantly deteriorates.

Audio-Visual Speech Recognition Lipreading +1

DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog

1 code implementation18 Dec 2019 Feilong Chen, Fandong Meng, Jiaming Xu, Peng Li, Bo Xu, Jie zhou

Visual Dialog is a vision-language task that requires an AI agent to engage in a conversation with humans grounded in an image.

Visual Dialog

Few-Features Attack to Fool Machine Learning Models through Mask-Based GAN

no code implementations12 Nov 2019 Feng Chen, Yunkai Shang, Bo Xu, Jincheng Hu

In comparison with the previous non-learning adversarial example attack approaches, the GAN-based adversarial attack example approach can generate the adversarial samples quickly using the GAN architecture every time facing a new sample after training, but meanwhile needs to perturb the attack samples in great quantities, which results in the unpractical application in reality.

Adversarial Attack

Unsupervised pre-training for sequence to sequence speech recognition

no code implementations28 Oct 2019 Zhiyun Fan, Shiyu Zhou, Bo Xu

The unsupervised pre-training is finished on AISHELL-2 dataset and we apply the pre-trained model to multiple paired data ratios of AISHELL-1 and HKUST.

automatic-speech-recognition Sequence-To-Sequence Speech Recognition +1

Iterative Update and Unified Representation for Multi-Agent Reinforcement Learning

no code implementations16 Aug 2019 Jiancheng Long, Hongming Zhang, Tianyang Yu, Bo Xu

In this method, iterative update can greatly alleviate the nonstationarity of the environment, unified representation can speed up the interaction with environment and avoid the linear growth of memory usage.

Multi-agent Reinforcement Learning

A Working Memory Model for Task-oriented Dialog Response Generation

no code implementations ACL 2019 Xiuyi Chen, Jiaming Xu, Bo Xu

Our WMM2Seq adopts a working memory to interact with two separated long-term memories, which are the episodic memory for memorizing dialog history and the semantic memory for storing KB tuples.

The World in My Mind: Visual Dialog with Adversarial Multi-modal Feature Encoding

no code implementations NAACL 2019 Yiqun Yao, Jiaming Xu, Bo Xu

Visual Dialog is a multi-modal task that requires a model to participate in a multi-turn human dialog grounded on an image, and generate correct, human-like responses.

Visual Dialog

CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition

no code implementations27 May 2019 Linhao Dong, Bo Xu

In this paper, we propose a novel soft and monotonic alignment mechanism used for sequence transduction.

End-To-End Speech Recognition Language Modelling +2

Reference-Based Sequence Classification

no code implementations17 May 2019 Zengyou He, Guangyao Xu, Chaohua Sheng, Bo Xu, Quan Zou

By utilizing this framework as a tool, we propose new sequence classification algorithms that are quite different from existing solutions.

Classification General Classification

Material Segmentation of Multi-View Satellite Imagery

no code implementations17 Apr 2019 Matthew Purri, Jia Xue, Kristin Dana, Matthew Leotta, Dan Lipsa, Zhixin Li, Bo Xu, Jie Shan

The residuals are computed by differencing the sparse-sampled reflectance function with a dictionary of pre-defined dense-sampled reflectance functions.

Material Recognition Semantic Segmentation

Self-Attention Aligner: A Latency-Control End-to-End Model for ASR Using Self-Attention Network and Chunk-Hopping

no code implementations18 Feb 2019 Linhao Dong, Feng Wang, Bo Xu

Experiments on two Mandarin ASR datasets show the replacement of RNNs by the self-attention networks yields a 8. 4%-10. 2% relative character error rate (CER) reduction.

automatic-speech-recognition Language Modelling +1

A Biologically Plausible Supervised Learning Method for Spiking Neural Networks Using the Symmetric STDP Rule

1 code implementation17 Dec 2018 Yunzhe Hao, Xuhui Huang, Meng Dong, Bo Xu

By combining the sym-STDP rule with bio-plausible synaptic scaling and intrinsic plasticity of the dynamic threshold, our SNN model implemented SL well and achieved good performance in the benchmark recognition task (MNIST dataset).

Concept Learning through Deep Reinforcement Learning with Memory-Augmented Neural Networks

no code implementations15 Nov 2018 Jing Shi, Jiaming Xu, Yiqun Yao, Bo Xu

In this paper, we present a memory-augmented neural network which is motivated by the process of human concept learning.

One-Shot Learning Outlier Detection

WECA: A WordNet-Encoded Collocation-Attention Network for Homographic Pun Recognition

no code implementations EMNLP 2018 Yufeng Diao, Hongfei Lin, Di wu, Liang Yang, Kan Xu, Zhihao Yang, Jian Wang, Shaowu Zhang, Bo Xu, Dongyu Zhang

In this work, we first use WordNet to understand and expand word embedding for settling the polysemy of homographic puns, and then propose a WordNet-Encoded Collocation-Attention network model (WECA) which combined with the context weights for recognizing the puns.

Semi-Supervised Disfluency Detection

no code implementations COLING 2018 Feng Wang, Wei Chen, Zhen Yang, Qianqian Dong, Shuang Xu, Bo Xu

While the disfluency detection has achieved notable success in the past years, it still severely suffers from the data scarcity.

Machine Translation Question Answering

Construction of a Chinese Corpus for the Analysis of the Emotionality of Metaphorical Expressions

no code implementations ACL 2018 Dongyu Zhang, Hongfei Lin, Liang Yang, Shaowu Zhang, Bo Xu

However, there is little research on the construction of metaphor corpora annotated with emotion for the analysis of emotionality of metaphorical expressions.

Emotion Recognition

Single-channel Speech Dereverberation via Generative Adversarial Training

no code implementations25 Jun 2018 Chenxing Li, Tieqiang Wang, Shuang Xu, Bo Xu

In this paper, we propose a single-channel speech dereverberation system (DeReGAT) based on convolutional, bidirectional long short-term memory and deep feed-forward neural network (CBLDNN) with generative adversarial training (GAT).

Speech Dereverberation Speech Quality

Multilingual End-to-End Speech Recognition with A Single Transformer on Low-Resource Languages

no code implementations12 Jun 2018 Shiyu Zhou, Shuang Xu, Bo Xu

Experiments on CALLHOME datasets demonstrate that the multilingual ASR Transformer with the language symbol at the end performs better and can obtain relatively 10. 5\% average word error rate (WER) reduction compared to SHL-MLSTM with residual learning.

automatic-speech-recognition End-To-End Speech Recognition +2

NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition

2 code implementations4 Jun 2018 Fenfen Sheng, Zhineng Chen, Bo Xu

Considering scene image has large variation in text and background, we further design a modality-transform block to effectively transform 2D input images to 1D sequences, combined with the encoder to extract more discriminative features.

Scene Text Scene Text Recognition

A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese

no code implementations16 May 2018 Shiyu Zhou, Linhao Dong, Shuang Xu, Bo Xu

Experiments on HKUST datasets demonstrate that the lexicon free modeling units can outperform lexicon related modeling units in terms of character error rate (CER).

automatic-speech-recognition Language Modelling +1

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

1 code implementation28 Apr 2018 Shiyu Zhou, Linhao Dong, Shuang Xu, Bo Xu

Furthermore, we investigate a comparison between syllable based model and context-independent phoneme (CI-phoneme) based model with the Transformer in Mandarin Chinese.

automatic-speech-recognition Language Modelling +3

Unsupervised Neural Machine Translation with Weight Sharing

1 code implementation ACL 2018 Zhen Yang, Wei Chen, Feng Wang, Bo Xu

Unsupervised neural machine translation (NMT) is a recently proposed approach for machine translation which aims to train the model without using any labeled data.

Machine Translation Translation

Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation

1 code implementation IJCNLP 2017 Chunqi Wang, Bo Xu

The first is that they heavily rely on manually designed bigram feature, i. e. they are not good at capturing n-gram features automatically.

Chinese Word Segmentation Feature Engineering +1

Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets

2 code implementations NAACL 2018 Zhen Yang, Wei Chen, Feng Wang, Bo Xu

During training, both the dynamic discriminator and the static BLEU objective are employed to evaluate the generated sentences and feedback the evaluations to guide the learning of the generator.

Machine Translation Translation

A Character-Aware Encoder for Neural Machine Translation

no code implementations COLING 2016 Zhen Yang, Wei Chen, Feng Wang, Bo Xu

This article proposes a novel character-aware neural machine translation (NMT) model that views the input sequences as sequences of characters rather than words.

Machine Translation Translation

Combining Lexical and Semantic-based Features for Answer Sentence Selection

no code implementations WS 2016 Jing Shi, Jiaming Xu, Yiqun Yao, Suncong Zheng, Bo Xu

As the result of the evaluation shows, our solution provides a valuable and brief model which could be used in modelling question answering or sentence semantic relevance.

Feature Engineering Open-Domain Question Answering

Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling

3 code implementations COLING 2016 Peng Zhou, Zhenyu Qi, Suncong Zheng, Jiaming Xu, Hongyun Bao, Bo Xu

To integrate the features on both dimensions of the matrix, this paper explores applying 2D max pooling operation to obtain a fixed-length representation of the text.

Classification General Classification +2

Hierarchical Memory Networks for Answer Selection on Unknown Words

1 code implementation COLING 2016 Jiaming Xu, Jing Shi, Yiqun Yao, Suncong Zheng, Bo Xu

Recently, end-to-end memory networks have shown promising results on Question Answering task, which encode the past facts into an explicit memory and perform reasoning ability by making multiple computational steps on the memory.

Answer Selection

Cannot find the paper you are looking for? You can Submit a new open access paper.