Search Results for author: Bo Xu

Found 109 papers, 36 papers with code

结合标签转移关系的多任务笑点识别方法(Multi-task punchlines recognition method combined with label transfer relationship)

no code implementations CCL 2021 Tongyue Zhang, Shaowu Zhang, Bo Xu, Liang Yang, Hongfei Lin

“幽默在人类交流中扮演着重要角色, 并大量存在于情景喜剧中。笑点(punchline)是情景喜剧实现幽默效果的形式之一, 在情景喜剧笑点识别任务中, 每条句子的标签代表该句是否为笑点, 但是以往的笑点识别工作通常只通过建模上下文语义关系识别笑点, 对标签的利用并不充分。为了充分利用标签序列中的信息, 本文提出了一种新的识别方法, 即结合条件随机场的单词级-句子级多任务学习模型, 该模型在两方面进行了改进, 首先将标签序列中相邻两个标签之间的转移关系看作幽默理论中不一致性的一种体现, 并使用条件随机场学习这种转移关系, 其次由于学习相邻标签之间的转移关系以及上下文语义关系均能够学习到铺垫和笑点之间的不一致性, 两者之间存在相关性, 为了使模型通过利用这种相关性提高笑点识别的效果, 该模型引入了多任务学习方法, 使用多任务学习方法同时学习每条句子的句义、组成每条句子的所有字符的词义, 单词级别的标签转移关系以及句子级别的标签转移关系。本文在CCL2020“小牛杯”幽默计算—情景喜剧笑点识别评测任务的英文数据集上进行实验, 结果表明, 本文提出的方法比目前最好的方法提高了3. 2%, 在情景喜剧幽默笑点识别任务上取得了最好的效果, 并通过消融实验证明了上述两方面改进的有效性。”

RealMedDial: A Real Telemedical Dialogue Dataset Collected from Online Chinese Short-Video Clips

no code implementations COLING 2022 Bo Xu, Hongtong Zhang, Jian Wang, Xiaokun Zhang, Dezhi Hao, Linlin Zong, Hongfei Lin, Fenglong Ma

We collected and annotated a wide range of meta-data with respect to medical dialogue including doctor profiles, hospital departments, diseases and symptoms for fine-grained analysis on language usage pattern and clinical diagnosis.

Response Generation

GUTS at SemEval-2022 Task 4: Adversarial Training and Balancing Methods for Patronizing and Condescending Language Detection

no code implementations SemEval (NAACL) 2022 Junyu Lu, Hao Zhang, Tongyue Zhang, Hongbo Wang, Haohao Zhu, Bo Xu, Hongfei Lin

For Subtask B, framed as a multi-label classification problem, we utilize various improved multi-label cross-entropy loss functions and analyze the performance of our method.

Multi-Label Classification

Bridging the Gap between Prior and Posterior Knowledge Selection for Knowledge-Grounded Dialogue Generation

no code implementations EMNLP 2020 Xiuyi Chen, Fandong Meng, Peng Li, Feilong Chen, Shuang Xu, Bo Xu, Jie zhou

Here, we deal with these issues on two aspects: (1) We enhance the prior selection module with the necessary posterior information obtained from the specially designed Posterior Information Prediction Module (PIPM); (2) We propose a Knowledge Distillation Based Training Strategy (KDBTS) to train the decoder with the knowledge selected from the prior distribution, removing the exposure bias of knowledge selection.

Dialogue Generation Knowledge Distillation

软件标识符的自然语言规范性研究(Research on the Natural Language Normalness of Software Identifiers)

no code implementations CCL 2021 Dongzhen Wen, Fan Zhang, Xiao Zhang, Liang Yang, Yuan Lin, Bo Xu, Hongfei Lin

“软件源代码的理解则是软件协同开发与维护的核心, 而源代码中占半数以上的标识符的理解则在软件理解中起到重要作用, 传统软件工程主要研究通过命名规范限制标识符的命名过程以构造更易理解和交流的标识符。本文则在梳理分析常见编程语言命名规范的基础上, 提出一种全新的标识符可理解性评价标准。具体而言, 本文首先总结梳理了常见主流编程语言中的命名规范并类比自然语言语素概念本文提出基于软件语素的标识符构成过程, 即标识符的构成可被视为软件语素的生成、排列和连接过程。在此基础上, 本文提出一种结合自然语料库的软件标识符规范性评价方法, 用来衡量软件标识符是否易于理解。最后, 本文通过源代码理解数据集和乇乩乴乨乵乢平台中开源项目对规范性指标进行了验证性实验, 结果表明本文提出的规范性分数能够很好衡量软件项目的可理解性。”

Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition

no code implementations2 Feb 2023 Minglun Han, Qingyu Wang, Tielin Zhang, Yi Wang, Duzhen Zhang, Bo Xu

The spiking neural network (SNN) using leaky-integrated-and-fire (LIF) neurons has been commonly used in automatic speech recognition (ASR) tasks.

Automatic Speech Recognition speech-recognition

Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation

1 code implementation30 Jan 2023 Minglun Han, Feilong Chen, Jing Shi, Shuang Xu, Bo Xu

Large-scale pre-trained language models (PLMs) with powerful language modeling capabilities have been widely used in natural language processing.

Automatic Speech Recognition Knowledge Distillation +3

Tuning Synaptic Connections instead of Weights by Genetic Algorithm in Spiking Policy Network

1 code implementation29 Dec 2022 Duzhen Zhang, Tielin Zhang, Shuncheng Jia, Qingyu Wang, Bo Xu

Learning from the interaction is the primary way biological agents know about the environment and themselves.

Privileged Prior Information Distillation for Image Matting

no code implementations25 Nov 2022 Cheng Lyu, Jiake Xie, Bo Xu, Cheng Lu, Han Huang, Xin Huang, Ming Wu, Chuang Zhang, Yong Tang

Performance of trimap-free image matting methods is limited when trying to decouple the deterministic and undetermined regions, especially in the scenes where foregrounds are semantically ambiguous, chromaless, or high transmittance.

Image Matting

Sequentially Sampled Chunk Conformer for Streaming End-to-End ASR

no code implementations21 Nov 2022 Fangyuan Wang, Xiyuan Wang, Bo Xu

This paper presents an in-depth study on a Sequentially Sampled Chunk Conformer, SSC-Conformer, for streaming End-to-End (E2E) ASR.

Motif-topology improved Spiking Neural Network for the Cocktail Party Effect and McGurk Effect

1 code implementation12 Nov 2022 Shuncheng Jia, Tielin Zhang, Ruichen Zuo, Bo Xu

Here, we propose a Motif-topology improved SNN (M-SNN) for the efficient multi-sensory integration and cognitive phenomenon simulations.

HT-Net: Hierarchical Transformer based Operator Learning Model for Multiscale PDEs

no code implementations19 Oct 2022 Xinliang Liu, Bo Xu, Lei Zhang

Complex nonlinear interplays of multiple scales give rise to many interesting physical phenomena and pose major difficulties for the computer simulation of multiscale PDE models in areas such as reservoir simulation, high frequency scattering and turbulence modeling.

Operator learning

Attention Spiking Neural Networks

no code implementations28 Sep 2022 Man Yao, Guangshe Zhao, Hengyu Zhang, Yifan Hu, Lei Deng, Yonghong Tian, Bo Xu, Guoqi Li

On ImageNet-1K, we achieve top-1 accuracy of 75. 92% and 77. 08% on single/4-step Res-SNN-104, which are state-of-the-art results in SNNs.

Action Recognition Image Classification

Situational Perception Guided Image Matting

no code implementations20 Apr 2022 Bo Xu, Jiake Xie, Han Huang, Ziwen Li, Cheng Lu, Yong Tang, Yandong Guo

In this paper, we propose a Situational Perception Guided Image Matting (SPG-IM) method that mitigates subjective bias of matting annotations and captures sufficient situational perception information for better global saliency distilled from the visual-to-textual task.

Association Image Matting

Improving Cross-Modal Understanding in Visual Dialog via Contrastive Learning

no code implementations15 Apr 2022 Feilong Chen, Xiuyi Chen, Shuang Xu, Bo Xu

Visual Dialog is a challenging vision-language task since the visual dialog agent needs to answer a series of questions after reasoning over both the image content and dialog history.

Contrastive Learning Question Answering +2

Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR

1 code implementation29 Mar 2022 Fangyuan Wang, Bo Xu

We integrate this scheme with the chunk-wise Transformer and Conformer, and identify them as SChunk-Transformer and SChunk-Conformer, respectively.

Automatic Speech Recognition speech-recognition

Semantic Distillation Guided Salient Object Detection

no code implementations8 Mar 2022 Bo Xu, Guanze Liu, Han Huang, Cheng Lu, Yandong Guo

Most existing CNN-based salient object detection methods can identify local segmentation details like hair and animal fur, but often misinterpret the real saliency due to the lack of global contextual information caused by the subjectiveness of the SOD task and the locality of convolution layers.

Association Image Captioning +3

Motif-topology and Reward-learning improved Spiking Neural Network for Efficient Multi-sensory Integration

1 code implementation11 Feb 2022 Shuncheng Jia, Ruichen Zuo, Tielin Zhang, Hongxing Liu, Bo Xu

Network architectures and learning principles are key in forming complex functions in artificial neural networks (ANNs) and spiking neural networks (SNNs).

Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection

1 code implementation30 Jan 2022 Minglun Han, Linhao Dong, Zhenlin Liang, Meng Cai, Shiyu Zhou, Zejun Ma, Bo Xu

Nowadays, most methods in end-to-end contextual speech recognition bias the recognition process towards contextual knowledge.

speech-recognition Speech Recognition

Semi-Supervised Adversarial Recognition of Refined Window Structures for Inverse Procedural Façade Modeling

no code implementations22 Jan 2022 Han Hu, Xinrong Liang, Yulin Ding, Qisen Shang, Bo Xu, Xuming Ge, Min Chen, Ruofei Zhong, Qing Zhu

Unfortunately, the large amount of interactive sample labeling efforts has dramatically hindered the application of deep learning methods, especially for 3D modeling tasks, which require heterogeneous samples.

Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem

no code implementations17 Dec 2021 Jing Shi, Xuankai Chang, Tomoki Hayashi, Yen-Ju Lu, Shinji Watanabe, Bo Xu

Specifically, we propose a novel speech separation/enhancement model based on the recognition of discrete symbols, and convert the paradigm of the speech separation/enhancement related tasks from regression to classification.

regression Speech Separation

Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks

1 code implementation6 Dec 2021 Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xiyun Li, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Bo Xu

In this paper, we facilitate the research by providing large-scale datasets, and use them to examine the usage of the Decision Transformer in the context of MARL.

Offline RL reinforcement-learning +4

LiMuSE: Lightweight Multi-modal Speaker Extraction

1 code implementation7 Nov 2021 Qinghua Liu, Yating Huang, Yunzhe Hao, Jiaming Xu, Bo Xu

Multi-modal cues, including spatial information, facial expression and voiceprint, are introduced to the speech separation and speaker extraction tasks to serve as complementary information to achieve better performance.

Model Compression Quantization +1

Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation

no code implementations22 Oct 2021 Ziwen Li, Bo Xu, Han Huang, Cheng Lu, Yandong Guo

In this paper, we propose a new framework Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation (DTS-VIBE), to generate 3D human pose and mesh from RGB videos.

Ranked #3 on 3D Human Pose Estimation on MPI-INF-3DHP (PA-MPJPE metric)

3D Human Pose Estimation Optical Flow Estimation

Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction

1 code implementation ICCV 2021 Bo Xu, Han Huang, Cheng Lu, Ziwen Li, Yandong Guo

In this paper, we propose a Virtual Multi-modality Foreground Matting (VMFM) method to learn human-object interactive foreground (human and objects interacted with him or her) from a raw RGB image.

Human-Object Interaction Detection Image Matting

Tao: A Learning Framework for Adaptive Nearest Neighbor Search using Static Features Only

no code implementations2 Oct 2021 Kaixiang Yang, Hongya Wang, Bo Xu, Wei Wang, Yingyuan Xiao, Ming Du, Junfeng Zhou

In the middle of query execution, AdaptNN collects a number of runtime features and predicts termination condition for each individual query, by which better end-to-end latency is attained.

Information Retrieval Management +1

A Simple Unified Framework for Anomaly Detection in Deep Reinforcement Learning

no code implementations21 Sep 2021 Hongming Zhang, Ke Sun, Bo Xu, Linglong Kong, Martin Müller

In this paper, we propose a simple yet effective anomaly detection framework for deep RL algorithms that simultaneously considers random, adversarial and out-of-distribution~(OOD) state outliers.

Anomaly Detection Atari Games +3

Population-coding and Dynamic-neurons improved Spiking Actor Network for Reinforcement Learning

no code implementations15 Jun 2021 Duzhen Zhang, Tielin Zhang, Shuncheng Jia, Xiang Cheng, Bo Xu

Based on a hybrid learning framework, where a spike actor-network infers actions from states and a deep critic network evaluates the actor, we propose a Population-coding and Dynamic-neurons improved Spiking Actor Network (PDSAN) for efficient state representation from two different scales: input coding and neuronal coding.

OpenAI Gym reinforcement-learning +1

WASE: Learning When to Attend for Speaker Extraction in Cocktail Party Environments

1 code implementation13 Jun 2021 Yunzhe Hao, Jiaming Xu, Peng Zhang, Bo Xu

In the speaker extraction problem, it is found that additional information from the target speaker contributes to the tracking and extraction of the target speaker, which includes voiceprint, lip movement, facial expression, and spatial information.

Action Detection Activity Detection

Counterfactual Supporting Facts Extraction for Explainable Medical Record Based Diagnosis with Graph Network

1 code implementation NAACL 2021 Haoran Wu, Wei Chen, Shuang Xu, Bo Xu

Specifically, we first structure the sequence of EMR into a hierarchical graph network and then obtain the causal relationship between multi-granularity features and diagnosis results through counterfactual intervention on the graph.

MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation

no code implementations17 Apr 2021 Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu

The spatial self-attention module is designed to attend on the cross-channel correlation in the covariance matrices.

Automatic Speech Recognition speech-recognition +1

Network Representation Learning: From Traditional Feature Learning to Deep Learning

no code implementations7 Mar 2021 Ke Sun, Lei Wang, Bo Xu, Wenhong Zhao, Shyh Wei Teng, Feng Xia

Network representation learning (NRL) is an effective graph analytics technique and promotes users to deeply understand the hidden characteristics of graph data.

Recommendation Systems Representation Learning

Graph Force Learning

no code implementations7 Mar 2021 Ke Sun, Jiaying Liu, Shuo Yu, Bo Xu, Feng Xia

Features representation leverages the great power in network analysis tasks.

Graph Learning

MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition

no code implementations25 Feb 2021 Linghui Meng, Jin Xu, Xu Tan, Jindong Wang, Tao Qin, Bo Xu

In this paper, we propose MixSpeech, a simple yet effective data augmentation method based on mixup for automatic speech recognition (ASR).

Automatic Speech Recognition Data Augmentation +1

1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

1 code implementation20 Jan 2021 Fei Du, Bo Xu, Jiasheng Tang, Yuqi Zhang, Fan Wang, Hao Li

We extend the classical tracking-by-detection paradigm to this tracking-any-object task.

 Ranked #1 on Multi-Object Tracking on TAO (using extra training data)

Association Multi-Object Tracking

Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech Recognition

no code implementations17 Jan 2021 Cheng Yi, Shiyu Zhou, Bo Xu

In this work, we fuse a pre-trained acoustic encoder (wav2vec2. 0) and a pre-trained linguistic encoder (BERT) into an end-to-end ASR model.

Automatic Speech Recognition Language Modelling +1

Research on Fast Text Recognition Method for Financial Ticket Image

no code implementations5 Jan 2021 Fukang Tian, Haiyu Wu, Bo Xu

At present, a few works have applied deep learning methods to financial ticket recognition.

Region Proposal

Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages

no code implementations22 Dec 2020 Cheng Yi, Jianzhong Wang, Ning Cheng, Shiyu Zhou, Bo Xu

To verify its universality over languages, we apply pre-trained models to solve low-resource speech recognition tasks in various spoken languages.

speech-recognition Speech Recognition

CIF-based Collaborative Decoding for End-to-end Contextual Speech Recognition

no code implementations17 Dec 2020 Minglun Han, Linhao Dong, Shiyu Zhou, Bo Xu

End-to-end (E2E) models have achieved promising results on multiple speech recognition benchmarks, and shown the potential to become the mainstream.

speech-recognition Speech Recognition

Research on All-content Text Recognition Method for Financial Ticket Image

no code implementations15 Dec 2020 Fukang Tian, Haiyu Wu, Bo Xu

With the development of the economy, the number of financial tickets increases rapidly.

Knowledge Aware Emotion Recognition in Textual Conversations via Multi-Task Incremental Transformer

1 code implementation COLING 2020 Duzhen Zhang, Xiuyi Chen, Shuang Xu, Bo Xu

For one thing, speakers often rely on the context and commonsense knowledge to express emotions; for another, most utterances contain neutral emotion in conversations, as a result, the confusion between a few non-neutral utterances and much more neutral ones restrains the emotion recognition performance.

Emotion Recognition Graph Attention +3

Audio-visual Speech Separation with Adversarially Disentangled Visual Representation

no code implementations29 Nov 2020 Peng Zhang, Jiaming Xu, Jing Shi, Yunzhe Hao, Bo Xu

In our model, we use the face detector to detect the number of speakers in the scene and use visual information to avoid the permutation problem.

Speech Separation

Financial ticket intelligent recognition system based on deep learning

no code implementations29 Oct 2020 Fukang Tian, Haiyu Wu, Bo Xu

Facing the rapid growth in the issuance of financial tickets (or bills, invoices etc.

Self-Learning

Tuning Convolutional Spiking Neural Network with Biologically-plausible Reward Propagation

1 code implementation9 Oct 2020 Tielin Zhang, Shuncheng Jia, Xiang Cheng, Bo Xu

The performance of the proposed BRP-SNN is further verified on the spatial (including MNIST and Cifar-10) and temporal (including TIDigits and DvsGesture) tasks, where the SNN using BRP has reached a similar accuracy compared to other state-of-the-art BP-based SNNs and saved 50% more computational cost than ANNs.

Finite Meta-Dynamic Neurons in Spiking Neural Networks for Spatio-temporal Learning

no code implementations7 Oct 2020 Xiang Cheng, Tielin Zhang, Shuncheng Jia, Bo Xu

Spiking Neural Networks (SNNs) have incorporated more biologically-plausible structures and learning principles, hence are playing critical roles in bridging the gap between artificial and natural neural networks.

Consecutive Decoding for Speech-to-text Translation

1 code implementation21 Sep 2020 Qianqian Dong, Mingxuan Wang, Hao Zhou, Shuang Xu, Bo Xu, Lei LI

The key idea is to generate source transcript and target translation text with a single decoder.

Machine Translation speech-recognition +3

Shifu2: A Network Representation Learning Based Model for Advisor-advisee Relationship Mining

no code implementations17 Aug 2020 Jiaying Liu, Feng Xia, Lei Wang, Bo Xu, Xiangjie Kong, Hanghang Tong, Irwin King

The advisor-advisee relationship represents direct knowledge heritage, and such relationship may not be readily available from academic libraries and search engines.

Representation Learning

DINE: A Framework for Deep Incomplete Network Embedding

no code implementations9 Aug 2020 Ke Hou, Jiaying Liu, Yin Peng, Bo Xu, Ivan Lee, Feng Xia

Empirically, we evaluate DINE over three networks on multi-label classification and link prediction tasks.

General Classification Link Prediction +3

MODEL: Motif-based Deep Feature Learning for Link Prediction

no code implementations9 Aug 2020 Lei Wang, Jing Ren, Bo Xu, Jian-Xin Li, Wei Luo, Feng Xia

Link prediction plays an important role in network analysis and applications.

Link Prediction

Speaker-Conditional Chain Model for Speech Separation and Extraction

no code implementations25 Jun 2020 Jing Shi, Jiaming Xu, Yusuke Fujita, Shinji Watanabe, Bo Xu

With the predicted speaker information from whole observation, our model is helpful to solve the problem of conventional speech separation and speaker extraction for multi-round long recordings.

Audio and Speech Processing Sound

Deep Learning Guided Building Reconstruction from Satellite Imagery-derived Point Clouds

no code implementations19 May 2020 Bo Xu, Xu Zhang, Zhixin Li, Matt Leotta, Shih-Fu Chang, Jie Shan

For points that belong to the same roof shape, a multi-cue, hierarchical RANSAC approach is proposed for efficient and reliable segmenting and reconstructing the building point cloud.

3D Reconstruction

Discriminative Multi-modality Speech Recognition

2 code implementations CVPR 2020 Bo Xu, Cheng Lu, Yandong Guo, Jacob Wang

Vision is often used as a complementary modality for audio speech recognition (ASR), especially in the noisy environment where performance of solo audio modality significantly deteriorates.

Audio-Visual Speech Recognition Lipreading +2

DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog

1 code implementation18 Dec 2019 Feilong Chen, Fandong Meng, Jiaming Xu, Peng Li, Bo Xu, Jie zhou

Visual Dialog is a vision-language task that requires an AI agent to engage in a conversation with humans grounded in an image.

Visual Dialog

Few-Features Attack to Fool Machine Learning Models through Mask-Based GAN

no code implementations12 Nov 2019 Feng Chen, Yunkai Shang, Bo Xu, Jincheng Hu

In comparison with the previous non-learning adversarial example attack approaches, the GAN-based adversarial attack example approach can generate the adversarial samples quickly using the GAN architecture every time facing a new sample after training, but meanwhile needs to perturb the attack samples in great quantities, which results in the unpractical application in reality.

Adversarial Attack BIG-bench Machine Learning

Unsupervised pre-training for sequence to sequence speech recognition

no code implementations28 Oct 2019 Zhiyun Fan, Shiyu Zhou, Bo Xu

The unsupervised pre-training is finished on AISHELL-2 dataset and we apply the pre-trained model to multiple paired data ratios of AISHELL-1 and HKUST.

Automatic Speech Recognition Sequence-To-Sequence Speech Recognition +2

Iterative Update and Unified Representation for Multi-Agent Reinforcement Learning

no code implementations16 Aug 2019 Jiancheng Long, Hongming Zhang, Tianyang Yu, Bo Xu

In this method, iterative update can greatly alleviate the nonstationarity of the environment, unified representation can speed up the interaction with environment and avoid the linear growth of memory usage.

Multi-agent Reinforcement Learning reinforcement-learning +1

A Working Memory Model for Task-oriented Dialog Response Generation

no code implementations ACL 2019 Xiuyi Chen, Jiaming Xu, Bo Xu

Our WMM2Seq adopts a working memory to interact with two separated long-term memories, which are the episodic memory for memorizing dialog history and the semantic memory for storing KB tuples.

Response Generation

The World in My Mind: Visual Dialog with Adversarial Multi-modal Feature Encoding

no code implementations NAACL 2019 Yiqun Yao, Jiaming Xu, Bo Xu

Visual Dialog is a multi-modal task that requires a model to participate in a multi-turn human dialog grounded on an image, and generate correct, human-like responses.

General Knowledge Visual Dialog

CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition

2 code implementations27 May 2019 Linhao Dong, Bo Xu

In this paper, we propose a novel soft and monotonic alignment mechanism used for sequence transduction.

Language Modelling Multi-Task Learning +2

Reference-Based Sequence Classification

no code implementations17 May 2019 Zengyou He, Guangyao Xu, Chaohua Sheng, Bo Xu, Quan Zou

By utilizing this framework as a tool, we propose new sequence classification algorithms that are quite different from existing solutions.

Classification General Classification

Material Segmentation of Multi-View Satellite Imagery

no code implementations17 Apr 2019 Matthew Purri, Jia Xue, Kristin Dana, Matthew Leotta, Dan Lipsa, Zhixin Li, Bo Xu, Jie Shan

The residuals are computed by differencing the sparse-sampled reflectance function with a dictionary of pre-defined dense-sampled reflectance functions.

Material Recognition Semantic Segmentation

Self-Attention Aligner: A Latency-Control End-to-End Model for ASR Using Self-Attention Network and Chunk-Hopping

no code implementations18 Feb 2019 Linhao Dong, Feng Wang, Bo Xu

Experiments on two Mandarin ASR datasets show the replacement of RNNs by the self-attention networks yields a 8. 4%-10. 2% relative character error rate (CER) reduction.

Automatic Speech Recognition Language Modelling +1

A Biologically Plausible Supervised Learning Method for Spiking Neural Networks Using the Symmetric STDP Rule

1 code implementation17 Dec 2018 Yunzhe Hao, Xuhui Huang, Meng Dong, Bo Xu

By combining the sym-STDP rule with bio-plausible synaptic scaling and intrinsic plasticity of the dynamic threshold, our SNN model implemented SL well and achieved good performance in the benchmark recognition task (MNIST dataset).

Concept Learning through Deep Reinforcement Learning with Memory-Augmented Neural Networks

no code implementations15 Nov 2018 Jing Shi, Jiaming Xu, Yiqun Yao, Bo Xu

In this paper, we present a memory-augmented neural network which is motivated by the process of human concept learning.

One-Shot Learning Outlier Detection +2

WECA: A WordNet-Encoded Collocation-Attention Network for Homographic Pun Recognition

no code implementations EMNLP 2018 Yufeng Diao, Hongfei Lin, Di wu, Liang Yang, Kan Xu, Zhihao Yang, Jian Wang, Shaowu Zhang, Bo Xu, Dongyu Zhang

In this work, we first use WordNet to understand and expand word embedding for settling the polysemy of homographic puns, and then propose a WordNet-Encoded Collocation-Attention network model (WECA) which combined with the context weights for recognizing the puns.

Semi-Supervised Disfluency Detection

no code implementations COLING 2018 Feng Wang, Wei Chen, Zhen Yang, Qianqian Dong, Shuang Xu, Bo Xu

While the disfluency detection has achieved notable success in the past years, it still severely suffers from the data scarcity.

Machine Translation Question Answering

Construction of a Chinese Corpus for the Analysis of the Emotionality of Metaphorical Expressions

no code implementations ACL 2018 Dongyu Zhang, Hongfei Lin, Liang Yang, Shaowu Zhang, Bo Xu

However, there is little research on the construction of metaphor corpora annotated with emotion for the analysis of emotionality of metaphorical expressions.

Emotion Recognition

Single-channel Speech Dereverberation via Generative Adversarial Training

no code implementations25 Jun 2018 Chenxing Li, Tieqiang Wang, Shuang Xu, Bo Xu

In this paper, we propose a single-channel speech dereverberation system (DeReGAT) based on convolutional, bidirectional long short-term memory and deep feed-forward neural network (CBLDNN) with generative adversarial training (GAT).

Speech Dereverberation

Multilingual End-to-End Speech Recognition with A Single Transformer on Low-Resource Languages

no code implementations12 Jun 2018 Shiyu Zhou, Shuang Xu, Bo Xu

Experiments on CALLHOME datasets demonstrate that the multilingual ASR Transformer with the language symbol at the end performs better and can obtain relatively 10. 5\% average word error rate (WER) reduction compared to SHL-MLSTM with residual learning.

Automatic Speech Recognition Language Modelling +1

NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition

3 code implementations4 Jun 2018 Fenfen Sheng, Zhineng Chen, Bo Xu

Considering scene image has large variation in text and background, we further design a modality-transform block to effectively transform 2D input images to 1D sequences, combined with the encoder to extract more discriminative features.

Optical Character Recognition Scene Text Recognition

A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese

no code implementations16 May 2018 Shiyu Zhou, Linhao Dong, Shuang Xu, Bo Xu

Experiments on HKUST datasets demonstrate that the lexicon free modeling units can outperform lexicon related modeling units in terms of character error rate (CER).

Automatic Speech Recognition Language Modelling +2

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

1 code implementation28 Apr 2018 Shiyu Zhou, Linhao Dong, Shuang Xu, Bo Xu

Furthermore, we investigate a comparison between syllable based model and context-independent phoneme (CI-phoneme) based model with the Transformer in Mandarin Chinese.

Automatic Speech Recognition Language Modelling +5

Unsupervised Neural Machine Translation with Weight Sharing

1 code implementation ACL 2018 Zhen Yang, Wei Chen, Feng Wang, Bo Xu

Unsupervised neural machine translation (NMT) is a recently proposed approach for machine translation which aims to train the model without using any labeled data.

Machine Translation NMT +1

Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation

1 code implementation IJCNLP 2017 Chunqi Wang, Bo Xu

The first is that they heavily rely on manually designed bigram feature, i. e. they are not good at capturing n-gram features automatically.

Chinese Word Segmentation Feature Engineering +1

Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets

3 code implementations NAACL 2018 Zhen Yang, Wei Chen, Feng Wang, Bo Xu

During training, both the dynamic discriminator and the static BLEU objective are employed to evaluate the generated sentences and feedback the evaluations to guide the learning of the generator.

Machine Translation NMT +1

A Character-Aware Encoder for Neural Machine Translation

no code implementations COLING 2016 Zhen Yang, Wei Chen, Feng Wang, Bo Xu

This article proposes a novel character-aware neural machine translation (NMT) model that views the input sequences as sequences of characters rather than words.

Machine Translation NMT +1

Combining Lexical and Semantic-based Features for Answer Sentence Selection

no code implementations WS 2016 Jing Shi, Jiaming Xu, Yiqun Yao, Suncong Zheng, Bo Xu

As the result of the evaluation shows, our solution provides a valuable and brief model which could be used in modelling question answering or sentence semantic relevance.

Feature Engineering Open-Domain Question Answering

Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling

3 code implementations COLING 2016 Peng Zhou, Zhenyu Qi, Suncong Zheng, Jiaming Xu, Hongyun Bao, Bo Xu

To integrate the features on both dimensions of the matrix, this paper explores applying 2D max pooling operation to obtain a fixed-length representation of the text.

Classification General Classification +2

Hierarchical Memory Networks for Answer Selection on Unknown Words

1 code implementation COLING 2016 Jiaming Xu, Jing Shi, Yiqun Yao, Suncong Zheng, Bo Xu

Recently, end-to-end memory networks have shown promising results on Question Answering task, which encode the past facts into an explicit memory and perform reasoning ability by making multiple computational steps on the memory.

Answer Selection

Convolutional Neural Networks for Text Hashing

no code implementations IJCAI 2015 Jiaming Xu, PengWang, Guanhua Tian, Bo Xu, Jun Zhao, Fangyuan Wang, HongWei Hao

Meanwhile word features and position features are together fed into a convolutional network to learn the implicit features which are further incorporated with the explicit features to fit the pretrained binary code.

Cannot find the paper you are looking for? You can Submit a new open access paper.