no code implementations • CCL 2021 • Tongyue Zhang, Shaowu Zhang, Bo Xu, Liang Yang, Hongfei Lin
“幽默在人类交流中扮演着重要角色, 并大量存在于情景喜剧中。笑点(punchline)是情景喜剧实现幽默效果的形式之一, 在情景喜剧笑点识别任务中, 每条句子的标签代表该句是否为笑点, 但是以往的笑点识别工作通常只通过建模上下文语义关系识别笑点, 对标签的利用并不充分。为了充分利用标签序列中的信息, 本文提出了一种新的识别方法, 即结合条件随机场的单词级-句子级多任务学习模型, 该模型在两方面进行了改进, 首先将标签序列中相邻两个标签之间的转移关系看作幽默理论中不一致性的一种体现, 并使用条件随机场学习这种转移关系, 其次由于学习相邻标签之间的转移关系以及上下文语义关系均能够学习到铺垫和笑点之间的不一致性, 两者之间存在相关性, 为了使模型通过利用这种相关性提高笑点识别的效果, 该模型引入了多任务学习方法, 使用多任务学习方法同时学习每条句子的句义、组成每条句子的所有字符的词义, 单词级别的标签转移关系以及句子级别的标签转移关系。本文在CCL2020“小牛杯”幽默计算—情景喜剧笑点识别评测任务的英文数据集上进行实验, 结果表明, 本文提出的方法比目前最好的方法提高了3. 2%, 在情景喜剧幽默笑点识别任务上取得了最好的效果, 并通过消融实验证明了上述两方面改进的有效性。”
1 code implementation • COLING 2022 • Bo Xu, Shizhou Huang, Ming Du, Hongya Wang, Hui Song, Chaofeng Sha, Yanghua Xiao
In this paper, we argue that different social media posts should consider different modalities for multimodal information extraction.
no code implementations • CCL 2021 • Dongzhen Wen, Fan Zhang, Xiao Zhang, Liang Yang, Yuan Lin, Bo Xu, Hongfei Lin
“软件源代码的理解则是软件协同开发与维护的核心, 而源代码中占半数以上的标识符的理解则在软件理解中起到重要作用, 传统软件工程主要研究通过命名规范限制标识符的命名过程以构造更易理解和交流的标识符。本文则在梳理分析常见编程语言命名规范的基础上, 提出一种全新的标识符可理解性评价标准。具体而言, 本文首先总结梳理了常见主流编程语言中的命名规范并类比自然语言语素概念本文提出基于软件语素的标识符构成过程, 即标识符的构成可被视为软件语素的生成、排列和连接过程。在此基础上, 本文提出一种结合自然语料库的软件标识符规范性评价方法, 用来衡量软件标识符是否易于理解。最后, 本文通过源代码理解数据集和乇乩乴乨乵乢平台中开源项目对规范性指标进行了验证性实验, 结果表明本文提出的规范性分数能够很好衡量软件项目的可理解性。”
no code implementations • EMNLP 2020 • Xiuyi Chen, Fandong Meng, Peng Li, Feilong Chen, Shuang Xu, Bo Xu, Jie zhou
Here, we deal with these issues on two aspects: (1) We enhance the prior selection module with the necessary posterior information obtained from the specially designed Posterior Information Prediction Module (PIPM); (2) We propose a Knowledge Distillation Based Training Strategy (KDBTS) to train the decoder with the knowledge selected from the prior distribution, removing the exposure bias of knowledge selection.
no code implementations • Findings (EMNLP) 2021 • Changrong Min, Yonghe Chu, Liang Yang, Bo Xu, Hongfei Lin
Thus, cosine similarity cannot approximate distances on the manifold.
no code implementations • SemEval (NAACL) 2022 • Junyu Lu, Hao Zhang, Tongyue Zhang, Hongbo Wang, Haohao Zhu, Bo Xu, Hongfei Lin
For Subtask B, framed as a multi-label classification problem, we utilize various improved multi-label cross-entropy loss functions and analyze the performance of our method.
no code implementations • COLING 2022 • Bo Xu, Hongtong Zhang, Jian Wang, Xiaokun Zhang, Dezhi Hao, Linlin Zong, Hongfei Lin, Fenglong Ma
We collected and annotated a wide range of meta-data with respect to medical dialogue including doctor profiles, hospital departments, diseases and symptoms for fine-grained analysis on language usage pattern and clinical diagnosis.
no code implementations • 27 Mar 2024 • Qingyu Wang, Duzhen Zhang, Tilelin Zhang, Bo Xu
Energy-efficient spikformer has been proposed by integrating the biologically plausible spiking neural network (SNN) and artificial Transformer, whereby the Spiking Self-Attention (SSA) is used to achieve both higher accuracy and lower computational cost.
no code implementations • 18 Mar 2024 • Zewen Xu, Yijia He, Hao Wei, Bo Xu, BinJian Xie, Yihong Wu
First, a high-precision rotation estimation method based on normal vector coplanarity constraints that consider the uncertainty of observations is proposed, which can be solved by Levenberg-Marquardt (LM) algorithm efficiently.
no code implementations • 15 Mar 2024 • Bo Xu, Ziao Liu, Mengqi Guo, Jiancheng Li, Gim Hee Lee
We propose a novel rolling shutter bundle adjustment method for neural radiance fields (NeRF), which utilizes the unordered rolling shutter (RS) images to obtain the implicit 3D representation.
no code implementations • 27 Feb 2024 • Xiaokun Zhang, Bo Xu, Chenliang Li, Yao Zhou, Liangyue Li, Hongfei Lin
Emerging efforts incorporate various kinds of side information into their methods for enhancing task performance.
no code implementations • 18 Dec 2023 • Jingqing Ruan, Kaishen Wang, Qingyang Zhang, Dengpeng Xing, Bo Xu
Many complicated real-world tasks can be broken down into smaller, more manageable parts, and planning with prior knowledge extracted from these simplified pieces is crucial for humans to make accurate decisions.
no code implementations • 4 Dec 2023 • Chao Shen, Wenkang Zhan, Jian Tang, Zhaofeng Wu, Bo Xu, Chao Zhao, Zhanguo Wang
It standardizes deoxidation temperatures across various equipment and substrate materials, advancing the standardization research process in semiconductor preparation, a significant milestone in thin film growth technology.
no code implementations • 27 Nov 2023 • Can Sun, Hao Zheng, Zhigang Hu, Liu Yang, Meiguang Zheng, Bo Xu
The single domain generalization(SDG) based on meta-learning has emerged as an effective technique for solving the domain-shift problem.
no code implementations • 26 Nov 2023 • Bo Xu, Hao Zheng, Zhigang Hu, Liu Yang, Meiguang Zheng
In current synthetic aperture radar (SAR) object classification, one of the major challenges is the severe overfitting issue due to the limited dataset (few-shot) and noisy data.
no code implementations • 21 Nov 2023 • Xuanle Zhao, Yue Sun, Tielin Zhang, Bo Xu
One of the most notable methods is the Fourier Neural Operator (FNO), which is inspired by Green's function method and approximate operator kernel directly in the frequency domain.
1 code implementation • 2 Nov 2023 • Xiaokun Zhang, Bo Xu, Fenglong Ma, Chenliang Li, Yuan Lin, Hongfei Lin
Secondly, price preference and interest preference are interdependent and collectively determine user choice, necessitating that we jointly consider both price and interest preference for intent modeling.
1 code implementation • 31 Oct 2023 • Hui Ma, Jian Wang, Hongfei Lin, Bo Zhang, Yijia Zhang, Bo Xu
Emotion recognition in conversations (ERC), the task of recognizing the emotion of each utterance in a conversation, is crucial for building empathetic machines.
1 code implementation • 29 Sep 2023 • Xiaokun Zhang, Bo Xu, Fenglong Ma, Chenliang Li, Liang Yang, Hongfei Lin
(2) How to fuse these heterogeneous descriptive information to comprehensively infer user interests?
no code implementations • 27 Sep 2023 • Zongyuan Tan, Hongya Wang, Bo Xu, Minjie Luo, Ming Du
Locality-sensitive hashing (LSH) is an effective randomized technique widely used in many machine learning tasks.
no code implementations • 8 Sep 2023 • Ramanathan V. Guha, Prashanth Radhakrishnan, Bo Xu, Wei Sun, Carolyn Au, Ajai Tirumali, Muhammad J. Amjad, Samantha Piekos, Natalie Diaz, Jennifer Chen, Julia Wu, Prem Ramaswami, James Manyika
The aggregate of these Data Commons can be viewed as a single Knowledge Graph.
no code implementations • 18 Aug 2023 • Hongqiu Wang, Lei Zhu, Guang Yang, Yike Guo, Shichen Zhang, Bo Xu, Yueming Jin
Our method is verified on these datasets, and experimental results exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods.
no code implementations • 18 Aug 2023 • Yunzhi Qiu, Xiaokun Zhang, Weiwei Wang, Tongxuan Zhang, Bo Xu, Hongfei Lin
Secondly, social media datasets suffer from the challenges of low annotated data.
1 code implementation • ICCV 2023 • Man Yao, Jiakui Hu, Guangshe Zhao, Yaoyuan Wang, Ziyang Zhang, Bo Xu, Guoqi Li
In this work, we pose and focus on three key questions regarding the inherent redundancy in SNNs.
no code implementations • 5 Aug 2023 • Fangyuan Wang, Ming Hao, Yuhai Shi, Bo Xu
The conventional recipe for Automatic Speech Recognition (ASR) models is to 1) train multiple checkpoints on a training set while relying on a validation set to prevent overfitting using early stopping and 2) average several last checkpoints or that of the lowest validation losses to obtain the final model.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 2 Aug 2023 • Qingyu Wang, Duzhen Zhang, Tielin Zhang, Bo Xu
The results indicate that compared to the SOTA Spikformer with SSA, Spikformer with LT achieves higher Top-1 accuracy on neuromorphic datasets (i. e., CIFAR10-DVS and DVS128 Gesture) and comparable Top-1 accuracy on static datasets (i. e., CIFAR-10 and CIFAR-100).
1 code implementation • 1 Aug 2023 • Bo Zhang, Jian Wang, Hui Ma, Bo Xu, Hongfei Lin
To overcome this challenge, we propose an innovative multimodal framework, called ZRIGF, which assimilates image-grounded information for dialogue generation in zero-resource situations.
1 code implementation • 30 Jul 2023 • Zefa Hu, Ziyi Ni, Jing Shi, Shuang Xu, Bo Xu
However, these generative methods output a whole sequence consisting of term-status pairs in one stage and ignore integrating prior knowledge, which demands a deeper understanding to model the relationship between terms and infer the status of each term.
1 code implementation • 22 Jul 2023 • Qingyang Zhang, Yiming Yang, Jingqing Ruan, Xuantang Xiong, Dengpeng Xing, Bo Xu
However, existing works often overlook the temporal coherence in GCHRL when learning latent subgoal representations and lack an efficient subgoal selection strategy that balances exploration and exploitation.
no code implementations • 10 Jul 2023 • Junyu Lu, Hongfei Lin, Xiaokun Zhang, Zhaoqing Li, Tongyue Zhang, Linlin Zong, Fenglong Ma, Bo Xu
Our framework jointly optimizes the self-supervised and the supervised contrastive learning loss for capturing span-level information beyond the token-level emotional semantics used in existing models, particularly detecting speech containing abusive and insulting words.
1 code implementation • NeurIPS 2023 • Man Yao, Jiakui Hu, Zhaokun Zhou, Li Yuan, Yonghong Tian, Bo Xu, Guoqi Li
In this paper, we incorporate the spike-driven paradigm into Transformer by the proposed Spike-driven Transformer with four unique properties: 1) Event-driven, no calculation is triggered when the input of Transformer is zero; 2) Binary spike communication, all matrix multiplications associated with the spike matrix can be transformed into sparse additions; 3) Self-attention with linear complexity at both token and channel dimensions; 4) The operations between spike-form Query, Key, and Value are mask and addition.
no code implementations • 22 Jun 2023 • Chao Shen, Wenkang Zhan, Kaiyao Xin, Manyang Li, Zhenyu Sun, Hui Cong, Chi Xu, Jian Tang, Zhaofeng Wu, Bo Xu, Zhongming Wei, Chunlai Xue, Chao Zhao, Zhanguo Wang
Self-assembled InAs/GaAs quantum dots (QDs) have properties highly valuable for developing various optoelectronic devices such as QD lasers and single photon sources.
no code implementations • 7 Jun 2023 • Libin Wang, Han Hu, Qisen Shang, Bo Xu, Qing Zhu
The lack of fa\c{c}ade structures in photogrammetric mesh models renders them inadequate for meeting the demands of intricate applications.
no code implementations • 31 May 2023 • Ziyi Ni, Minglun Han, Feilong Chen, Linghui Meng, Jing Shi, Pin Lv, Bo Xu
In this paper, we first propose ViLaS (Vision and Language into Automatic Speech Recognition), a novel multimodal ASR model based on the continuous integrate-and-fire (CIF) mechanism, which can integrate visual and textual context simultaneously or separately, to facilitate speech recognition.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 24 May 2023 • Xiyuan Wang, Fangyuan Wang, Bo Xu, Liang Xu, Jing Xiao
Typically, the Time-Delay Neural Network (TDNN) and Transformer can serve as a backbone for Speaker Verification (SV).
no code implementations • 20 May 2023 • Man Yao, Yuhong Chou, Guangshe Zhao, Xiawu Zheng, Yonghong Tian, Bo Xu, Guoqi Li
LTH opens up a new path for network pruning.
no code implementations • 10 May 2023 • Xiyun Li, Ziyi Ni, Jingqing Ruan, Linghui Meng, Jing Shi, Tielin Zhang, Bo Xu
Inspired by this two-step psychology theory, we propose a biologically plausible mixture of personality (MoP) improved spiking actor network (SAN), whereby a determinantal point process is used to simulate the complex formation and integration of different types of personality in MoP, and dynamic and spiking neurons are incorporated into the SAN for the efficient reinforcement learning.
1 code implementation • 8 May 2023 • Junyu Lu, Bo Xu, Xiaokun Zhang, Changrong Min, Liang Yang, Hongfei Lin
In addition, it is crucial to introduce lexical knowledge to detect the toxicity of posts, which has been a challenge for researchers.
2 code implementations • 7 May 2023 • Feilong Chen, Minglun Han, Haozhi Zhao, Qingyang Zhang, Jing Shi, Shuang Xu, Bo Xu
(3) Integrating multiple modalities: all single-modal encoders are aligned with the LLM through X2L interfaces to integrate multimodal capabilities into the LLM.
1 code implementation • ICLR 2023 • Hongming Zhang, Chenjun Xiao, Han Wang, Jun Jin, Bo Xu, Martin Müller
In this work, we further exploit the information in the replay memory by treating it as an empirical \emph{Replay Memory MDP (RM-MDP)}.
no code implementations • 12 Apr 2023 • Haojia Yu, Han Hu, Bo Xu, Qisen Shang, Zhendong Wang, Qing Zhu
Most urban applications necessitate building footprints in the form of concise vector graphics with sharp boundaries rather than pixel-wise raster images.
no code implementations • 30 Mar 2023 • Qisen Shang, Han Hu, Haojia Yu, Bo Xu, Libin Wang, Qing Zhu
Experimental results on publicly available fa\c{c}ade image and 3D model datasets demonstrate that our method yields superior results and effectively addresses issues associated with flawed textures.
1 code implementation • 2 Mar 2023 • Zefa Hu, Xiuyi Chen, Haoran Wu, Minglun Han, Ziyi Ni, Jing Shi, Shuang Xu, Bo Xu
Medical Slot Filling (MSF) task aims to convert medical queries into structured information, playing an essential role in diagnosis dialogue systems.
1 code implementation • 2 Feb 2023 • Minglun Han, Qingyu Wang, Tielin Zhang, Yi Wang, Duzhen Zhang, Bo Xu
The spiking neural network (SNN) using leaky-integrated-and-fire (LIF) neurons has been commonly used in automatic speech recognition (ASR) tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
2 code implementations • 30 Jan 2023 • Minglun Han, Feilong Chen, Jing Shi, Shuang Xu, Bo Xu
Large-scale pre-trained language models (PLMs) have shown great potential in natural language processing tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • CVPR 2023 • Yijia He, Bo Xu, Zhanpeng Ouyang, Hongdong Li
We propose a novel visual-inertial odometry (VIO) initialization method, which decouples rotation and translation estimation, and achieves higher efficiency and better robustness.
1 code implementation • 29 Dec 2022 • Duzhen Zhang, Tielin Zhang, Shuncheng Jia, Qingyu Wang, Bo Xu
Learning from the interaction is the primary way biological agents know about the environment and themselves.
no code implementations • 25 Nov 2022 • Cheng Lyu, Jiake Xie, Bo Xu, Cheng Lu, Han Huang, Xin Huang, Ming Wu, Chuang Zhang, Yong Tang
Performance of trimap-free image matting methods is limited when trying to decouple the deterministic and undetermined regions, especially in the scenes where foregrounds are semantically ambiguous, chromaless, or high transmittance.
no code implementations • 21 Nov 2022 • Fangyuan Wang, Bo Xu
Currently, the chunk-wise schemes are often used to make Automatic Speech Recognition (ASR) models to support streaming deployment.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 12 Nov 2022 • Shuncheng Jia, Tielin Zhang, Ruichen Zuo, Bo Xu
Here, we propose a Motif-topology improved SNN (M-SNN) for the efficient multi-sensory integration and cognitive phenomenon simulations.
no code implementations • 19 Oct 2022 • Xinliang Liu, Bo Xu, Lei Zhang
Neural operators have emerged as a powerful tool for learning the mapping between infinite-dimensional parameter and solution spaces of partial differential equations (PDEs).
no code implementations • 28 Sep 2022 • Man Yao, Guangshe Zhao, Hengyu Zhang, Yifan Hu, Lei Deng, Yonghong Tian, Bo Xu, Guoqi Li
On ImageNet-1K, we achieve top-1 accuracy of 75. 92% and 77. 08% on single/4-step Res-SNN-104, which are state-of-the-art results in SNNs.
1 code implementation • 9 May 2022 • Xiaokun Zhang, Bo Xu, Liang Yang, Chenliang Li, Fenglong Ma, Haifeng Liu, Hongfei Lin
Finally, we predict user actions based on item features and users' price and interest preferences.
no code implementations • 20 Apr 2022 • Bo Xu, Jiake Xie, Han Huang, Ziwen Li, Cheng Lu, Yong Tang, Yandong Guo
In this paper, we propose a Situational Perception Guided Image Matting (SPG-IM) method that mitigates subjective bias of matting annotations and captures sufficient situational perception information for better global saliency distilled from the visual-to-textual task.
no code implementations • 15 Apr 2022 • Feilong Chen, Xiuyi Chen, Shuang Xu, Bo Xu
Visual Dialog is a challenging vision-language task since the visual dialog agent needs to answer a series of questions after reasoning over both the image content and dialog history.
1 code implementation • 29 Mar 2022 • Fangyuan Wang, Bo Xu
We integrate this scheme with the chunk-wise Transformer and Conformer, and identify them as SChunk-Transformer and SChunk-Conformer, respectively.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • IEEE/ACM Transactions on Audio, Speech, and Language Processing 2022 • Bo Zhang, Jian Wang, Hongfei Lin, Hui Ma, Bo Xu
Correlation integration is designed to fully exploit the pairwise mutual information among dialogue context, knowledge, and responses, while overall integration adopts an integration gate to capture global information.
no code implementations • 8 Mar 2022 • Bo Xu, Guanze Liu, Han Huang, Cheng Lu, Yandong Guo
Most existing CNN-based salient object detection methods can identify local segmentation details like hair and animal fur, but often misinterpret the real saliency due to the lack of global contextual information caused by the subjectiveness of the SOD task and the locality of convolution layers.
1 code implementation • 18 Feb 2022 • Feilong Chen, Duzhen Zhang, Minglun Han, Xiuyi Chen, Jing Shi, Shuang Xu, Bo Xu
Finally, we discuss the new frontiers in VLP.
1 code implementation • 11 Feb 2022 • Shuncheng Jia, Ruichen Zuo, Tielin Zhang, Hongxing Liu, Bo Xu
Network architectures and learning principles are key in forming complex functions in artificial neural networks (ANNs) and spiking neural networks (SNNs).
1 code implementation • 30 Jan 2022 • Minglun Han, Linhao Dong, Zhenlin Liang, Meng Cai, Shiyu Zhou, Zejun Ma, Bo Xu
Nowadays, most methods in end-to-end contextual speech recognition bias the recognition process towards contextual knowledge.
no code implementations • 22 Jan 2022 • Han Hu, Xinrong Liang, Yulin Ding, Qisen Shang, Bo Xu, Xuming Ge, Min Chen, Ruofei Zhong, Qing Zhu
Unfortunately, the large amount of interactive sample labeling efforts has dramatically hindered the application of deep learning methods, especially for 3D modeling tasks, which require heterogeneous samples.
no code implementations • 17 Dec 2021 • Jing Shi, Xuankai Chang, Tomoki Hayashi, Yen-Ju Lu, Shinji Watanabe, Bo Xu
Specifically, we propose a novel speech separation/enhancement model based on the recognition of discrete symbols, and convert the paradigm of the speech separation/enhancement related tasks from regression to classification.
1 code implementation • 6 Dec 2021 • Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xiyun Li, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Bo Xu
In this paper, we facilitate the research by providing large-scale datasets, and use them to examine the usage of the Decision Transformer in the context of MARL.
1 code implementation • 7 Nov 2021 • Qinghua Liu, Yating Huang, Yunzhe Hao, Jiaming Xu, Bo Xu
Multi-modal cues, including spatial information, facial expression and voiceprint, are introduced to the speech separation and speaker extraction tasks to serve as complementary information to achieve better performance.
no code implementations • 22 Oct 2021 • Ziwen Li, Bo Xu, Han Huang, Cheng Lu, Yandong Guo
In this paper, we propose a new framework Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation (DTS-VIBE), to generate 3D human pose and mesh from RGB videos.
Ranked #44 on 3D Human Pose Estimation on 3DPW
1 code implementation • ICCV 2021 • Bo Xu, Han Huang, Cheng Lu, Ziwen Li, Yandong Guo
In this paper, we propose a Virtual Multi-modality Foreground Matting (VMFM) method to learn human-object interactive foreground (human and objects interacted with him or her) from a raw RGB image.
no code implementations • 2 Oct 2021 • Kaixiang Yang, Hongya Wang, Bo Xu, Wei Wang, Yingyuan Xiao, Ming Du, Junfeng Zhou
In the middle of query execution, AdaptNN collects a number of runtime features and predicts termination condition for each individual query, by which better end-to-end latency is attained.
no code implementations • 29 Sep 2021 • Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xi yun Li, Haifeng Zhang, Ying Wen, Weinan Zhang, Jun Wang, Bo Xu
Offline reinforcement learning leverages static datasets to learn optimal policies with no necessity to access the environment.
Multi-agent Reinforcement Learning reinforcement-learning +2
no code implementations • 21 Sep 2021 • Hongming Zhang, Ke Sun, Bo Xu, Linglong Kong, Martin Müller
In this paper, we propose a simple yet effective anomaly detection framework for deep RL algorithms that simultaneously considers random, adversarial and out-of-distribution~(OOD) state outliers.
no code implementations • 7 Jul 2021 • Fangyuan Wang, Zhigang Song, Hongchen Jiang, Bo Xu
Most of the recent state-of-the-art results for speaker verification are achieved by X-vector and its subsequent variants.
no code implementations • 15 Jun 2021 • Duzhen Zhang, Tielin Zhang, Shuncheng Jia, Xiang Cheng, Bo Xu
Based on a hybrid learning framework, where a spike actor-network infers actions from states and a deep critic network evaluates the actor, we propose a Population-coding and Dynamic-neurons improved Spiking Actor Network (PDSAN) for efficient state representation from two different scales: input coding and neuronal coding.
1 code implementation • 13 Jun 2021 • Yunzhe Hao, Jiaming Xu, Peng Zhang, Bo Xu
In the speaker extraction problem, it is found that additional information from the target speaker contributes to the tracking and extraction of the target speaker, which includes voiceprint, lip movement, facial expression, and spatial information.
1 code implementation • NAACL 2021 • Haoran Wu, Wei Chen, Shuang Xu, Bo Xu
Specifically, we first structure the sequence of EMR into a hierarchical graph network and then obtain the causal relationship between multi-granularity features and diagnosis results through counterfactual intervention on the graph.
no code implementations • 17 Apr 2021 • Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu
The spatial self-attention module is designed to attend on the cross-channel correlation in the covariance matrices.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 7 Mar 2021 • Ke Sun, Jiaying Liu, Shuo Yu, Bo Xu, Feng Xia
Features representation leverages the great power in network analysis tasks.
no code implementations • 7 Mar 2021 • Ke Sun, Lei Wang, Bo Xu, Wenhong Zhao, Shyh Wei Teng, Feng Xia
Network representation learning (NRL) is an effective graph analytics technique and promotes users to deeply understand the hidden characteristics of graph data.
no code implementations • 25 Feb 2021 • Linghui Meng, Jin Xu, Xu Tan, Jindong Wang, Tao Qin, Bo Xu
In this paper, we propose MixSpeech, a simple yet effective data augmentation method based on mixup for automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 20 Jan 2021 • Fei Du, Bo Xu, Jiasheng Tang, Yuqi Zhang, Fan Wang, Hao Li
We extend the classical tracking-by-detection paradigm to this tracking-any-object task.
Ranked #7 on Multi-Object Tracking on TAO (using extra training data)
no code implementations • 17 Jan 2021 • Cheng Yi, Shiyu Zhou, Bo Xu
In this work, we fuse a pre-trained acoustic encoder (wav2vec2. 0) and a pre-trained linguistic encoder (BERT) into an end-to-end ASR model.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 5 Jan 2021 • Fukang Tian, Haiyu Wu, Bo Xu
At present, a few works have applied deep learning methods to financial ticket recognition.
no code implementations • 22 Dec 2020 • Cheng Yi, Jianzhong Wang, Ning Cheng, Shiyu Zhou, Bo Xu
To verify its universality over languages, we apply pre-trained models to solve low-resource speech recognition tasks in various spoken languages.
no code implementations • 17 Dec 2020 • Minglun Han, Linhao Dong, Shiyu Zhou, Bo Xu
End-to-end (E2E) models have achieved promising results on multiple speech recognition benchmarks, and shown the potential to become the mainstream.
no code implementations • 15 Dec 2020 • Fukang Tian, Haiyu Wu, Bo Xu
With the development of the economy, the number of financial tickets increases rapidly.
no code implementations • 11 Dec 2020 • Zhiyun Fan, Meng Li, Shiyu Zhou, Bo Xu
Then we demonstrate the effectiveness of wav2vec 2. 0 on the two tasks respectively.
no code implementations • COLING 2020 • Duzhen Zhang, Xiuyi Chen, Shuang Xu, Bo Xu
For one thing, speakers often rely on the context and commonsense knowledge to express emotions; for another, most utterances contain neutral emotion in conversations, as a result, the confusion between a few non-neutral utterances and much more neutral ones restrains the emotion recognition performance.
no code implementations • 29 Nov 2020 • Peng Zhang, Jiaming Xu, Jing Shi, Yunzhe Hao, Bo Xu
In our model, we use the face detector to detect the number of speakers in the scene and use visual information to avoid the permutation problem.
no code implementations • 29 Oct 2020 • Fukang Tian, Haiyu Wu, Bo Xu
Facing the rapid growth in the issuance of financial tickets (or bills, invoices etc.
1 code implementation • 9 Oct 2020 • Tielin Zhang, Shuncheng Jia, Xiang Cheng, Bo Xu
The performance of the proposed BRP-SNN is further verified on the spatial (including MNIST and Cifar-10) and temporal (including TIDigits and DvsGesture) tasks, where the SNN using BRP has reached a similar accuracy compared to other state-of-the-art BP-based SNNs and saved 50% more computational cost than ANNs.
no code implementations • 7 Oct 2020 • Xiang Cheng, Tielin Zhang, Shuncheng Jia, Bo Xu
Spiking Neural Networks (SNNs) have incorporated more biologically-plausible structures and learning principles, hence are playing critical roles in bridging the gap between artificial and natural neural networks.
1 code implementation • 21 Sep 2020 • Qianqian Dong, Mingxuan Wang, Hao Zhou, Shuang Xu, Bo Xu, Lei LI
The key idea is to generate source transcript and target translation text with a single decoder.
1 code implementation • 21 Sep 2020 • Qianqian Dong, Rong Ye, Mingxuan Wang, Hao Zhou, Shuang Xu, Bo Xu, Lei LI
Can we build a system to fully utilize signals in a parallel ST corpus?
no code implementations • 17 Aug 2020 • Jiaying Liu, Feng Xia, Lei Wang, Bo Xu, Xiangjie Kong, Hanghang Tong, Irwin King
The advisor-advisee relationship represents direct knowledge heritage, and such relationship may not be readily available from academic libraries and search engines.
no code implementations • 9 Aug 2020 • Lei Wang, Jing Ren, Bo Xu, Jian-Xin Li, Wei Luo, Feng Xia
Link prediction plays an important role in network analysis and applications.
no code implementations • 9 Aug 2020 • Ke Hou, Jiaying Liu, Yin Peng, Bo Xu, Ivan Lee, Feng Xia
Empirically, we evaluate DINE over three networks on multi-label classification and link prediction tasks.
no code implementations • 25 Jun 2020 • Jing Shi, Jiaming Xu, Yusuke Fujita, Shinji Watanabe, Bo Xu
With the predicted speaker information from whole observation, our model is helpful to solve the problem of conventional speech separation and speaker extraction for multi-round long recordings.
Audio and Speech Processing Sound
no code implementations • NeurIPS 2020 • Jing Shi, Xuankai Chang, Pengcheng Guo, Shinji Watanabe, Yusuke Fujita, Jiaming Xu, Bo Xu, Lei Xie
This model additionally has a simple and efficient stop criterion for the end of the transduction, making it able to infer the variable number of output sequences.
Ranked #3 on Speech Separation on WSJ0-4mix
no code implementations • 20 May 2020 • Linhao Dong, Cheng Yi, Jianzong Wang, Shiyu Zhou, Shuang Xu, Xueli Jia, Bo Xu
End-to-end models are gaining wider attention in the field of automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 19 May 2020 • Bo Xu, Xu Zhang, Zhixin Li, Matt Leotta, Shih-Fu Chang, Jie Shan
For points that belong to the same roof shape, a multi-cue, hierarchical RANSAC approach is proposed for efficient and reliable segmenting and reconstructing the building point cloud.
2 code implementations • CVPR 2020 • Bo Xu, Cheng Lu, Yandong Guo, Jacob Wang
Vision is often used as a complementary modality for audio speech recognition (ASR), especially in the noisy environment where performance of solo audio modality significantly deteriorates.
Ranked #6 on Audio-Visual Speech Recognition on LRS3-TED (using extra training data)
no code implementations • 2 Jan 2020 • Zhiyun Fan, Jie Li, Shiyu Zhou, Bo Xu
We investigate different factors of SAM.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 18 Dec 2019 • Feilong Chen, Fandong Meng, Jiaming Xu, Peng Li, Bo Xu, Jie zhou
Visual Dialog is a vision-language task that requires an AI agent to engage in a conversation with humans grounded in an image.
no code implementations • 12 Nov 2019 • Feng Chen, Yunkai Shang, Bo Xu, Jincheng Hu
In comparison with the previous non-learning adversarial example attack approaches, the GAN-based adversarial attack example approach can generate the adversarial samples quickly using the GAN architecture every time facing a new sample after training, but meanwhile needs to perturb the attack samples in great quantities, which results in the unpractical application in reality.
no code implementations • 28 Oct 2019 • Zhiyun Fan, Shiyu Zhou, Bo Xu
The unsupervised pre-training is finished on AISHELL-2 dataset and we apply the pre-trained model to multiple paired data ratios of AISHELL-1 and HKUST.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 16 Aug 2019 • Jiancheng Long, Hongming Zhang, Tianyang Yu, Bo Xu
In this method, iterative update can greatly alleviate the nonstationarity of the environment, unified representation can speed up the interaction with environment and avoid the linear growth of memory usage.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • ACL 2019 • Xiuyi Chen, Jiaming Xu, Bo Xu
Our WMM2Seq adopts a working memory to interact with two separated long-term memories, which are the episodic memory for memorizing dialog history and the semantic memory for storing KB tuples.
no code implementations • NAACL 2019 • Yiqun Yao, Jiaming Xu, Bo Xu
Visual Dialog is a multi-modal task that requires a model to participate in a multi-turn human dialog grounded on an image, and generate correct, human-like responses.
2 code implementations • 27 May 2019 • Linhao Dong, Bo Xu
In this paper, we propose a novel soft and monotonic alignment mechanism used for sequence transduction.
no code implementations • 17 May 2019 • Zengyou He, Guangyao Xu, Chaohua Sheng, Bo Xu, Quan Zou
By utilizing this framework as a tool, we propose new sequence classification algorithms that are quite different from existing solutions.
no code implementations • 17 Apr 2019 • Matthew Purri, Jia Xue, Kristin Dana, Matthew Leotta, Dan Lipsa, Zhixin Li, Bo Xu, Jie Shan
The residuals are computed by differencing the sparse-sampled reflectance function with a dictionary of pre-defined dense-sampled reflectance functions.
no code implementations • 18 Feb 2019 • Linhao Dong, Feng Wang, Bo Xu
Experiments on two Mandarin ASR datasets show the replacement of RNNs by the self-attention networks yields a 8. 4%-10. 2% relative character error rate (CER) reduction.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 17 Dec 2018 • Yunzhe Hao, Xuhui Huang, Meng Dong, Bo Xu
By combining the sym-STDP rule with bio-plausible synaptic scaling and intrinsic plasticity of the dynamic threshold, our SNN model implemented SL well and achieved good performance in the benchmark recognition task (MNIST dataset).
no code implementations • 15 Nov 2018 • Jing Shi, Jiaming Xu, Yiqun Yao, Bo Xu
In this paper, we present a memory-augmented neural network which is motivated by the process of human concept learning.
no code implementations • EMNLP 2018 • Yufeng Diao, Hongfei Lin, Di wu, Liang Yang, Kan Xu, Zhihao Yang, Jian Wang, Shaowu Zhang, Bo Xu, Dongyu Zhang
In this work, we first use WordNet to understand and expand word embedding for settling the polysemy of homographic puns, and then propose a WordNet-Encoded Collocation-Attention network model (WECA) which combined with the context weights for recognizing the puns.
1 code implementation • EMNLP 2018 • Yiqun Yao, Jiaming Xu, Feng Wang, Bo Xu
Our code is available at https://github. com/FlamingHorizon/CMM-VR.
no code implementations • COLING 2018 • Feng Wang, Wei Chen, Zhen Yang, Qianqian Dong, Shuang Xu, Bo Xu
While the disfluency detection has achieved notable success in the past years, it still severely suffers from the data scarcity.
no code implementations • ACL 2018 • Dongyu Zhang, Hongfei Lin, Liang Yang, Shaowu Zhang, Bo Xu
However, there is little research on the construction of metaphor corpora annotated with emotion for the analysis of emotionality of metaphorical expressions.
no code implementations • 25 Jun 2018 • Chenxing Li, Tieqiang Wang, Shuang Xu, Bo Xu
In this paper, we propose a single-channel speech dereverberation system (DeReGAT) based on convolutional, bidirectional long short-term memory and deep feed-forward neural network (CBLDNN) with generative adversarial training (GAT).
no code implementations • 17 Jun 2018 • Linhao Dong, Shiyu Zhou, Wei Chen, Bo Xu
End-to-end models have been showing superiority in Automatic Speech Recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 12 Jun 2018 • Shiyu Zhou, Shuang Xu, Bo Xu
Experiments on CALLHOME datasets demonstrate that the multilingual ASR Transformer with the language symbol at the end performs better and can obtain relatively 10. 5\% average word error rate (WER) reduction compared to SHL-MLSTM with residual learning.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
3 code implementations • 4 Jun 2018 • Fenfen Sheng, Zhineng Chen, Bo Xu
Considering scene image has large variation in text and background, we further design a modality-transform block to effectively transform 2D input images to 1D sequences, combined with the encoder to extract more discriminative features.
no code implementations • 16 May 2018 • Shiyu Zhou, Linhao Dong, Shuang Xu, Bo Xu
Experiments on HKUST datasets demonstrate that the lexicon free modeling units can outperform lexicon related modeling units in terms of character error rate (CER).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 28 Apr 2018 • Shiyu Zhou, Linhao Dong, Shuang Xu, Bo Xu
Furthermore, we investigate a comparison between syllable based model and context-independent phoneme (CI-phoneme) based model with the Transformer in Mandarin Chinese.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
1 code implementation • ACL 2018 • Zhen Yang, Wei Chen, Feng Wang, Bo Xu
Unsupervised neural machine translation (NMT) is a recently proposed approach for machine translation which aims to train the model without using any labeled data.
Ranked #6 on Machine Translation on WMT2016 German-English
1 code implementation • IJCNLP 2017 • Chunqi Wang, Bo Xu
The first is that they heavily rely on manually designed bigram feature, i. e. they are not good at capturing n-gram features automatically.
no code implementations • EMNLP 2017 • Xiaowei Zhang, Wei Chen, Feng Wang, Shuang Xu, Bo Xu
Neural Machine Translation (NMT) lays intensive burden on computation and memory cost.
2 code implementations • ACL 2017 • Suncong Zheng, Feng Wang, Hongyun Bao, Yuexing Hao, Peng Zhou, Bo Xu
Joint extraction of entities and relations is an important task in information extraction.
Ranked #3 on Relation Extraction on NYT-single
3 code implementations • NAACL 2018 • Zhen Yang, Wei Chen, Feng Wang, Bo Xu
During training, both the dynamic discriminator and the static BLEU objective are employed to evaluate the generated sentences and feedback the evaluations to guide the learning of the generator.
1 code implementation • 1 Jan 2017 • Jiaming Xu, Peng Wang, Suncong Zheng, Guanhua Tian, Jun Zhao, Bo Xu
Short text clustering is a challenging problem due to its sparseness of text representation.
Ranked #2 on Short Text Clustering on Stackoverflow
no code implementations • WS 2016 • Jing Shi, Jiaming Xu, Yiqun Yao, Suncong Zheng, Bo Xu
As the result of the evaluation shows, our solution provides a valuable and brief model which could be used in modelling question answering or sentence semantic relevance.
no code implementations • COLING 2016 • Zhen Yang, Wei Chen, Feng Wang, Bo Xu
This article proposes a novel character-aware neural machine translation (NMT) model that views the input sequences as sequences of characters rather than words.
3 code implementations • COLING 2016 • Peng Zhou, Zhenyu Qi, Suncong Zheng, Jiaming Xu, Hongyun Bao, Bo Xu
To integrate the features on both dimensions of the matrix, this paper explores applying 2D max pooling operation to obtain a fixed-length representation of the text.
Ranked #6 on Text Classification on TREC-6
1 code implementation • COLING 2016 • Jiaming Xu, Jing Shi, Yiqun Yao, Suncong Zheng, Bo Xu
Recently, end-to-end memory networks have shown promising results on Question Answering task, which encode the past facts into an explicit memory and perform reasoning ability by making multiple computational steps on the memory.
no code implementations • Journal of Biomedical Informatics 2016 • Wei Zheng, Hongfei Lin, Zhehuan Zhao, Bo Xu, Yijia Zhang, Zhihao Yang, Jian Wang
Especially for the Medline-2013 dataset, our system outperforms the top-ranking DDIs systems by F-scores of 10. 7 and 12. 2 in detection and classification, respectively.
no code implementations • IJCAI 2015 • Jiaming Xu, PengWang, Guanhua Tian, Bo Xu, Jun Zhao, Fangyuan Wang, HongWei Hao
Meanwhile word features and position features are together fed into a convolutional network to learn the implicit features which are further incorporated with the explicit features to fit the pretrained binary code.
1 code implementation • 10 Mar 2015 • Jiaming Xu, Bo Xu, Guanhua Tian, Jun Zhao, Fangyuan Wang, Hong-Wei Hao
However, topics of certain granularity are not adequate to represent the intrinsic semantic information.