Search Results for author: Dong Wang

Found 270 papers, 100 papers with code

Wild Face Anti-Spoofing Challenge 2023: Benchmark and Results

1 code implementation12 Apr 2023 Dong Wang, Jia Guo, Qiqi Shao, Haochi He, Zhian Chen, Chuanbao Xiao, Ajian Liu, Sergio Escalera, Hugo Jair Escalante, Zhen Lei, Jun Wan, Jiankang Deng

Leveraging the WFAS dataset and Protocol 1 (Known-Type), we host the Wild Face Anti-Spoofing Challenge at the CVPR2023 workshop.

Face Anti-Spoofing Face Recognition

Learning to Navigate for Fine-grained Classification

12 code implementations ECCV 2018 Ze Yang, Tiange Luo, Dong Wang, Zhiqiang Hu, Jun Gao, Li-Wei Wang

In consideration of intrinsic consistency between informativeness of the regions and their probability being ground-truth class, we design a novel training paradigm, which enables Navigator to detect most informative regions under the guidance from Teacher.

Fine-Grained Image Classification General Classification +1

Universal Instance Perception as Object Discovery and Retrieval

1 code implementation CVPR 2023 Bin Yan, Yi Jiang, Jiannan Wu, Dong Wang, Ping Luo, Zehuan Yuan, Huchuan Lu

All instance perception tasks aim at finding certain objects specified by some queries such as category names, language expressions, and target annotations, but this complete field has been split into multiple independent subtasks.

 Ranked #1 on Referring Expression Segmentation on RefCoCo val (using extra training data)

Described Object Detection Generalized Referring Expression Comprehension +15

Towards Grand Unification of Object Tracking

1 code implementation14 Jul 2022 Bin Yan, Yi Jiang, Peize Sun, Dong Wang, Zehuan Yuan, Ping Luo, Huchuan Lu

We present a unified method, termed Unicorn, that can simultaneously solve four tracking problems (SOT, MOT, VOS, MOTS) with a single network using the same model parameters.

Multi-Object Tracking Multi-Object Tracking and Segmentation +3

Off-Policy Primal-Dual Safe Reinforcement Learning

2 code implementations26 Jan 2024 Zifan Wu, Bo Tang, Qian Lin, Chao Yu, Shangqin Mao, Qianlong Xie, Xingxing Wang, Dong Wang

Results on benchmark tasks show that our method not only achieves an asymptotic performance comparable to state-of-the-art on-policy methods while using much fewer samples, but also significantly reduces constraint violation during training.

reinforcement-learning Safe Reinforcement Learning

Tracking Anything in High Quality

1 code implementation26 Jul 2023 Jiawen Zhu, Zhenyu Chen, Zeqi Hao, Shijie Chang, Lu Zhang, Dong Wang, Huchuan Lu, Bin Luo, Jun-Yan He, Jin-Peng Lan, Hanyuan Chen, Chenyang Li

To further improve the quality of tracking masks, a pretrained MR model is employed to refine the tracking results.

Object Semantic Segmentation +3

C^3 Framework: An Open-source PyTorch Code for Crowd Counting

3 code implementations5 Jul 2019 Junyu. Gao, Wei. Lin, Bin Zhao, Dong Wang, Chenyu Gao, Jun Wen

This technical report attempts to provide efficient and solid kits addressed on the field of crowd counting, which is denoted as Crowd Counting Code Framework (C$^3$F).

Crowd Counting

Transformer Tracking

1 code implementation CVPR 2021 Xin Chen, Bin Yan, Jiawen Zhu, Dong Wang, Xiaoyun Yang, Huchuan Lu

The correlation operation is a simple fusion manner to consider the similarity between the template and the search region.

Visual Object Tracking Visual Tracking

High-Performance Long-Term Tracking with Meta-Updater

2 code implementations CVPR 2020 Kenan Dai, Yunhua Zhang, Dong Wang, Jianhua Li, Huchuan Lu, Xiaoyun Yang

Most top-ranked long-term trackers adopt the offline-trained Siamese architectures, thus, they cannot benefit from great progress of short-term trackers with online update.

Visual Object Tracking Visual Tracking +1

Visual Prompt Multi-Modal Tracking

1 code implementation CVPR 2023 Jiawen Zhu, Simiao Lai, Xin Chen, Dong Wang, Huchuan Lu

To inherit the powerful representations of the foundation model, a natural modus operandi for multi-modal tracking is full fine-tuning on the RGB-based parameters.

Object Tracking Rgb-T Tracking

Learning regression and verification networks for long-term visual tracking

3 code implementations12 Sep 2018 Yunhua Zhang, Dong Wang, Lijun Wang, Jinqing Qi, Huchuan Lu

Compared with short-term tracking, the long-term tracking task requires determining the tracked object is present or absent, and then estimating the accurate bounding box if present or conducting image-wide re-detection if absent.

General Classification Object +3

Balanced Multimodal Learning via On-the-fly Gradient Modulation

1 code implementation CVPR 2022 Xiaokang Peng, Yake Wei, Andong Deng, Dong Wang, Di Hu

Multimodal learning helps to comprehensively understand the world, by integrating different senses.

Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training

3 code implementations28 May 2022 Renrui Zhang, Ziyu Guo, Rongyao Fang, Bin Zhao, Dong Wang, Yu Qiao, Hongsheng Li, Peng Gao

By fine-tuning on downstream tasks, Point-M2AE achieves 86. 43% accuracy on ScanObjectNN, +3. 36% to the second-best, and largely benefits the few-shot classification, part segmentation and 3D object detection with the hierarchical pre-training scheme.

Ranked #4 on 3D Point Cloud Linear Classification on ModelNet40 (using extra training data)

3D Object Detection 3D Point Cloud Linear Classification +5

Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation

1 code implementation4 Jul 2020 Bin Yan, Dong Wang, Huchuan Lu, Xiaoyun Yang

In recent years, the multiple-stage strategy has become a popular trend for visual tracking.

Visual Tracking

Mining Negative Temporal Contexts For False Positive Suppression In Real-Time Ultrasound Lesion Detection

1 code implementation29 May 2023 Haojun Yu, Youcheng Li, Quanlin Wu, Ziwei Zhao, Dengbo Chen, Dong Wang, LiWei Wang

To address this issue, we propose to extract contexts from previous frames, including NTC, with the guidance of inverse optical flow.

Lesion Detection object-detection +2

'Skimming-Perusal' Tracking: A Framework for Real-Time and Robust Long-term Tracking

1 code implementation ICCV 2019 Bin Yan, Haojie Zhao, Dong Wang, Huchuan Lu, Xiaoyun Yang

In this work, we present a novel robust and real-time long-term tracking framework based on the proposed skimming and perusal modules.

LlamaRec: Two-Stage Recommendation using Large Language Models for Ranking

1 code implementation25 Oct 2023 Zhenrui Yue, Sara Rabhi, Gabriel de Souza Pereira Moreira, Dong Wang, Even Oldridge

Recently, large language models (LLMs) have exhibited significant progress in language understanding and generation.

Movie Recommendation

GradNet: Gradient-Guided Network for Visual Object Tracking

2 code implementations ICCV 2019 Peixia Li, Bo-Yu Chen, Wanli Ouyang, Dong Wang, Xiaoyun Yang, Huchuan Lu

In this work, we propose a novel gradient-guided network to exploit the discriminative information in gradients and update the template in the siamese network through feed-forward and backward operations.

Ranked #3 on Visual Object Tracking on OTB-2015 (Precision metric)

Object Template Matching +2

Multi-modal Visual Tracking: Review and Experimental Comparison

2 code implementations8 Dec 2020 Pengyu Zhang, Dong Wang, Huchuan Lu

Visual object tracking, as a fundamental task in computer vision, has drawn much attention in recent years.

Rgb-T Tracking Visual Object Tracking

TFN: An Interpretable Neural Network with Time-Frequency Transform Embedded for Intelligent Fault Diagnosis

1 code implementation5 Sep 2022 Qian Chen, Xingjian Dong, Guowei Tu, Dong Wang, Baoxuan Zhao, Zhike Peng

However, the CNN is a typical black-box model, and the mechanism of CNN's decision-making are not clear, which limits its application in high-reliability-required fault diagnosis scenarios.

Decision Making

Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters

1 code implementation18 Mar 2024 Jiazuo Yu, Yunzhi Zhuge, Lu Zhang, Dong Wang, Huchuan Lu, You He

Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset.

Continual Learning Incremental Learning +1

Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible Noises

1 code implementation CVPR 2020 Bin Yan, Dong Wang, Huchuan Lu, Xiaoyun Yang

An effective and efficient perturbation generator is trained with a carefully designed adversarial loss, which can simultaneously cool hot regions where the target exists on the heatmaps and force the predicted bounding box to shrink, making the tracked target invisible to trackers.

Adversarial Attack

Vision-based Anti-UAV Detection and Tracking

1 code implementation22 May 2022 Jie Zhao, Jingshu Zhang, Dongdong Li, Dong Wang

It contains a detection dataset with a total of 10, 000 images and a tracking dataset with 20 videos that include short-term and long-term sequences.

ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance

7 code implementations29 Mar 2023 Zoey Guo, Yiwen Tang, Ray Zhang, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li

In this paper, we propose ViewRefer, a multi-view framework for 3D visual grounding exploring how to grasp the view knowledge from both text and 3D modalities.

Visual Grounding

Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection

1 code implementation ICCV 2017 Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, Xiang Ruan

In addition, to achieve accurate boundary inference and semantic enhancement, edge-aware feature maps in low-level layers and the predicted results of low resolution features are recursively embedded into the learning framework.

Ranked #20 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)

Object object-detection +2

CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding

1 code implementation ACL 2021 Dong Wang, Ning Ding, Piji Li, Hai-Tao Zheng

Recent works aimed to improve the robustness of pre-trained models mainly focus on adversarial training from perturbed examples with similar semantics, neglecting the utilization of different or even opposite semantics.

Contrastive Learning Natural Language Understanding +3

Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models

5 code implementations4 Oct 2023 Yiwen Tang, Ray Zhang, Zoey Guo, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li

To this end, we introduce Point-PEFT, a novel framework for adapting point cloud pre-trained models with minimal learnable parameters.

Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding

5 code implementations11 Apr 2024 Yiwen Tang, Jiaming Liu, Dong Wang, Zhigang Wang, Shanghang Zhang, Bin Zhao, Xuelong Li

The adapter incorporates prior spatial knowledge from the source modality to guide the local feature aggregation of 3D tokens, compelling the semantic adaption of any-modality transformers.

Efficient Visual Tracking via Hierarchical Cross-Attention Transformer

1 code implementation25 Mar 2022 Xin Chen, Ben Kang, Dong Wang, Dongdong Li, Huchuan Lu

Most state-of-the-art trackers are satisfied with the real-time speed on powerful GPUs.

Visual Tracking

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

1 code implementation NeurIPS 2023 Haoran He, Chenjia Bai, Kang Xu, Zhuoran Yang, Weinan Zhang, Dong Wang, Bin Zhao, Xuelong Li

Specifically, we propose Multi-Task Diffusion Model (\textsc{MTDiff}), a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis in multi-task offline settings.

Reinforcement Learning (RL)

THCHS-30 : A Free Chinese Speech Corpus

1 code implementation7 Dec 2015 Dong Wang, Xuewei Zhang

Speech data is crucially important for speech recognition research.

speech-recognition Speech Recognition

Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline

1 code implementation CVPR 2022 Pengyu Zhang, Jie Zhao, Dong Wang, Huchuan Lu, Xiang Ruan

With the popularity of multi-modal sensors, visible-thermal (RGB-T) object tracking is to achieve robust performance and wider application scenarios with the guidance of objects' temperature information.

Attribute Object Tracking +1

PointScatter: Point Set Representation for Tubular Structure Extraction

1 code implementation13 Sep 2022 Dong Wang, Zhao Zhang, Ziwei Zhao, Yuhang Liu, Yihong Chen, LiWei Wang

Inspired by this, we propose PointScatter, an alternative to the segmentation models for the tubular structure extraction task.

Segmentation

Fully Self-Supervised Depth Estimation from Defocus Clue

1 code implementation CVPR 2023 Haozhe Si, Bin Zhao, Dong Wang, Yunpeng Gao, Mulin Chen, Zhigang Wang, Xuelong Li

We show that our framework circumvents the needs for the depth and AIF image ground-truth, and receives superior predictions, thus closing the gap between the theoretical success of DFD works and their applications in the real world.

Depth Estimation

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

1 code implementation19 Mar 2021 Zhe Xie, Chengxuan Liu, Yichi Zhang, Hongtao Lu, Dong Wang, Yue Ding

To solve the above problem, in this work, we propose a novel method called Adversarial and Contrastive Variational Autoencoder (ACVAE) for sequential recommendation.

Collaborative Filtering Sequential Recommendation

Exploiting Persona Information for Diverse Generation of Conversational Responses

1 code implementation29 May 2019 Haoyu Song, Wei-Nan Zhang, Yiming Cui, Dong Wang, Ting Liu

Giving conversational context with persona information to a chatbot, how to exploit the information to generate diverse and sustainable conversations is still a non-trivial task.

Chatbot

Linear Recurrent Units for Sequential Recommendation

1 code implementation3 Oct 2023 Zhenrui Yue, Yueqi Wang, Zhankui He, Huimin Zeng, Julian McAuley, Dong Wang

State-of-the-art sequential recommendation relies heavily on self-attention-based recommender models.

Language Modelling Sequential Recommendation

WANA: Symbolic Execution of Wasm Bytecode for Cross-Platform Smart Contract Vulnerability Detection

1 code implementation30 Jul 2020 Dong Wang, Bo Jiang, W. K. Chan

Furthermore, WANA proposes a set of test oracles to detect the vulnerabilities in EOSIO and Ethereum smart contracts based on WebAssembly bytecode analysis.

Software Engineering D.2.5

Correlation Tracking via Joint Discrimination and Reliability Learning

1 code implementation CVPR 2018 Chong Sun, Dong Wang, Huchuan Lu, Ming-Hsuan Yang

To address this issue, we propose a novel CF-based optimization problem to jointly model the discrimination and reliability information.

Visual Tracking

High-Performance Transformer Tracking

1 code implementation25 Mar 2022 Xin Chen, Bin Yan, Jiawen Zhu, Huchuan Lu, Xiang Ruan, Dong Wang

First, we present a transformer tracking (named TransT) method based on the Siamese-like feature extraction backbone, the designed attention-based fusion mechanism, and the classification and regression head.

Vocal Bursts Intensity Prediction

MetaAdapt: Domain Adaptive Few-Shot Misinformation Detection via Meta Learning

1 code implementation22 May 2023 Zhenrui Yue, Huimin Zeng, Yang Zhang, Lanyu Shang, Dong Wang

As such, MetaAdapt can learn how to adapt the misinformation detection model and exploit the source data for improved performance in the target domain.

Meta-Learning Misinformation +1

D-DPCC: Deep Dynamic Point Cloud Compression via 3D Motion Prediction

1 code implementation2 May 2022 Tingyu Fan, Linyao Gao, Yiling Xu, Zhu Li, Dong Wang

This paper proposes a novel 3D sparse convolution-based Deep Dynamic Point Cloud Compression (D-DPCC) network to compensate and compress the DPC geometry with 3D motion estimation and motion compensation in the feature space.

Motion Compensation Motion Estimation +2

Temporal Relational Modeling with Self-Supervision for Action Segmentation

1 code implementation14 Dec 2020 Dong Wang, Di Hu, Xingjian Li, Dejing Dou

The main reason is that large number of nodes (i. e., video frames) makes GCNs hard to capture and model temporal relations in videos.

Action Recognition Action Segmentation +1

Real Additive Margin Softmax for Speaker Verification

1 code implementation18 Oct 2021 Lantian Li, Ruiqian Nai, Dong Wang

The additive margin softmax (AM-Softmax) loss has delivered remarkable performance in speaker verification.

Speaker Verification

Learning Spatial-Aware Regressions for Visual Tracking

1 code implementation CVPR 2018 Chong Sun, Dong Wang, Huchuan Lu, Ming-Hsuan Yang

Second, we propose a fully convolutional neural network with spatially regularized kernels, through which the filter kernel corresponding to each output channel is forced to focus on a specific region of the target.

regression Visual Object Tracking +1

Topology-Preserving Automatic Labeling of Coronary Arteries via Anatomy-aware Connection Classifier

1 code implementation22 Jul 2023 Zhixing Zhang, Ziwei Zhao, Dong Wang, Shishuang Zhao, Yuhang Liu, Jia Liu, LiWei Wang

Automatic labeling of coronary arteries is an essential task in the practical diagnosis process of cardiovascular diseases.

Anatomy

Zero- and Few-Shot Event Detection via Prompt-Based Meta Learning

1 code implementation27 May 2023 Zhenrui Yue, Huimin Zeng, Mengfei Lan, Heng Ji, Dong Wang

With emerging online topics as a source for numerous new events, detecting unseen / rare event types presents an elusive challenge for existing event detection methods, where only limited data access is provided for training.

Event Detection Meta-Learning

Tight Mutual Information Estimation With Contrastive Fenchel-Legendre Optimization

1 code implementation2 Jul 2021 Qing Guo, Junya Chen, Dong Wang, Yuewei Yang, Xinwei Deng, Lawrence Carin, Fan Li, Jing Huang, Chenyang Tao

Successful applications of InfoNCE and its variants have popularized the use of contrastive variational mutual information (MI) estimators in machine learning.

Mutual Information Estimation

Show, Deconfound and Tell: Image Captioning With Causal Inference

1 code implementation CVPR 2022 Bing Liu, Dong Wang, Xu Yang, Yong Zhou, Rui Yao, Zhiwen Shao, Jiaqi Zhao

In the encoding stage, the IOD is able to disentangle the region-based visual features by deconfounding the visual confounder.

Causal Inference Image Captioning

CN-CELEB: a challenging Chinese speaker recognition dataset

2 code implementations31 Oct 2019 Yue Fan, Jiawen Kang, Lantian Li, Kaicheng Li, Haolin Chen, Sitong Cheng, Pengyuan Zhang, Ziya Zhou, Yunqi Cai, Dong Wang

These datasets tend to deliver over optimistic performance and do not meet the request of research on speaker recognition in unconstrained conditions.

Speaker Recognition

Dual Memory Aggregation Network for Event-Based Object Detection with Learnable Representation

1 code implementation17 Mar 2023 Dongsheng Wang, Xu Jia, Yang Zhang, Xinyu Zhang, Yaoyuan Wang, Ziyang Zhang, Dong Wang, Huchuan Lu

To fully exploit information with event streams to detect objects, a dual-memory aggregation network (DMANet) is proposed to leverage both long and short memory along event streams to aggregate effective information for object detection.

Object object-detection +1

Safe Offline Reinforcement Learning with Real-Time Budget Constraints

1 code implementation1 Jun 2023 Qian Lin, Bo Tang, Zifan Wu, Chao Yu, Shangqin Mao, Qianlong Xie, Xingxing Wang, Dong Wang

Aiming at promoting the safe real-world deployment of Reinforcement Learning (RL), research on safe RL has made significant progress in recent years.

reinforcement-learning Reinforcement Learning (RL)

ROI Pooled Correlation Filters for Visual Tracking

1 code implementation CVPR 2019 Yuxuan Sun, Chong Sun, Dong Wang, You He, Huchuan Lu

The ROI (region-of-interest) based pooling method performs pooling operations on the cropped ROI regions for various samples and has shown great success in the object detection methods.

object-detection Object Detection +1

Video Annotation for Visual Tracking via Selection and Refinement

1 code implementation ICCV 2021 Kenan Dai, Jie Zhao, Lijun Wang, Dong Wang, Jianhua Li, Huchuan Lu, Xuesheng Qian, Xiaoyun Yang

Deep learning based visual trackers entail offline pre-training on large volumes of video datasets with accurate bounding box annotations that are labor-expensive to achieve.

Visual Tracking

Cross DQN: Cross Deep Q Network for Ads Allocation in Feed

1 code implementation9 Sep 2021 Guogang Liao, Ze Wang, Xiaoxu Wu, Xiaowen Shi, Chuheng Zhang, Yongkang Wang, Xingxing Wang, Dong Wang

Our model results in higher revenue and better user experience than state-of-the-art baselines in offline experiments.

PIER: Permutation-Level Interest-Based End-to-End Re-ranking Framework in E-commerce

1 code implementation6 Feb 2023 Xiaowen Shi, Fan Yang, Ze Wang, Xiaoxu Wu, Muzhi Guan, Guogang Liao, Yongkang Wang, Xingxing Wang, Dong Wang

Then we design a novel omnidirectional attention mechanism in OCPM to capture the context information in the permutation.

Re-Ranking

Deep Speaker Vector Normalization with Maximum Gaussianality Training

1 code implementation30 Oct 2020 Yunqi Cai, Lantian Li, Dong Wang, Andrew Abel

In this paper, we argue that this problem is largely attributed to the maximum-likelihood (ML) training criterion of the DNF model, which aims to maximize the likelihood of the observations but not necessarily improve the Gaussianality of the latent codes.

Speaker Recognition

Voting for the right answer: Adversarial defense for speaker verification

1 code implementation15 Jun 2021 Haibin Wu, Yang Zhang, Zhiyong Wu, Dong Wang, Hung-Yi Lee

Automatic speaker verification (ASV) is a well developed technology for biometric identification, and has been ubiquitous implemented in security-critic applications, such as banking and access control.

Adversarial Defense Speaker Verification

Contrastive Domain Adaptation for Early Misinformation Detection: A Case Study on COVID-19

2 code implementations20 Aug 2022 Zhenrui Yue, Huimin Zeng, Ziyi Kou, Lanyu Shang, Dong Wang

However, early misinformation often demonstrates both conditional and label shifts against existing misinformation data (e. g., class imbalance in COVID-19 datasets), rendering such methods less effective for detecting early misinformation.

Domain Adaptation Misinformation

QA Domain Adaptation using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation

1 code implementation19 Oct 2022 Zhenrui Yue, Huimin Zeng, Bernhard Kratzwald, Stefan Feuerriegel, Dong Wang

Unlike existing approaches, we generate pseudo labels and propose to train the model via a novel attention-based contrastive adaptation method.

Contrastive Learning Data Augmentation +2

AP18-OLR Challenge: Three Tasks and Their Baselines

1 code implementation2 Jun 2018 Zhiyuan Tang, Dong Wang, Qing Chen

The third oriental language recognition (OLR) challenge AP18-OLR is introduced in this paper, including the data profile, the tasks and the evaluation principles.

Open Set Learning

AP17-OLR Challenge: Data, Plan, and Baseline

1 code implementation28 Jun 2017 Zhiyuan Tang, Dong Wang, Yixiang Chen, Qing Chen

We present the data profile and the evaluation plan of the second oriental language recognition (OLR) challenge AP17-OLR.

Neural Image Re-Exposure

1 code implementation23 May 2023 Xinyu Zhang, Hefei Huang, Xu Jia, Dong Wang, Huchuan Lu

In this work, we aim to re-expose the captured photo in post-processing to provide a more flexible way of addressing those issues within a unified framework.

Ranked #4 on Deblurring on GoPro (using extra training data)

Deblurring Joint Deblur and Frame Interpolation +5

signADAM: Learning Confidences for Deep Neural Networks

1 code implementation21 Jul 2019 Dong Wang, Yicheng Liu, Wenwo Tang, Fanhua Shang, Hongying Liu, Qigong Sun, Licheng Jiao

In this paper, we propose a new first-order gradient-based algorithm to train deep neural networks.

Domain-Invariant Speaker Vector Projection by Model-Agnostic Meta-Learning

1 code implementation25 May 2020 Jiawen Kang, Ruiqi Liu, Lantian Li, Yunqi Cai, Dong Wang, Thomas Fang Zheng

Domain generalization remains a critical problem for speaker recognition, even with the state-of-the-art architectures based on deep neural nets.

Audio and Speech Processing

Rényi State Entropy for Exploration Acceleration in Reinforcement Learning

1 code implementation8 Mar 2022 Mingqi Yuan, Man-on Pun, Dong Wang

One of the most critical challenges in deep reinforcement learning is to maintain the long-term exploration capability of the agent.

reinforcement-learning Reinforcement Learning (RL)

Deep generative LDA

1 code implementation30 Oct 2020 Yunqi Cai, Dong Wang

Limited by its linear form and the underlying Gaussian assumption, however, LDA is not applicable in situations where the data distribution is complex.

Dimensionality Reduction Speaker Recognition

Federated Recommendation via Hybrid Retrieval Augmented Generation

1 code implementation7 Mar 2024 Huimin Zeng, Zhenrui Yue, Qian Jiang, Dong Wang

To this end, we propose GPT-FedRec, a federated recommendation framework leveraging ChatGPT and a novel hybrid Retrieval Augmented Generation (RAG) mechanism.

Hallucination Privacy Preserving +2

Deep Normalization for Speaker Vectors

1 code implementation7 Apr 2020 Yunqi Cai, Lantian Li, Dong Wang, Andrew Abel

Deep speaker embedding has demonstrated state-of-the-art performance in speaker recognition tasks.

Speaker Recognition

Towards Nonlinear-Motion-Aware and Occlusion-Robust Rolling Shutter Correction

1 code implementation ICCV 2023 Delin Qu, Yizhen Lao, Zhigang Wang, Dong Wang, Bin Zhao, Xuelong Li

This paper addresses the problem of rolling shutter correction in complex nonlinear and dynamic scenes with extreme occlusion.

Rolling Shutter Correction

Subspace-Configurable Networks

1 code implementation22 May 2023 Olga Saukh, Dong Wang, Xiaoxi He, Lothar Thiele

The obtained subspace is low-dimensional and has a surprisingly simple structure even for complex, non-invertible transformations of the input, leading to an exceptionally high efficiency of subspace-configurable networks (SCNs) when limited storage and computing resources are at stake.

Audio Signal Processing Data Augmentation

Gradient Importance Learning for Incomplete Observations

1 code implementation ICLR 2022 Qitong Gao, Dong Wang, Joshua D. Amason, Siyang Yuan, Chenyang Tao, Ricardo Henao, Majda Hadziahmetovic, Lawrence Carin, Miroslav Pajic

Though recent works have developed methods that can generate estimates (or imputations) of the missing entries in a dataset to facilitate downstream analysis, most depend on assumptions that may not align with real-world applications and could suffer from poor performance in subsequent tasks such as classification.

Imputation Reinforcement Learning (RL) +2

Deployable Reinforcement Learning with Variable Control Rate

1 code implementation17 Jan 2024 Dong Wang, Giovanni Beltrame

Unfortunately, the system should be controlled at the highest, worst-case frequency to ensure stability, which can demand significant computational and energy resources and hinder the deployability of the controller on onboard hardware.

reinforcement-learning Reinforcement Learning (RL)

Exploring Linear Relationship in Feature Map Subspace for ConvNets Compression

no code implementations15 Mar 2018 Dong Wang, Lei Zhou, Xueni Zhang, Xiao Bai, Jun Zhou

In this way, most of the representative information in the network can be retained in each cluster.

Clustering

Deep factorization for speech signal

no code implementations27 Feb 2018 Lantian Li, Dong Wang, Yixiang Chen, Ying Shi, Zhiyuan Tang, Thomas Fang Zheng

Various informative factors mixed in speech signals, leading to great difficulty when decoding any of the factors.

Emotion Recognition Speaker Recognition

Full-info Training for Deep Speaker Feature Learning

no code implementations31 Oct 2017 Lantian Li, Zhiyuan Tang, Dong Wang, Thomas Fang Zheng

In recent studies, it has shown that speaker patterns can be learned from very short speech segments (e. g., 0. 3 seconds) by a carefully designed convolutional & time-delay deep neural network (CT-DNN) model.

Speaker Verification

Non-rigid Object Tracking via Deep Multi-scale Spatial-temporal Discriminative Saliency Maps

no code implementations22 Feb 2018 Pingping Zhang, Wei Liu, Dong Wang, Yinjie Lei, Hongyu Wang, Chunhua Shen, Huchuan Lu

Extensive experiments demonstrate that the proposed algorithm achieves competitive performance in both saliency detection and visual tracking, especially outperforming other related trackers on the non-rigid object tracking datasets.

Object Object Tracking +2

Medical Diagnosis From Laboratory Tests by Combining Generative and Discriminative Learning

no code implementations12 Nov 2017 Shiyue Zhang, Pengtao Xie, Dong Wang, Eric P. Xing

In hospital, physicians rely on massive clinical data to make diagnosis decisions, among which laboratory tests are one of the most important resources.

Decision Making Imputation +1

Phonetic Temporal Neural Model for Language Identification

no code implementations9 May 2017 Zhiyuan Tang, Dong Wang, Yixiang Chen, Lantian Li, Andrew Abel

Deep neural models, particularly the LSTM-RNN model, have shown great potential for language identification (LID).

Language Identification

Memory-augmented Neural Machine Translation

no code implementations EMNLP 2017 Yang Feng, Shiyue Zhang, Andi Zhang, Dong Wang, Andrew Abel

Neural machine translation (NMT) has achieved notable success in recent times, however it is also widely recognized that this approach has limitations with handling infrequent words and word pairs.

Machine Translation NMT +1

Deep Factorization for Speech Signal

no code implementations5 Jun 2017 Dong Wang, Lantian Li, Ying Shi, Yixiang Chen, Zhiyuan Tang

In this paper, we demonstrated that the speaker factor is also a short-time spectral pattern and can be largely identified with just a few frames using a simple deep neural network (DNN).

Emotion Recognition

Deep Speaker Verification: Do We Need End to End?

no code implementations22 Jun 2017 Dong Wang, Lantian Li, Zhiyuan Tang, Thomas Fang Zheng

This principle has recently been applied to several prototype research on speaker verification (SV), where the feature learning and classifier are learned together with an objective function that is consistent with the evaluation metric.

Speaker Verification

Cross-lingual Speaker Verification with Deep Feature Learning

no code implementations22 Jun 2017 Lantian Li, Dong Wang, Askar Rozi, Thomas Fang Zheng

The experiments demonstrated that the feature-based system outperformed the i-vector system with a large margin, particularly with language mismatch between enrollment and test.

Speaker Verification

Speaker Recognition with Cough, Laugh and "Wei"

no code implementations22 Jun 2017 Miao Zhang, Yixiang Chen, Lantian Li, Dong Wang

This paper proposes a speaker recognition (SRE) task with trivial speech events, such as cough and laugh.

Speaker Recognition

Weakly Supervised PLDA Training

no code implementations27 Sep 2016 Lantian Li, Yixiang Chen, Dong Wang, Chenghui Zhao

PLDA is a popular normalization approach for the i-vector model, and it has delivered state-of-the-art performance in speaker verification.

Speaker Verification

Collaborative Learning for Language and Speaker Recognition

no code implementations27 Sep 2016 Lantian Li, Zhiyuan Tang, Dong Wang, Andrew Abel, Yang Feng, Shiyue Zhang

This paper presents a unified model to perform language and speaker recognition simultaneously and altogether.

Speaker Recognition

Phone-aware Neural Language Identification

no code implementations9 May 2017 Zhiyuan Tang, Dong Wang, Yixiang Chen, Ying Shi, Lantian Li

Pure acoustic neural models, particularly the LSTM-RNN model, have shown great potential in language identification (LID).

Language Identification

Flexible and Creative Chinese Poetry Generation Using Neural Memory

no code implementations ACL 2017 Jiyuan Zhang, Yang Feng, Dong Wang, Yang Wang, Andrew Abel, Shiyue Zhang, Andi Zhang

It has been shown that Chinese poems can be successfully generated by sequence-to-sequence neural models, particularly with the attention mechanism.

Memory Visualization for Gated Recurrent Neural Networks in Speech Recognition

no code implementations28 Sep 2016 Zhiyuan Tang, Ying Shi, Dong Wang, Yang Feng, Shiyue Zhang

Recurrent neural networks (RNNs) have shown clear superiority in sequence modeling, particularly the ones with gated units, such as long short-term memory (LSTM) and gated recurrent unit (GRU).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

AP16-OL7: A Multilingual Database for Oriental Languages and A Language Recognition Baseline

no code implementations27 Sep 2016 Dong Wang, Lantian Li, Difei Tang, Qing Chen

We present the AP16-OL7 database which was released as the training and test data for the oriental language recognition (OLR) challenge on APSIPA 2016.

System Combination for Short Utterance Speaker Recognition

no code implementations31 Mar 2016 Lantian Li, Dong Wang, Xiaodong Zhang, Thomas Fang Zheng, Panshi Jin

This paper presents a combination approach to the SUSR tasks with two phonetic-aware systems: one is the DNN-based i-vector system and the other is our recently proposed subregion-based GMM-UBM system.

Speaker Recognition

Local Training for PLDA in Speaker Verification

no code implementations27 Sep 2016 Chenghui Zhao, Lantian Li, Dong Wang, April Pu

PLDA is a popular normalization approach for the i-vector model, and it has delivered state-of-the-art performance in speaker verification.

Speaker Verification

OC16-CE80: A Chinese-English Mixlingual Database and A Speech Recognition Baseline

no code implementations27 Sep 2016 Dong Wang, Zhiyuan Tang, Difei Tang, Qing Chen

We present the OC16-CE80 Chinese-English mixlingual speech database which was released as a main resource for training, development and test for the Chinese-English mixlingual speech recognition (MixASR-CHEN) challenge on O-COCOSDA 2016.

speech-recognition Speech Recognition

Multi-task Recurrent Model for Speech and Speaker Recognition

no code implementations31 Mar 2016 Zhiyuan Tang, Lantian Li, Dong Wang

Although highly correlated, speech and speaker recognition have been regarded as two independent tasks and studied by two communities.

Speaker Recognition

Can Machine Generate Traditional Chinese Poetry? A Feigenbaum Test

no code implementations19 Jun 2016 Qixin Wang, Tianyi Luo, Dong Wang

Recent progress in neural learning demonstrated that machines can do well in regularized tasks, e. g., the game of Go.

Game of Go

Recurrent Neural Network Training with Dark Knowledge Transfer

no code implementations18 May 2015 Zhiyuan Tang, Dong Wang, Zhiyong Zhang

Recent research found that a well-trained model can be used as a teacher to train other child models, by using the predictions generated by the teacher model as supervision.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Bayesian Neighbourhood Component Analysis

no code implementations8 Apr 2016 Dong Wang, Xiaoyang Tan

Learning a good distance metric in feature space potentially improves the performance of the KNN classifier and is useful in many real-world applications.

Bayesian Optimization Metric Learning

Binary Speaker Embedding

no code implementations20 Oct 2015 Lantian Li, Dong Wang, Chao Xing, Kaimin Yu, Thomas Fang Zheng

The popular i-vector model represents speakers as low-dimensional continuous vectors (i-vectors), and hence it is a way of continuous speaker embedding.

Binarization Speaker Verification

Max-margin Metric Learning for Speaker Recognition

no code implementations20 Oct 2015 Lantian Li, Dong Wang, Chao Xing, Thomas Fang Zheng

Probabilistic linear discriminant analysis (PLDA) is a popular normalization approach for the i-vector model, and has delivered state-of-the-art performance in speaker recognition.

Metric Learning Speaker Recognition

A Universal Update-pacing Framework For Visual Tracking

1 code implementation1 Mar 2016 Zexi Hu, Yuefang Gao, Dong Wang, Xuhong Tian

Given a base tracker, an ensemble of trackers is generated, in which each tracker's update behavior will be paced and then traces the target object forward and backward to generate a pair of trajectories in an interval.

Visual Tracking

Relation Classification via Recurrent Neural Network

1 code implementation5 Aug 2015 Dongxu Zhang, Dong Wang

Deep learning has gained much success in sentence-level relation classification.

Classification Feature Engineering +4

Transfer Learning for Speech and Language Processing

no code implementations19 Nov 2015 Dong Wang, Thomas Fang Zheng

Transfer learning is a vital technique that generalizes models trained for one setting or task to other settings or tasks.

Multi-Task Learning speech-recognition +1

Deep Representation of Facial Geometric and Photometric Attributes for Automatic 3D Facial Expression Recognition

no code implementations10 Nov 2015 Huibin Li, Jian Sun, Dong Wang, Zongben Xu, Liming Chen

In this paper, we present a novel approach to automatic 3D Facial Expression Recognition (FER) based on deep representation of facial 3D geometric and 2D photometric attributes.

3D Facial Expression Recognition Facial Expression Recognition

Stochastic Top-k ListNet

no code implementations EMNLP 2015 Tianyi Luo, Dong Wang, Rong Liu, Yiqiao Pan

ListNet is a well-known listwise learning to rank model and has gained much attention in recent years.

Learning-To-Rank

Learning from LDA using Deep Neural Networks

no code implementations5 Aug 2015 Dongxu Zhang, Tianyi Luo, Dong Wang, Rong Liu

Latent Dirichlet Allocation (LDA) is a three-level hierarchical Bayesian model for topic inference.

Document Classification General Classification +1

VMF-SNE: Embedding for Spherical Data

no code implementations30 Jul 2015 Mian Wang, Dong Wang

This assumption does not hold for a wide range of data types in practical applications, for instance spherical data for which the local proximity is better modelled by the von Mises-Fisher (vMF) distribution instead of the Gaussian.

Data Visualization

Improved Deep Speaker Feature Learning for Text-Dependent Speaker Recognition

no code implementations28 Jun 2015 Lantian Li, Yiye Lin, Zhiyong Zhang, Dong Wang

A deep learning approach has been proposed recently to derive speaker identifies (d-vector) by a deep neural network (DNN).

Dynamic Time Warping Speaker Recognition

Recognize Foreign Low-Frequency Words with Similar Pairs

no code implementations16 Jun 2015 Xi Ma, Xiaoxi Wang, Dong Wang, Zhiyong Zhang

We also employ this approach to deal with out-of-language words in the task of multi-lingual speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Knowledge Transfer Pre-training

no code implementations7 Jun 2015 Zhiyuan Tang, Dong Wang, Yiqiao Pan, Zhiyong Zhang

Compared to the conventional layer-wise methods, this new method does not care about the model structure, so can be used to pre-train very complex models.

speech-recognition Speech Recognition +1

Learning Speech Rate in Speech Recognition

no code implementations2 Jun 2015 Xiangyu Zeng, Shi Yin, Dong Wang

A significant performance reduction is often observed in speech recognition when the rate of speech (ROS) is too low or too high.

speech-recognition Speech Recognition

Unsupervised Feature Learning with C-SVDDNet

no code implementations23 Dec 2014 Dong Wang, Xiaoyang Tan

To address this issue, we propose a SVDD based feature learning algorithm that describes the density and distribution of each cluster from K-means with an SVDD ball for more robust feature representation.

Image Classification Object Recognition

Deep Speaker Vectors for Semi Text-independent Speaker Verification

no code implementations24 May 2015 Lantian Li, Dong Wang, Zhiyong Zhang, Thomas Fang Zheng

Recent research shows that deep neural networks (DNNs) can be used to extract deep speaker vectors (d-vectors) that preserve speaker characteristics and can be used in speaker verification.

Speaker Recognition Text-Dependent Speaker Verification +2

Chinese Poetry Generation with Flexible Styles

no code implementations17 Jul 2018 Jiyuan Zhang, Dong Wang

Research has shown that sequence-to-sequence neural models, particularly those with the attention mechanism, can successfully generate classical Chinese poems.

Gaussian-Constrained training for speaker verification

no code implementations8 Nov 2018 Lantian Li, Zhiyuan Tang, Ying Shi, Dong Wang

This paper proposes a Gaussian-constrained training approach that (1) discards the parametric classifier, and (2) enforces the distribution of the derived speaker vectors to be Gaussian.

Speaker Verification

Phonetic-attention scoring for deep speaker features in speaker verification

no code implementations8 Nov 2018 Lantian Li, Zhiyuan Tang, Ying Shi, Dong Wang

This score reflects the similarity of the two frames in phonetic content, and is used to weigh the contribution of this frame pair in the utterance-based scoring.

Machine Translation Speaker Verification +1

Discourse Mode Identification in Essays

no code implementations ACL 2017 Wei Song, Dong Wang, Ruiji Fu, Lizhen Liu, Ting Liu, Guoping Hu

Evaluation results show that discourse modes can be identified automatically with an average F1-score of 0. 7.

Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network

no code implementations CVPR 2018 Wenda Zhao, Fan Zhao, Dong Wang, Huchuan Lu

To address these issues, we propose a multi-stream bottom-top-bottom fully convolutional network (BTBNet), which is the first attempt to develop an end-to-end deep network for DBD.

Defocus Blur Detection Defocus Estimation

Structured Siamese Network for Real-Time Visual Tracking

no code implementations ECCV 2018 Yunhua Zhang, Lijun Wang, Jinqing Qi, Dong Wang, Mengyang Feng, Huchuan Lu

In this paper, we circumvent this issue by proposing a local structure learning method, which simultaneously considers the local patterns of the target and their structural relationships for more accurate target tracking.

Real-Time Visual Tracking

Real-time 'Actor-Critic' Tracking

no code implementations ECCV 2018 Boyu Chen, Dong Wang, Peixia Li, Shuang Wang, Huchuan Lu

In this work, we propose a novel tracking algorithm with real-time performance based on the ‘Actor-Critic’ framework.

Visual Tracking

Globally Soft Filter Pruning For Efficient Convolutional Neural Networks

no code implementations ICLR 2019 Ke Xu, Xiao-Yun Wang, Qun Jia, Jianjing An, Dong Wang

Therefore, accumulating the saliency of the filter over the entire data set can provide more accurate guidance for pruning.

CONet: A Cognitive Ocean Network

no code implementations9 Jan 2019 Huimin Lu, Dong Wang, Yujie Li, Jianru Li, Xin Li, Hyoungseop Kim, Seiichi Serikawa, Iztok Humar

The Cognitive Ocean Network (CONet) will become the mainstream of future ocean science and engineering developments.

Least Soft-Threshold Squares Tracking

no code implementations CVPR 2013 Dong Wang, Huchuan Lu, Ming-Hsuan Yang

In this paper, we propose a generative tracking method based on a novel robust linear regression algorithm.

Visual Tracking via Probability Continuous Outlier Model

no code implementations CVPR 2014 Dong Wang, Huchuan Lu

In this paper, we present a novel online visual tracking method based on linear representation.

Visual Tracking

Stepwise Metric Promotion for Unsupervised Video Person Re-Identification

no code implementations ICCV 2017 Zimo Liu, Dong Wang, Huchuan Lu

The intensive annotation cost and the rich but unlabeled data contained in videos motivate us to propose an unsupervised video-based person re-identification (re-ID) method.

Retrieval Video-Based Person Re-Identification

VAE-based regularization for deep speaker embedding

no code implementations7 Apr 2019 Yang Zhang, Lantian Li, Dong Wang

Deep speaker embedding has achieved state-of-the-art performance in speaker recognition.

Speaker Recognition

Listen to the Image

no code implementations CVPR 2019 Di Hu, Dong Wang, Xuelong. Li, Feiping Nie, Qi. Wang

different encoding schemes indicate that using machine model to accelerate optimization evaluation and reduce experimental cost is feasible to some extent, which could dramatically promote the upgrading of encoding scheme then help the blind to improve their visual perception ability.

Translation

The iterative convolution-thresholding method (ICTM) for image segmentation

no code implementations24 Apr 2019 Dong Wang, Xiao-Ping Wang

In this paper, we propose a novel iterative convolution-thresholding method (ICTM) that is applicable to a range of variational models for image segmentation.

Image Segmentation Segmentation +1

Early Action Prediction with Generative Adversarial Networks

no code implementations30 Apr 2019 Dong Wang, Yuan Yuan, Qi. Wang

Action Prediction is aimed to determine what action is occurring in a video as early as possible, which is crucial to many online applications, such as predicting a traffic accident before it happens and detecting malicious actions in the monitoring system.

Early Action Prediction Generative Adversarial Network

Memory-Augmented Temporal Dynamic Learning for Action Recognition

no code implementations30 Apr 2019 Yuan Yuan, Dong Wang, Qi. Wang

Human actions captured in video sequences contain two crucial factors for action recognition, i. e., visual appearance and motion dynamics.

Action Recognition Temporal Action Localization

Anomaly Detection in Traffic Scenes via Spatial-aware Motion Reconstruction

no code implementations30 Apr 2019 Yuan Yuan, Dong Wang, Qi. Wang

3) Results of motion orientation and magnitude are adaptively weighted and fused by a Bayesian model, which makes the proposed method more robust and handle more kinds of abnormal events.

Anomaly Detection Autonomous Vehicles

Cross-Modal Message Passing for Two-stream Fusion

no code implementations30 Apr 2019 Dong Wang, Yuan Yuan, Qi. Wang

The classification object ensures that each modal network predicts the true action category while the competing objective encourages each modal network to outperform the other one.

Action Recognition General Classification +3

Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems

no code implementations28 May 2019 Tianle Cai, Ruiqi Gao, Jikai Hou, Siyu Chen, Dong Wang, Di He, Zhihua Zhang, Li-Wei Wang

First-order methods such as stochastic gradient descent (SGD) are currently the standard algorithm for training deep neural networks.

regression Second-order methods

A One-step Pruning-recovery Framework for Acceleration of Convolutional Neural Networks

no code implementations18 Jun 2019 Dong Wang, Lei Zhou, Xiao Bai, Jun Zhou

Our method accelerates the network in one-step pruning-recovery manner with a novel optimization objective function, which achieves higher accuracy with much less cost compared with existing pruning methods.

LMVP: Video Predictor with Leaked Motion Information

no code implementations24 Jun 2019 Dong Wang, Yitong Li, Wei Cao, Liqun Chen, Qi Wei, Lawrence Carin

We propose a Leaked Motion Video Predictor (LMVP) to predict future frames by capturing the spatial and temporal dependencies from given inputs.

A Preliminary Study on Data Augmentation of Deep Learning for Image Classification

no code implementations9 Jun 2019 Benlin Hu, Cheng Lei, Dong Wang, Shu Zhang, Zhenyu Chen

Deep learning models have a large number of freeparameters that need to be calculated by effective trainingof the models on a great deal of training data to improvetheir generalization performance.

Data Augmentation General Classification +1

Towards Reliable Online Clickbait Video Detection: A Content-Agnostic Approach

no code implementations17 Jul 2019 Lanyu Shang, Daniel Zhang, Michael Wang, Shuyue Lai, Dong Wang

Current clickbait detection solutions that mainly focus on analyzing the text of the title, the image of the thumbnail, or the content of the video are shown to be suboptimal in detecting the online clickbait videos.

Clickbait Detection

AP19-OLR Challenge: Three Tasks and Their Baselines

no code implementations16 Jul 2019 Zhiyuan Tang, Dong Wang, Li-Ming Song

The participants can refer to these online-published recipes to deploy LID systems for convenience.

VAE-based Domain Adaptation for Speaker Verification

no code implementations27 Aug 2019 Xueyi Wang, Lantian Li, Dong Wang

By enforcing the neural model to discriminate the speakers in the training set, deep speaker embedding (called `x-vectors`) can be derived from the hidden layers.

Domain Adaptation Speaker Verification

An Online Reinforcement Learning Approach to Quality-Cost-Aware Task Allocation for Multi-Attribute Social Sensing

no code implementations11 Sep 2019 Yang Zhang, Daniel Zhang, Nathan Vance, Dong Wang

Social sensing has emerged as a new sensing paradigm where humans (or devices on their behalf) collectively report measurements about the physical world.

Attribute

On Investigation of Unsupervised Speech Factorization Based on Normalization Flow

no code implementations29 Oct 2019 Haoran Sun, Yunqi Cai, Lantian Li, Dong Wang

Speech signals are complex composites of various information, including phonetic content, speaker traits, channel effect, etc.

Curriculum Audiovisual Learning

no code implementations26 Jan 2020 Di Hu, Zheng Wang, Haoyi Xiong, Dong Wang, Feiping Nie, Dejing Dou

Associating sound and its producer in complex audiovisual scene is a challenging task, especially when we are lack of annotated training data.

Clustering

Deep Variational Luenberger-type Observer for Stochastic Video Prediction

no code implementations12 Feb 2020 Dong Wang, Feng Zhou, Zheng Yan, Guang Yao, Zongxuan Liu, Wennan Ma, Cewu Lu

Our model builds upon an variational encoder which transforms the input video into a latent feature space and a Luenberger-type observer which captures the dynamic evolution of the latent features.

Representation Learning Video Prediction +1

Graph Representation Learning for Merchant Incentive Optimization in Mobile Payment Marketing

no code implementations27 Feb 2020 Ziqi Liu, Dong Wang, Qianyu Yu, Zhiqiang Zhang, Yue Shen, Jian Ma, Wenliang Zhong, Jinjie Gu, Jun Zhou, Shuang Yang, Yuan Qi

In this paper, we present a graph representation learning method atop of transaction networks for merchant incentive optimization in mobile payment marketing.

Graph Representation Learning Marketing

CovidSens: A Vision on Reliable Social Sensing for COVID-19

no code implementations9 Apr 2020 Md Tahmid Rashid, Dong Wang

In this vision paper, we discuss the roles of CovidSens and identify potential challenges in developing reliable social sensing based risk alert systems.

Misinformation

Stabilizing Training of Generative Adversarial Nets via Langevin Stein Variational Gradient Descent

no code implementations22 Apr 2020 Dong Wang, Xiaoqian Qin, Fengyi Song, Li Cheng

Generative adversarial networks (GANs), famous for the capability of learning complex underlying data distribution, are however known to be tricky in the training process, which would probably result in mode collapse or performance deterioration.

Variational Inference

An efficient iterative method for reconstructing surface from point clouds

no code implementations25 May 2020 Dong Wang

In this paper, we develop an efficient iterative method on a variational model for the surface reconstruction from point clouds.

Surface Reconstruction

Improve bone age assessment by learning from anatomical local regions

no code implementations27 May 2020 Dong Wang, Kexin Zhang, Jia Ding, Li-Wei Wang

In the clinical practice, Tanner and Whitehouse (TW2) method is a widely-used method for radiologists to perform BAA.

DASC: Towards A Road Damage-Aware Social-Media-Driven Car Sensing Framework for Disaster Response Applications

no code implementations4 Jun 2020 Md Tahmid Rashid, Daniel, Zhang, Dong Wang

iii) How to efficiently guide the cars to the event locations with little prior knowledge of the road damage caused by the disaster, while also handling the dynamics of the physical world and social media?

Disaster Response

AP20-OLR Challenge: Three Tasks and Their Baselines

no code implementations4 Jun 2020 Zheng Li, Miao Zhao, Qingyang Hong, Lin Li, Zhiyuan Tang, Dong Wang, Li-Ming Song, Cheng Yang

Based on Kaldi and Pytorch, recipes for i-vector and x-vector systems are also conducted as baselines for the three tasks.

Dialect Identification

A Characteristic Function-based Algorithm for Geodesic Active Contours

no code implementations1 Jul 2020 Jun Ma, Dong Wang, Xiao-Ping Wang, Xiaoping Yang

Active contour models have been widely used in image segmentation, and the level set method (LSM) is the most popular approach for solving the models, via implicitly representing the contour by a level set function.

Image Segmentation Lesion Segmentation +2

Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking

no code implementations4 Jul 2020 Pengyu Zhang, Jie Zhao, Dong Wang, Huchuan Lu, Xiaoyun Yang

In this study, we propose a novel RGB-T tracking framework by jointly modeling both appearance and motion cues.

Rgb-T Tracking

Towards Robust and Efficient Contrastive Textual Representation Learning

no code implementations1 Jan 2021 Liqun Chen, Yizhe Zhang, Dianqi Li, Chenyang Tao, Dong Wang, Lawrence Carin

There has been growing interest in representation learning for text data, based on theoretical arguments and empirical evidence.

Contrastive Learning Representation Learning

Remarks on Optimal Scores for Speaker Recognition

no code implementations10 Oct 2020 Dong Wang

In this article, we first establish the theory of optimal scores for speaker recognition.

Speaker Identification Speaker Recognition +1

Consistency of archetypal analysis

no code implementations16 Oct 2020 Braxton Osting, Dong Wang, Yiming Xu, Dominique Zosso

Archetypal analysis is an unsupervised learning method that uses a convex polytope to summarize multivariate data.

Squeezing value of cross-domain labels: a decoupled scoring approach for speaker verification

no code implementations27 Oct 2020 Lantian Li, Yang Zhang, Jiawen Kang, Thomas Fang Zheng, Dong Wang

Domain mismatch often occurs in real applications and causes serious performance reduction on speaker verification systems.

Speaker Verification

Deep generative factorization for speech signal

no code implementations27 Oct 2020 Haoran Sun, Lantian Li, Yunqi Cai, Yang Zhang, Thomas Fang Zheng, Dong Wang

Various information factors are blended in speech signals, which forms the primary difficulty for most speech information processing tasks.

Can We Trust Deep Speech Prior?

no code implementations4 Nov 2020 Ying Shi, Haolin Chen, Zhiyuan Tang, Lantian Li, Dong Wang, Jiqing Han

Recently, speech enhancement (SE) based on deep speech prior has attracted much attention, such as the variational auto-encoder with non-negative matrix factorization (VAE-NMF) architecture.

Speech Enhancement

Wasserstein Contrastive Representation Distillation

no code implementations CVPR 2021 Liqun Chen, Dong Wang, Zhe Gan, Jingjing Liu, Ricardo Henao, Lawrence Carin

The primary goal of knowledge distillation (KD) is to encapsulate the information of a model learned from a teacher network into a student network, with the latter being more compact than the former.

Contrastive Learning Knowledge Distillation +2

Summarize before Aggregate: A Global-to-local Heterogeneous Graph Inference Network for Conversational Emotion Recognition

no code implementations COLING 2020 Dongming Sheng, Dong Wang, Ying Shen, Haitao Zheng, Haozhuang Liu

Local dependencies, which captures short-term emotional effects between neighbouring utterances, are further injected via an Aggregation Graph to distinguish the subtle differences between utterances containing emotional phrases.

Emotion Recognition in Conversation

A Streaming End-to-End Framework For Spoken Language Understanding

no code implementations20 May 2021 Nihal Potdar, Anderson R. Avila, Chao Xing, Dong Wang, Yiran Cao, Xiao Chen

In this paper, we propose a streaming end-to-end framework that can process multiple intentions in an online and incremental way.

Intent Detection Keyword Spotting +3

Cannot find the paper you are looking for? You can Submit a new open access paper.