Search Results for author: Tao Wang

Found 186 papers, 64 papers with code

Task-oriented Domain-specific Meta-Embedding for Text Classification

no code implementations EMNLP 2020 Xin Wu, Yi Cai, Yang Kai, Tao Wang, Qing Li

Meta-embedding learning, which combines complementary information in different word embeddings, have shown superior performances across different Natural Language Processing tasks.

General Classification text-classification +2

Estimation and Testing of Forecast Rationality with Many Moments

no code implementations18 Sep 2023 Tae-Hwy Lee, Tao Wang

We in this paper utilize P-GMM (Cheng and Liao, 2015) moment selection procedure to select valid and relevant moments for estimating and testing forecast rationality under the flexible loss proposed by Elliott et al. (2005).

Base Station Beamforming Design for Near-field XL-IRS Beam Training

no code implementations12 Sep 2023 Tao Wang, Changsheng You, Changchuan Yin

However, this approach may cause degraded beam training performance in practice due to the near-field channel model of the BS-IRS link.

Deep Video Restoration for Under-Display Camera

no code implementations9 Sep 2023 Xuanxi Chen, Tao Wang, Ziqian Shao, Kaihao Zhang, Wenhan Luo, Tong Lu, Zikun Liu, Tae-Kyun Kim, Hongdong Li

With the pipeline, we build the first large-scale UDC video restoration dataset called PexelsUDC, which includes two subsets named PexelsUDC-T and PexelsUDC-P corresponding to different displays for UDC.

Video Restoration

CPSP: Learning Speech Concepts From Phoneme Supervision

no code implementations1 Sep 2023 Chunyu Qiang, Hao Li, Yixin Tian, Ruibo Fu, Tao Wang, Longbiao Wang, Jianwu Dang

For fine-grained generation and recognition tasks such as minimally-supervised text-to-speech (TTS), voice conversion (VC), and automatic speech recognition (ASR), the intermediate representations extracted from speech should serve as a "bridge" between text and acoustic information, containing information from both modalities.

Audio Classification Automatic Speech Recognition +5

Semantic-aware Consistency Network for Cloth-changing Person Re-Identification

1 code implementation27 Aug 2023 Peini Guo, Hong Liu, Jianbing Wu, Guoquan Wang, Tao Wang

Despite recent progress in CC-ReID, existing approaches are still hindered by the interference of clothing variations since they lack effective constraints to keep the model consistently focused on clothing-irrelevant regions.

Person Re-Identification

FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo Embeddings

1 code implementation17 Aug 2023 Yulin Su, Min Yang, Minghui Qiu, Jing Wang, Tao Wang

Logo embedding plays a crucial role in various e-commerce applications by facilitating image retrieval or recognition, such as intellectual property protection and product search.

Image Retrieval Optical Character Recognition (OCR)

Development of a Knowledge Graph Embeddings Model for Pain

no code implementations17 Aug 2023 Jaya Chaturvedi, Tao Wang, Sumithra Velupillai, Robert Stewart, Angus Roberts

This paper describes the construction of such knowledge graph embedding models of pain concepts, extracted from the unstructured text of mental health electronic health records, combined with external knowledge created from relations described in SNOMED CT, and their evaluation on a subject-object link prediction task.

Knowledge Graph Embedding Knowledge Graph Embeddings +2

HGDNet: A Height-Hierarchy Guided Dual-Decoder Network for Single View Building Extraction and Height Estimation

no code implementations10 Aug 2023 Chaoran Lu, Ningning Cao, Pan Zhang, Ting Liu, Baochai Peng, Guozhang Liu, Mengke Yuan, Sen Zhang, Simin Huang, Tao Wang

Unifying the correlative single-view satellite image building extraction and height estimation tasks indicates a promising way to share representations and acquire generalist model for large-scale urban 3D reconstruction.

3D Reconstruction

Fine-grained building roof instance segmentation based on domain adapted pretraining and composite dual-backbone

no code implementations10 Aug 2023 Guozhang Liu, Baochai Peng, Ting Liu, Pan Zhang, Mengke Yuan, Chaoran Lu, Ningning Cao, Sen Zhang, Simin Huang, Tao Wang

The diversity of building architecture styles of global cities situated on various landforms, the degraded optical imagery affected by clouds and shadows, and the significant inter-class imbalance of roof types pose challenges for designing a robust and accurate building roof instance segmentor.

Data Augmentation Instance Segmentation +1

Deep Semantic Graph Matching for Large-scale Outdoor Point Clouds Registration

no code implementations10 Aug 2023 Shaocong Liu, Tao Wang, Yan Zhang, Ruqin Zhou, Li Li, Chenguang Dai, Yongsheng Zhang, Hanyun Wang

Firstly, the semantic category labels of 3D point clouds are obtained by utilizing large-scale point cloud semantic segmentation network.

Graph Matching Point Cloud Registration +1

Few-shot Class-Incremental Semantic Segmentation via Pseudo-Labeling and Knowledge Distillation

1 code implementation5 Aug 2023 Chengjia Jiang, Tao Wang, Sien Li, Jinyang Wang, Shirui Wang, Antonios Antoniou

Given only one or a few images labeled with the novel classes and a much larger set of unlabeled images, we transfer the knowledge from labeled images to unlabeled images with a coarse-to-fine pseudo-labeling approach in two steps.

Class-Incremental Semantic Segmentation Knowledge Distillation

Class-Specific Distribution Alignment for Semi-Supervised Medical Image Classification

no code implementations29 Jul 2023 Zhongzheng Huang, Jiawei Wu, Tao Wang, Zuoyong Li, Anastasia Ioannou

Despite the success of deep neural networks in medical image classification, the problem remains challenging as data annotation is time-consuming, and the class distribution is imbalanced due to the relative scarcity of diseases.

Image Classification Semi-supervised Medical Image Classification

Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding

no code implementations28 Jul 2023 Chunyu Qiang, Hao Li, Hao Ni, He Qu, Ruibo Fu, Tao Wang, Longbiao Wang, Jianwu Dang

However, existing methods suffer from three problems: the high dimensionality and waveform distortion of discrete speech representations, the prosodic averaging problem caused by the duration prediction model in non-autoregressive frameworks, and the information redundancy and dimension explosion problems of existing semantic encoding methods.

Language Modelling Speech Synthesis

LLDiffusion: Learning Degradation Representations in Diffusion Models for Low-Light Image Enhancement

1 code implementation27 Jul 2023 Tao Wang, Kaihao Zhang, Ziqian Shao, Wenhan Luo, Bjorn Stenger, Tae-Kyun Kim, Wei Liu, Hongdong Li

In this paper, we address this limitation by proposing a degradation-aware learning scheme for LLIE using diffusion models, which effectively integrates degradation and image priors into the diffusion process, resulting in improved image enhancement.

Image Generation Low-Light Image Enhancement

An Intelligent Remote Sensing Image Quality Inspection System

no code implementations22 Jul 2023 Yijiong Yu, Tao Wang, Kang Ran, Chang Li, Hao Wu

Due to the inevitable presence of quality problems, remote sensing image quality inspection is indeed an indispensable step between the acquisition and the application of remote sensing images.

Image Classification Semantic Segmentation

Bridging the Gap: Multi-Level Cross-Modality Joint Alignment for Visible-Infrared Person Re-Identification

no code implementations17 Jul 2023 Tengfei Liang, Yi Jin, Wu Liu, Tao Wang, Songhe Feng, Yidong Li

Visible-Infrared person Re-IDentification (VI-ReID) is a challenging cross-modality image retrieval task that aims to match pedestrians' images across visible and infrared cameras.

Image Classification Image Retrieval +3

Joint Adversarial and Collaborative Learning for Self-Supervised Action Recognition

1 code implementation15 Jul 2023 Tianyu Guo, Mengyuan Liu, Hong Liu, Wenhao Li, Jingwen Guo, Tao Wang, Yidi Li

Considering the instance-level discriminative ability, contrastive learning methods, including MoCo and SimCLR, have been adapted from the original image representation learning task to solve the self-supervised skeleton-based action recognition task.

Contrastive Learning Ensemble Learning +4

BLEURT Has Universal Translations: An Analysis of Automatic Metrics by Minimum Risk Training

no code implementations6 Jul 2023 Yiming Yan, Tao Wang, Chengqi Zhao, ShuJian Huang, Jiajun Chen, Mingxuan Wang

In this study, we systematically analyze and compare various mainstream and cutting-edge automatic metrics from the perspective of their guidance for training machine translation systems.

Machine Translation Translation

Seeing is not Believing: An Identity Hider for Human Vision Privacy Protection

no code implementations2 Jul 2023 Tao Wang, Yushu Zhang, Zixuan Yang, Hua Zhang, Zhongyun Hua

Secondly, the visual content of the virtual face is transferred into the original face and then the background is replaced with the original one.

PCDAL: A Perturbation Consistency-Driven Active Learning Approach for Medical Image Segmentation and Classification

1 code implementation29 Jun 2023 Tao Wang, Xinlin Zhang, Yuanbo Zhou, Junlin Lan, Tao Tan, Min Du, Qinquan Gao, Tong Tong

To address this limitation, we propose an AL-based method that can be simultaneously applied to 2D medical image classification, segmentation, and 3D medical image segmentation tasks.

Active Learning Image Classification +4

Valley: Video Assistant with Large Language model Enhanced abilitY

1 code implementation12 Jun 2023 Ruipu Luo, Ziwang Zhao, Min Yang, Junwei DOng, Minghui Qiu, Pengcheng Lu, Tao Wang, Zhongyu Wei

Specifically, our proposed Valley model is designed with a simple projection module that bridges video, image, and language modalities, and is further unified with a multi-lingual LLM.

Action Recognition Instruction Following +4

Boosting Fast and High-Quality Speech Synthesis with Linear Diffusion

no code implementations9 Jun 2023 Haogeng Liu, Tao Wang, Jie Cao, Ran He, JianHua Tao

When decreasing the number of sampling steps (i. e., the number of line segments used to fit the path), the ease of fitting straight lines compared to curves allows us to generate higher quality samples from a random noise with fewer iterations.

Denoising Speech Synthesis

Improving speech translation by fusing speech and text

no code implementations23 May 2023 Wenbiao Yin, Zhicheng Liu, Chengqi Zhao, Tao Wang, Jian Tong, Rong Ye

To tackle these gaps, we propose \textbf{F}use-\textbf{S}peech-\textbf{T}ext (\textbf{FST}), a cross-modal model which supports three distinct input modalities for translation: speech, text, and fused speech-text.

Machine Translation Translation

Graph Propagation Transformer for Graph Representation Learning

1 code implementation19 May 2023 Zhe Chen, Hao Tan, Tao Wang, Tianrun Shen, Tong Lu, Qiuying Peng, Cheng Cheng, Yue Qi

The core insight of our method is to fully consider the information propagation among nodes and edges in a graph when building the attention module in the transformer blocks.

Ranked #2 on Graph Regression on PCQM4M-LSC (Validation MAE metric)

Graph Learning Graph Property Prediction +3

PaLM 2 Technical Report

no code implementations17 May 2023 Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu

Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.

 Ranked #1 on Question Answering on TriviaQA (using extra training data)

Language Modelling Question Answering

Student Classroom Behavior Detection based on YOLOv7-BRA and Multi-Model Fusion

1 code implementation13 May 2023 Fan Yang, Tao Wang, Xiaofei Wang

We constructed a dataset, which contained 11, 248 labels and 4, 001 images, with an emphasis on the common behavior of raising hands in a classroom setting (Student Classroom Behavior dataset, SCB-Dataset).

RFR-WWANet: Weighted Window Attention-Based Recovery Feature Resolution Network for Unsupervised Image Registration

1 code implementation7 May 2023 Mingrui Ma, Tao Wang, Lei Song, Weijie Wang, Guixia Liu

Furthermore, shifted window partitioning operations are inflexible, indicating that they cannot perceive the semantic information over uncertain distances and automatically bridge the global connections between windows.

Image Registration Long-range modeling

A Soft Coordination Method of Heterogeneous Devices in Distribution System Voltage Control

no code implementations4 May 2023 Licheng Wang, Tao Wang, Gang Huang, Ruifeng Yan, Kai Wang, Youbing Zhang, Shijie Cheng

The proposed method achieves the soft coordination by establishing a modified actor-critic algorithm to train a proxy model of inverters.

Decision Making

GRIG: Few-Shot Generative Residual Image Inpainting

no code implementations24 Apr 2023 Wanglong Lu, Xianta Jiang, Xiaogang Jin, Yong-Liang Yang, Minglun Gong, Tao Wang, Kaijie Shi, Hanli Zhao

Image inpainting is the task of filling in missing or masked region of an image with semantically meaningful contents.

Image Inpainting

The Cascaded Forward Algorithm for Neural Network Training

1 code implementation17 Mar 2023 Gongpei Zhao, Tao Wang, Yidong Li, Yi Jin, Congyan Lang, Haibin Ling

Backpropagation algorithm has been widely used as a mainstream learning procedure for neural networks in the past decade, and has played a significant role in the development of deep learning.

Image Classification

Subjective and Objective Quality Assessment for in-the-Wild Computer Graphics Images

no code implementations14 Mar 2023 ZiCheng Zhang, Wei Sun, Tao Wang, Wei Lu, Quan Zhou, Jun He, Qiyuan Wang, Xiongkuo Min, Guangtao Zhai

Computer graphics images (CGIs) are artificially generated by means of computer programs and are widely perceived under various scenarios, such as games, streaming media, etc.

Image Quality Assessment

Feature Completion Transformer for Occluded Person Re-identification

no code implementations3 Mar 2023 Tao Wang, Hong Liu, Wenhao Li, Miaoju Ban, Tuanyu Guo, Yidi Li

In this paper, different from most previous works that discard the occluded region, we propose a Feature Completion Transformer (FCFormer) to implicitly complement the semantic information of occluded parts in the feature space.

Person Re-Identification

Spatio-Temporal Point Process for Multiple Object Tracking

no code implementations5 Feb 2023 Tao Wang, Kean Chen, Weiyao Lin, John See, Zenghui Zhang, Qian Xu, Xia Jia

As such, we propose a novel framework that can effectively predict and mask-out the noisy and confusing detection results before associating the objects into trajectories.

Multiple Object Tracking

Spatio-Temporal Context Modeling for Road Obstacle Detection

no code implementations19 Jan 2023 Xiuen Wu, Tao Wang, Lingyu Liang, Zuoyong Li, Fum Yew Ching

The results indicate that our method with spatio-temporal context modeling is superior to existing methods for road obstacle detection.

object-detection Object Detection +1

A Multi-Scale Framework for Out-of-Distribution Detection in Dermoscopic Images

no code implementations18 Jan 2023 Zhongzheng Huang, Tao Wang, Yuanzheng Cai, Lingyu Liang

The automatic detection of skin diseases via dermoscopic images can improve the efficiency in diagnosis and help doctors make more accurate judgments.

Out-of-Distribution Detection

Robust Remote Sensing Scene Classification with Multi-View Voting and Entropy Ranking

no code implementations14 Jan 2023 Jinyang Wang, Tao Wang, Min Gan, George Hadjichristofi

Deep convolutional neural networks have been widely used in scene classification of remotely sensed images.

Scene Classification

UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion

no code implementations10 Jan 2023 Haogeng Liu, Tao Wang, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, JianHua Tao

Text-to-speech (TTS) and voice conversion (VC) are two different tasks both aiming at generating high quality speaking voice according to different input modality.

Quantization Voice Conversion

Learning to Detect and Segment for Open Vocabulary Object Detection

no code implementations CVPR 2023 Tao Wang, Nan Li

With such a conditional design, the detection model is bridged by the semantic embedding to offer strongly generalizable class-wise box and mask prediction.

object-detection Open Vocabulary Object Detection

Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and Transformer-Based Method

1 code implementation22 Dec 2022 Tao Wang, Kaihao Zhang, Tianrun Shen, Wenhan Luo, Bjorn Stenger, Tong Lu

In this paper, we consider the task of low-light image enhancement (LLIE) and introduce a large-scale database consisting of images at 4K and 8K resolution.

Benchmarking Face Detection +1

Emotion Selectable End-to-End Text-based Speech Editing

no code implementations20 Dec 2022 Tao Wang, Jiangyan Yi, Ruibo Fu, JianHua Tao, Zhengqi Wen, Chu Yuan Zhang

To achieve this task, we propose Emo-CampNet (emotion CampNet), which can provide the option of emotional attributes for the generated speech in text-based speech editing and has the one-shot ability to edit unseen speakers' speech.

Data Augmentation

Towards Real World HDRTV Reconstruction: A Data Synthesis-based Approach

no code implementations6 Nov 2022 Zhen Cheng, Tao Wang, Yong Li, Fenglong Song, Chang Chen, Zhiwei Xiong

To solve this problem, we propose a learning-based data synthesis approach to learn the properties of real-world SDRTVs by integrating several tone mapping priors into both network structures and loss functions.

Tone Mapping

Rethinking Image Restoration for Object Detection

1 code implementation NIPS 2022 Shangquan Sun, Wenqi Ren, Tao Wang, Xiaochun Cao

To address the issue, we propose a targeted adversarial attack in the restoration procedure to boost object detection performance after restoration.

Adversarial Attack Domain Adaptation +4

DLUNet: Semi-supervised Learning based Dual-Light UNet for Multi-organ Segmentation

1 code implementation22 Sep 2022 Haoran Lai, Tao Wang, Shuoling Zhou

In the training phase, it consists of two light UNets, which make full use of label and unlabeled data simultaneously by using consistent-based learning.

Organ Segmentation

Multiple Instance Neural Networks Based on Sparse Attention for Cancer Detection using T-cell Receptor Sequences

no code implementations9 Aug 2022 Younghoon Kim, Tao Wang, Danyi Xiong, Xinlei Wang, Seongoh Park

Among different types of data used to answer this biological question, studies based on T cell receptors (TCRs) are under recent spotlight due to the growing appreciation of the roles of the host immunity system in tumor biology.

Multiple Instance Learning

On Mitigating Hard Clusters for Face Clustering

1 code implementation25 Jul 2022 Yingjie Chen, Huasong Zhong, Chong Chen, Chen Shen, Jianqiang Huang, Tao Wang, Yun Liang, Qianru Sun

Face clustering is a promising way to scale up face recognition systems using large-scale unlabeled face images.

Clustering Face Clustering +1

SJ-HD^2R: Selective Joint High Dynamic Range and Denoising Imaging for Dynamic Scenes

no code implementations20 Jun 2022 Wei Li, Shuai Xiao, Tianhong Dai, Shanxin Yuan, Tao Wang, Cheng Li, Fenglong Song

To further leverage these two paradigms, we propose a selective and joint HDR and denoising (SJ-HD$^2$R) imaging framework, utilizing scenario-specific priors to conduct the path selection with an accuracy of more than 93. 3$\%$.

Denoising

Multi-View Imputation and Cross-Attention Network Based on Incomplete Longitudinal and Multimodal Data for Conversion Prediction of Mild Cognitive Impairment

1 code implementation16 Jun 2022 Tao Wang, Xiumei Chen, Xiaoling Zhang, Shuoling Zhou, Qianjin Feng, Meiyan Huang

To address these challenges, a multi-view imputation and cross-attention network (MCNet) was proposed to integrate data imputation and MCI conversion prediction in a unified framework.

Disease Prediction Imputation +1

Subjective Quality Assessment for Images Generated by Computer Graphics

no code implementations10 Jun 2022 Tao Wang, ZiCheng Zhang, Wei Sun, Xiongkuo Min, Wei Lu, Guangtao Zhai

However, limited work has been put forward to tackle the problem of computer graphics generated images' quality assessment (CG-IQA).

No-Reference Image Quality Assessment

A No-reference Quality Assessment Metric for Point Cloud Based on Captured Video Sequences

no code implementations9 Jun 2022 Yu Fan, ZiCheng Zhang, Wei Sun, Xiongkuo Min, Wei Lu, Tao Wang, Ning Liu, Guangtao Zhai

Point cloud is one of the most widely used digital formats of 3D models, the visual quality of which is quite sensitive to distortions such as downsampling, noise, and compression.

Point Cloud Quality Assessment

Deep Neural Network for Blind Visual Quality Assessment of 4K Content

no code implementations9 Jun 2022 Wei Lu, Wei Sun, Xiongkuo Min, Wenhan Zhu, Quan Zhou, Jun He, Qiyuan Wang, ZiCheng Zhang, Tao Wang, Guangtao Zhai

In this paper, we propose a deep learning-based BIQA model for 4K content, which on one hand can recognize true and pseudo 4K content and on the other hand can evaluate their perceptual visual quality.

Blind Image Quality Assessment Multi-Task Learning

A No-Reference Deep Learning Quality Assessment Method for Super-resolution Images Based on Frequency Maps

no code implementations9 Jun 2022 ZiCheng Zhang, Wei Sun, Xiongkuo Min, Wenhan Zhu, Tao Wang, Wei Lu, Guangtao Zhai

Therefore, in this paper, we propose a no-reference deep-learning image quality assessment method based on frequency maps because the artifacts caused by SISR algorithms are quite sensitive to frequency information.

Image Quality Assessment Image Super-Resolution

Blind Surveillance Image Quality Assessment via Deep Neural Network Combined with the Visual Saliency

no code implementations9 Jun 2022 Wei Lu, Wei Sun, Wenhan Zhu, Xiongkuo Min, ZiCheng Zhang, Tao Wang, Guangtao Zhai

In this paper, we first conduct an example experiment (i. e. the face detection task) to demonstrate that the quality of the SIs has a crucial impact on the performance of the IVSS, and then propose a saliency-based deep neural network for the blind quality assessment of the SIs, which helps IVSS to filter the low-quality SIs and improve the detection and recognition performance.

Face Detection Image Quality Assessment

Uncertainty-based Network for Few-shot Image Classification

no code implementations17 May 2022 Minglei Yuan, Qian Xu, Chunhao Cai, Yin-Dong Zheng, Tao Wang, Tong Lu

Specifically, we first data augment and classify the query instance and calculate the mutual information of these classification scores.

Classification Few-Shot Image Classification +1

The Value of Information in Stopping Problems

no code implementations13 May 2022 Ehud Lehrer, Tao Wang

We consider stopping problems in which a decision maker (DM) faces an unknown state of nature and decides sequentially whether to stop and take an irreversible action; pay a fee and obtain additional information; or wait without acquiring information.

FRIH: Fine-grained Region-aware Image Harmonization

no code implementations13 May 2022 Jinlong Peng, Zekun Luo, Liang Liu, Boshen Zhang, Tao Wang, Yabiao Wang, Ying Tai, Chengjie Wang, Weiyao Lin

Image harmonization aims to generate a more realistic appearance of foreground and background for a composite image.

Image Harmonization

Causal Intervention for Subject-Deconfounded Facial Action Unit Recognition

no code implementations17 Apr 2022 Yingjie Chen, Diqi Chen, Tao Wang, Yizhou Wang, Yun Liang

Subject-invariant facial action unit (AU) recognition remains challenging for the reason that the data distribution varies among subjects.

Causal Inference Facial Action Unit Detection

GigaST: A 10,000-hour Pseudo Speech Translation Corpus

1 code implementation8 Apr 2022 Rong Ye, Chengqi Zhao, Tom Ko, Chutong Meng, Tao Wang, Mingxuan Wang, Jun Cao

The training set is translated by a strong machine translation system and the test set is translated by human.

Machine Translation Translation

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision

1 code implementation CVPR 2022 Kehong Gong, Bingbing Li, Jianfeng Zhang, Tao Wang, Jing Huang, Michael Bi Mi, Jiashi Feng, Xinchao Wang

Existing self-supervised 3D human pose estimation schemes have largely relied on weak supervisions like consistency loss to guide the learning, which, inevitably, leads to inferior results in real-world scenarios with unseen poses.

3D Human Pose Estimation

A Long Short-term Memory Based Recurrent Neural Network for Interventional MRI Reconstruction

no code implementations28 Mar 2022 Ruiyang Zhao, Zhao He, Tao Wang, Suhao Qiu, Pawel Herman, Yanle Hu, Chencheng Zhang, Dinggang Shen, Bomin Sun, Guang-Zhong Yang, Yuan Feng

Here we proposed a convolutional long short-term memory (Conv-LSTM) based recurrent neural network (RNN), or ConvLR, to reconstruct interventional images with golden-angle radial sampling.

MRI Reconstruction

Geometric Synthesis: A Free lunch for Large-scale Palmprint Recognition Model Pretraining

no code implementations11 Mar 2022 Kai Zhao, Lei Shen, Yingyi Zhang, Chuhan Zhou, Tao Wang, Ruixin Zhang, Shouhong Ding, Wei Jia, Wei Shen

In this paper, by observing that palmar creases are the key information to deep-learning-based palmprint recognition, we propose to synthesize training data by manipulating palmar creases.

End-to-end video instance segmentation via spatial-temporal graph neural networks

1 code implementation ICCV 2021 Tao Wang, Ning Xu, Kean Chen, Weiyao Lin

Specifically, graph nodes representing instance features are used for detection and segmentation while graph edges representing instance relations are used for tracking.

Instance Segmentation Semantic Segmentation +1

An STDP-Based Supervised Learning Algorithm for Spiking Neural Networks

no code implementations7 Mar 2022 Zhanhao Hu, Tao Wang, Xiaolin Hu

Compared with rate-based artificial neural networks, Spiking Neural Networks (SNN) provide a more biological plausible model for the brain.

NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation

no code implementations5 Mar 2022 Tao Wang, Ruibo Fu, Jiangyan Yi, JianHua Tao, Zhengqi Wen

We have also verified through experiments that this method can effectively control the noise components in the predicted speech and adjust the SNR of speech.

CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing

1 code implementation21 Feb 2022 Tao Wang, Jiangyan Yi, Ruibo Fu, JianHua Tao, Zhengqi Wen

It can solve unnatural prosody in the edited region and synthesize the speech corresponding to the unseen words in the transcript.

Few-Shot Learning

Single UHD Image Dehazing via Interpretable Pyramid Network

1 code implementation17 Feb 2022 Boxue Xiao, Zhuoran Zheng, Xiang Chen, Chen Lv, Yunliang Zhuang, Tao Wang

Currently, most single image dehazing models cannot run an ultra-high-resolution (UHD) image with a single GPU shader in real-time.

Image Dehazing Single Image Dehazing

SODAR: Segmenting Objects by DynamicallyAggregating Neighboring Mask Representations

1 code implementation15 Feb 2022 Tao Wang, Jun Hao Liew, Yu Li, Yunpeng Chen, Jiashi Feng

Unlike the original per grid cell object masks, SODAR is implicitly supervised to learn mask representations that encode geometric structure of nearby objects and complement adjacent representations with context.

Instance Segmentation Semantic Segmentation

LighTN: Light-weight Transformer Network for Performance-overhead Tradeoff in Point Cloud Downsampling

no code implementations13 Feb 2022 Xu Wang, Yi Jin, Yigang Cen, Tao Wang, Bowen Tang, Yidong Li

Compared with traditional task-irrelevant downsampling methods, task-oriented neural networks have shown improved performance in point cloud downsampling range.

COVID-19 Hospitalizations Forecasts Using Internet Search Data

no code implementations3 Feb 2022 Tao Wang, Simin Ma, Soobin Baek, Shihao Yang

As the COVID-19 spread over the globe and new variants of COVID-19 keep occurring, reliable real-time forecasts of COVID-19 hospitalizations are critical for public health decision on medical resources allocations such as ICU beds, ventilators, and personnel to prepare for the surge of COVID-19 pandemics.

Decision Making Time Series +1

IoTGAN: GAN Powered Camouflage Against Machine Learning Based IoT Device Identification

no code implementations10 Jan 2022 Tao Hou, Tao Wang, Zhuo Lu, Yao Liu, Yalin Sagduyu

In this research, we propose a novel attack strategy named IoTGAN to manipulate an IoT device's traffic such that it can evade machine learning based IoT device identification.

BIG-bench Machine Learning

Deep Probabilistic Graph Matching

no code implementations5 Jan 2022 He Liu, Tao Wang, Yidong Li, Congyan Lang, Songhe Feng, Haibin Ling

Most previous learning-based graph matching algorithms solve the \textit{quadratic assignment problem} (QAP) by dropping one or more of the matching constraints and adopting a relaxed assignment solver to obtain sub-optimal correspondences.

Graph Matching

GLAN: A Graph-based Linear Assignment Network

no code implementations5 Jan 2022 He Liu, Tao Wang, Congyan Lang, Songhe Feng, Yi Jin, Yidong Li

The experimental results on a synthetic dataset reveal that our method outperforms state-of-the-art baselines and achieves consistently high accuracy with the increment of the problem size.

Multi-Object Tracking

Powerful Graph Convolutioal Networks with Adaptive Propagation Mechanism for Homophily and Heterophily

no code implementations27 Dec 2021 Tao Wang, Rui Wang, Di Jin, Dongxiao He, Yuxiao Huang

To address this problem, in this paper we design a novel propagation mechanism, which can automatically change the propagation and aggregation process according to homophily or heterophily between node pairs.

ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition

1 code implementation NAACL 2022 Xinyu Wang, Min Gui, Yong Jiang, Zixia Jia, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

As text representations take the most important role in MNER, in this paper, we propose {\bf I}mage-{\bf t}ext {\bf A}lignments (ITA) to align image features into the textual space, so that the attention mechanism in transformer-based pretrained textual embeddings can be better utilized.

Multi-modal Named Entity Recognition named-entity-recognition +1

Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition

1 code implementation7 Dec 2021 Tianyu Guo, Hong Liu, Zhan Chen, Mengyuan Liu, Tao Wang, Runwei Ding

In this paper, to make better use of the movement patterns introduced by extreme augmentations, a Contrastive Learning framework utilizing Abundant Information Mining for self-supervised action Representation (AimCLR) is proposed.

Contrastive Learning Representation Learning +2

Pose-guided Feature Disentangling for Occluded Person Re-identification Based on Transformer

1 code implementation5 Dec 2021 Tao Wang, Hong Liu, Pinhao Song, Tianyu Guo, Wei Shi

Therefore, we propose a transformer-based Pose-guided Feature Disentangling (PFD) method by utilizing pose information to clearly disentangle semantic components (e. g. human body or joint parts) and selectively match non-occluded parts correspondingly.

Person Re-Identification

MC-Blur: A Comprehensive Benchmark for Image Deblurring

2 code implementations1 Dec 2021 Kaihao Zhang, Tao Wang, Wenhan Luo, Boheng Chen, Wenqi Ren, Bjorn Stenger, Wei Liu, Hongdong Li, Ming-Hsuan Yang

Blur artifacts can seriously degrade the visual quality of images, and numerous deblurring methods have been proposed for specific scenarios.

Benchmarking Deblurring +1

Topological and Algebraic Structures of Atanassov's Intuitionistic Fuzzy-Values Space

no code implementations17 Nov 2021 Xinxing Wu, Tao Wang, Qian Liu, Peide Liu, Guanrong Chen, Xu Zhang

By introducing a new operator for IFVs via the linear order based on a score function and an accuracy function, we show that such an operator is a strong negation on IFVs.

Direct Multi-view Multi-person 3D Pose Estimation

2 code implementations NeurIPS 2021 Tao Wang, Jianfeng Zhang, Yujun Cai, Shuicheng Yan, Jiashi Feng

Instead of estimating 3D joint locations from costly volumetric representation or reconstructing the per-person 3D pose from multiple detected 2D poses as in previous methods, MvP directly regresses the multi-person 3D poses in a clean and efficient way, without relying on intermediate tasks.

Ranked #3 on 3D Multi-Person Pose Estimation on Panoptic (using extra training data)

3D Multi-Person Pose Estimation 3D Pose Estimation

CMTR: Cross-modality Transformer for Visible-infrared Person Re-identification

no code implementations18 Oct 2021 Tengfei Liang, Yi Jin, Yajun Gao, Wu Liu, Songhe Feng, Tao Wang, Yidong Li

The existing convolutional neural network-based methods mainly face the problem of insufficient perception of modalities' information, and can not learn good discriminative modality-invariant embeddings for identities, which limits their performance.

Cross-Modality Person Re-identification Person Re-Identification

Distract Your Attention: Multi-head Cross Attention Network for Facial Expression Recognition

2 code implementations15 Sep 2021 Zhengyao Wen, Wenzhong Lin, Tao Wang, Ge Xu

To address these issues, we propose our DAN with three key components: Feature Clustering Network (FCN), Multi-head cross Attention Network (MAN), and Attention Fusion Network (AFN).

Facial Expression Recognition Facial Expression Recognition (FER)

PnP-DETR: Towards Efficient Visual Analysis with Transformers

1 code implementation ICCV 2021 Tao Wang, Li Yuan, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

Recently, DETR pioneered the solution of vision tasks with transformers, it directly translates the image feature map into the object detection result.

object-detection Object Detection +1

MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

1 code implementation EMNLP 2021 Xinyin Ma, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Weiming Lu

Entity retrieval, which aims at disambiguating mentions to canonical entities from massive KBs, is essential for many tasks in natural language processing.

Entity Linking Entity Retrieval +1

Joint Graph Learning and Matching for Semantic Feature Correspondence

2 code implementations1 Sep 2021 He Liu, Tao Wang, Yidong Li, Congyan Lang, Yi Jin, Haibin Ling

In this paper, we propose a joint \emph{graph learning and matching} network, named GLAM, to explore reliable graph structures for boosting graph matching.

Graph Learning Graph Matching

Secoco: Self-Correcting Encoding for Neural Machine Translation

no code implementations Findings (EMNLP) 2021 Tao Wang, Chengqi Zhao, Mingxuan Wang, Lei LI, Hang Li, Deyi Xiong

This paper presents Self-correcting Encoding (Secoco), a framework that effectively deals with input noise for robust neural machine translation by introducing self-correcting predictors.

Machine Translation NMT +1

Learning Class-level Prototypes for Few-shot Learning

no code implementations25 Aug 2021 Minglei Yuan, Wenhai Wang, Tao Wang, Chunhao Cai, Qian Xu, Tong Lu

Few-shot learning aims to recognize new categories using very few labeled samples.

Few-Shot Learning

Real-time Image Enhancer via Learnable Spatial-aware 3D Lookup Tables

no code implementations ICCV 2021 Tao Wang, Yong Li, Jingyang Peng, Yipeng Ma, Xian Wang, Fenglong Song, Youliang Yan

One is a 1D weight vector used for image-level scenario adaptation, the other is a 3D weight map aimed for pixel-wise category fusion.

Image Enhancement

Risk Minimization for Zero-shot Sequence Labeling

no code implementations ACL 2021 Zechuan Hu, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

In this paper, we propose a novel unified framework for zero-shot sequence labeling with minimum risk training and design a new decomposable risk function that models the relations between the predicted labels from the source models and the true labels.

Multi-View Cross-Lingual Structured Prediction with Minimum Supervision

no code implementations ACL 2021 Zechuan Hu, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

In structured prediction problems, cross-lingual transfer learning is an efficient way to train quality models for low-resource languages, and further improvement can be obtained by learning from multiple source languages.

Cross-Lingual Transfer Structured Prediction +1

No-Reference Quality Assessment for 3D Colored Point Cloud and Mesh Models

2 code implementations5 Jul 2021 ZiCheng Zhang, Wei Sun, Xiongkuo Min, Tao Wang, Wei Lu, Guangtao Zhai

Therefore, many related studies such as point cloud quality assessment (PCQA) and mesh quality assessment (MQA) have been carried out to measure the visual quality degradations of 3D models.

Point Cloud Quality Assessment

Ultra-High-Definition Image Dehazing via Multi-Guided Bilateral Learning

1 code implementation CVPR 2021 Zhuoran Zheng, Wenqi Ren, Xiaochun Cao, Xiaobin Hu, Tao Wang, Fenglong Song, Xiuyi Jia

To address the problem, we propose a novel network capable of real-time dehazing of 4K images on a single GPU, which consists of three deep CNNs.

Image Dehazing Single Image Dehazing +1

Deep Learning based Full-reference and No-reference Quality Assessment Models for Compressed UGC Videos

1 code implementation2 Jun 2021 Wei Sun, Tao Wang, Xiongkuo Min, Fuwang Yi, Guangtao Zhai

The proposed VQA framework consists of three modules, the feature extraction module, the quality regression module, and the quality pooling module.

regression Video Quality Assessment

Adaptive Feature Alignment for Adversarial Training

no code implementations31 May 2021 Tao Wang, Ruixin Zhang, Xingyu Chen, Kai Zhao, Xiaolin Huang, Yuge Huang, Shaoxin Li, Jilin Li, Feiyue Huang

Based on this observation, we propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.

Adversarial Defense

The Volctrans Neural Speech Translation System for IWSLT 2021

1 code implementation ACL (IWSLT) 2021 Chengqi Zhao, Zhicheng Liu, Jian Tong, Tao Wang, Mingxuan Wang, Rong Ye, Qianqian Dong, Jun Cao, Lei LI

For offline speech translation, our best end-to-end model achieves 8. 1 BLEU improvements over the benchmark on the MuST-C test set and is even approaching the results of a strong cascade solution.

Translation

Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning

3 code implementations ACL 2021 Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

We find empirically that the contextual representations computed on the retrieval-based input view, constructed through the concatenation of a sentence and its external contexts, can achieve significantly improved performance compared to the original input view based only on the sentence.

Chinese Named Entity Recognition Chunking +2

FedCom: A Byzantine-Robust Local Model Aggregation Rule Using Data Commitment for Federated Learning

no code implementations16 Apr 2021 Bo Zhao, Peng Sun, Liming Fang, Tao Wang, Keyu Jiang

The results demonstrate its effectiveness and superior performance compared to the state-of-the-art Byzantine-robust schemes in defending against typical data poisoning and model poisoning attacks under practical Non-IID data distributions.

Data Poisoning Federated Learning +2

Half-Truth: A Partially Fake Audio Detection Dataset

no code implementations8 Apr 2021 Jiangyan Yi, Ye Bai, JianHua Tao, Zhengkun Tian, Chenglong Wang, Tao Wang, Ruibo Fu

Diverse promising datasets have been designed to hold back the development of fake audio detection, such as ASVspoof databases.

Speech Synthesis

IDOL-Net: An Interactive Dual-Domain Parallel Network for CT Metal Artifact Reduction

no code implementations3 Apr 2021 Tao Wang, Wenjun Xia, Zexin Lu, Huaiqiang Sun, Yan Liu, Hu Chen, Jiliu Zhou, Yi Zhang

Since the dual-domain MAR methods can leverage the hybrid information from both sinogram and image domains, they have significantly improved the performance compared to single-domain methods.

Computed Tomography (CT) Disentanglement +1

A Universal Model for Cross Modality Mapping by Relational Reasoning

no code implementations26 Feb 2021 Zun Li, Congyan Lang, Liqian Liang, Tao Wang, Songhe Feng, Jun Wu, Yidong Li

With the aim of matching a pair of instances from two different modalities, cross modality mapping has attracted growing attention in the computer vision community.

Image Classification Relational Reasoning

Regional and Sectoral Structures and Their Dynamics of Chinese Economy: A Network Perspective from Multi-Regional Input-Output Tables

no code implementations24 Feb 2021 Tao Wang, Shiying Xiao, Jun Yan, Panpan Zhang

Quantified metrics assessing the relative importance of the province-sectors in the national economy echo the national and regional economic development policies to a certain extent.

Community Detection Physics and Society General Economics Economics Applications

Attention Models for Point Clouds in Deep Learning: A Survey

no code implementations22 Feb 2021 Xu Wang, Yi Jin, Yigang Cen, Tao Wang, Yidong Li

Recently, the advancement of 3D point clouds in deep learning has attracted intensive research in different application domains such as computer vision and robotic tasks.

3D Pose Estimation 3D Semantic Segmentation

DAN-Net: Dual-Domain Adaptive-Scaling Non-local Network for CT Metal Artifact Reduction

1 code implementation16 Feb 2021 Tao Wang, Wenjun Xia, Yongqiang Huang, Huaiqiang Sun, Yan Liu, Hu Chen, Jiliu Zhou, Yi Zhang

With the rapid development of deep learning in the field of medical imaging, several network models have been proposed for metal artifact reduction (MAR) in CT.

Computed Tomography (CT) Metal Artifact Reduction

Isolation mechanisms for high-speed packet-processing pipelines

no code implementations29 Jan 2021 Tao Wang, Xiangrui Yang, Gianni Antichi, Anirudh Sivaraman, Aurojit Panda

We have open sourced the code for Menshen's hardware and software at https://isolation. quest/.

Networking and Internet Architecture Hardware Architecture

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

12 code implementations ICCV 2021 Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zihang Jiang, Francis EH Tay, Jiashi Feng, Shuicheng Yan

To overcome such limitations, we propose a new Tokens-To-Token Vision Transformer (T2T-ViT), which incorporates 1) a layer-wise Tokens-to-Token (T2T) transformation to progressively structurize the image to tokens by recursively aggregating neighboring Tokens into one Token (Tokens-to-Token), such that local structure represented by surrounding tokens can be modeled and tokens length can be reduced; 2) an efficient backbone with a deep-narrow structure for vision transformer motivated by CNN architecture design after empirical study.

Image Classification Language Modelling

Search for axion-like dark matter using solid-state nuclear magnetic resonance

no code implementations4 Jan 2021 Deniz Aybas, Janos Adam, Emmy Blumenthal, Alexander V. Gramolin, Dorian Johnson, Annalies Kleyheeg, Samer Afach, John W. Blanchard, Gary P. Centers, Antoine Garcon, Martin Engler, Nataniel L. Figueroa, Marina Gil Sendra, Arne Wickenbrock, Matthew Lawson, Tao Wang, Teng Wu, Haosu Luo, Hamdi Mani, Philip Mauskopf, Peter W. Graham, Surjeet Rajendran, Derek F. Jackson Kimball, Dmitry Budker, Alexander O. Sushkov

We calibrated the detector and characterized the excitation spectrum and relaxation parameters of the nuclear spin ensemble with pulsed magnetic resonance measurements in a 4. 4 T magnetic field.

High Energy Physics - Experiment Other Condensed Matter Instrumentation and Detectors

AggMask: Exploring locally aggregated learning of mask representations for instance segmentation

1 code implementation1 Jan 2021 Tao Wang, Jun Hao Liew, Yu Li, Yunpeng Chen, Jiashi Feng

Recently proposed one-stage instance segmentation models (\emph{e. g.}, SOLO) learn to directly predict location-specific object mask with fully-convolutional networks.

Instance Segmentation Semantic Segmentation

Multi-Scale Separable Network for Ultra-High-Definition Video Deblurring

1 code implementation ICCV 2021 Senyou Deng, Wenqi Ren, Yanyang Yan, Tao Wang, Fenglong Song, Xiaochun Cao

Although recent research has witnessed a significant progress on the video deblurring task, these methods struggle to reconcile inference efficiency and visual quality simultaneously, especially on ultra-high-definition (UHD) videos (e. g., 4K resolution).

Deblurring Vocal Bursts Intensity Prediction

FASG: Feature Aggregation Self-training GCN for Semi-supervised Node Classification

no code implementations1 Jan 2021 Gongpei Zhao, Tao Wang, Yidong Li, Yi Jin

Recently, Graph Convolutioal Networks (GCNs) have achieved significant success in many graph-based learning tasks, especially for node classification, due to its excellent ability in representation learning.

Classification General Classification +2

Exploring the limits of Concurrency in ML Training on Google TPUs

no code implementations7 Nov 2020 Sameer Kumar, James Bradbury, Cliff Young, Yu Emma Wang, Anselm Levskaya, Blake Hechtman, Dehao Chen, HyoukJoong Lee, Mehmet Deveci, Naveen Kumar, Pankaj Kanwar, Shibo Wang, Skye Wanderman-Milne, Steve Lacy, Tao Wang, Tayo Oguntebi, Yazhou Zu, Yuanzhong Xu, Andy Swing

Recent results in language understanding using neural networks have required training hardware of unprecedentedscale, with thousands of chips cooperating on a single training run.

The Box is in the Pen: Evaluating Commonsense Reasoning in Neural Machine Translation

1 code implementation Findings of the Association for Computational Linguistics 2020 Jie He, Tao Wang, Deyi Xiong, Qun Liu

Our experiments and analyses demonstrate that neural machine translation performs poorly on commonsense reasoning of the three ambiguity types in terms of both reasoning accuracy ( 6 60. 1{\%}) and reasoning consistency (6 31{\%}).

Common Sense Reasoning Machine Translation +1

Toward Accurate Person-level Action Recognition in Videos of Crowded Scenes

no code implementations16 Oct 2020 Li Yuan, Yichen Zhou, Shuning Chang, Ziyuan Huang, Yunpeng Chen, Xuecheng Nie, Tao Wang, Jiashi Feng, Shuicheng Yan

Prior works always fail to deal with this problem in two aspects: (1) lacking utilizing information of the scenes; (2) lacking training data in the crowd and complex scenes.

Action Recognition In Videos Semantic Segmentation

Finding Action Tubes with a Sparse-to-Dense Framework

no code implementations30 Aug 2020 Yuxi Li, Weiyao Lin, Tao Wang, John See, Rui Qian, Ning Xu, Li-Min Wang, Shugong Xu

The task of spatial-temporal action detection has attracted increasing attention among researchers.

Ranked #3 on Action Detection on UCF Sports (Video-mAP 0.2 metric)

Action Detection

High Accurate Time-of-Arrival Estimation with Fine-Grained Feature Generation for Internet-of-Things Applications

no code implementations18 Aug 2020 Guangjin Pan, Tao Wang, Shunqing Zhang, Shugong Xu

Conventional schemes often require extra reference signals or more complicated algorithms to improve the time-of-arrival (TOA) estimation accuracy.

Object-aware Multimodal Named Entity Recognition in Social Media Posts with Adversarial Learning

1 code implementation3 Aug 2020 Changmeng Zheng, Zhiwei Wu, Tao Wang, Cai Yi, Qing Li

To better exploit visual and textual information in NER, we propose an adversarial gated bilinear attention neural network (AGBAN).

named-entity-recognition Named Entity Recognition +1

The Devil is in Classification: A Simple Framework for Long-tail Object Detection and Instance Segmentation

1 code implementation ECCV 2020 Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, Jiashi Feng

Specifically, we systematically investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset, and unveil that a major cause is the inaccurate classification of object proposals.

General Classification Instance Segmentation +3

A Multi-Level Approach to Waste Object Segmentation

1 code implementation8 Jul 2020 Tao Wang, Yuanzheng Cai, Lingyu Liang, Dongyi Ye

We address the problem of localizing waste objects from a color image and an optional depth image, which is a key perception component for robotic interaction with such objects.

Semantic Segmentation

Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax

2 code implementations CVPR 2020 Yu Li, Tao Wang, Bingyi Kang, Sheng Tang, Chunfeng Wang, Jintao Li, Jiashi Feng

Solving long-tail large vocabulary object detection with deep learning based models is a challenging and demanding task, which is however under-explored. In this work, we provide the first systematic analysis on the underperformance of state-of-the-art models in front of long-tail distribution.

Image Classification Instance Segmentation +4

Learning Combinatorial Solver for Graph Matching

no code implementations CVPR 2020 Tao Wang, He Liu, Yidong Li, Yi Jin, Xiaohui Hou, Haibin Ling

Learning-based approaches to graph matching have been developed and explored for more than a decade, have grown rapidly in scope and popularity in recent years.

Combinatorial Optimization Graph Matching +1

Human in Events: A Large-Scale Benchmark for Human-centric Video Analysis in Complex Events

no code implementations9 May 2020 Weiyao Lin, Huabin Liu, Shizhan Liu, Yuxi Li, Rui Qian, Tao Wang, Ning Xu, Hongkai Xiong, Guo-Jun Qi, Nicu Sebe

To this end, we present a new large-scale dataset with comprehensive annotations, named Human-in-Events or HiEve (Human-centric video analysis in complex Events), for the understanding of human motions, poses, and actions in a variety of realistic events, especially in crowd & complex events.

Action Recognition Pose Estimation

Automatic low-bit hybrid quantization of neural networks through meta learning

no code implementations24 Apr 2020 Tao Wang, Junsong Wang, Chang Xu, Chao Xue

With the best searched quantization policy, we subsequently retrain or finetune to further improve the performance of the quantized target network.

Meta-Learning Quantization +1