Search Results for author: Tao Wang

Found 211 papers, 79 papers with code

Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning

no code implementations • CVPR 2013 • Tao Wang, Xuming He, Nick Barnes

We propose a structured Hough voting method for detecting objects with heavy occlusion in indoor environments.

Object object-detection +1

Paper
Add Code

Dual Training and Dual Prediction for Polarity Classification

no code implementations • ACL 2013 • Rui Xia, Tao Wang, Xuelei Hu, Shoushan Li, Cheng-qing Zong

Feature Engineering General Classification +2

Paper
Add Code

Time-dependent Hierarchical Dirichlet Model for Timeline Generation

no code implementations • 8 Dec 2013 • Rumeng Li, Tao Wang, Xun Wang

Existing approaches neglect such hierarchical topic structure involved in the news corpus in timeline generation.

Novelty Detection Sentence

Paper
Add Code

Modeling Attractiveness and Multiple Clicks in Sponsored Search Results

no code implementations • 1 Jan 2014 • Dinesh Govindaraj, Tao Wang, S. V. N. Vishwanathan

Our model seamlessly incorporates the effect of externalities (quality of other search results displayed in response to a user query), user fatigue, as well as pre and post-click relevance of a sponsored search result.

Paper
Add Code

A Multi-parent Memetic Algorithm for the Linear Ordering Problem

no code implementations • 18 May 2014 • Tao Ye, Tao Wang, Zhipeng Lu, Jin-Kao Hao

In this paper, we present a multi-parent memetic algorithm (denoted by MPM) for solving the classic Linear Ordering Problem (LOP).

Paper
Add Code

Voting for Deceptive Opinion Spam Detection

no code implementations • 16 Sep 2014 • Tao Wang, Hua Zhu

Consumers' purchase decisions are increasingly influenced by user-generated online reviews.

Deception Detection General Classification +1

Paper
Add Code

Convolutional Neural Networks over Tree Structures for Programming Language Processing

8 code implementations • 18 Sep 2014 • Lili Mou, Ge Li, Lu Zhang, Tao Wang, Zhi Jin

Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community.

Sentence

201

Paper
Code

An Empirical Evaluation of Deep Learning on Highway Driving

no code implementations • 7 Apr 2015 • Brody Huval, Tao Wang, Sameep Tandon, Jeff Kiske, Will Song, Joel Pazhayampallil, Mykhaylo Andriluka, Pranav Rajpurkar, Toki Migimatsu, Royce Cheng-Yue, Fernando Mujica, Adam Coates, Andrew Y. Ng

We collect a large data set of highway data and apply deep learning and computer vision algorithms to problems such as car and lane detection.

Ranked #2 on Lane Detection on Caltech Lanes Cordova

Autonomous Driving Lane Detection

Paper
Add Code

Driverseat: Crowdstrapping Learning Tasks for Autonomous Driving

no code implementations • 7 Dec 2015 • Pranav Rajpurkar, Toki Migimatsu, Jeff Kiske, Royce Cheng-Yue, Sameep Tandon, Tao Wang, Andrew Ng

While emerging deep-learning systems have outclassed knowledge-based approaches in many tasks, their application to detection tasks for autonomous technologies remains an open field for scientific exploration.

Autonomous Driving Lane Detection

Paper
Add Code

Onset of lasing in small devices: the identification of the first threshold

no code implementations • 1 Mar 2018 • Tao Wang, Gian Piero Puccioni, Gian Luca Lippi

We present a simple and flexible technique for identifying the onset of coherent emission in lasers, from the mesoscale to the nanoscale, which makes use of photon counting and a small amplitude modulation added to the pump.

Optics

Paper
Add Code

Image Classification at Supercomputer Scale

no code implementations • 16 Nov 2018 • Chris Ying, Sameer Kumar, Dehao Chen, Tao Wang, Youlong Cheng

Deep learning is extremely computationally intensive, and hardware vendors have responded by building faster accelerators in large clusters.

Classification General Classification +1

Paper
Add Code

GM-PLL: Graph Matching based Partial Label Learning

no code implementations • 10 Jan 2019 • Gengyu Lyu, Songhe Feng, Tao Wang, Congyan Lang, Yidong Li

Partial Label Learning (PLL) aims to learn from the data where each training example is associated with a set of candidate labels, among which only one is correct.

Graph Matching Partial Label Learning

Paper
Add Code

Flash: Efficient Dynamic Routing for Offchain Networks

2 code implementations • 14 Feb 2019 • Peng Wang, Hong Xu, Xin Jin, Tao Wang

Mice payments are directly sent by looking up a routing table with a few precomputed paths to reduce probing overhead.

Networking and Internet Architecture

Paper
Code

A Unified Analytical Framework for Trustable Machine Learning and Automation Running with Blockchain

no code implementations • 21 Mar 2019 • Tao Wang

Second, it proposes a unified analytical framework for trustable machine learning by using blockchain technology.

BIG-bench Machine Learning

Paper
Add Code

Few-shot Adaptive Faster R-CNN

no code implementations • CVPR 2019 • Tao Wang, Xiaopeng Zhang, Li Yuan, Jiashi Feng

To address these challenges, we first introduce a pairing mechanism over source and target features to alleviate the issue of insufficient target domain samples.

object-detection Object Detection +1

Paper
Add Code

High-Performance Support Vector Machines and Its Applications

no code implementations • 1 May 2019 • Taiping He, Tao Wang, Ralph Abbey, Joshua Griffin

The support vector machines (SVM) algorithm is a popular classification technique in data mining and machine learning.

Classification General Classification +1

Paper
Add Code

Fully Automatic Brain Tumor Segmentation using a Normalized Gaussian Bayesian Classifier and 3D Fluid Vector Flow

no code implementations • 1 May 2019 • Tao Wang, Irene Cheng, Anup Basu

This paper presents an automatic brain tumor segmentation method based on a Normalized Gaussian Bayesian classification and a new 3D Fluid Vector Flow (FVF) algorithm.

Brain Tumor Segmentation Segmentation +1

Paper
Add Code

A note on 'A fully parallel 3D thinning algorithm and its applications'

no code implementations • 1 May 2019 • Tao Wang, Anup Basu

A 3D thinning algorithm erodes a 3D binary image layer by layer to extract the skeletons.

Paper
Add Code

Variational Prototype Replays for Continual Learning

1 code implementation • 23 May 2019 • Mengmi Zhang, Tao Wang, Joo Hwee Lim, Gabriel Kreiman, Jiashi Feng

In each classification task, our method learns a set of variational prototypes with their means and variances, where embedding of the samples from the same class can be represented in a prototypical distribution and class-representative prototypes are separated apart.

Continual Learning General Classification +2

Paper
Code

Distilling Object Detectors with Fine-grained Feature Imitation

3 code implementations • CVPR 2019 • Tao Wang, Li Yuan, Xiaopeng Zhang, Jiashi Feng

To address the challenge of distilling knowledge in detection model, we propose a fine-grained feature imitation method exploiting the cross-location discrepancy of feature response.

Knowledge Distillation Object +2

412

Paper
Code

Central Similarity Quantization for Efficient Image and Video Retrieval

1 code implementation • CVPR 2020 • Li Yuan, Tao Wang, Xiaopeng Zhang, Francis EH Tay, Zequn Jie, Wei Liu, Jiashi Feng

In this work, we propose a new \emph{global} similarity metric, termed as \emph{central similarity}, with which the hash codes of similar data pairs are encouraged to approach a common center and those for dissimilar pairs to converge to different centers, to improve hash learning efficiency and retrieval accuracy.

Quantization Retrieval +1

230

Paper
Code

Trustable and Automated Machine Learning Running with Blockchain and Its Applications

no code implementations • 14 Aug 2019 • Tao Wang, Xinmin Wu, Taiping He

Experiments show that this synthetic data generation is very effective in applications such as fraud detection in financial data.

BIG-bench Machine Learning Fraud Detection +1

Paper
Add Code

Scale MLPerf-0.6 models on Google TPU-v3 Pods

no code implementations • 21 Sep 2019 • Sameer Kumar, Victor Bitorff, Dehao Chen, Chiachen Chou, Blake Hechtman, HyoukJoong Lee, Naveen Kumar, Peter Mattson, Shibo Wang, Tao Wang, Yuanzhong Xu, Zongwei Zhou

The recent submission of Google TPU-v3 Pods to the industry wide MLPerf v0. 6 training benchmark demonstrates the scalability of a suite of industry relevant ML models.

Benchmarking

Paper
Add Code

Revisiting Knowledge Distillation via Label Smoothing Regularization

2 code implementations • CVPR 2020 • Li Yuan, Francis E. H. Tay, Guilin Li, Tao Wang, Jiashi Feng

Without any extra computation cost, Tf-KD achieves up to 0. 65\% improvement on ImageNet over well-established baseline models, which is superior to label smoothing regularization.

Self-Knowledge Distillation

1,279

Paper
Code

Prototype Recalls for Continual Learning

no code implementations • 25 Sep 2019 • Mengmi Zhang, Tao Wang, Joo Hwee Lim, Jiashi Feng

Without tampering with the performance on initial tasks, our method learns novel concepts given a few training examples of each class in new tasks.

Continual Learning Metric Learning +1

Paper
Add Code

Deformable Surface Tracking by Graph Matching

no code implementations • ICCV 2019 • Tao Wang, Haibin Ling, Congyan Lang, Songhe Feng, Xiaohui Hou

This paper addresses the problem of deformable surface tracking from monocular images.

Graph Matching

Paper
Add Code

Classification Calibration for Long-tail Instance Segmentation

1 code implementation • 29 Oct 2019 • Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Jun Hao Liew, Sheng Tang, Steven Hoi, Jiashi Feng

In this report, we investigate the performance drop phenomenon of state-of-the-art two-stage instance segmentation models when processing extreme long-tail training data based on the LVIS [5] dataset, and find a major cause is the inaccurate classification of object proposals.

Classification General Classification +3

100

Paper
Code

Merging External Bilingual Pairs into Neural Machine Translation

no code implementations • 2 Dec 2019 • Tao Wang, Shaohui Kuang, Deyi Xiong, António Branco

As neural machine translation (NMT) is not easily amenable to explicit correction of errors, incorporating pre-specified translations into NMT is widely regarded as a non-trivial challenge.

Machine Translation NMT +1

Paper
Add Code

Learning a Layout Transfer Network for Context Aware Object Detection

no code implementations • 9 Dec 2019 • Tao Wang, Xuming He, Yuanzheng Cai, Guobao Xiao

We present a context aware object detection method based on a retrieve-and-transform scene layout model.

Autonomous Driving Object +2

Paper
Add Code

iqiyi Submission to ActivityNet Challenge 2019 Kinetics-700 challenge: Hierarchical Group-wise Attention

no code implementations • 7 Feb 2020 • Qian Liu, Dongyang Cai, Jie Liu, Nan Ding, Tao Wang

The standard non-local (NL) module is effective in aggregating frame-level features on the task of video classification but presents low parameters efficiency and high computational cost.

General Classification Video Classification

Paper
Add Code

CTM: Collaborative Temporal Modeling for Action Recognition

no code implementations • 8 Feb 2020 • Qian Liu, Tao Wang, Jie Liu, Yang Guan, Qi Bu, Longfei Yang

In order to learn powerful feature of videos, we propose a Collaborative Temporal Modeling (CTM) block (Figure 1) to learn temporal information for action recognition.

Action Recognition Video Understanding

Paper
Add Code

Structure-Level Knowledge Distillation For Multilingual Sequence Labeling

1 code implementation • ACL 2020 • Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Fei Huang, Kewei Tu

Multilingual sequence labeling is a task of predicting label sequences using a single unified model for multiple languages.

Aspect Extraction Knowledge Distillation

Paper
Code

Deep Learning based OTDOA Positioning for NB-IoT Communication Systems

no code implementations • 10 Apr 2020 • Guangjin Pan, Tao Wang, Xiufeng Jiang, Shunqing Zhang

Positioning is becoming a key component in many Internet of Things (IoT) applications.

Paper
Add Code

Automatic low-bit hybrid quantization of neural networks through meta learning

no code implementations • 24 Apr 2020 • Tao Wang, Junsong Wang, Chang Xu, Chao Xue

With the best searched quantization policy, we subsequently retrain or finetune to further improve the performance of the quantized target network.

Meta-Learning Quantization +1

Paper
Add Code

Human in Events: A Large-Scale Benchmark for Human-centric Video Analysis in Complex Events

no code implementations • 9 May 2020 • Weiyao Lin, Huabin Liu, Shizhan Liu, Yuxi Li, Rui Qian, Tao Wang, Ning Xu, Hongkai Xiong, Guo-Jun Qi, Nicu Sebe

To this end, we present a new large-scale dataset with comprehensive annotations, named Human-in-Events or HiEve (Human-centric video analysis in complex Events), for the understanding of human motions, poses, and actions in a variety of realistic events, especially in crowd & complex events.

Action Recognition Pose Estimation

Paper
Add Code

Making Robots Draw A Vivid Portrait In Two Minutes

no code implementations • 12 May 2020 • Fei Gao, Jingjie Zhu, Zeyuan Yu, Peng Li, Tao Wang

The whole portrait drawing robotic system is named AiSketcher.

Style Transfer Vocal Bursts Valence Prediction

Paper
Add Code

Learning Combinatorial Solver for Graph Matching

no code implementations • CVPR 2020 • Tao Wang, He Liu, Yidong Li, Yi Jin, Xiaohui Hou, Haibin Ling

Learning-based approaches to graph matching have been developed and explored for more than a decade, have grown rapidly in scope and popularity in recent years.

Combinatorial Optimization Graph Matching +1

Paper
Add Code

Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax

2 code implementations • CVPR 2020 • Yu Li, Tao Wang, Bingyi Kang, Sheng Tang, Chunfeng Wang, Jintao Li, Jiashi Feng

Solving long-tail large vocabulary object detection with deep learning based models is a challenging and demanding task, which is however under-explored. In this work, we provide the first systematic analysis on the underperformance of state-of-the-art models in front of long-tail distribution.

Image Classification Instance Segmentation +5

350

Paper
Code

A Multi-Level Approach to Waste Object Segmentation

1 code implementation • 8 Jul 2020 • Tao Wang, Yuanzheng Cai, Lingyu Liang, Dongyi Ye

We address the problem of localizing waste objects from a color image and an optional depth image, which is a key perception component for robotic interaction with such objects.

Object Segmentation +1

Paper
Code

The Devil is in Classification: A Simple Framework for Long-tail Object Detection and Instance Segmentation

1 code implementation • ECCV 2020 • Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, Jiashi Feng

Specifically, we systematically investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset, and unveil that a major cause is the inaccurate classification of object proposals.

General Classification Instance Segmentation +4

100

Paper
Code

Object-aware Multimodal Named Entity Recognition in Social Media Posts with Adversarial Learning

1 code implementation • 3 Aug 2020 • Changmeng Zheng, Zhiwei Wu, Tao Wang, Cai Yi, Qing Li

To better exploit visual and textual information in NER, we propose an adversarial gated bilinear attention neural network (AGBAN).

named-entity-recognition Named Entity Recognition +1

Paper
Code

High Accurate Time-of-Arrival Estimation with Fine-Grained Feature Generation for Internet-of-Things Applications

no code implementations • 18 Aug 2020 • Guangjin Pan, Tao Wang, Shunqing Zhang, Shugong Xu

Conventional schemes often require extra reference signals or more complicated algorithms to improve the time-of-arrival (TOA) estimation accuracy.

Paper
Add Code

Finding Action Tubes with a Sparse-to-Dense Framework

no code implementations • 30 Aug 2020 • Yuxi Li, Weiyao Lin, Tao Wang, John See, Rui Qian, Ning Xu, Li-Min Wang, Shugong Xu

The task of spatial-temporal action detection has attracted increasing attention among researchers.

Ranked #3 on Action Detection on UCF Sports (Video-mAP 0.2 metric)

Action Detection

Paper
Add Code

More Embeddings, Better Sequence Labelers?

no code implementations • Findings of the Association for Computational Linguistics 2020 • Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

Recent work proposes a family of contextual embeddings that significantly improves the accuracy of sequence labelers over non-contextual embeddings.

Ranked #2 on Chunking on CoNLL 2003 (German)

Chunking Word Embeddings

Paper
Add Code

AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network

1 code implementation • EMNLP 2020 • Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

The linear-chain Conditional Random Field (CRF) model is one of the most widely-used neural sequence labeling approaches.

Ranked #3 on Chunking on CoNLL 2003 (German)

Chunking Variational Inference

Paper
Code

Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor

1 code implementation • ACL 2021 • Xinyu Wang, Yong Jiang, Zhaohui Yan, Zixia Jia, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

The objective function of knowledge distillation is typically the cross-entropy between the teacher and the student's output distributions.

Dependency Parsing Knowledge Distillation +1

Paper
Code

Automated Concatenation of Embeddings for Structured Prediction

2 code implementations • ACL 2021 • Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

Pretrained contextualized embeddings are powerful word representations for structured prediction tasks.

Ranked #1 on Chunking on Penn Treebank

Aspect Extraction Chunking +6

292

Paper
Code

Rapid Robust Principal Component Analysis: CUR Accelerated Inexact Low Rank Estimation

1 code implementation • 14 Oct 2020 • HanQin Cai, Keaton Hamm, Longxiu Huang, Jiaqi Li, Tao Wang

Robust principal component analysis (RPCA) is a widely used tool for dimension reduction.

Computational Efficiency Dimensionality Reduction

Paper
Code

Toward Accurate Person-level Action Recognition in Videos of Crowded Scenes

no code implementations • 16 Oct 2020 • Li Yuan, Yichen Zhou, Shuning Chang, Ziyuan Huang, Yunpeng Chen, Xuecheng Nie, Tao Wang, Jiashi Feng, Shuicheng Yan

Prior works always fail to deal with this problem in two aspects: (1) lacking utilizing information of the scenes; (2) lacking training data in the crowd and complex scenes.

Action Recognition In Videos Semantic Segmentation

Paper
Add Code

EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising

2 code implementations • 30 Oct 2020 • Tengfei Liang, Yi Jin, Yidong Li, Tao Wang, Songhe Feng, Congyan Lang

In this paper, we propose the Edge enhancement based Densely connected Convolutional Neural Network (EDCNN).

Ranked #1 on Denoising on AAPM

Computed Tomography (CT) Image Denoising

135

Paper
Code

The Box is in the Pen: Evaluating Commonsense Reasoning in Neural Machine Translation

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Jie He, Tao Wang, Deyi Xiong, Qun Liu

Our experiments and analyses demonstrate that neural machine translation performs poorly on commonsense reasoning of the three ambiguity types in terms of both reasoning accuracy ( 6 60. 1{\%}) and reasoning consistency (6 31{\%}).

Common Sense Reasoning Machine Translation +2

Paper
Code

Exploring the limits of Concurrency in ML Training on Google TPUs

no code implementations • 7 Nov 2020 • Sameer Kumar, James Bradbury, Cliff Young, Yu Emma Wang, Anselm Levskaya, Blake Hechtman, Dehao Chen, HyoukJoong Lee, Mehmet Deveci, Naveen Kumar, Pankaj Kanwar, Shibo Wang, Skye Wanderman-Milne, Steve Lacy, Tao Wang, Tayo Oguntebi, Yazhou Zu, Yuanzhong Xu, Andy Swing

Recent results in language understanding using neural networks have required training hardware of unprecedentedscale, with thousands of chips cooperating on a single training run.

Paper
Add Code

An Investigation of Potential Function Designs for Neural CRF

no code implementations • Findings of the Association for Computational Linguistics 2020 • Zechuan Hu, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

The neural linear-chain CRF model is one of the most widely-used approach to sequence labeling.

Paper
Add Code

Multi-Scale Separable Network for Ultra-High-Definition Video Deblurring

1 code implementation • ICCV 2021 • Senyou Deng, Wenqi Ren, Yanyang Yan, Tao Wang, Fenglong Song, Xiaochun Cao

Although recent research has witnessed a significant progress on the video deblurring task, these methods struggle to reconcile inference efficiency and visual quality simultaneously, especially on ultra-high-definition (UHD) videos (e. g., 4K resolution).

4k Deblurring +1

Paper
Code

AggMask: Exploring locally aggregated learning of mask representations for instance segmentation

1 code implementation • 1 Jan 2021 • Tao Wang, Jun Hao Liew, Yu Li, Yunpeng Chen, Jiashi Feng

Recently proposed one-stage instance segmentation models (\emph{e. g.}, SOLO) learn to directly predict location-specific object mask with fully-convolutional networks.

Instance Segmentation Segmentation +1

Paper
Code

Ultra-High-Definition Image HDR Reconstruction via Collaborative Bilateral Learning

no code implementations • ICCV 2021 • Zhuoran Zheng, Wenqi Ren, Xiaochun Cao, Tao Wang, Xiuyi Jia

First, we propose a dual-path network to extract content and chromatic features at a reduced resolution of the low dynamic range (LDR) input.

HDR Reconstruction Vocal Bursts Intensity Prediction

Paper
Add Code

FASG: Feature Aggregation Self-training GCN for Semi-supervised Node Classification

no code implementations • 1 Jan 2021 • Gongpei Zhao, Tao Wang, Yidong Li, Yi Jin

Recently, Graph Convolutioal Networks (GCNs) have achieved significant success in many graph-based learning tasks, especially for node classification, due to its excellent ability in representation learning.

Classification General Classification +2

Paper
Add Code

Search for axion-like dark matter using solid-state nuclear magnetic resonance

no code implementations • 4 Jan 2021 • Deniz Aybas, Janos Adam, Emmy Blumenthal, Alexander V. Gramolin, Dorian Johnson, Annalies Kleyheeg, Samer Afach, John W. Blanchard, Gary P. Centers, Antoine Garcon, Martin Engler, Nataniel L. Figueroa, Marina Gil Sendra, Arne Wickenbrock, Matthew Lawson, Tao Wang, Teng Wu, Haosu Luo, Hamdi Mani, Philip Mauskopf, Peter W. Graham, Surjeet Rajendran, Derek F. Jackson Kimball, Dmitry Budker, Alexander O. Sushkov

We calibrated the detector and characterized the excitation spectrum and relaxation parameters of the nuclear spin ensemble with pulsed magnetic resonance measurements in a 4. 4 T magnetic field.

High Energy Physics - Experiment Other Condensed Matter Instrumentation and Detectors

Paper
Add Code

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

13 code implementations • ICCV 2021 • Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zihang Jiang, Francis EH Tay, Jiashi Feng, Shuicheng Yan

To overcome such limitations, we propose a new Tokens-To-Token Vision Transformer (T2T-ViT), which incorporates 1) a layer-wise Tokens-to-Token (T2T) transformation to progressively structurize the image to tokens by recursively aggregating neighboring Tokens into one Token (Tokens-to-Token), such that local structure represented by surrounding tokens can be modeled and tokens length can be reduced; 2) an efficient backbone with a deep-narrow structure for vision transformer motivated by CNN architecture design after empirical study.

Ranked #403 on Image Classification on ImageNet

Image Classification Language Modelling

3,171

Paper
Code

Isolation mechanisms for high-speed packet-processing pipelines

no code implementations • 29 Jan 2021 • Tao Wang, Xiangrui Yang, Gianni Antichi, Anirudh Sivaraman, Aurojit Panda

We have open sourced the code for Menshen's hardware and software at https://isolation. quest/.

Networking and Internet Architecture Hardware Architecture

Paper
Add Code

DAN-Net: Dual-Domain Adaptive-Scaling Non-local Network for CT Metal Artifact Reduction

1 code implementation • 16 Feb 2021 • Tao Wang, Wenjun Xia, Yongqiang Huang, Huaiqiang Sun, Yan Liu, Hu Chen, Jiliu Zhou, Yi Zhang

With the rapid development of deep learning in the field of medical imaging, several network models have been proposed for metal artifact reduction (MAR) in CT.

Computed Tomography (CT) Metal Artifact Reduction

Paper
Code

Attention Models for Point Clouds in Deep Learning: A Survey

no code implementations • 22 Feb 2021 • Xu Wang, Yi Jin, Yigang Cen, Tao Wang, Yidong Li

Recently, the advancement of 3D point clouds in deep learning has attracted intensive research in different application domains such as computer vision and robotic tasks.

3D Pose Estimation 3D Semantic Segmentation

Paper
Add Code

Regional and Sectoral Structures and Their Dynamics of Chinese Economy: A Network Perspective from Multi-Regional Input-Output Tables

no code implementations • 24 Feb 2021 • Tao Wang, Shiying Xiao, Jun Yan, Panpan Zhang

Quantified metrics assessing the relative importance of the province-sectors in the national economy echo the national and regional economic development policies to a certain extent.

Community Detection Physics and Society General Economics Economics Applications

Paper
Add Code

A Universal Model for Cross Modality Mapping by Relational Reasoning

no code implementations • 26 Feb 2021 • Zun Li, Congyan Lang, Liqian Liang, Tao Wang, Songhe Feng, Jun Wu, Yidong Li

With the aim of matching a pair of instances from two different modalities, cross modality mapping has attracted growing attention in the computer vision community.

Image Classification Relational Reasoning

Paper
Add Code

Autocorrect in the Process of Translation -- Multi-task Learning Improves Dialogue Machine Translation

1 code implementation • 30 Mar 2021 • Tao Wang, Chengqi Zhao, Mingxuan Wang, Lei LI, Deyi Xiong

Automatic translation of dialogue texts is a much needed demand in many real life scenarios.

Machine Translation Multi-Task Learning +1

Paper
Code

IDOL-Net: An Interactive Dual-Domain Parallel Network for CT Metal Artifact Reduction

no code implementations • 3 Apr 2021 • Tao Wang, Wenjun Xia, Zexin Lu, Huaiqiang Sun, Yan Liu, Hu Chen, Jiliu Zhou, Yi Zhang

Since the dual-domain MAR methods can leverage the hybrid information from both sinogram and image domains, they have significantly improved the performance compared to single-domain methods.

Computed Tomography (CT) Disentanglement +1

Paper
Add Code

Half-Truth: A Partially Fake Audio Detection Dataset

1 code implementation • 8 Apr 2021 • Jiangyan Yi, Ye Bai, JianHua Tao, Haoxin Ma, Zhengkun Tian, Chenglong Wang, Tao Wang, Ruibo Fu

Therefore, this paper develops such a dataset for half-truth audio detection (HAD).

Speech Synthesis

Paper
Code

FedCom: A Byzantine-Robust Local Model Aggregation Rule Using Data Commitment for Federated Learning

no code implementations • 16 Apr 2021 • Bo Zhao, Peng Sun, Liming Fang, Tao Wang, Keyu Jiang

The results demonstrate its effectiveness and superior performance compared to the state-of-the-art Byzantine-robust schemes in defending against typical data poisoning and model poisoning attacks under practical Non-IID data distributions.

Data Poisoning Federated Learning +2

Paper
Add Code

Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning

3 code implementations • ACL 2021 • Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

We find empirically that the contextual representations computed on the retrieval-based input view, constructed through the concatenation of a sentence and its external contexts, can achieve significantly improved performance compared to the original input view based only on the sentence.

Ranked #1 on Named Entity Recognition (NER) on CMeEE

Chinese Named Entity Recognition Chunking +3

364

Paper
Code

GSPMD: General and Scalable Parallelization for ML Computation Graphs

2 code implementations • 10 May 2021 • Yuanzhong Xu, HyoukJoong Lee, Dehao Chen, Blake Hechtman, Yanping Huang, Rahul Joshi, Maxim Krikun, Dmitry Lepikhin, Andy Ly, Marcello Maggioni, Ruoming Pang, Noam Shazeer, Shibo Wang, Tao Wang, Yonghui Wu, Zhifeng Chen

We present GSPMD, an automatic, compiler-based parallelization system for common machine learning computations.

Playing the Game of 2048

907

Paper
Code

The Volctrans Neural Speech Translation System for IWSLT 2021

1 code implementation • ACL (IWSLT) 2021 • Chengqi Zhao, Zhicheng Liu, Jian Tong, Tao Wang, Mingxuan Wang, Rong Ye, Qianqian Dong, Jun Cao, Lei LI

For offline speech translation, our best end-to-end model achieves 8. 1 BLEU improvements over the benchmark on the MuST-C test set and is even approaching the results of a strong cascade solution.

Translation

293

Paper
Code

Learned Smartphone ISP on Mobile NPUs with Deep Learning, Mobile AI 2021 Challenge: Report

2 code implementations • 17 May 2021 • Andrey Ignatov, Cheng-Ming Chiang, Hsien-Kai Kuo, Anastasia Sycheva, Radu Timofte, Min-Hung Chen, Man-Yu Lee, Yu-Syuan Xu, Yu Tseng, Shusong Xu, Jin Guo, Chao-Hung Chen, Ming-Chun Hsyu, Wen-Chia Tsai, Chao-Wei Chen, Grigory Malivenko, Minsu Kwon, Myungje Lee, Jaeyoon Yoo, Changbeom Kang, Shinjo Wang, Zheng Shaolong, Hao Dejun, Xie Fen, Feng Zhuang, Yipeng Ma, Jingyang Peng, Tao Wang, Fenglong Song, Chih-Chung Hsu, Kwan-Lin Chen, Mei-Hsuang Wu, Vishal Chudasama, Kalpesh Prajapati, Heena Patel, Anjali Sarvaiya, Kishor Upla, Kiran Raja, Raghavendra Ramachandra, Christoph Busch, Etienne de Stoutz

As the quality of mobile cameras starts to play a crucial role in modern smartphones, more and more attention is now being paid to ISP algorithms used to improve various perceptual aspects of mobile photos.

291

Paper
Code

Adaptive Feature Alignment for Adversarial Training

no code implementations • 31 May 2021 • Tao Wang, Ruixin Zhang, Xingyu Chen, Kai Zhao, Xiaolin Huang, Yuge Huang, Shaoxin Li, Jilin Li, Feiyue Huang

Based on this observation, we propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.

Adversarial Defense

Paper
Add Code

Autocorrect in the Process of Translation --- Multi-task Learning Improves Dialogue Machine Translation

no code implementations • NAACL 2021 • Tao Wang, Chengqi Zhao, Mingxuan Wang, Lei LI, Deyi Xiong

Automatic translation of dialogue texts is a much needed demand in many real life scenarios.

Machine Translation Multi-Task Learning +1

Paper
Add Code

Deep Learning based Full-reference and No-reference Quality Assessment Models for Compressed UGC Videos

1 code implementation • 2 Jun 2021 • Wei Sun, Tao Wang, Xiongkuo Min, Fuwang Yi, Guangtao Zhai

The proposed VQA framework consists of three modules, the feature extraction module, the quality regression module, and the quality pooling module.

Ranked #12 on Video Quality Assessment on MSU NR VQA Database

regression Video Quality Assessment

Paper
Code

Ultra-High-Definition Image Dehazing via Multi-Guided Bilateral Learning

1 code implementation • CVPR 2021 • Zhuoran Zheng, Wenqi Ren, Xiaochun Cao, Xiaobin Hu, Tao Wang, Fenglong Song, Xiuyi Jia

To address the problem, we propose a novel network capable of real-time dehazing of 4K images on a single GPU, which consists of three deep CNNs.

4k Image Dehazing +2

Paper
Code

No-Reference Quality Assessment for 3D Colored Point Cloud and Mesh Models

2 code implementations • 5 Jul 2021 • ZiCheng Zhang, Wei Sun, Xiongkuo Min, Tao Wang, Wei Lu, Guangtao Zhai

Therefore, many related studies such as point cloud quality assessment (PCQA) and mesh quality assessment (MQA) have been carried out to measure the visual quality degradations of 3D models.

Ranked #3 on Point Cloud Quality Assessment on WPC

Point Cloud Quality Assessment

Paper
Code

Multi-View Cross-Lingual Structured Prediction with Minimum Supervision

no code implementations • ACL 2021 • Zechuan Hu, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

In structured prediction problems, cross-lingual transfer learning is an efficient way to train quality models for low-resource languages, and further improvement can be obtained by learning from multiple source languages.

Cross-Lingual Transfer Sentence +2

Paper
Add Code

Risk Minimization for Zero-shot Sequence Labeling

no code implementations • ACL 2021 • Zechuan Hu, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

In this paper, we propose a novel unified framework for zero-shot sequence labeling with minimum risk training and design a new decomposable risk function that models the relations between the predicted labels from the source models and the true labels.

Paper
Add Code

Real-time Image Enhancer via Learnable Spatial-aware 3D Lookup Tables

no code implementations • ICCV 2021 • Tao Wang, Yong Li, Jingyang Peng, Yipeng Ma, Xian Wang, Fenglong Song, Youliang Yan

One is a 1D weight vector used for image-level scenario adaptation, the other is a 3D weight map aimed for pixel-wise category fusion.

4k Image Enhancement

Paper
Add Code

Learning Class-level Prototypes for Few-shot Learning

no code implementations • 25 Aug 2021 • Minglei Yuan, Wenhai Wang, Tao Wang, Chunhao Cai, Qian Xu, Tong Lu

Few-shot learning aims to recognize new categories using very few labeled samples.

Few-Shot Learning

Paper
Add Code

Secoco: Self-Correcting Encoding for Neural Machine Translation

no code implementations • Findings (EMNLP) 2021 • Tao Wang, Chengqi Zhao, Mingxuan Wang, Lei LI, Hang Li, Deyi Xiong

This paper presents Self-correcting Encoding (Secoco), a framework that effectively deals with input noise for robust neural machine translation by introducing self-correcting predictors.

Machine Translation NMT +1

Paper
Add Code

Joint Graph Learning and Matching for Semantic Feature Correspondence

2 code implementations • 1 Sep 2021 • He Liu, Tao Wang, Yidong Li, Congyan Lang, Yi Jin, Haibin Ling

In this paper, we propose a joint \emph{graph learning and matching} network, named GLAM, to explore reliable graph structures for boosting graph matching.

Graph Learning Graph Matching

Paper
Code

MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

1 code implementation • EMNLP 2021 • Xinyin Ma, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Weiming Lu

Entity retrieval, which aims at disambiguating mentions to canonical entities from massive KBs, is essential for many tasks in natural language processing.

Ranked #1 on Entity Retrieval on ZESHEL

Entity Linking Entity Retrieval +1

Paper
Code

Distract Your Attention: Multi-head Cross Attention Network for Facial Expression Recognition

2 code implementations • 15 Sep 2021 • Zhengyao Wen, Wenzhong Lin, Tao Wang, Ge Xu

To address these issues, we propose our DAN with three key components: Feature Clustering Network (FCN), Multi-head cross Attention Network (MAN), and Attention Fusion Network (AFN).

Ranked #9 on Facial Expression Recognition (FER) on AffectNet

Facial Expression Recognition Facial Expression Recognition (FER)

156

Paper
Code

PnP-DETR: Towards Efficient Visual Analysis with Transformers

1 code implementation • ICCV 2021 • Tao Wang, Li Yuan, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

Recently, DETR pioneered the solution of vision tasks with transformers, it directly translates the image feature map into the object detection result.

object-detection Object Detection +1

129

Paper
Code

CMTR: Cross-modality Transformer for Visible-infrared Person Re-identification

no code implementations • 18 Oct 2021 • Tengfei Liang, Yi Jin, Yajun Gao, Wu Liu, Songhe Feng, Tao Wang, Yidong Li

The existing convolutional neural network-based methods mainly face the problem of insufficient perception of modalities' information, and can not learn good discriminative modality-invariant embeddings for identities, which limits their performance.

Cross-Modality Person Re-identification Person Re-Identification

Paper
Add Code

Direct Multi-view Multi-person 3D Pose Estimation

2 code implementations • NeurIPS 2021 • Tao Wang, Jianfeng Zhang, Yujun Cai, Shuicheng Yan, Jiashi Feng

Instead of estimating 3D joint locations from costly volumetric representation or reconstructing the per-person 3D pose from multiple detected 2D poses as in previous methods, MvP directly regresses the multi-person 3D poses in a clean and efficient way, without relying on intermediate tasks.

Ranked #3 on 3D Multi-Person Pose Estimation on Panoptic (using extra training data)

3D Multi-Person Pose Estimation 3D Pose Estimation

311

Paper
Code

Topological and Algebraic Structures of Atanassov's Intuitionistic Fuzzy-Values Space

no code implementations • 17 Nov 2021 • Xinxing Wu, Tao Wang, Qian Liu, Peide Liu, Guanrong Chen, Xu Zhang

By introducing a new operator for IFVs via the linear order based on a score function and an accuracy function, we show that such an operator is a strong negation on IFVs.

Negation

Paper
Add Code

MC-Blur: A Comprehensive Benchmark for Image Deblurring

2 code implementations • 1 Dec 2021 • Kaihao Zhang, Tao Wang, Wenhan Luo, Boheng Chen, Wenqi Ren, Bjorn Stenger, Wei Liu, Hongdong Li, Ming-Hsuan Yang

Blur artifacts can seriously degrade the visual quality of images, and numerous deblurring methods have been proposed for specific scenarios.

Benchmarking Deblurring +1

143

Paper
Code

Pose-guided Feature Disentangling for Occluded Person Re-identification Based on Transformer

1 code implementation • 5 Dec 2021 • Tao Wang, Hong Liu, Pinhao Song, Tianyu Guo, Wei Shi

Therefore, we propose a transformer-based Pose-guided Feature Disentangling (PFD) method by utilizing pose information to clearly disentangle semantic components (e. g. human body or joint parts) and selectively match non-occluded parts correspondingly.

Person Re-Identification

106

Paper
Code

Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition

1 code implementation • 7 Dec 2021 • Tianyu Guo, Hong Liu, Zhan Chen, Mengyuan Liu, Tao Wang, Runwei Ding

In this paper, to make better use of the movement patterns introduced by extreme augmentations, a Contrastive Learning framework utilizing Abundant Information Mining for self-supervised action Representation (AimCLR) is proposed.

Contrastive Learning Representation Learning +2

Paper
Code

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

no code implementations • 13 Dec 2021 • Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten Bosma, Zongwei Zhou, Tao Wang, Yu Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathleen Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc V Le, Yonghui Wu, Zhifeng Chen, Claire Cui

Scaling language models with more data, compute and parameters has driven significant progress in natural language processing.

Ranked #10 on Language Modelling on LAMBADA

Common Sense Reasoning In-Context Learning +2

Paper
Add Code

ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition

1 code implementation • NAACL 2022 • Xinyu Wang, Min Gui, Yong Jiang, Zixia Jia, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu

As text representations take the most important role in MNER, in this paper, we propose {\bf I}mage-{\bf t}ext {\bf A}lignments (ITA) to align image features into the textual space, so that the attention mechanism in transformer-based pretrained textual embeddings can be better utilized.

Ranked #1 on Multi-modal Named Entity Recognition on Twitter-17

Multi-modal Named Entity Recognition named-entity-recognition +1

171

Paper
Code

Camera-aware Style Separation and Contrastive Learning for Unsupervised Person Re-identification

no code implementations • 19 Dec 2021 • Xue Li, Tengfei Liang, Yi Jin, Tao Wang, Yidong Li

Unsupervised person re-identification (ReID) is a challenging task without data annotation to guide discriminative learning.

Contrastive Learning Unsupervised Person Re-Identification

Paper
Add Code

Powerful Graph Convolutioal Networks with Adaptive Propagation Mechanism for Homophily and Heterophily

no code implementations • 27 Dec 2021 • Tao Wang, Rui Wang, Di Jin, Dongxiao He, Yuxiao Huang

To address this problem, in this paper we design a novel propagation mechanism, which can automatically change the propagation and aggregation process according to homophily or heterophily between node pairs.

Attribute

Paper
Add Code

Deep Probabilistic Graph Matching

no code implementations • 5 Jan 2022 • He Liu, Tao Wang, Yidong Li, Congyan Lang, Songhe Feng, Haibin Ling

Most previous learning-based graph matching algorithms solve the \textit{quadratic assignment problem} (QAP) by dropping one or more of the matching constraints and adopting a relaxed assignment solver to obtain sub-optimal correspondences.

Graph Matching

Paper
Add Code

GLAN: A Graph-based Linear Assignment Network

no code implementations • 5 Jan 2022 • He Liu, Tao Wang, Congyan Lang, Songhe Feng, Yi Jin, Yidong Li

The experimental results on a synthetic dataset reveal that our method outperforms state-of-the-art baselines and achieves consistently high accuracy with the increment of the problem size.

Multi-Object Tracking

Paper
Add Code

IoTGAN: GAN Powered Camouflage Against Machine Learning Based IoT Device Identification

no code implementations • 10 Jan 2022 • Tao Hou, Tao Wang, Zhuo Lu, Yao Liu, Yalin Sagduyu

In this research, we propose a novel attack strategy named IoTGAN to manipulate an IoT device's traffic such that it can evade machine learning based IoT device identification.

BIG-bench Machine Learning

Paper
Add Code

COVID-19 Hospitalizations Forecasts Using Internet Search Data

no code implementations • 3 Feb 2022 • Tao Wang, Simin Ma, Soobin Baek, Shihao Yang

As the COVID-19 spread over the globe and new variants of COVID-19 keep occurring, reliable real-time forecasts of COVID-19 hospitalizations are critical for public health decision on medical resources allocations such as ICU beds, ventilators, and personnel to prepare for the surge of COVID-19 pandemics.

Decision Making Time Series +1

Paper
Add Code

LighTN: Light-weight Transformer Network for Performance-overhead Tradeoff in Point Cloud Downsampling

no code implementations • 13 Feb 2022 • Xu Wang, Yi Jin, Yigang Cen, Tao Wang, Bowen Tang, Yidong Li

Compared with traditional task-irrelevant downsampling methods, task-oriented neural networks have shown improved performance in point cloud downsampling range.

Paper
Add Code

SODAR: Segmenting Objects by DynamicallyAggregating Neighboring Mask Representations

1 code implementation • 15 Feb 2022 • Tao Wang, Jun Hao Liew, Yu Li, Yunpeng Chen, Jiashi Feng

Unlike the original per grid cell object masks, SODAR is implicitly supervised to learn mask representations that encode geometric structure of nearby objects and complement adjacent representations with context.

Instance Segmentation Object +1

Paper
Code

Singing-Tacotron: Global duration control attention and dynamic filter for End-to-end singing voice synthesis

no code implementations • 16 Feb 2022 • Tao Wang, Ruibo Fu, Jiangyan Yi, JianHua Tao, Zhengqi Wen

Firstly, we propose a global duration control attention mechanism for the SVS model.

Singing Voice Synthesis

Paper
Add Code

Single UHD Image Dehazing via Interpretable Pyramid Network

1 code implementation • 17 Feb 2022 • Boxue Xiao, Zhuoran Zheng, Xiang Chen, Chen Lv, Yunliang Zhuang, Tao Wang

Currently, most single image dehazing models cannot run an ultra-high-resolution (UHD) image with a single GPU shader in real-time.

4k Image Dehazing +1

Paper
Code

ADD 2022: the First Audio Deep Synthesis Detection Challenge

no code implementations • 17 Feb 2022 • Jiangyan Yi, Ruibo Fu, JianHua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li, Zheng Lian, Bin Liu

Audio deepfake detection is an emerging topic, which was included in the ASVspoof 2021.

Audio Generation DeepFake Detection +1

Paper
Add Code

CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing

1 code implementation • 21 Feb 2022 • Tao Wang, Jiangyan Yi, Ruibo Fu, JianHua Tao, Zhengqi Wen

It can solve unnatural prosody in the edited region and synthesize the speech corresponding to the unseen words in the transcript.

Few-Shot Learning Sentence

164

Paper
Code

DAMO-NLP at SemEval-2022 Task 11: A Knowledge-based System for Multilingual Named Entity Recognition

1 code implementation • SemEval (NAACL) 2022 • Xinyu Wang, Yongliang Shen, Jiong Cai, Tao Wang, Xiaobin Wang, Pengjun Xie, Fei Huang, Weiming Lu, Yueting Zhuang, Kewei Tu, Wei Lu, Yong Jiang

Our system wins 10 out of 13 tracks in the MultiCoNER shared task.

Multilingual Named Entity Recognition Named Entity Recognition +1

171

Paper
Code

NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation

no code implementations • 5 Mar 2022 • Tao Wang, Ruibo Fu, Jiangyan Yi, JianHua Tao, Zhengqi Wen

We have also verified through experiments that this method can effectively control the noise components in the predicted speech and adjust the SNR of speech.

Paper
Add Code

An STDP-Based Supervised Learning Algorithm for Spiking Neural Networks

no code implementations • 7 Mar 2022 • Zhanhao Hu, Tao Wang, Xiaolin Hu

Compared with rate-based artificial neural networks, Spiking Neural Networks (SNN) provide a more biological plausible model for the brain.

Paper
Add Code

End-to-end video instance segmentation via spatial-temporal graph neural networks

1 code implementation • ICCV 2021 • Tao Wang, Ning Xu, Kean Chen, Weiyao Lin

Specifically, graph nodes representing instance features are used for detection and segmentation while graph edges representing instance relations are used for tracking.

Instance Segmentation Segmentation +2

Paper
Code

Geometric Synthesis: A Free lunch for Large-scale Palmprint Recognition Model Pretraining

no code implementations • 11 Mar 2022 • Kai Zhao, Lei Shen, Yingyi Zhang, Chuhan Zhou, Tao Wang, Ruixin Zhang, Shouhong Ding, Wei Jia, Wei Shen

In this paper, by observing that palmar creases are the key information to deep-learning-based palmprint recognition, we propose to synthesize training data by manipulating palmar creases.

TAR

Paper
Add Code

EventFormer: AU Event Transformer for Facial Action Unit Event Detection

no code implementations • 12 Mar 2022 • Yingjie Chen, Jiarui Zhang, Tao Wang, Yun Liang

Facial action units (AUs) play an indispensable role in human emotion analysis.

Emotion Recognition Event Detection

Paper
Add Code

A Long Short-term Memory Based Recurrent Neural Network for Interventional MRI Reconstruction

no code implementations • 28 Mar 2022 • Ruiyang Zhao, Zhao He, Tao Wang, Suhao Qiu, Pawel Herman, Yanle Hu, Chencheng Zhang, Dinggang Shen, Bomin Sun, Guang-Zhong Yang, Yuan Feng

Here we proposed a convolutional long short-term memory (Conv-LSTM) based recurrent neural network (RNN), or ConvLR, to reconstruct interventional images with golden-angle radial sampling.

MRI Reconstruction

Paper
Add Code

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision

1 code implementation • CVPR 2022 • Kehong Gong, Bingbing Li, Jianfeng Zhang, Tao Wang, Jing Huang, Michael Bi Mi, Jiashi Feng, Xinchao Wang

Existing self-supervised 3D human pose estimation schemes have largely relied on weak supervisions like consistency loss to guide the learning, which, inevitably, leads to inferior results in real-world scenarios with unseen poses.

Ranked #37 on 3D Human Pose Estimation on MPI-INF-3DHP

3D Human Pose Estimation Hallucination

306

Paper
Code

GigaST: A 10,000-hour Pseudo Speech Translation Corpus

1 code implementation • 8 Apr 2022 • Rong Ye, Chengqi Zhao, Tom Ko, Chutong Meng, Tao Wang, Mingxuan Wang, Jun Cao

The training set is translated by a strong machine translation system and the test set is translated by human.

Machine Translation Translation

Paper
Code

Causal Intervention for Subject-Deconfounded Facial Action Unit Recognition

no code implementations • 17 Apr 2022 • Yingjie Chen, Diqi Chen, Tao Wang, Yizhou Wang, Yun Liang

Subject-invariant facial action unit (AU) recognition remains challenging for the reason that the data distribution varies among subjects.

Causal Inference Facial Action Unit Detection

Paper
Add Code

Understanding CNNs from excitations

no code implementations • 2 May 2022 • Zijian Ying, Qianmu Li, Zhichao Lian, Jun Hou, Tong Lin, Tao Wang

To organize these excitations into final saliency maps, we introduce a double-chain backpropagation procedure.

Binary Classification

Paper
Add Code

FRIH: Fine-grained Region-aware Image Harmonization

no code implementations • 13 May 2022 • Jinlong Peng, Zekun Luo, Liang Liu, Boshen Zhang, Tao Wang, Yabiao Wang, Ying Tai, Chengjie Wang, Weiyao Lin

Image harmonization aims to generate a more realistic appearance of foreground and background for a composite image.

Image Harmonization

Paper
Add Code

The Value of Information in Stopping Problems

no code implementations • 13 May 2022 • Ehud Lehrer, Tao Wang

We consider stopping problems in which a decision maker (DM) faces an unknown state of nature and decides sequentially whether to stop and take an irreversible action; pay a fee and obtain additional information; or wait without acquiring information.

Paper
Add Code

Uncertainty-based Network for Few-shot Image Classification

no code implementations • 17 May 2022 • Minglei Yuan, Qian Xu, Chunhao Cai, Yin-Dong Zheng, Tao Wang, Tong Lu

Specifically, we first data augment and classify the query instance and calculate the mutual information of these classification scores.

Classification Few-Shot Image Classification +1

Paper
Add Code

Blind Surveillance Image Quality Assessment via Deep Neural Network Combined with the Visual Saliency

no code implementations • 9 Jun 2022 • Wei Lu, Wei Sun, Wenhan Zhu, Xiongkuo Min, ZiCheng Zhang, Tao Wang, Guangtao Zhai

In this paper, we first conduct an example experiment (i. e. the face detection task) to demonstrate that the quality of the SIs has a crucial impact on the performance of the IVSS, and then propose a saliency-based deep neural network for the blind quality assessment of the SIs, which helps IVSS to filter the low-quality SIs and improve the detection and recognition performance.

Face Detection Image Quality Assessment

Paper
Add Code

Deep Neural Network for Blind Visual Quality Assessment of 4K Content

no code implementations • 9 Jun 2022 • Wei Lu, Wei Sun, Xiongkuo Min, Wenhan Zhu, Quan Zhou, Jun He, Qiyuan Wang, ZiCheng Zhang, Tao Wang, Guangtao Zhai

In this paper, we propose a deep learning-based BIQA model for 4K content, which on one hand can recognize true and pseudo 4K content and on the other hand can evaluate their perceptual visual quality.

4k Blind Image Quality Assessment +1

Paper
Add Code

A No-Reference Deep Learning Quality Assessment Method for Super-resolution Images Based on Frequency Maps

no code implementations • 9 Jun 2022 • ZiCheng Zhang, Wei Sun, Xiongkuo Min, Wenhan Zhu, Tao Wang, Wei Lu, Guangtao Zhai

Therefore, in this paper, we propose a no-reference deep-learning image quality assessment method based on frequency maps because the artifacts caused by SISR algorithms are quite sensitive to frequency information.

Image Quality Assessment Image Super-Resolution

Paper
Add Code

A No-reference Quality Assessment Metric for Point Cloud Based on Captured Video Sequences

no code implementations • 9 Jun 2022 • Yu Fan, ZiCheng Zhang, Wei Sun, Xiongkuo Min, Wei Lu, Tao Wang, Ning Liu, Guangtao Zhai

Point cloud is one of the most widely used digital formats of 3D models, the visual quality of which is quite sensitive to distortions such as downsampling, noise, and compression.

Point Cloud Quality Assessment

Paper
Add Code

Subjective Quality Assessment for Images Generated by Computer Graphics

no code implementations • 10 Jun 2022 • Tao Wang, ZiCheng Zhang, Wei Sun, Xiongkuo Min, Wei Lu, Guangtao Zhai

However, limited work has been put forward to tackle the problem of computer graphics generated images' quality assessment (CG-IQA).

No-Reference Image Quality Assessment NR-IQA

Paper
Add Code

Multi-View Imputation and Cross-Attention Network Based on Incomplete Longitudinal and Multimodal Data for Conversion Prediction of Mild Cognitive Impairment

1 code implementation • 16 Jun 2022 • Tao Wang, Xiumei Chen, Xiaoling Zhang, Shuoling Zhou, Qianjin Feng, Meiyan Huang

To address these challenges, a multi-view imputation and cross-attention network (MCNet) was proposed to integrate data imputation and MCI conversion prediction in a unified framework.

Disease Prediction Imputation +1

Paper
Code

SJ-HD^2R: Selective Joint High Dynamic Range and Denoising Imaging for Dynamic Scenes

no code implementations • 20 Jun 2022 • Wei Li, Shuai Xiao, Tianhong Dai, Shanxin Yuan, Tao Wang, Cheng Li, Fenglong Song

To further leverage these two paradigms, we propose a selective and joint HDR and denoising (SJ-HD$^2$R) imaging framework, utilizing scenario-specific priors to conduct the path selection with an accuracy of more than 93. 3$\%$.

Denoising

Paper
Add Code

Boosting R-CNN: Reweighting R-CNN Samples by RPN's Error for Underwater Object Detection

2 code implementations • 28 Jun 2022 • Pinhao Song, Pengteng Li, Linhui Dai, Tao Wang, Zhan Chen

This work aims to solve the problem from two perspectives: uncertainty modeling and hard example mining.

Ranked #70 on Object Detection on COCO test-dev

Object object-detection +2

Paper
Code

On Mitigating Hard Clusters for Face Clustering

1 code implementation • 25 Jul 2022 • Yingjie Chen, Huasong Zhong, Chong Chen, Chen Shen, Jianqiang Huang, Tao Wang, Yun Liang, Qianru Sun

Face clustering is a promising way to scale up face recognition systems using large-scale unlabeled face images.

Clustering Face Clustering +1

Paper
Code

P2ANet: A Dataset and Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos

no code implementations • 26 Jul 2022 • Jiang Bian, Xuhong LI, Tao Wang, Qingzhong Wang, Jun Huang, Chen Liu, Jun Zhao, Feixiang Lu, Dejing Dou, Haoyi Xiong

While deep learning has been widely used for video analytics, such as video classification and action detection, dense action detection with fast-moving subjects from sports videos is still challenging.

Action Detection Action Localization +2

Paper
Add Code

Multiple Instance Neural Networks Based on Sparse Attention for Cancer Detection using T-cell Receptor Sequences

no code implementations • 9 Aug 2022 • Younghoon Kim, Tao Wang, Danyi Xiong, Xinlei Wang, Seongoh Park

Among different types of data used to answer this biological question, studies based on T cell receptors (TCRs) are under recent spotlight due to the growing appreciation of the roles of the host immunity system in tumor biology.

Multiple Instance Learning

Paper
Add Code

An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio

no code implementations • 20 Aug 2022 • Xinrui Yan, Jiangyan Yi, JianHua Tao, Chenglong Wang, Haoxin Ma, Tao Wang, Shiming Wang, Ruibo Fu

Many effective attempts have been made for fake audio detection.

Paper
Add Code

DLUNet: Semi-supervised Learning based Dual-Light UNet for Multi-organ Segmentation

1 code implementation • 22 Sep 2022 • Haoran Lai, Tao Wang, Shuoling Zhou

In the training phase, it consists of two light UNets, which make full use of label and unlabeled data simultaneously by using consistent-based learning.

Organ Segmentation

Paper
Code

Rethinking Image Restoration for Object Detection

1 code implementation • NIPS 2022 • Shangquan Sun, Wenqi Ren, Tao Wang, Xiaochun Cao

To address the issue, we propose a targeted adversarial attack in the restoration procedure to boost object detection performance after restoration.

Adversarial Attack Domain Adaptation +5

Paper
Code

A Survey of Deep Face Restoration: Denoise, Super-Resolution, Deblur, Artifact Removal

1 code implementation • 5 Nov 2022 • Tao Wang, Kaihao Zhang, Xuanxi Chen, Wenhan Luo, Jiankang Deng, Tong Lu, Xiaochun Cao, Wei Liu, Hongdong Li, Stefanos Zafeiriou

Second, we discuss the challenges of face restoration.

Image Restoration Super-Resolution

367

Paper
Code

Towards Real World HDRTV Reconstruction: A Data Synthesis-based Approach

no code implementations • 6 Nov 2022 • Zhen Cheng, Tao Wang, Yong Li, Fenglong Song, Chang Chen, Zhiwei Xiong

To solve this problem, we propose a learning-based data synthesis approach to learn the properties of real-world SDRTVs by integrating several tone mapping priors into both network structures and loss functions.

Tone Mapping

Paper
Add Code

Emotion Selectable End-to-End Text-based Speech Editing

no code implementations • 20 Dec 2022 • Tao Wang, Jiangyan Yi, Ruibo Fu, JianHua Tao, Zhengqi Wen, Chu Yuan Zhang

To achieve this task, we propose Emo-CampNet (emotion CampNet), which can provide the option of emotional attributes for the generated speech in text-based speech editing and has the one-shot ability to edit unseen speakers' speech.

Data Augmentation

Paper
Add Code

Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and Transformer-Based Method

1 code implementation • 22 Dec 2022 • Tao Wang, Kaihao Zhang, Tianrun Shen, Wenhan Luo, Bjorn Stenger, Tong Lu

In this paper, we consider the task of low-light image enhancement (LLIE) and introduce a large-scale database consisting of images at 4K and 8K resolution.

4k 8k +3

143

Paper
Code

Restoring Vision in Hazy Weather with Hierarchical Contrastive Learning

no code implementations • 22 Dec 2022 • Tao Wang, Guangpin Tao, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Xiaoqin Zhang, Tong Lu

HCD consists of a hierarchical dehazing network (HDN) and a novel hierarchical contrastive loss (HCL).

Contrastive Learning Image Dehazing +3

Paper
Add Code

Learning to Detect and Segment for Open Vocabulary Object Detection

no code implementations • CVPR 2023 • Tao Wang, Nan Li

With such a conditional design, the detection model is bridged by the semantic embedding to offer strongly generalizable class-wise box and mask prediction.

Object object-detection +1

Paper
Add Code

UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion

no code implementations • 10 Jan 2023 • Haogeng Liu, Tao Wang, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, JianHua Tao

Text-to-speech (TTS) and voice conversion (VC) are two different tasks both aiming at generating high quality speaking voice according to different input modality.

Quantization Voice Conversion

Paper
Add Code

Robust Remote Sensing Scene Classification with Multi-View Voting and Entropy Ranking

no code implementations • 14 Jan 2023 • Jinyang Wang, Tao Wang, Min Gan, George Hadjichristofi

Deep convolutional neural networks have been widely used in scene classification of remotely sensed images.

Scene Classification

Paper
Add Code

Mortality Prediction with Adaptive Feature Importance Recalibration for Peritoneal Dialysis Patients: a deep-learning-based study on a real-world longitudinal follow-up dataset

1 code implementation • 17 Jan 2023 • Liantao Ma, Chaohe Zhang, Junyi Gao, Xianfeng Jiao, Zhihao Yu, Xinyu Ma, Yasha Wang, Wen Tang, Xinju Zhao, Wenjie Ruan, Tao Wang

Here, our objective is to develop a deep learning model for a real-time, individualized, and interpretable mortality prediction model - AICare.

Feature Importance Mortality Prediction

Paper
Code

A Multi-Scale Framework for Out-of-Distribution Detection in Dermoscopic Images

no code implementations • 18 Jan 2023 • Zhongzheng Huang, Tao Wang, Yuanzheng Cai, Lingyu Liang

The automatic detection of skin diseases via dermoscopic images can improve the efficiency in diagnosis and help doctors make more accurate judgments.

Out-of-Distribution Detection

Paper
Add Code

Spatio-Temporal Context Modeling for Road Obstacle Detection

no code implementations • 19 Jan 2023 • Xiuen Wu, Tao Wang, Lingyu Liang, Zuoyong Li, Fum Yew Ching

The results indicate that our method with spatio-temporal context modeling is superior to existing methods for road obstacle detection.

object-detection Object Detection +1

Paper
Add Code

Spatio-Temporal Point Process for Multiple Object Tracking

no code implementations • 5 Feb 2023 • Tao Wang, Kean Chen, Weiyao Lin, John See, Zenghui Zhang, Qian Xu, Xia Jia

As such, we propose a novel framework that can effectively predict and mask-out the noisy and confusing detection results before associating the objects into trajectories.

Multiple Object Tracking Object

Paper
Add Code

Noise2Music: Text-conditioned Music Generation with Diffusion Models

no code implementations • 8 Feb 2023 • Qingqing Huang, Daniel S. Park, Tao Wang, Timo I. Denk, Andy Ly, Nanxin Chen, Zhengdong Zhang, Zhishuai Zhang, Jiahui Yu, Christian Frank, Jesse Engel, Quoc V. Le, William Chan, Zhifeng Chen, Wei Han

We introduce Noise2Music, where a series of diffusion models is trained to generate high-quality 30-second music clips from text prompts.

Ranked #2 on Text-to-Music Generation on MusicCaps

Music Generation Text-to-Music Generation

Paper
Add Code

Feature Completion Transformer for Occluded Person Re-identification

no code implementations • 3 Mar 2023 • Tao Wang, Mengyuan Liu, Hong Liu, Wenhao Li, Miaoju Ban, Tuanyu Guo, Yidi Li

In this paper, different from most previous works that discard the occluded region, we propose a Feature Completion Transformer (FCFormer) to implicitly complement the semantic information of occluded parts in the feature space.

Person Re-Identification

Paper
Add Code

The Cascaded Forward Algorithm for Neural Network Training

1 code implementation • 17 Mar 2023 • Gongpei Zhao, Tao Wang, Yidong Li, Yi Jin, Congyan Lang, Haibin Ling

Backpropagation algorithm has been widely used as a mainstream learning procedure for neural networks in the past decade, and has played a significant role in the development of deep learning.

Image Classification

Paper
Code

GRIG: Few-Shot Generative Residual Image Inpainting

no code implementations • 24 Apr 2023 • Wanglong Lu, Xianta Jiang, Xiaogang Jin, Yong-Liang Yang, Minglun Gong, Tao Wang, Kaijie Shi, Hanli Zhao

Image inpainting is the task of filling in missing or masked region of an image with semantically meaningful contents.

Image Inpainting

Paper
Add Code

A Soft Coordination Method of Heterogeneous Devices in Distribution System Voltage Control

no code implementations • 4 May 2023 • Licheng Wang, Tao Wang, Gang Huang, Ruifeng Yan, Kai Wang, Youbing Zhang, Shijie Cheng

The proposed method achieves the soft coordination by establishing a modified actor-critic algorithm to train a proxy model of inverters.

Decision Making

Paper
Add Code

RFR-WWANet: Weighted Window Attention-Based Recovery Feature Resolution Network for Unsupervised Image Registration

1 code implementation • 7 May 2023 • Mingrui Ma, Tao Wang, Lei Song, Weijie Wang, Guixia Liu

Furthermore, shifted window partitioning operations are inflexible, indicating that they cannot perceive the semantic information over uncertain distances and automatically bridge the global connections between windows.

Computational Efficiency Long-range modeling +1

Paper
Code

Student Classroom Behavior Detection based on YOLOv7-BRA and Multi-Model Fusion

1 code implementation • 13 May 2023 • Fan Yang, Tao Wang, Xiaofei Wang

We constructed a dataset, which contained 11, 248 labels and 4, 001 images, with an emphasis on the common behavior of raising hands in a classroom setting (Student Classroom Behavior dataset, SCB-Dataset).

Paper
Code

PaLM 2 Technical Report

1 code implementation • 17 May 2023 • Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu

Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.

Ranked #1 on Question Answering on StrategyQA

Code Generation Common Sense Reasoning +6

Paper
Code

Graph Propagation Transformer for Graph Representation Learning

1 code implementation • 19 May 2023 • Zhe Chen, Hao Tan, Tao Wang, Tianrun Shen, Tong Lu, Qiuying Peng, Cheng Cheng, Yue Qi

The core insight of our method is to fully consider the information propagation among nodes and edges in a graph when building the attention module in the transformer blocks.

Ranked #2 on Graph Regression on PCQM4M-LSC (Validation MAE metric)

Graph Learning Graph Property Prediction +3

Paper
Code

Improving speech translation by fusing speech and text

no code implementations • 23 May 2023 • Wenbiao Yin, Zhicheng Liu, Chengqi Zhao, Tao Wang, Jian Tong, Rong Ye

To tackle these gaps, we propose \textbf{F}use-\textbf{S}peech-\textbf{T}ext (\textbf{FST}), a cross-modal model which supports three distinct input modalities for translation: speech, text, and fused speech-text.

Machine Translation Translation

Paper
Add Code

GridFormer: Residual Dense Transformer with Grid Structure for Image Restoration in Adverse Weather Conditions

no code implementations • 29 May 2023 • Tao Wang, Kaihao Zhang, Ziqian Shao, Wenhan Luo, Bjorn Stenger, Tong Lu, Tae-Kyun Kim, Wei Liu, Hongdong Li

Second, we introduce a residual dense transformer block (RDTB) as the final GridFormer layer.

Image Restoration Rain Removal

Paper
Add Code

Boosting Fast and High-Quality Speech Synthesis with Linear Diffusion

no code implementations • 9 Jun 2023 • Haogeng Liu, Tao Wang, Jie Cao, Ran He, JianHua Tao

When decreasing the number of sampling steps (i. e., the number of line segments used to fit the path), the ease of fitting straight lines compared to curves allows us to generate higher quality samples from a random noise with fewer iterations.

Denoising Speech Synthesis

Paper
Add Code

Valley: Video Assistant with Large Language model Enhanced abilitY

1 code implementation • 12 Jun 2023 • Ruipu Luo, Ziwang Zhao, Min Yang, Junwei DOng, Da Li, Pengcheng Lu, Tao Wang, Linmei Hu, Minghui Qiu, Zhongyu Wei

Large language models (LLMs), with their remarkable conversational capabilities, have demonstrated impressive performance across various applications and have emerged as formidable AI assistants.

Action Recognition Instruction Following +4

163

Paper
Code

PCDAL: A Perturbation Consistency-Driven Active Learning Approach for Medical Image Segmentation and Classification

1 code implementation • 29 Jun 2023 • Tao Wang, Xinlin Zhang, Yuanbo Zhou, Junlin Lan, Tao Tan, Min Du, Qinquan Gao, Tong Tong

To address this limitation, we propose an AL-based method that can be simultaneously applied to 2D medical image classification, segmentation, and 3D medical image segmentation tasks.

Active Learning Image Classification +5

Paper
Code

Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model

no code implementations • 1 Jul 2023 • Jiong Cai, Yong Jiang, Yue Zhang, Chengyue Jiang, Ke Yu, Jianhui Ji, Rong Xiao, Haihong Tang, Tao Wang, Zhongqiang Huang, Pengjun Xie, Fei Huang, Kewei Tu

We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.

Text Matching

Paper
Add Code

Seeing is not Believing: An Identity Hider for Human Vision Privacy Protection

no code implementations • 2 Jul 2023 • Tao Wang, Yushu Zhang, Zixuan Yang, Hua Zhang, Zhongyun Hua

Massive captured face images are stored in the database for the identification of individuals.

Attribute Disentanglement +1

Paper
Add Code

BLEURT Has Universal Translations: An Analysis of Automatic Metrics by Minimum Risk Training

no code implementations • 6 Jul 2023 • Yiming Yan, Tao Wang, Chengqi Zhao, ShuJian Huang, Jiajun Chen, Mingxuan Wang

In this study, we systematically analyze and compare various mainstream and cutting-edge automatic metrics from the perspective of their guidance for training machine translation systems.

Machine Translation Sentence +1

Paper
Add Code

Joint Adversarial and Collaborative Learning for Self-Supervised Action Recognition

1 code implementation • 15 Jul 2023 • Tianyu Guo, Mengyuan Liu, Hong Liu, Wenhao Li, Jingwen Guo, Tao Wang, Yidi Li

Considering the instance-level discriminative ability, contrastive learning methods, including MoCo and SimCLR, have been adapted from the original image representation learning task to solve the self-supervised skeleton-based action recognition task.

Contrastive Learning Ensemble Learning +4

Paper
Code

Bridging the Gap: Multi-Level Cross-Modality Joint Alignment for Visible-Infrared Person Re-Identification

1 code implementation • 17 Jul 2023 • Tengfei Liang, Yi Jin, Wu Liu, Tao Wang, Songhe Feng, Yidong Li

Visible-Infrared person Re-IDentification (VI-ReID) is a challenging cross-modality image retrieval task that aims to match pedestrians' images across visible and infrared cameras.

Cross-Modality Person Re-identification Image Classification +4

Paper
Code

An Intelligent Remote Sensing Image Quality Inspection System

1 code implementation • 22 Jul 2023 • Yijiong Yu, Tao Wang, Kang Ran, Chang Li, Hao Wu

Due to the inevitable presence of quality problems, quality inspection of remote sensing images is indeed an indispensable step between the acquisition and the application of them.

Image Classification Semantic Segmentation

Paper
Code

Semi-Supervised Medical Image Segmentation with Co-Distribution Alignment

no code implementations • 24 Jul 2023 • Tao Wang, Zhongzheng Huang, Jiawei Wu, Yuanzheng Cai, Zuoyong Li

Medical image segmentation has made significant progress when a large amount of labeled data are available.

Image Segmentation Segmentation +2

Paper
Add Code

LLDiffusion: Learning Degradation Representations in Diffusion Models for Low-Light Image Enhancement

1 code implementation • 27 Jul 2023 • Tao Wang, Kaihao Zhang, Ziqian Shao, Wenhan Luo, Bjorn Stenger, Tae-Kyun Kim, Wei Liu, Hongdong Li

In this paper, we address this limitation by proposing a degradation-aware learning scheme for LLIE using diffusion models, which effectively integrates degradation and image priors into the diffusion process, resulting in improved image enhancement.

Image Generation Low-Light Image Enhancement

Paper
Code

Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding

no code implementations • 28 Jul 2023 • Chunyu Qiang, Hao Li, Hao Ni, He Qu, Ruibo Fu, Tao Wang, Longbiao Wang, Jianwu Dang

However, existing methods suffer from three problems: the high dimensionality and waveform distortion of discrete speech representations, the prosodic averaging problem caused by the duration prediction model in non-autoregressive frameworks, and the information redundancy and dimension explosion problems of existing semantic encoding methods.

Language Modelling Speech Synthesis

Paper
Add Code

Class-Specific Distribution Alignment for Semi-Supervised Medical Image Classification

no code implementations • 29 Jul 2023 • Zhongzheng Huang, Jiawei Wu, Tao Wang, Zuoyong Li, Anastasia Ioannou

Despite the success of deep neural networks in medical image classification, the problem remains challenging as data annotation is time-consuming, and the class distribution is imbalanced due to the relative scarcity of diseases.

Image Classification Semi-supervised Medical Image Classification

Paper
Add Code

Few-shot Class-Incremental Semantic Segmentation via Pseudo-Labeling and Knowledge Distillation

1 code implementation • 5 Aug 2023 • Chengjia Jiang, Tao Wang, Sien Li, Jinyang Wang, Shirui Wang, Antonios Antoniou

Given only one or a few images labeled with the novel classes and a much larger set of unlabeled images, we transfer the knowledge from labeled images to unlabeled images with a coarse-to-fine pseudo-labeling approach in two steps.

Class-Incremental Semantic Segmentation Knowledge Distillation

Paper
Code

HGDNet: A Height-Hierarchy Guided Dual-Decoder Network for Single View Building Extraction and Height Estimation

no code implementations • 10 Aug 2023 • Chaoran Lu, Ningning Cao, Pan Zhang, Ting Liu, Baochai Peng, Guozhang Liu, Mengke Yuan, Sen Zhang, Simin Huang, Tao Wang

Unifying the correlative single-view satellite image building extraction and height estimation tasks indicates a promising way to share representations and acquire generalist model for large-scale urban 3D reconstruction.

3D Reconstruction

Paper
Add Code

Deep Semantic Graph Matching for Large-scale Outdoor Point Clouds Registration

no code implementations • 10 Aug 2023 • Shaocong Liu, Tao Wang, Yan Zhang, Ruqin Zhou, Li Li, Chenguang Dai, Yongsheng Zhang, Longguang Wang, Hanyun Wang

The adjacent points with the same category labels are then clustered together using the Euclidean clustering algorithm to obtain the semantic instances, which are represented by three kinds of attributes including spatial location information, semantic categorical information, and global geometric shape information.

Graph Matching Point Cloud Registration +1

Paper
Add Code

Fine-grained building roof instance segmentation based on domain adapted pretraining and composite dual-backbone

no code implementations • 10 Aug 2023 • Guozhang Liu, Baochai Peng, Ting Liu, Pan Zhang, Mengke Yuan, Chaoran Lu, Ningning Cao, Sen Zhang, Simin Huang, Tao Wang

The diversity of building architecture styles of global cities situated on various landforms, the degraded optical imagery affected by clouds and shadows, and the significant inter-class imbalance of roof types pose challenges for designing a robust and accurate building roof instance segmentor.

Data Augmentation Instance Segmentation +1

Paper
Add Code

Development of a Knowledge Graph Embeddings Model for Pain

1 code implementation • 17 Aug 2023 • Jaya Chaturvedi, Tao Wang, Sumithra Velupillai, Robert Stewart, Angus Roberts

This paper describes the construction of such knowledge graph embedding models of pain concepts, extracted from the unstructured text of mental health electronic health records, combined with external knowledge created from relations described in SNOMED CT, and their evaluation on a subject-object link prediction task.

Knowledge Graph Embedding Knowledge Graph Embeddings +2

Paper
Code

FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo Embeddings

1 code implementation • 17 Aug 2023 • Yulin Su, Min Yang, Minghui Qiu, Jing Wang, Tao Wang

Logo embedding plays a crucial role in various e-commerce applications by facilitating image retrieval or recognition, such as intellectual property protection and product search.

Image Retrieval Optical Character Recognition (OCR)

Paper
Code

Blind Face Restoration for Under-Display Camera via Dictionary Guided Transformer

no code implementations • 20 Aug 2023 • Jingfan Tan, Xiaoxu Chen, Tao Wang, Kaihao Zhang, Wenhan Luo, Xiaocun Cao

However, due to the characteristics of the display, images taken by UDC suffer from significant quality degradation.

Blind Face Restoration Image Restoration

Paper
Add Code

Semantic-aware Consistency Network for Cloth-changing Person Re-Identification

1 code implementation • 27 Aug 2023 • Peini Guo, Hong Liu, Jianbing Wu, Guoquan Wang, Tao Wang

Despite recent progress in CC-ReID, existing approaches are still hindered by the interference of clothing variations since they lack effective constraints to keep the model consistently focused on clothing-irrelevant regions.

Cloth-Changing Person Re-Identification

Paper
Code

Dual-Decoder Consistency via Pseudo-Labels Guided Data Augmentation for Semi-Supervised Medical Image Segmentation

1 code implementation • 31 Aug 2023 • Yuanbin Chen, Tao Wang, Hui Tang, Longxuan Zhao, Ruige Zong, Shun Chen, Tao Tan, Xinlin Zhang, Tong Tong

In this paper, we present a novel semi-supervised learning method, Dual-Decoder Consistency via Pseudo-Labels Guided Data Augmentation (DCPA), for medical image segmentation.

Data Augmentation Image Segmentation +3

Paper
Code

Learning Speech Representation From Contrastive Token-Acoustic Pretraining

no code implementations • 1 Sep 2023 • Chunyu Qiang, Hao Li, Yixin Tian, Ruibo Fu, Tao Wang, Longbiao Wang, Jianwu Dang

However, existing contrastive learning methods in the audio field focus on extracting global descriptive information for downstream audio classification tasks, making them unsuitable for TTS, VC, and ASR tasks.

Audio Classification Automatic Speech Recognition +5

Paper
Add Code

Deep Video Restoration for Under-Display Camera

no code implementations • 9 Sep 2023 • Xuanxi Chen, Tao Wang, Ziqian Shao, Kaihao Zhang, Wenhan Luo, Tong Lu, Zikun Liu, Tae-Kyun Kim, Hongdong Li

With the pipeline, we build the first large-scale UDC video restoration dataset called PexelsUDC, which includes two subsets named PexelsUDC-T and PexelsUDC-P corresponding to different displays for UDC.

Video Restoration

Paper
Add Code

Base Station Beamforming Design for Near-field XL-IRS Beam Training

no code implementations • 12 Sep 2023 • Tao Wang, Changsheng You, Changchuan Yin

However, this approach may cause degraded beam training performance in practice due to the near-field channel model of the BS-IRS link.

Paper
Add Code

Estimation and Testing of Forecast Rationality with Many Moments

no code implementations • 18 Sep 2023 • Tae-Hwy Lee, Tao Wang

We in this paper utilize P-GMM (Cheng and Liao, 2015) moment selection procedure to select valid and relevant moments for estimating and testing forecast rationality under the flexible loss proposed by Elliott et al. (2005).

valid

Paper
Add Code

SCB-Dataset3: A Benchmark for Detecting Student Classroom Behavior

1 code implementation • 4 Oct 2023 • Fan Yang, Tao Wang

The use of deep learning methods to automatically detect students' classroom behavior is a promising approach for analyzing their class performance and improving teaching effectiveness.

Paper
Code

VCL Challenges 2023 at ICCV 2023 Technical Report: Bi-level Adaptation Method for Test-time Adaptive Object Detection

no code implementations • 13 Oct 2023 • Chenyu Lin, Yusheng He, Zhengqing Zang, Chenwei Tang, Tao Wang, Jiancheng Lv

This report outlines our team's participation in VCL Challenges B Continual Test_time Adaptation, focusing on the technical details of our approach.

object-detection Object Detection

Paper
Add Code

Controlled Decoding from Language Models

no code implementations • 25 Oct 2023 • Sidharth Mudgal, Jong Lee, Harish Ganapathy, Yaguang Li, Tao Wang, Yanping Huang, Zhifeng Chen, Heng-Tze Cheng, Michael Collins, Trevor Strohman, Jilin Chen, Alex Beutel, Ahmad Beirami

KL-regularized reinforcement learning (RL) is a popular alignment framework to control the language model responses towards high reward outcomes.

Language Modelling Multi-Objective Reinforcement Learning +1

Paper
Add Code

Pseudo Label-Guided Data Fusion and Output Consistency for Semi-Supervised Medical Image Segmentation

1 code implementation • 17 Nov 2023 • Tao Wang, Yuanbin Chen, Xinlin Zhang, Yuanbo Zhou, Junlin Lan, Bizhe Bai, Tao Tan, Min Du, Qinquan Gao, Tong Tong

Inspired by semi-supervised algorithms that use both labeled and unlabeled data for training, we propose the PLGDF framework, which builds upon the mean teacher network for segmenting medical images with less annotation.

Image Segmentation Pseudo Label +3

Paper
Code

Complementary Advantages of ChatGPTs and Human Readers in Reasoning: Evidence from English Text Reading Comprehension

no code implementations • 17 Nov 2023 • Tongquan Zhou, Yao Zhang, Siyi Cao, Yulu Li, Tao Wang

Our study reveals that human readers and ChatGPTs have their respective advantages and disadvantages in drawing inferences from text reading comprehension, unlocking a complementary relationship in text-based reasoning.

Causal Inference Reading Comprehension

Paper
Add Code

Boost Adversarial Transferability by Uniform Scale and Mix Mask Method

no code implementations • 18 Nov 2023 • Tao Wang, Zijian Ying, Qianmu Li, Zhichao Lian

To address these challenges, we propose a framework called Uniform Scale and Mix Mask Method (US-MM) for adversarial example generation.

Paper
Add Code

Learning to Skip for Language Modeling

no code implementations • 26 Nov 2023 • Dewen Zeng, Nan Du, Tao Wang, Yuanzhong Xu, Tao Lei, Zhifeng Chen, Claire Cui

Overparameterized large-scale language models have impressive generalization performance of in-context few-shot learning.

Few-Shot Learning Language Modelling

Paper
Add Code

VSFormer: Visual-Spatial Fusion Transformer for Correspondence Pruning

1 code implementation • 14 Dec 2023 • Tangfei Liao, Xiaoqin Zhang, Li Zhao, Tao Wang, Guobao Xiao

Then, we model these visual cues and correspondences by a joint visual-spatial fusion module, simultaneously embedding visual cues into correspondences for pruning.

Paper
Code

Research on Multilingual Natural Scene Text Detection Algorithm

no code implementations • 18 Dec 2023 • Tao Wang

Natural scene text detection is a significant challenge in computer vision, with tremendous potential applications in multilingual, diverse, and complex text scenarios.

Scene Text Detection Semantic Segmentation +1

Paper
Add Code

Gemini: A Family of Highly Capable Multimodal Models

no code implementations • The Keyword 2023 • Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, Ryan Doherty, Eli Collins, Clemens Meyer, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Jack Krawczyk, Ed Chi, Heng-Tze Cheng, Eric Ni, Purvi Shah, Patrick Kane, Betty Chan, Manaal Faruqui, Aliaksei Severyn, Hanzhao Lin, Yaguang Li, Yong Cheng, Mahdis Mahdieh, Mia Chen, Pei Sun, Dustin Tran, Sumit Bagri, Balaji Lakshminarayanan, Jeremiah Liu, Andras Orban, Fabian Güra, Hao Zhou, Xinying Song, Aurelien Boffy, Harish Ganapathy, Steven Zheng, HyunJeong Choe, Ágoston Weisz, Tao Zhu, Yifeng Lu, Siddharth Gopal, Jarrod Kahn, Maciej Kula, Jeff Pitman, Rushin Shah, Emanuel Taropa, Majd Al Merey, Martin Baeuml, Zhifeng Chen, Laurent El Shafey, Yujing Zhang, Olcan Sercinoglu, George Tucker, Enrique Piqueras, Maxim Krikun, Iain Barr, Nikolay Savinov, Ivo Danihelka, Becca Roelofs, Anaïs White, Anders Andreassen, Tamara von Glehn, Lakshman Yagati, Mehran Kazemi, Lucas Gonzalez, Misha Khalman, Jakub Sygnowski, Alexandre Frechette, Charlotte Smith, Laura Culp, Lev Proleev, Yi Luan, Xi Chen, James Lottes, Nathan Schucher, Federico Lebron, Alban Rrustemi, Natalie Clay, Phil Crone, Tomas Kocisky, Jeffrey Zhao, Bartek Perz, Dian Yu, Heidi Howard, Adam Bloniarz, Jack W. Rae, Han Lu, Laurent SIfre, Marcello Maggioni, Fred Alcober, Dan Garrette, Megan Barnes, Shantanu Thakoor, Jacob Austin, Gabriel Barth-Maron, William Wong, Rishabh Joshi, Rahma Chaabouni, Deeni Fatiha, Arun Ahuja, Gaurav Singh Tomar, Evan Senter, Martin Chadwick, Ilya Kornakov, Nithya Attaluri, Iñaki Iturrate, Ruibo Liu, Yunxuan Li, Sarah Cogan, Jeremy Chen, Chao Jia, Chenjie Gu, Qiao Zhang, Jordan Grimstad, Ale Jakse Hartman, Xavier Garcia, Thanumalayan Sankaranarayana Pillai, Jacob Devlin, Michael Laskin, Diego de Las Casas, Dasha Valter, Connie Tao, Lorenzo Blanco, Adrià Puigdomènech Badia, David Reitter, Mianna Chen, Jenny Brennan, Clara Rivera, Sergey Brin, Shariq Iqbal, Gabriela Surita, Jane Labanowski, Abhi Rao, Stephanie Winkler, Emilio Parisotto, Yiming Gu, Kate Olszewska, Ravi Addanki, Antoine Miech, Annie Louis, Denis Teplyashin, Geoff Brown, Elliot Catt, Jan Balaguer, Jackie Xiang, Pidong Wang, Zoe Ashwood, Anton Briukhov, Albert Webson, Sanjay Ganapathy, Smit Sanghavi, Ajay Kannan, Ming-Wei Chang, Axel Stjerngren, Josip Djolonga, Yuting Sun, Ankur Bapna, Matthew Aitchison, Pedram Pejman, Henryk Michalewski, Tianhe Yu, Cindy Wang, Juliette Love, Junwhan Ahn, Dawn Bloxwich, Kehang Han, Peter Humphreys, Thibault Sellam, James Bradbury, Varun Godbole, Sina Samangooei, Bogdan Damoc, Alex Kaskasoli, Sébastien M. R. Arnold, Vijay Vasudevan, Shubham Agrawal, Jason Riesa, Dmitry Lepikhin, Richard Tanburn, Srivatsan Srinivasan, Hyeontaek Lim, Sarah Hodkinson, Pranav Shyam, Johan Ferret, Steven Hand, Ankush Garg, Tom Le Paine, Jian Li, Yujia Li, Minh Giang, Alexander Neitz, Zaheer Abbas, Sarah York, Machel Reid, Elizabeth Cole, Aakanksha Chowdhery, Dipanjan Das, Dominika Rogozińska, Vitaliy Nikolaev, Pablo Sprechmann, Zachary Nado, Lukas Zilka, Flavien Prost, Luheng He, Marianne Monteiro, Gaurav Mishra, Chris Welty, Josh Newlan, Dawei Jia, Miltiadis Allamanis, Clara Huiyi Hu, Raoul de Liedekerke, Justin Gilmer, Carl Saroufim, Shruti Rijhwani, Shaobo Hou, Disha Shrivastava, Anirudh Baddepudi, Alex Goldin, Adnan Ozturel, Albin Cassirer, Yunhan Xu, Daniel Sohn, Devendra Sachan, Reinald Kim Amplayo, Craig Swanson, Dessie Petrova, Shashi Narayan, Arthur Guez, Siddhartha Brahma, Jessica Landon, Miteyan Patel, Ruizhe Zhao, Kevin Villela, Luyu Wang, Wenhao Jia, Matthew Rahtz, Mai Giménez, Legg Yeung, James Keeling, Petko Georgiev, Diana Mincu, Boxi Wu, Salem Haykal, Rachel Saputro, Kiran Vodrahalli, James Qin, Zeynep Cankara, Abhanshu Sharma, Nick Fernando, Will Hawkins, Behnam Neyshabur, Solomon Kim, Adrian Hutter, Priyanka Agrawal, Alex Castro-Ros, George van den Driessche, Tao Wang, Shuo-Yiin Chang, Paul Komarek, Ross Mcilroy, Mario Lučić, Guodong Zhang, Wael Farhan, Michael Sharman, Paul Natsev, Paul Michel, Yamini Bansal, Siyuan Qiao, Kris Cao, Siamak Shakeri, Christina Butterfield, Justin Chung, Paul Kishan Rubenstein, Shivani Agrawal, Arthur Mensch, Kedar Soparkar, Karel Lenc, Timothy Chung, Aedan Pope, Loren Maggiore, Jackie Kay, Priya Jhakra, Shibo Wang, Joshua Maynez, Mary Phuong, Taylor Tobin, Andrea Tacchetti, Maja Trebacz, Kevin Robinson, Yash Katariya, Sebastian Riedel, Paige Bailey, Kefan Xiao, Nimesh Ghelani, Lora Aroyo, Ambrose Slone, Neil Houlsby, Xuehan Xiong, Zhen Yang, Elena Gribovskaya, Jonas Adler, Mateo Wirth, Lisa Lee, Music Li, Thais Kagohara, Jay Pavagadhi, Sophie Bridgers, Anna Bortsova, Sanjay Ghemawat, Zafarali Ahmed, Tianqi Liu, Richard Powell, Vijay Bolina, Mariko Iinuma, Polina Zablotskaia, James Besley, Da-Woon Chung, Timothy Dozat, Ramona Comanescu, Xiance Si, Jeremy Greer, Guolong Su, Martin Polacek, Raphaël Lopez Kaufman, Simon Tokumine, Hexiang Hu, Elena Buchatskaya, Yingjie Miao, Mohamed Elhawaty, Aditya Siddhant, Nenad Tomasev, Jinwei Xing, Christina Greer, Helen Miller, Shereen Ashraf, Aurko Roy, Zizhao Zhang, Ada Ma, Angelos Filos, Milos Besta, Rory Blevins, Ted Klimenko, Chih-Kuan Yeh, Soravit Changpinyo, Jiaqi Mu, Oscar Chang, Mantas Pajarskas, Carrie Muir, Vered Cohen, Charline Le Lan, Krishna Haridasan, Amit Marathe, Steven Hansen, Sholto Douglas, Rajkumar Samuel, Mingqiu Wang, Sophia Austin, Chang Lan, Jiepu Jiang, Justin Chiu, Jaime Alonso Lorenzo, Lars Lowe Sjösund, Sébastien Cevey, Zach Gleicher, Thi Avrahami, Anudhyan Boral, Hansa Srinivasan, Vittorio Selo, Rhys May, Konstantinos Aisopos, Léonard Hussenot, Livio Baldini Soares, Kate Baumli, Michael B. Chang, Adrià Recasens, Ben Caine, Alexander Pritzel, Filip Pavetic, Fabio Pardo, Anita Gergely, Justin Frye, Vinay Ramasesh, Dan Horgan, Kartikeya Badola, Nora Kassner, Subhrajit Roy, Ethan Dyer, Víctor Campos Campos, Alex Tomala, Yunhao Tang, Dalia El Badawy, Elspeth White, Basil Mustafa, Oran Lang, Abhishek Jindal, Sharad Vikram, Zhitao Gong, Sergi Caelles, Ross Hemsley, Gregory Thornton, Fangxiaoyu Feng, Wojciech Stokowiec, Ce Zheng, Phoebe Thacker, Çağlar Ünlü, Zhishuai Zhang, Mohammad Saleh, James Svensson, Max Bileschi, Piyush Patil, Ankesh Anand, Roman Ring, Katerina Tsihlas, Arpi Vezer, Marco Selvi, Toby Shevlane, Mikel Rodriguez, Tom Kwiatkowski, Samira Daruki, Keran Rong, Allan Dafoe, Nicholas FitzGerald, Keren Gu-Lemberg, Mina Khan, Lisa Anne Hendricks, Marie Pellat, Vladimir Feinberg, James Cobon-Kerr, Tara Sainath, Maribeth Rauh, Sayed Hadi Hashemi, Richard Ives, Yana Hasson, Eric Noland, Yuan Cao, Nathan Byrd, Le Hou, Qingze Wang, Thibault Sottiaux, Michela Paganini, Jean-Baptiste Lespiau, Alexandre Moufarek, Samer Hassan, Kaushik Shivakumar, Joost van Amersfoort, Amol Mandhane, Pratik Joshi, Anirudh Goyal, Matthew Tung, Andrew Brock, Hannah Sheahan, Vedant Misra, Cheng Li, Nemanja Rakićević, Mostafa Dehghani, Fangyu Liu, Sid Mittal, Junhyuk Oh, Seb Noury, Eren Sezener, Fantine Huot, Matthew Lamm, Nicola De Cao, Charlie Chen, Sidharth Mudgal, Romina Stella, Kevin Brooks, Gautam Vasudevan, Chenxi Liu, Mainak Chain, Nivedita Melinkeri, Aaron Cohen, Venus Wang, Kristie Seymore, Sergey Zubkov, Rahul Goel, Summer Yue, Sai Krishnakumaran, Brian Albert, Nate Hurley, Motoki Sano, Anhad Mohananey, Jonah Joughin, Egor Filonov, Tomasz Kępa, Yomna Eldawy, Jiawern Lim, Rahul Rishi, Shirin Badiezadegan, Taylor Bos, Jerry Chang, Sanil Jain, Sri Gayatri Sundara Padmanabhan, Subha Puttagunta, Kalpesh Krishna, Leslie Baker, Norbert Kalb, Vamsi Bedapudi, Shuntong Lei, Anthony Yu, Oren Litvin, Xiang Zhou, Zhichun Wu, Sam Sobell, Andrea Siciliano, Alan Papir, Robby Neale, Jonas Bragagnolo, Tej Toor, Tina Chen, Valentin Anklin, Feiran Wang, Richie Feng, Milad Gholami, Kevin Ling, Lijuan Liu, Jules Walter, Hamid Moghaddam, Arun Kishore, Jakub Adamek, Tyler Mercado, Jonathan Mallinson, Siddhinita Wandekar, Stephen Cagle, Eran Ofek, Guillermo Garrido, Clemens Lombriser, Maksim Mukha, Botu Sun, Hafeezul Rahman Mohammad, Josip Matak, Yadi Qian, Vikas Peswani, Pawel Janus, Quan Yuan, Leif Schelin, Oana David, Ankur Garg, Yifan He, Oleksii Duzhyi, Anton Älgmyr, Timothée Lottaz, Qi Li, Vikas Yadav, Luyao Xu, Alex Chinien, Rakesh Shivanna, Aleksandr Chuklin, Josie Li, Carrie Spadine, Travis Wolfe, Kareem Mohamed, Subhabrata Das, Zihang Dai, Kyle He, Daniel von Dincklage, Shyam Upadhyay, Akanksha Maurya, Luyan Chi, Sebastian Krause, Khalid Salama, Pam G Rabinovitch, Pavan Kumar Reddy M, Aarush Selvan, Mikhail Dektiarev, Golnaz Ghiasi, Erdem Guven, Himanshu Gupta, Boyi Liu, Deepak Sharma, Idan Heimlich Shtacher, Shachi Paul, Oscar Akerlund, François-Xavier Aubet, Terry Huang, Chen Zhu, Eric Zhu, Elico Teixeira, Matthew Fritze, Francesco Bertolini, Liana-Eleonora Marinescu, Martin Bölle, Dominik Paulus, Khyatti Gupta, Tejasi Latkar, Max Chang, Jason Sanders, Roopa Wilson, Xuewei Wu, Yi-Xuan Tan, Lam Nguyen Thiet, Tulsee Doshi, Sid Lall, Swaroop Mishra, Wanming Chen, Thang Luong, Seth Benjamin, Jasmine Lee, Ewa Andrejczuk, Dominik Rabiej, Vipul Ranjan, Krzysztof Styrc, Pengcheng Yin, Jon Simon, Malcolm Rose Harriott, Mudit Bansal, Alexei Robsky, Geoff Bacon, David Greene, Daniil Mirylenka, Chen Zhou, Obaid Sarvana, Abhimanyu Goyal, Samuel Andermatt, Patrick Siegler, Ben Horn, Assaf Israel, Francesco Pongetti, Chih-Wei "Louis" Chen, Marco Selvatici, Pedro Silva, Kathie Wang, Jackson Tolins, Kelvin Guu, Roey Yogev, Xiaochen Cai, Alessandro Agostini, Maulik Shah, Hung Nguyen, Noah Ó Donnaile, Sébastien Pereira, Linda Friso, Adam Stambler, Adam Kurzrok, Chenkai Kuang, Yan Romanikhin, Mark Geller, ZJ Yan, Kane Jang, Cheng-Chun Lee, Wojciech Fica, Eric Malmi, Qijun Tan, Dan Banica, Daniel Balle, Ryan Pham, Yanping Huang, Diana Avram, Hongzhi Shi, Jasjot Singh, Chris Hidey, Niharika Ahuja, Pranab Saxena, Dan Dooley, Srividya Pranavi Potharaju, Eileen O'Neill, Anand Gokulchandran, Ryan Foley, Kai Zhao, Mike Dusenberry, YuAn Liu, Pulkit Mehta, Ragha Kotikalapudi, Chalence Safranek-Shrader, Andrew Goodman, Joshua Kessinger, Eran Globen, Prateek Kolhar, Chris Gorgolewski, Ali Ibrahim, Yang song, Ali Eichenbaum, Thomas Brovelli, Sahitya Potluri, Preethi Lahoti, Cip Baetu, Ali Ghorbani, Charles Chen, Andy Crawford, Shalini Pal, Mukund Sridhar, Petru Gurita, Asier Mujika, Igor Petrovski, Pierre-Louis Cedoz, Chenmei Li, Shiyuan Chen, Niccolò Dal Santo, Siddharth Goyal, Jitesh Punjabi, Karthik Kappaganthu, Chester Kwak, Pallavi LV, Sarmishta Velury, Himadri Choudhury, Jamie Hall, Premal Shah, Ricardo Figueira, Matt Thomas, Minjie Lu, Ting Zhou, Chintu Kumar, Thomas Jurdi, Sharat Chikkerur, Yenai Ma, Adams Yu, Soo Kwak, Victor Ähdel, Sujeevan Rajayogam, Travis Choma, Fei Liu, Aditya Barua, Colin Ji, Ji Ho Park, Vincent Hellendoorn, Alex Bailey, Taylan Bilal, Huanjie Zhou, Mehrdad Khatir, Charles Sutton, Wojciech Rzadkowski, Fiona Macintosh, Konstantin Shagin, Paul Medina, Jinjing Zhou, Pararth Shah, Yingying Bi, Attila Dankovics, Shipra Banga, Sabine Lehmann, Marissa Bredesen, Zifan Lin, John Eric Hoffmann, Jonathan Lai, Raynald Chung, Kai Yang, Nihal Balani, Arthur Bražinskas, Andrei Sozanschi, Matthew Hayes, Héctor Fernández Alcalde, Peter Makarov, Will Chen, Antonio Stella, Liselotte Snijders, Michael Mandl, Ante Kärrman, Paweł Nowak, Xinyi Wu, Alex Dyck, Krishnan Vaidyanathan, Raghavender R, Jessica Mallet, Mitch Rudominer, Eric Johnston, Sushil Mittal, Akhil Udathu, Janara Christensen, Vishal Verma, Zach Irving, Andreas Santucci, Gamaleldin Elsayed, Elnaz Davoodi, Marin Georgiev, Ian Tenney, Geoffrey Cideron, Edouard Leurent, Mahmoud Alnahlawi, Ionut Georgescu, Nan Wei, Ivy Zheng, Dylan Scandinaro, Heinrich Jiang, Jasper Snoek, Mukund Sundararajan, Xuezhi Wang, Zack Ontiveros, Itay Karo, Jeremy Cole, Vinu Rajashekhar, Lara Tumeh, Eyal Ben-David, Rishub Jain, Jonathan Uesato, Romina Datta, Oskar Bunyan, Shimu Wu, John Zhang, Piotr Stanczyk, Ye Zhang, David Steiner, Subhajit Naskar, Michael Azzam, Matthew Johnson, Adam Paszke, Chung-Cheng Chiu, Jaume Sanchez Elias, Afroz Mohiuddin, Faizan Muhammad, Jin Miao, Andrew Lee, Nino Vieillard, Jane Park, Jiageng Zhang, Jeff Stanway, Drew Garmon, Abhijit Karmarkar, Zhe Dong, Jong Lee, Aviral Kumar, Luowei Zhou, Jonathan Evens, William Isaac, Geoffrey Irving, Edward Loper, Michael Fink, Isha Arkatkar, Nanxin Chen, Izhak Shafran, Ivan Petrychenko, Zhe Chen, Johnson Jia, Anselm Levskaya, Zhenkai Zhu, Peter Grabowski, Yu Mao, Alberto Magni, Kaisheng Yao, Javier Snaider, Norman Casagrande, Evan Palmer, Paul Suganthan, Alfonso Castaño, Irene Giannoumis, Wooyeol Kim, Mikołaj Rybiński, Ashwin Sreevatsa, Jennifer Prendki, David Soergel, Adrian Goedeckemeyer, Willi Gierke, Mohsen Jafari, Meenu Gaba, Jeremy Wiesner, Diana Gage Wright, Yawen Wei, Harsha Vashisht, Yana Kulizhskaya, Jay Hoover, Maigo Le, Lu Li, Chimezie Iwuanyanwu, Lu Liu, Kevin Ramirez, Andrey Khorlin, Albert Cui, Tian Lin, Marcus Wu, Ricardo Aguilar, Keith Pallo, Abhishek Chakladar, Ginger Perng, Elena Allica Abellan, Mingyang Zhang, Ishita Dasgupta, Nate Kushman, Ivo Penchev, Alena Repina, Xihui Wu, Tom van der Weide, Priya Ponnapalli, Caroline Kaplan, Jiri Simsa, Shuangfeng Li, Olivier Dousse, Jeff Piper, Nathan Ie, Rama Pasumarthi, Nathan Lintz, Anitha Vijayakumar, Daniel Andor, Pedro Valenzuela, Minnie Lui, Cosmin Paduraru, Daiyi Peng, Katherine Lee, Shuyuan Zhang, Somer Greene, Duc Dung Nguyen, Paula Kurylowicz, Cassidy Hardin, Lucas Dixon, Lili Janzer, Kiam Choo, Ziqiang Feng, Biao Zhang, Achintya Singhal, Dayou Du, Dan McKinnon, Natasha Antropova, Tolga Bolukbasi, Orgad Keller, David Reid, Daniel Finchelstein, Maria Abi Raad, Remi Crocker, Peter Hawkins, Robert Dadashi, Colin Gaffney, Ken Franko, Anna Bulanova, Rémi Leblond, Shirley Chung, Harry Askham, Luis C. Cobo, Kelvin Xu, Felix Fischer, Jun Xu, Christina Sorokin, Chris Alberti, Chu-Cheng Lin, Colin Evans, Alek Dimitriev, Hannah Forbes, Dylan Banarse, Zora Tung, Mark Omernick, Colton Bishop, Rachel Sterneck, Rohan Jain, Jiawei Xia, Ehsan Amid, Francesco Piccinno, Xingyu Wang, Praseem Banzal, Daniel J. Mankowitz, Alex Polozov, Victoria Krakovna, Sasha Brown, Mohammadhossein Bateni, Dennis Duan, Vlad Firoiu, Meghana Thotakuri, Tom Natan, Matthieu Geist, Ser tan Girgin, Hui Li, Jiayu Ye, Ofir Roval, Reiko Tojo, Michael Kwong, James Lee-Thorp, Christopher Yew, Danila Sinopalnikov, Sabela Ramos, John Mellor, Abhishek Sharma, Kathy Wu, David Miller, Nicolas Sonnerat, Denis Vnukov, Rory Greig, Jennifer Beattie, Emily Caveness, Libin Bai, Julian Eisenschlos, Alex Korchemniy, Tomy Tsai, Mimi Jasarevic, Weize Kong, Phuong Dao, Zeyu Zheng, Frederick Liu, Fan Yang, Rui Zhu, Tian Huey Teh, Jason Sanmiya, Evgeny Gladchenko, Nejc Trdin, Daniel Toyama, Evan Rosen, Sasan Tavakkol, Linting Xue, Chen Elkind, Oliver Woodman, John Carpenter, George Papamakarios, Rupert Kemp, Sushant Kafle, Tanya Grunina, Rishika Sinha, Alice Talbert, Diane Wu, Denese Owusu-Afriyie, Cosmo Du, Chloe Thornton, Jordi Pont-Tuset, Pradyumna Narayana, Jing Li, Saaber Fatehi, John Wieting, Omar Ajmeri, Benigno Uria, Yeongil Ko, Laura Knight, Amélie Héliou, Ning Niu, Shane Gu, Chenxi Pang, Yeqing Li, Nir Levine, Ariel Stolovich, Rebeca Santamaria-Fernandez, Sonam Goenka, Wenny Yustalim, Robin Strudel, Ali Elqursh, Charlie Deck, Hyo Lee, Zonglin Li, Kyle Levin, Raphael Hoffmann, Dan Holtmann-Rice, Olivier Bachem, Sho Arora, Christy Koh, Soheil Hassas Yeganeh, Siim Põder, Mukarram Tariq, Yanhua Sun, Lucian Ionita, Mojtaba Seyedhosseini, Pouya Tafti, Zhiyu Liu, Anmol Gulati, Jasmine Liu, Xinyu Ye, Bart Chrzaszcz, Lily Wang, Nikhil Sethi, Tianrun Li, Ben Brown, Shreya Singh, Wei Fan, Aaron Parisi, Joe Stanton, Vinod Koverkathu, Christopher A. Choquette-Choo, Yunjie Li, TJ Lu, Abe Ittycheriah, Prakash Shroff, Mani Varadarajan, Sanaz Bahargam, Rob Willoughby, David Gaddy, Guillaume Desjardins, Marco Cornero, Brona Robenek, Bhavishya Mittal, Ben Albrecht, Ashish Shenoy, Fedor Moiseev, Henrik Jacobsson, Alireza Ghaffarkhah, Morgane Rivière, Alanna Walton, Clément Crepy, Alicia Parrish, Zongwei Zhou, Clement Farabet, Carey Radebaugh, Praveen Srinivasan, Claudia van der Salm, Andreas Fidjeland, Salvatore Scellato, Eri Latorre-Chimoto, Hanna Klimczak-Plucińska, David Bridson, Dario de Cesare, Tom Hudson, Piermaria Mendolicchio, Lexi Walker, Alex Morris, Matthew Mauger, Alexey Guseynov, Alison Reid, Seth Odoom, Lucia Loher, Victor Cotruta, Madhavi Yenugula, Dominik Grewe, Anastasia Petrushkina, Tom Duerig, Antonio Sanchez, Steve Yadlowsky, Amy Shen, Amir Globerson, Lynette Webb, Sahil Dua, Dong Li, Surya Bhupatiraju, Dan Hurt, Haroon Qureshi, Ananth Agarwal, Tomer Shani, Matan Eyal, Anuj Khare, Shreyas Rammohan Belle, Lei Wang, Chetan Tekur, Mihir Sanjay Kale, Jinliang Wei, Ruoxin Sang, Brennan Saeta, Tyler Liechty, Yao Zhao, Stephan Lee, Pandu Nayak, Doug Fritz, Manish Reddy Vuyyuru, John Aslanides, Nidhi Vyas, Martin Wicke, Xiao Ma, Evgenii Eltyshev, Nina Martin, Hardie Cate, James Manyika, Keyvan Amiri, Yelin Kim, Xi Xiong, Kai Kang, Florian Luisier, Nilesh Tripuraneni, David Madras, Mandy Guo, Austin Waters, Oliver Wang, Joshua Ainslie, Jason Baldridge, Han Zhang, Garima Pruthi, Jakob Bauer, Feng Yang, Riham Mansour, Jason Gelman, Yang Xu, George Polovets, Ji Liu, Honglong Cai, Warren Chen, XiangHai Sheng, Emily Xue, Sherjil Ozair, Christof Angermueller, Xiaowei Li, Anoop Sinha, Weiren Wang, Julia Wiesinger, Emmanouil Koukoumidis, Yuan Tian, Anand Iyer, Madhu Gurumurthy, Mark Goldenson, Parashar Shah, MK Blake, Hongkun Yu, Anthony Urbanowicz, Jennimaria Palomaki, Chrisantha Fernando, Ken Durden, Harsh Mehta, Nikola Momchev, Elahe Rahimtoroghi, Maria Georgaki, Amit Raul, Sebastian Ruder, Morgan Redshaw, Jinhyuk Lee, Denny Zhou, Komal Jalan, Dinghua Li, Blake Hechtman, Parker Schuh, Milad Nasr, Kieran Milan, Vladimir Mikulik, Juliana Franco, Tim Green, Nam Nguyen, Joe Kelley, Aroma Mahendru, Andrea Hu, Joshua Howland, Ben Vargas, Jeffrey Hui, Kshitij Bansal, Vikram Rao, Rakesh Ghiya, Emma Wang, Ke Ye, Jean Michel Sarr, Melanie Moranski Preston, Madeleine Elish, Steve Li, Aakash Kaku, Jigar Gupta, Ice Pasupat, Da-Cheng Juan, Milan Someswar, Tejvi M., Xinyun Chen, Aida Amini, Alex Fabrikant, Eric Chu, Xuanyi Dong, Amruta Muthal, Senaka Buthpitiya, Sarthak Jauhari, Nan Hua, Urvashi Khandelwal, Ayal Hitron, Jie Ren, Larissa Rinaldi, Shahar Drath, Avigail Dabush, Nan-Jiang Jiang, Harshal Godhia, Uli Sachs, Anthony Chen, Yicheng Fan, Hagai Taitelbaum, Hila Noga, Zhuyun Dai, James Wang, Chen Liang, Jenny Hamer, Chun-Sung Ferng, Chenel Elkind, Aviel Atias, Paulina Lee, Vít Listík, Mathias Carlen, Jan van de Kerkhof, Marcin Pikus, Krunoslav Zaher, Paul Müller, Sasha Zykova, Richard Stefanec, Vitaly Gatsko, Christoph Hirnschall, Ashwin Sethi, Xingyu Federico Xu, Chetan Ahuja, Beth Tsai, Anca Stefanoiu, Bo Feng, Keshav Dhandhania, Manish Katyal, Akshay Gupta, Atharva Parulekar, Divya Pitta, Jing Zhao, Vivaan Bhatia, Yashodha Bhavnani, Omar Alhadlaq, Xiaolin Li, Peter Danenberg, Dennis Tu, Alex Pine, Vera Filippova, Abhipso Ghosh, Ben Limonchik, Bhargava Urala, Chaitanya Krishna Lanka, Derik Clive, Yi Sun, Edward Li, Hao Wu, Kevin Hongtongsak, Ianna Li, Kalind Thakkar, Kuanysh Omarov, Kushal Majmundar, Michael Alverson, Michael Kucharski, Mohak Patel, Mudit Jain, Maksim Zabelin, Paolo Pelagatti, Rohan Kohli, Saurabh Kumar, Joseph Kim, Swetha Sankar, Vineet Shah, Lakshmi Ramachandruni, Xiangkai Zeng, Ben Bariach, Laura Weidinger, Amar Subramanya, Sissie Hsiao, Demis Hassabis, Koray Kavukcuoglu, Adam Sadovsky, Quoc Le, Trevor Strohman, Yonghui Wu, Slav Petrov, Jeffrey Dean, Oriol Vinyals

This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding.

Ranked #1 on Multi-task Language Understanding on MMLU

Arithmetic Reasoning Code Generation +3

Paper
Add Code

Towards Real-World Blind Face Restoration with Generative Diffusion Prior

1 code implementation • 25 Dec 2023 • Xiaoxu Chen, Jingfan Tan, Tao Wang, Kaihao Zhang, Wenhan Luo, Xiaochun Cao

We propose BFRffusion which is thoughtfully designed to effectively extract features from low-quality face images and could restore realistic and faithful facial details with the generative prior of the pretrained Stable Diffusion.

Blind Face Restoration Privacy Preserving

Paper
Code

Knowledge Enhanced Conditional Imputation for Healthcare Time-series

1 code implementation • 27 Dec 2023 • Linglong Qian, Zina Ibrahim, Hugh Logan Ellis, Ao Zhang, Yuezhou Zhang, Tao Wang, Richard Dobson

This study presents a novel approach to addressing the challenge of missing data in multivariate time series, with a particular focus on the complexities of healthcare data.

Imputation Time Series

Paper
Code

Dual Teacher Knowledge Distillation with Domain Alignment for Face Anti-spoofing

no code implementations • 2 Jan 2024 • Zhe Kong, Wentian Zhang, Tao Wang, Kaihao Zhang, Yuexiang Li, Xiaoying Tang, Wenhan Luo

In this paper, we propose a domain adversarial attack (DAA) method to mitigate the training instability problem by adding perturbations to the input images, which makes them indistinguishable across domains and enables domain alignment.

Adversarial Attack Face Anti-Spoofing +2

Paper
Add Code

GroundingGPT:Language Enhanced Multi-modal Grounding Model

2 code implementations • 11 Jan 2024 • Zhaowei Li, Qi Xu, Dong Zhang, Hang Song, Yiqing Cai, Qi Qi, Ran Zhou, Junting Pan, Zefeng Li, Van Tu Vu, Zhida Huang, Tao Wang

Beyond capturing global information like other multi-modal models, our proposed model excels at tasks demanding a detailed understanding of local information within the input.

Language Modelling Large Language Model

206

Paper
Code

Anything in Any Scene: Photorealistic Video Object Insertion

no code implementations • 30 Jan 2024 • Chen Bai, Zeman Shao, Guoxiang Zhang, Di Liang, Jie Yang, Zhuorui Zhang, Yujian Guo, Chengzhang Zhong, Yiqiao Qiu, Zhendong Wang, Yichen Guan, Xiaoyin Zheng, Tao Wang, Cheng Lu

Our proposed general framework encompasses three key processes: 1) integrating a realistic object into a given scene video with proper placement to ensure geometric realism; 2) estimating the sky and environmental lighting distribution and simulating realistic shadows to enhance the light realism; 3) employing a style transfer network that refines the final video output to maximize photorealism.

Data Augmentation Object +2

Paper
Add Code

PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal

1 code implementation • 4 Feb 2024 • Tao Wang, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Tae-Kyun Kim, Tong Lu, Hongdong Li, Ming-Hsuan Yang

For the prompt generation, we first propose a prompt pre-training strategy to train a frequency prompt encoder that encodes the ground-truth image into LF and HF prompts.

Reflection Removal

Paper
Code

PANORAMIA: Privacy Auditing of Machine Learning Models without Retraining

no code implementations • 12 Feb 2024 • Mishaal Kazmi, Hadrien Lautraite, Alireza Akbari, Mauricio Soroco, Qiaoyue Tang, Tao Wang, Sébastien Gambs, Mathias Lécuyer

We introduce a privacy auditing scheme for ML models that relies on membership inference attacks using generated data as "non-members".

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.