Search Results for author: Hao Zhang

Found 311 papers, 104 papers with code

Friendly Topic Assistant for Transformer Based Abstractive Summarization

no code implementations EMNLP 2020 Zhengjue Wang, Zhibin Duan, Hao Zhang, Chaojie Wang, Long Tian, Bo Chen, Mingyuan Zhou

Abstractive document summarization is a comprehensive task including document understanding and summary generation, in which area Transformer-based models have achieved the state-of-the-art performance.

Abstractive Text Summarization Document Summarization +1

WordNet Troponymy and Extraction of “Manner-Result” Relations

no code implementations GWC 2018 Aliaksandr Huminski, Hao Zhang

The procedure of extraction includes three steps and the results are based on the analysis of the whole set of verbs in WordNet.

Incorporating Instructional Prompts into a Unified Generative Framework for Joint Multiple Intent Detection and Slot Filling

1 code implementation COLING 2022 Yangjun Wu, Han Wang, Dongxiang Zhang, Gang Chen, Hao Zhang

Specifically, we design 5-type templates as instructional prompts, and each template includes a question that acts as the driver to teach UGEN to grasp the paradigm, options that list the candidate intents or slots to reduce the answer search space, and the context denotes original utterance.

Intent Detection Question Answering +3

BIRNAT: Bidirectional Recurrent Neural Networks with Adversarial Training for Video Snapshot Compressive Imaging

1 code implementation ECCV 2020 Ziheng Cheng, Ruiying Lu, Zhengjue Wang, Hao Zhang, Bo Chen, Ziyi Meng, Xin Yuan

This measurement and the modulation masks are fed into our Recurrent Neural Network (RNN) to reconstruct the desired high-speed frames.

FDNeRF: Semantics-Driven Face Reconstruction, Prompt Editing and Relighting with Diffusion Models

1 code implementation1 Jun 2023 Hao Zhang, Yanbo Xu, Tianyuan Dai, Yu-Wing, Tai Chi-Keung Tang

The ability to create high-quality 3D faces from a single image has become increasingly important with wide applications in video conferencing, AR/VR, and advanced video editing in movie industries.

3D Face Reconstruction Video Editing +1

MS-DETR: Natural Language Video Localization with Sampling Moment-Moment Interaction

1 code implementation30 May 2023 Jing Wang, Aixin Sun, Hao Zhang, XiaoLi Li

Given a query, the task of Natural Language Video Localization (NLVL) is to localize a temporal moment in an untrimmed video that semantically matches the query.

BRIGHT: Bi-level Feature Representation of Image Collections using Groups of Hash Tables

no code implementations29 May 2023 Dingdong Yang, Yizhi Wang, Ali Mahdavi-Amiri, Hao Zhang

We present BRIGHT, a bi-level feature representation for an image collection, consisting of a per-image latent space on top of a multi-scale feature grid space.

Image Generation Quantization

Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR

no code implementations28 May 2023 W. Ronny Huang, Hao Zhang, Shankar Kumar, Shuo-Yiin Chang, Tara N. Sainath

We address this limitation by distilling punctuation knowledge from a bidirectional teacher language model (LM) trained on written, punctuated text.

Language Modelling Semantic Segmentation

Mobile Safety Application for Pedestrians

no code implementations27 May 2023 Sukru Yaren Gelbal, Mustafa Ridvan Cantas, Bilin Aksun Guvenc, Levent Guvenc, Gopichandra Surnilla, Hao Zhang

The work we discuss in this paper is related to a mobile application that utilizes the mobile phone sensors and Bluetooth communication to implement Personal Safety Message (PSM) broadcast using the SAE J2735 standard to create a Pedestrian to Vehicle (P2V) based safety warning structure.

Efficient Detection of LLM-generated Texts with a Bayesian Surrogate Model

no code implementations26 May 2023 Zhijie Deng, Hongcheng Gao, Yibo Miao, Hao Zhang

The detection of machine-generated text, especially from large language models (LLMs), is crucial in preventing serious social problems resulting from their misuse.

NoisywikiHow: A Benchmark for Learning with Real-world Noisy Labels in Natural Language Processing

1 code implementation18 May 2023 Tingting Wu, Xiao Ding, Minji Tang, Hao Zhang, Bing Qin, Ting Liu

To mitigate the effects of label noise, learning with noisy labels (LNL) methods are designed to achieve better generalization performance.

Learning with noisy labels

Hybrid AHS: A Hybrid of Kalman Filter and Deep Learning for Acoustic Howling Suppression

no code implementations4 May 2023 Hao Zhang, Meng Yu, Yuzhong Wu, Tao Yu, Dong Yu

During offline training, a pre-processed signal obtained from the Kalman filter and an ideal microphone signal generated via teacher-forced training strategy are used to train the deep neural network (DNN).

Deep Learning for Joint Acoustic Echo and Acoustic Howling Suppression in Hybrid Meetings

no code implementations2 May 2023 Hao Zhang, Meng Yu, Dong Yu

In particular, the interplay between acoustic echo and acoustic howling in a hybrid meeting makes the joint suppression of them difficult.

Speech Separation

A Strong and Reproducible Object Detector with Only Public Datasets

2 code implementations25 Apr 2023 Tianhe Ren, Jianwei Yang, Shilong Liu, Ailing Zeng, Feng Li, Hao Zhang, Hongyang Li, Zhaoyang Zeng, Lei Zhang

This work presents Focal-Stable-DINO, a strong and reproducible object detection model which achieves 64. 6 AP on COCO val2017 and 64. 8 AP on COCO test-dev using only 700M parameters without any test time augmentation.

object-detection Object Detection

Decouple Non-parametric Knowledge Distillation For End-to-end Speech Translation

no code implementations20 Apr 2023 Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Zhen Li

Existing techniques often attempt to make knowledge transfer from a powerful machine translation (MT) to speech translation (ST) model with some elaborate techniques, which often requires transcription as extra input during training.

Knowledge Distillation Machine Translation +3

Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning

no code implementations20 Apr 2023 Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Wei-Qiang Zhang

However, the final model often performs worse on the MT task than the MT model trained alone, which means that the knowledge transfer ability of this method is also limited.

Contrastive Learning Machine Translation +2

DropDim: A Regularization Method for Transformer Networks

no code implementations20 Apr 2023 Hao Zhang, Dan Qu, Keji Shao, Xukui Yang

In contrast to the general dropout method, which randomly drops neurons, DropDim drops part of the embedding dimensions.

MS-LSTM: Exploring Spatiotemporal Multiscale Representations in Video Prediction Domain

no code implementations16 Apr 2023 Zhifeng Ma, Hao Zhang, Jie Liu

The drastic variation of motion in spatial and temporal dimensions makes the video prediction task extremely challenging.

Video Prediction

Segment Everything Everywhere All at Once

2 code implementations13 Apr 2023 Xueyan Zou, Jianwei Yang, Hao Zhang, Feng Li, Linjie Li, Jianfeng Gao, Yong Jae Lee

Inspired by the development of prompt-based universal interfaces for LLMs, this paper presents SEEM, a promptable, interactive model for Segmenting Everything Everywhere all at once in an image.

Personalized Segmentation Semantic Segmentation

RoSI: Recovering 3D Shape Interiors from Few Articulation Images

no code implementations13 Apr 2023 Akshay Gadi Patil, Yiming Qian, Shan Yang, Brian Jackson, Eric Bennett, Hao Zhang

The dominant majority of 3D models that appear in gaming, VR/AR, and those we use to train geometric deep learning algorithms are incomplete, since they are modeled as surface meshes and missing their interior structures.

Detection Transformer with Stable Matching

1 code implementation10 Apr 2023 Shilong Liu, Tianhe Ren, Jiayu Chen, Zhaoyang Zeng, Hao Zhang, Feng Li, Hongyang Li, Jun Huang, Hang Su, Jun Zhu, Lei Zhang

We point out that the unstable matching in DETR is caused by a multi-optimization path problem, which is highlighted by the one-to-one matching design in DETR.

SpanRE: Entities and Overlapping Relations Extraction Based on Spans and Entity Attention

no code implementations6 Apr 2023 Hao Zhang

Then we present a labeled span mechanism to extract the objects and relations simultaneously, we use the labeled span mechanism to generate labeled spans whose start and end positions indicate the objects, and whose labels correspond to relations of subject and objects.

UKP-SQuARE v3: A Platform for Multi-Agent QA Research

no code implementations31 Mar 2023 Haritz Puerto, Tim Baumgärtner, Rachneet Sachdeva, Haishuo Fang, Hao Zhang, Sewin Tariverdian, Kexin Wang, Iryna Gurevych

To ease research in multi-agent models, we extend UKP-SQuARE, an online platform for QA research, to support three families of multi-agent systems: i) agent selection, ii) early-fusion of agents, and iii) late-fusion of agents.

Question Answering

Coarse-to-Fine Active Segmentation of Interactable Parts in Real Scene Images

no code implementations21 Mar 2023 Ruiqi Wang, Akshay Gadi Patil, Fenggen Yu, Hao Zhang

We introduce the first active learning (AL) framework for high-accuracy instance segmentation of dynamic, interactable parts from RGB images of real indoor scenes.

Active Learning Instance Segmentation +1

Fine-Grained Regional Prompt Tuning for Visual Abductive Reasoning

no code implementations18 Mar 2023 Hao Zhang, Basura Fernando

To tackle this, we propose a simple yet effective Regional Prompt Tuning, which encodes "regional visual hints" and "global contexts" separately at fine and coarse-grained levels.

Visual Abductive Reasoning

DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion

no code implementations16 Mar 2023 Maham Tanveer, Yizhi Wang, Ali Mahdavi-Amiri, Hao Zhang

We introduce a novel method to automatically generate an artistic typography by stylizing one or more letter fonts to visually convey the semantics of an input word, while ensuring that the output remains readable.


A Simple Framework for Open-Vocabulary Segmentation and Detection

2 code implementations14 Mar 2023 Hao Zhang, Feng Li, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianfeng Gao, Jianwei Yang, Lei Zhang

We present OpenSeeD, a simple Open-vocabulary Segmentation and Detection framework that jointly learns from different segmentation and detection datasets.

 Ranked #1 on Instance Segmentation on ADE20K val (using extra training data)

Instance Segmentation Panoptic Segmentation

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

2 code implementations9 Mar 2023 Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Yang, Chunyuan Li, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang

To effectively fuse language and vision modalities, we conceptually divide a closed-set detector into three phases and propose a tight fusion solution, which includes a feature enhancer, a language-guided query selection, and a cross-modality decoder for cross-modality fusion.

object-detection Referring Expression +2

ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing

1 code implementation CVPR 2023 Zequn Zeng, Hao Zhang, Zhengjue Wang, Ruiying Lu, Dongsheng Wang, Bo Chen

Zero-shot capability has been considered as a new revolution of deep learning, letting machines work on tasks without curated training data.

Image Captioning Language Modelling

TimeMAE: Self-Supervised Representations of Time Series with Decoupled Masked Autoencoders

1 code implementation1 Mar 2023 Mingyue Cheng, Qi Liu, Zhiding Liu, Hao Zhang, Rujiao Zhang, Enhong Chen

In this work, we propose TimeMAE, a novel self-supervised paradigm for learning transferrable time series representations based on transformer networks.

Time Series Analysis Time Series Classification

Concept-Level Explanation for the Generalization of a DNN

no code implementations25 Feb 2023 Huilin Zhou, Hao Zhang, Huiqi Deng, Dongrui Liu, Wen Shen, Shih-Han Chan, Quanshi Zhang

Therefore, in this paper, we investigate the generalization power of each interactive concept, and we use the generalization power of different interactive concepts to explain the generalization power of the entire DNN.

Introducing Depth into Transformer-based 3D Object Detection

no code implementations25 Feb 2023 Hao Zhang, Hongyang Li, Ailing Zeng, Feng Li, Shilong Liu, Xingyu Liao, Lei Zhang

To address the second issue, we introduce an auxiliary learning task called Depth-aware Negative Suppression loss.

3D Object Detection Auxiliary Learning +2

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving

1 code implementation22 Feb 2023 Zhuohan Li, Lianmin Zheng, Yinmin Zhong, Vincent Liu, Ying Sheng, Xin Jin, Yanping Huang, Zhifeng Chen, Hao Zhang, Joseph E. Gonzalez, Ion Stoica

Model parallelism is conventionally viewed as a method to scale a single large deep learning model beyond the memory limits of a single device.

Deep AHS: A Deep Learning Approach to Acoustic Howling Suppression

no code implementations18 Feb 2023 Hao Zhang, Meng Yu, Dong Yu

In this paper, we formulate acoustic howling suppression (AHS) as a supervised learning problem and propose a deep learning approach, called Deep AHS, to address it.

Speech Separation

NeuralKalman: A Learnable Kalman Filter for Acoustic Echo Cancellation

no code implementations29 Jan 2023 Yixuan Zhang, Meng Yu, Hao Zhang, Dong Yu, DeLiang Wang

The Kalman filter is widely used for addressing acoustic echo cancellation (AEC) problems due to their robustness to double-talk and fast convergence.

Acoustic echo cancellation

D$^2$CSG: Unsupervised Learning of Compact CSG Trees with Dual Complements and Dropouts

no code implementations27 Jan 2023 Fenggen Yu, Qimin Chen, Maham Tanveer, Ali Mahdavi Amiri, Hao Zhang

We present D$^2$CSG, a neural model composed of two dual and complementary network branches, with dropouts, for unsupervised learning of compact constructive solid geometry (CSG) representations of 3D CAD shapes.

HAL3D: Hierarchical Active Learning for Fine-Grained 3D Part Labeling

no code implementations25 Jan 2023 Fenggen Yu, Yiming Qian, Francisca Gil-Ureta, Brian Jackson, Eric Bennett, Hao Zhang

We present the first active learning tool for fine-grained 3D part labeling, a problem which challenges even the most advanced deep learning (DL) methods due to the significant structural variations among the small and intricate parts.

Active Learning

A Method For Eliminating Contour Errors In Self-Encoder Reconstructed Images

no code implementations25 Jan 2023 Yonggang Li, Hao Zhang

In this paper, we propose a self-supervised twin network approach based on this a priori.

CA$^2$T-Net: Category-Agnostic 3D Articulation Transfer from Single Image

no code implementations5 Jan 2023 Jasmine Collins, Anqi Liang, Jitendra Malik, Hao Zhang, Frédéric Devernay

We present a neural network approach to transfer the motion from a single image of an articulated object to a rest-state (i. e., unarticulated) 3D model.

CC-FedAvg: Computationally Customized Federated Averaging

no code implementations28 Dec 2022 Tingting Wu, Hao Zhang, Siyao Cheng, Jie Liu

Federated learning (FL) is an emerging paradigm to train model with distributed data from numerous Internet of Things (IoT) devices.

Federated Learning

Improved Long-Form Spoken Language Translation with Large Language Models

no code implementations19 Dec 2022 Arya D. McCarthy, Hao Zhang, Shankar Kumar, Felix Stahlberg, Axel H. Ng

A challenge in spoken language translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations.

Language Modelling Translation

ARO-Net: Learning Implicit Fields from Anchored Radial Observations

1 code implementation CVPR 2023 Yizhi Wang, Zeyu Huang, Ariel Shamir, Hui Huang, Hao Zhang, Ruizhen Hu

We introduce anchored radial observations (ARO), a novel shape encoding for learning implicit field representation of 3D shapes that is category-agnostic and generalizable amid significant shape variations.

Surface Reconstruction

Coordinating Cross-modal Distillation for Molecular Property Prediction

no code implementations30 Nov 2022 Hao Zhang, Nan Zhang, Ruixin Zhang, Lei Shen, Yingyi Zhang, Meng Liu

The existing graph methods have demonstrated that 3D geometric information is significant for better performance in MPP.

Graph Regression Graph Representation Learning +3

DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding

1 code implementation28 Nov 2022 Shilong Liu, Yaoyuan Liang, Feng Li, Shijia Huang, Hao Zhang, Hang Su, Jun Zhu, Lei Zhang

As phrase extraction can be regarded as a $1$D text segmentation problem, we formulate PEG as a dual detection problem and propose a novel DQ-DETR model, which introduces dual queries to probe different features from image and text for object prediction and phrase mask prediction.

object-detection Object Detection +4

FLNeRF: 3D Facial Landmarks Estimation in Neural Radiance Fields

1 code implementation21 Nov 2022 Hao Zhang, Tianyuan Dai, Yu-Wing Tai, Chi-Keung Tang

This paper presents the first significant work on directly predicting 3D face landmarks on neural radiance fields (NeRFs), without using intermediate representations such as 2D images, depth maps, or point clouds.

QueryForm: A Simple Zero-shot Form Entity Query Framework

no code implementations14 Nov 2022 Zifeng Wang, Zizhao Zhang, Jacob Devlin, Chen-Yu Lee, Guolong Su, Hao Zhang, Jennifer Dy, Vincent Perot, Tomas Pfister

Zero-shot transfer learning for document understanding is a crucial yet under-investigated scenario to help reduce the high cost involved in annotating document entities.

Transfer Learning

On Optimizing the Communication of Model Parallelism

no code implementations10 Nov 2022 Yonghao Zhuang, Hexu Zhao, Lianmin Zheng, Zhuohan Li, Eric P. Xing, Qirong Ho, Joseph E. Gonzalez, Ion Stoica, Hao Zhang

This pattern emerges when the two paradigms of model parallelism - intra-operator and inter-operator parallelism - are combined to support large models on large clusters.

FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration

no code implementations9 Nov 2022 Yangjun Wu, Kebin Fang, Yao Zhao, Hao Zhang, Lifeng Shi, Mengqi Zhang

To accomplish punctuation restoration, most existing methods focus on introducing extra information (e. g., part-of-speech) or addressing the class imbalance problem.

Language Modelling Punctuation Restoration +1

MPCFormer: fast, performant and private Transformer inference with MPC

1 code implementation2 Nov 2022 Dacheng Li, Rulin Shao, Hongyi Wang, Han Guo, Eric P. Xing, Hao Zhang

Through extensive evaluations, we show that MPCFORMER significantly speeds up Transformer inference in MPC settings while achieving similar ML performance to the input model.

Knowledge Distillation

Neural Eigenfunctions Are Structured Representation Learners

1 code implementation23 Oct 2022 Zhijie Deng, Jiaxin Shi, Hao Zhang, Peng Cui, Cewu Lu, Jun Zhu

In this paper, we introduce a scalable method for learning structured, adaptive-length deep representations.

Contrastive Learning Feature Importance +6

NIFT: Neural Interaction Field and Template for Object Manipulation

no code implementations20 Oct 2022 Zeyu Huang, Juzhan Xu, Sisi Dai, Kai Xu, Hao Zhang, Hui Huang, Ruizhen Hu

Given a few object manipulation demos, NIFT guides the generation of the interaction imitation for a new object instance by matching the Neural Interaction Template (NIT) extracted from the demos in the target Neural Interaction Field (NIF) defined for the new object.

Imitation Learning

Language Model Decomposition: Quantifying the Dependency and Correlation of Language Models

1 code implementation19 Oct 2022 Hao Zhang

A goodness-of-fit metric for LMD similar to the coefficient of determination is defined and used to measure the linear dependency of a set of LMs.

Language Modelling

Regularized Data Programming with Bayesian Priors

no code implementations17 Oct 2022 Jacqueline R. M. A. Maasch, Hao Zhang, Qian Yang, Fei Wang, Volodymyr Kuleshov

The cost of manual data labeling can be a significant obstacle in supervised learning.

AMP: Automatically Finding Model Parallel Strategies with Heterogeneity Awareness

1 code implementation13 Oct 2022 Dacheng Li, Hongyi Wang, Eric Xing, Hao Zhang

Scaling up model sizes can lead to fundamentally new capabilities in many machine learning (ML) tasks.

Application of Deep Learning on Single-Cell RNA-sequencing Data Analysis: A Review

no code implementations11 Oct 2022 Matthew Brendel, Chang Su, Zilong Bai, Hao Zhang, Olivier Elemento, Fei Wang

Single-cell RNA-sequencing (scRNA-seq) has become a routinely used technique to quantify the gene expression profile of thousands of single cells simultaneously.

Physical Interaction: Reconstructing Hand-object Interactions with Physics

1 code implementation22 Sep 2022 Haoyu Hu, Xinyu Yi, Hao Zhang, Jun-Hai Yong, Feng Xu

Single view-based reconstruction of hand-object interaction is challenging due to the severe observation missing caused by occlusions.

Learning Reconstructability for Drone Aerial Path Planning

no code implementations21 Sep 2022 Yilin Liu, Liqiang Lin, Yue Hu, Ke Xie, Chi-Wing Fu, Hao Zhang, Hui Huang

To reconstruct a new urban scene, we first build the 3D scene proxy, then rely on the predicted reconstruction quality and uncertainty measures by our network, based off of the proxy geometry, to guide the drone path planning.

3D Scene Reconstruction

DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples Discrimination

no code implementations21 Aug 2022 Tingting Wu, Xiao Ding, Hao Zhang, Jinglong Gao, Li Du, Bing Qin, Ting Liu

To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful (e. g., easy to hard) sequence.

Image Classification regression

UKP-SQuARE v2: Explainability and Adversarial Attacks for Trustworthy QA

1 code implementation19 Aug 2022 Rachneet Sachdeva, Haritz Puerto, Tim Baumgärtner, Sewin Tariverdian, Hao Zhang, Kexin Wang, Hossain Shaikh Saadi, Leonardo F. R. Ribeiro, Iryna Gurevych

In this paper, we introduce SQuARE v2, the new version of SQuARE, to provide an explainability infrastructure for comparing models based on methods such as saliency maps and graph-based explanations.

Adversarial Attack Explainable Models +2

PhyGNNet: Solving spatiotemporal PDEs with Physics-informed Graph Neural Network

no code implementations7 Aug 2022 Longxiang Jiang, Liyuan Wang, Xinkun Chu, Yonghao Xiao, Hao Zhang

Solving partial differential equations (PDEs) is an important research means in the fields of physics, biology, and chemistry.

Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLP

1 code implementation15 Jul 2022 Zhicai Wang, Yanbin Hao, Xingyu Gao, Hao Zhang, Shuo Wang, Tingting Mu, Xiangnan He

They use token-mixing layers to capture cross-token interactions, as opposed to the multi-head self-attention mechanism used by Transformers.

Long-term Leap Attention, Short-term Periodic Shift for Video Classification

1 code implementation12 Jul 2022 Hao Zhang, Lechao Cheng, Yanbin Hao, Chong-Wah Ngo

By replacing a vanilla 2D attention with the LAPS, we could adapt a static transformer into a video one, with zero extra parameters and neglectable computation overhead ($\sim$2. 6\%).

Video Classification

Data-and-Knowledge Dual-Driven Automatic Modulation Recognition for Wireless Communication Networks

no code implementations30 Jun 2022 Rui Ding, Hao Zhang, Fuhui Zhou, Qihui Wu, Zhu Han

In order to tackle these problems, a novel data-and-knowledge dual-driven automatic modulation classification scheme based on radio frequency machine learning is proposed by exploiting the attribute features of different modulations.


Wavelet Regularization Benefits Adversarial Training

1 code implementation8 Jun 2022 Jun Yan, Huilin Yin, Xiaoyang Deng, Ziming Zhao, Wancheng Ge, Hao Zhang, Gerhard Rigoll

Since adversarial vulnerability can be regarded as a high-frequency phenomenon, it is essential to regulate the adversarially-trained neural network models in the frequency domain.

Adversarial Robustness

MS-RNN: A Flexible Multi-Scale Framework for Spatiotemporal Predictive Learning

1 code implementation7 Jun 2022 Zhifeng Ma, Hao Zhang, Jie Liu

Spatiotemporal predictive learning, which predicts future frames through historical prior knowledge with the aid of deep learning, is widely used in many fields.

Image Classification Semantic Segmentation

DETR++: Taming Your Multi-Scale Detection Transformer

no code implementations7 Jun 2022 Chi Zhang, Lijuan Liu, Xiaoxue Zang, Frederick Liu, Hao Zhang, Xinying Song, Jindong Chen

Convolutional Neural Networks (CNN) have dominated the field of detection ever since the success of AlexNet in ImageNet classification [12].

object-detection Small Object Detection

Why Adversarial Training of ReLU Networks Is Difficult?

no code implementations30 May 2022 Xu Cheng, Hao Zhang, Yue Xin, Wen Shen, Jie Ren, Quanshi Zhang

We also prove that adversarial training tends to strengthen the influence of unconfident input samples with large gradient norms in an exponential manner.

GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis

no code implementations27 May 2022 Yushi Cao, Zhiming Li, Tianpei Yang, Hao Zhang, Yan Zheng, Yi Li, Jianye Hao, Yang Liu

In this paper, we combine the above two paradigms together and propose a novel Generalizable Logic Synthesis (GALOIS) framework to synthesize hierarchical and strict cause-effect logic programs.

Decision Making Program Synthesis +2

Active Domain Adaptation with Multi-level Contrastive Units for Semantic Segmentation

no code implementations23 May 2022 Hao Zhang, Ruimao Zhang, Zhanglin Peng, Junle Wang, Yanqing Jing

A simple pixel selection strategy followed with the construction of multi-level contrastive units is introduced to optimize the model for both domain adaptation and active supervised learning.

Active Learning Domain Adaptation +2

Downstream Transformer Generation of Question-Answer Pairs with Preprocessing and Postprocessing Pipelines

1 code implementation15 May 2022 Cheng Zhang, Hao Zhang, Jie Wang

We present a system called TP3 to perform a downstream task of transformers on generating question-answer pairs (QAPs) from a given article.

New-Onset Diabetes Assessment Using Artificial Intelligence-Enhanced Electrocardiography

no code implementations5 May 2022 Neil Jethani, Aahlad Puli, Hao Zhang, Leonid Garber, Lior Jankelson, Yindalon Aphinyanaphongs, Rajesh Ranganath

We found ECG-based assessment outperforms the ADA Risk test, achieving a higher area under the curve (0. 80 vs. 0. 68) and positive predictive value (13% vs. 9%) -- 2. 6 times the prevalence of diabetes in the cohort.

Adaptive Split-Fusion Transformer

1 code implementation26 Apr 2022 Zixuan Su, Hao Zhang, Jingjing Chen, Lei Pang, Chong-Wah Ngo, Yu-Gang Jiang

Neural networks for visual content understanding have recently evolved from convolutional ones (CNNs) to transformers.

 Ranked #1 on Image Classification on CIFAR-100 (Accuracy metric)

Image Classification

FedCos: A Scene-adaptive Federated Optimization Enhancement for Performance Improvement

1 code implementation7 Apr 2022 Hao Zhang, Tingting Wu, Siyao Cheng, Jie Liu

On the other hand, it enlarges the distances between local models, resulting in an aggregated global model with poor performance.

Federated Learning

Heterogeneous Autoencoder Empowered by Quadratic Neurons

1 code implementation2 Apr 2022 Jing-Xiao Liao, Bo-Jian Hou, Hang-Cheng Dong, Hao Zhang, Jianwei Ma, Jinwei Sun, Shiping Zhang, Feng-Lei Fan

Inspired by the complexity and diversity of biological neurons, a quadratic neuron is proposed to replace the inner product in the current neuron with a simplified quadratic function.

Anomaly Detection

Randomized Sharpness-Aware Training for Boosting Computational Efficiency in Deep Learning

no code implementations18 Mar 2022 Yang Zhao, Hao Zhang, Xiuyuan Hu

Optimizers in RST would perform a Bernoulli trial at each iteration to choose randomly from base algorithms (SGD) and sharpness-aware algorithms (SAM) with a probability arranged by a predefined scheduling function.


Group Contextualization for Video Recognition

1 code implementation CVPR 2022 Yanbin Hao, Hao Zhang, Chong-Wah Ngo, Xiangnan He

By utilizing calibrators to embed feature with four different kinds of contexts in parallel, the learnt representation is expected to be more resilient to diverse types of activities.

Action Recognition Egocentric Activity Recognition +1

Contextual Networks and Unsupervised Ranking of Sentences

no code implementations9 Mar 2022 Hao Zhang, You Zhou, Jie Wang

We construct a contextual network to represent a document with syntactic and semantic relations between word-sentence pairs, based on which we devise an unsupervised algorithm called CNATAR (Contextual Network And Text Analysis Rank) to score sentences, and rank them through a bi-objective 0-1 knapsack maximization problem over topic analysis and sentence scores.

Boilerplate Detection via Semantic Classification of TextBlocks

no code implementations9 Mar 2022 Hao Zhang, Jie Wang

We present a hierarchical neural network model called SemText to detect HTML boilerplate based on a novel semantic representation of HTML tags, class names, and text blocks.


DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection

11 code implementations7 Mar 2022 Hao Zhang, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel M. Ni, Heung-Yeung Shum

Compared to other models on the leaderboard, DINO significantly reduces its model size and pre-training data size while achieving better results.

 Ranked #1 on Object Detection on COCO 2017 val (box AP metric)

Real-Time Object Detection

Vision-Language Intelligence: Tasks, Representation Learning, and Large Models

no code implementations3 Mar 2022 Feng Li, Hao Zhang, Yi-Fan Zhang, Shilong Liu, Jian Guo, Lionel M. Ni, Pengchuan Zhang, Lei Zhang

This survey is inspired by the remarkable progress in both computer vision and natural language processing, and recent trends shifting from single modality processing to multiple modality comprehension.

Few-Shot Learning Representation Learning

DN-DETR: Accelerate DETR Training by Introducing Query DeNoising

11 code implementations CVPR 2022 Feng Li, Hao Zhang, Shilong Liu, Jian Guo, Lionel M. Ni, Lei Zhang

Our method is universal and can be easily plugged into any DETR-like methods by adding dozens of lines of code to achieve a remarkable improvement.

Object Detection

Hierarchical Point Cloud Encoding and Decoding with Lightweight Self-Attention based Model

no code implementations13 Feb 2022 En Yen Puang, Hao Zhang, Hongyuan Zhu, Wei Jing

In this paper we present SA-CNN, a hierarchical and lightweight self-attention based encoding and decoding architecture for representation learning of point cloud data.

Representation Learning Retrieval

Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning

1 code implementation8 Feb 2022 Yang Zhao, Hao Zhang, Xiuyuan Hu

In this paper, we propose an effective method to improve the model generalization by additionally penalizing the gradient norm of loss function during optimization.

A Variational Edge Partition Model for Supervised Graph Representation Learning

1 code implementation7 Feb 2022 Yilin He, Chaojie Wang, Hao Zhang, Bo Chen, Mingyuan Zhou

This paper introduces a graph generative process to model how the observed edges are generated by aggregating the node interactions over a set of overlapping node communities, each of which contributes to the edges via a logical OR mechanism.

Classification Graph Representation Learning +1

Neural Dual Contouring

2 code implementations4 Feb 2022 Zhiqin Chen, Andrea Tagliasacchi, Thomas Funkhouser, Hao Zhang

We introduce neural dual contouring (NDC), a new data-driven approach to mesh reconstruction based on dual contouring (DC).

Surface Reconstruction

RIM-Net: Recursive Implicit Fields for Unsupervised Learning of Hierarchical Shape Structures

no code implementations CVPR 2022 Chengjie Niu, Manyi Li, Kai Xu, Hao Zhang

Each level of the tree corresponds to an assembly of shape parts, represented as implicit functions, to reconstruct the input shape.

Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning

1 code implementation28 Jan 2022 Lianmin Zheng, Zhuohan Li, Hao Zhang, Yonghao Zhuang, Zhifeng Chen, Yanping Huang, Yida Wang, Yuanzhong Xu, Danyang Zhuo, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica

Existing model-parallel training systems either require users to manually create a parallelization plan or automatically generate one from a limited space of model parallelism configurations.

DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

5 code implementations ICLR 2022 Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi, Hang Su, Jun Zhu, Lei Zhang

We present in this paper a novel query formulation using dynamic anchor boxes for DETR (DEtection TRansformer) and offer a deeper understanding of the role of queries in DETR.

Object Detection

Temporal Sentence Grounding in Videos: A Survey and Future Directions

no code implementations20 Jan 2022 Hao Zhang, Aixin Sun, Wei Jing, Joey Tianyi Zhou

Temporal sentence grounding in videos (TSGV), \aka natural language video localization (NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that semantically corresponds to a language query from an untrimmed video.

Moment Retrieval Retrieval

A Privacy-Preserving Unsupervised Domain Adaptation Framework for Clinical Text Analysis

no code implementations18 Jan 2022 Qiyuan An, Ruijiang Li, Lin Gu, Hao Zhang, Qingyu Chen, Zhiyong Lu, Fei Wang, Yingying Zhu

To evaluate our proposed method's utility and privacy loss, we apply our model on a medical report disease label classification task using two noisy challenging clinical text datasets.

Inference Attack Membership Inference Attack +4

Neighborhood Region Smoothing Regularization for Finding Flat Minima In Deep Neural Networks

no code implementations16 Jan 2022 Yang Zhao, Hao Zhang

NRS leverages the finding that models would benefit from converging to flat minima, and tries to regularize the neighborhood region in weight space to yield approximate outputs.

Image Classification

Manifoldron: Direct Space Partition via Manifold Discovery

2 code implementations14 Jan 2022 Dayang Wang, Feng-Lei Fan, Bo-Jian Hou, Hao Zhang, Zhen Jia, Boce Zhou, Rongjie Lai, Hengyong Yu, Fei Wang

A neural network with the widely-used ReLU activation has been shown to partition the sample space into many convex polytopes for prediction.

BIG-bench Machine Learning

Face Deblurring Based on Separable Normalization and Adaptive Denormalization

no code implementations18 Dec 2021 Xian Zhang, Hao Zhang, Jiancheng Lv, Xiaojie Li

Face deblurring aims to restore a clear face image from a blurred input image with more explicit structure and facial details.

Deblurring Face Parsing +1

SAC-GAN: Structure-Aware Image Composition

1 code implementation13 Dec 2021 Hang Zhou, Rui Ma, Ling-Xiao Zhang, Lin Gao, Ali Mahdavi-Amiri, Hao Zhang

Specifically, our network takes the semantic layout features from the input scene image, features encoded from the edges and silhouette in the input object patch, as well as a latent code as inputs, and generates a 2D spatial affine transform defining the translation and scaling of the object patch.

Image Augmentation

Hybrid Neural Networks for On-device Directional Hearing

1 code implementation AAAI 2022 Anran Wang, Maruchi Kim, Hao Zhang, Shyamnath Gollakota

On-device directional hearing requires audio source separation from a given direction while achieving stringent human-imperceptible latency requirements.

Causal Inference Real-time Directional Hearing

UNIST: Unpaired Neural Implicit Shape Translation Network

no code implementations CVPR 2022 Qimin Chen, Johannes Merz, Aditya Sanghi, Hooman Shayani, Ali Mahdavi-Amiri, Hao Zhang

We introduce UNIST, the first deep neural implicit model for general-purpose, unpaired shape-to-shape translation, in both 2D and 3D domains.

Style Transfer Translation

Self-Reflective Terrain-Aware Robot Adaptation for Consistent Off-Road Ground Navigation

no code implementations12 Nov 2021 Sriram Siva, Maggie Wigness, John G. Rogers, Long Quang, Hao Zhang

Ground robots require the crucial capability of traversing unstructured and unprepared terrains and avoiding obstacles to complete tasks in real-world robotics applications such as disaster response.

Disaster Response Navigate

Discovering and Explaining the Representation Bottleneck of DNNs

1 code implementation ICLR 2022 Huiqi Deng, Qihan Ren, Hao Zhang, Quanshi Zhang

This paper explores the bottleneck of feature representations of deep neural networks (DNNs), from the perspective of the complexity of interactions between input variables encoded in DNNs.

Towards Debiasing Temporal Sentence Grounding in Video

no code implementations8 Nov 2021 Hao Zhang, Aixin Sun, Wei Jing, Joey Tianyi Zhou

In this paper, we propose two debiasing strategies, data debiasing and model debiasing, to "force" a TSGV model to capture cross-modal interactions.

Asynchronous Collaborative Localization by Integrating Spatiotemporal Graph Learning with Model-Based Estimation

no code implementations5 Nov 2021 Peng Gao, Brian Reily, Rui Guo, HongSheng Lu, Qingzhao Zhu, Hao Zhang

In this paper, we introduce a novel approach that integrates uncertainty-aware spatiotemporal graph learning and model-based state estimation for a team of robots to collaboratively localize objects.

Graph Learning Object Localization

Clinical Evidence Engine: Proof-of-Concept For A Clinical-Domain-Agnostic Decision Support Infrastructure

no code implementations31 Oct 2021 BoJian Hou, Hao Zhang, Gur Ladizhinsky, Stephen Yang, Volodymyr Kuleshov, Fei Wang, Qian Yang

As a result, clinicians cannot easily or rapidly scrutinize the CDSS recommendation when facing a difficult diagnosis or treatment decision in practice.

A Prototype-Oriented Framework for Unsupervised Domain Adaptation

1 code implementation NeurIPS 2021 Korawat Tanwisuth, Xinjie Fan, Huangjie Zheng, Shujian Zhang, Hao Zhang, Bo Chen, Mingyuan Zhou

Existing methods for unsupervised domain adaptation often rely on minimizing some statistical distance between the source and target samples in the latent space.

Unsupervised Domain Adaptation

Edge Partition Modulated Graph Convolutional Networks

no code implementations29 Sep 2021 Yilin He, Chaojie Wang, Hao Zhang, Bo Chen, Mingyuan Zhou

In this paper, we introduce a relational graph generative process to model how the observed edges are generated by aggregating the node interactions over multiple overlapping node communities, each of which represents a particular type of relation that contributes to the edges via a logical OR mechanism.

Variational Inference

Position-Invariant Truecasing with a Word-and-Character Hierarchical Recurrent Neural Network

no code implementations26 Aug 2021 Hao Zhang, You-Chi Cheng, Shankar Kumar, Mingqing Chen, Rajiv Mathews

Truecasing is the task of restoring the correct case (uppercase or lowercase) of noisy text generated either by an automatic system for speech recognition or machine translation or by humans.

Language Modelling Machine Translation +6

Detecting Small Objects in Thermal Images Using Single-Shot Detector

no code implementations25 Aug 2021 Hao Zhang, Xianggong Hong, Li Zhu

In this paper, we proposed DDSSD (Dilation and Deconvolution Single Shot Multibox Detector), an enhanced SSD with a novel feature fusion module which can improve the performance over SSD for small object detection.

object-detection Small Object Detection

Small Object Detection Based on Modified FSSD and Model Compression

no code implementations24 Aug 2021 Qingcai Wang, Hao Zhang, Xianggong Hong, Qinqin Zhou

Small objects have relatively low resolution, the unobvious visual features which are difficult to be extracted, so the existing object detection methods cannot effectively detect small objects, and the detection speed and stability are poor.

Model Compression object-detection +1

Automatic Modulation Classification Using Involution Enabled Residual Networks

no code implementations23 Aug 2021 Hao Zhang, Lu Yuan, Guangyu Wu, Fuhui Zhou, Qihui Wu

Automatic modulation classification (AMC) is of crucial importance for realizing wireless intelligence communications.


Interpreting Attributions and Interactions of Adversarial Attacks

no code implementations ICCV 2021 Xin Wang, Shuyun Lin, Hao Zhang, Yufei Zhu, Quanshi Zhang

This paper aims to explain adversarial attacks in terms of how adversarial perturbations contribute to the attacking task.

Token Shift Transformer for Video Classification

3 code implementations5 Aug 2021 Hao Zhang, Yanbin Hao, Chong-Wah Ngo

It is worth noticing that our TokShift transformer is a pure convolutional-free video transformer pilot with computational efficiency for video understanding.

Classification Video Classification +1

COSY: COunterfactual SYntax for Cross-Lingual Understanding

1 code implementation ACL 2021 Sicheng Yu, Hao Zhang, Yulei Niu, Qianru Sun, Jing Jiang

Pre-trained multilingual language models, e. g., multilingual-BERT, are widely used in cross-lingual tasks, yielding the state-of-the-art performance.

Natural Language Inference POS +2

EnsLM: Ensemble Language Model for Data Diversity by Semantic Clustering

1 code implementation ACL 2021 Zhibin Duan, Hao Zhang, Chaojie Wang, Zhengjue Wang, Bo Chen, Mingyuan Zhou

As a result, the backbone learns the shared knowledge among all clusters while modulated weights extract the cluster-specific features.

Language Modelling

Proceedings of ICML 2021 Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI

no code implementations16 Jul 2021 Quanshi Zhang, Tian Han, Lixin Fan, Zhanxing Zhu, Hang Su, Ying Nian Wu, Jie Ren, Hao Zhang

This workshop pays a special interest in theoretic foundations, limitations, and new application trends in the scope of XAI.

Turbulence-immune computational ghost imaging based on a multi-scale generative adversarial network

no code implementations14 Jul 2021 Hao Zhang, Deyang Duan

There is a consensus that turbulence-free images cannot be obtained by conventional computational ghost imaging (CGI) because the CGI is only a classic simulation, which does not satisfy the conditions of turbulence-free imaging.

MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis

no code implementations13 Jul 2021 Haocheng Ren, Hao Zhang, Jia Zheng, Jiaxiang Zheng, Rui Tang, Yuchi Huo, Hujun Bao, Rui Wang

With the rapid development of data-driven techniques, data has played an essential role in various computer vision tasks.

2D Semantic Segmentation Depth Estimation +1

PhotoChat: A Human-Human Dialogue Dataset with Photo Sharing Behavior for Joint Image-Text Modeling

no code implementations ACL 2021 Xiaoxue Zang, Lijuan Liu, Maria Wang, Yang song, Hao Zhang, Jindong Chen

Based on this dataset, we propose two tasks to facilitate research on image-text modeling: a photo-sharing intent prediction task that predicts whether one intends to share a photo in the next conversation turn, and a photo retrieval task that retrieves the most relevant photo according to the dialogue context.

Image Retrieval Retrieval

Learning Mesh Representations via Binary Space Partitioning Tree Networks

1 code implementation27 Jun 2021 Zhiqin Chen, Andrea Tagliasacchi, Hao Zhang

The network is trained to reconstruct a shape using a set of convexes obtained from a BSP-tree built over a set of planes, where the planes and convexes are both defined by learned network weights.

Interventional Video Grounding with Dual Contrastive Learning

1 code implementation CVPR 2021 Guoshun Nan, Rui Qiao, Yao Xiao, Jun Liu, Sicong Leng, Hao Zhang, Wei Lu

2) Meanwhile, we introduce a dual contrastive learning approach (DCL) to better align the text and video by maximizing the mutual information (MI) between query and video clips, and the MI between start/end frames of a target moment and the others within a video to learn more informative visual representations.

Causal Inference Contrastive Learning +2

Neural Marching Cubes

1 code implementation21 Jun 2021 Zhiqin Chen, Hao Zhang

To tackle these challenges, we re-cast MC from a deep learning perspective, by designing tessellation templates more apt at preserving geometric features, and learning the vertex positions and mesh topologies from training meshes, to account for contextual information from nearby cubes.

D2IM-Net: Learning Detail Disentangled Implicit Fields From Single Images

no code implementations CVPR 2021 Manyi Li, Hao Zhang

We present the first single-view 3D reconstruction network aimed at recovering geometric details from an input image which encompass both topological shape structures and surface features.

3D Reconstruction Single-View 3D Reconstruction

A Novel Automatic Modulation Classification Scheme Based on Multi-Scale Networks

no code implementations31 May 2021 Hao Zhang, Fuhui Zhou, Qihui Wu, Wei Wu, Rose Qingyang Hu

Moreover, a novel loss function that combines the center loss and the cross entropy loss is exploited to learn both discriminative and separable features in order to further improve the classification performance.

Classification Face Recognition

SDNet: mutil-branch for single image deraining using swin

no code implementations31 May 2021 Fuxiang Tan, YuTing Kong, Yingying Fan, Feng Liu, Daxin Zhou, Hao Zhang, Long Chen, Liang Gao, Yurong Qian

The former implements the basic rain pattern feature extraction, while the latter fuses different features to further extract and process the image features.

Autonomous Driving Single Image Deraining

Parallel Attention Network with Sequence Matching for Video Grounding

no code implementations Findings (ACL) 2021 Hao Zhang, Aixin Sun, Wei Jing, Liangli Zhen, Joey Tianyi Zhou, Rick Siow Mong Goh

In this work, we propose a Parallel Attention Network with Sequence matching (SeqPAN) to address the challenges in this task: multi-modal representation learning, and target moment boundary prediction.

Representation Learning Video Grounding

Video Corpus Moment Retrieval with Contrastive Learning

1 code implementation13 May 2021 Hao Zhang, Aixin Sun, Wei Jing, Guoshun Nan, Liangli Zhen, Joey Tianyi Zhou, Rick Siow Mong Goh

We adopt the first approach and introduce two contrastive learning objectives to refine video encoder and text encoder to learn video and text representations separately but with better alignment for VCMR.

Contrastive Learning Moment Retrieval +2

Understanding Deep MIMO Detection

no code implementations11 May 2021 Qiang Hu, Feifei Gao, Hao Zhang, Geoffrey Y. Li, Zongben Xu

We demonstrate that data-driven DL detector asymptotically approaches to the maximum a posterior (MAP) detector in various scenarios but requires enough training samples to converge in time-varying channels.

Contrastive Attraction and Contrastive Repulsion for Representation Learning

no code implementations8 May 2021 Huangjie Zheng, Xu Chen, Jiangchao Yao, Hongxia Yang, Chunyuan Li, Ya zhang, Hao Zhang, Ivor Tsang, Jingren Zhou, Mingyuan Zhou

We realize this strategy with contrastive attraction and contrastive repulsion (CACR), which makes the query not only exert a greater force to attract more distant positive samples but also do so to repel closer negative samples.

Contrastive Learning Representation Learning

CAPRI-Net: Learning Compact CAD Shapes with Adaptive Primitive Assembly

no code implementations CVPR 2022 Fenggen Yu, Zhiqin Chen, Manyi Li, Aditya Sanghi, Hooman Shayani, Ali Mahdavi-Amiri, Hao Zhang

We introduce CAPRI-Net, a neural network for learning compact and interpretable implicit representations of 3D computer-aided design (CAD) models, in the form of adaptive primitive assemblies.

Estimating the Generalization in Deep Neural Networks via Sparsity

no code implementations2 Apr 2021 Yang Zhao, Hao Zhang

By training DNNs with a wide range of generalization gap on popular datasets, we show that our key quantities and linear model could be efficient tools for estimating the generalization gap of DNNs.

Image Classification

Learning Multiscale Correlations for Human Motion Prediction

no code implementations19 Mar 2021 Honghong Zhou, Caili Guo, Hao Zhang, Yanjun Wang

We evaluate our approach on two standard benchmark datasets for human motion prediction: Human3. 6M and CMU motion capture dataset.

Human motion prediction motion prediction

Quantitative Performance Assessment of CNN Units via Topological Entropy Calculation

no code implementations ICLR 2022 Yang Zhao, Hao Zhang

We show that by investigating the feature entropy of units on only training data, it could give discrimination between networks with different generalization ability from the view of the effectiveness of feature representations.

General Classification Image Classification

Memory-Efficient Network for Large-scale Video Compressive Sensing

2 code implementations CVPR 2021 Ziheng Cheng, Bo Chen, Guanliang Liu, Hao Zhang, Ruiying Lu, Zhengjue Wang, Xin Yuan

With the knowledge of masks, optimization algorithms or deep learning methods are employed to reconstruct the desired high-speed video frames from this snapshot measurement.

Compressive Sensing Demosaicking +1

Multi-Channel and Multi-Microphone Acoustic Echo Cancellation Using A Deep Learning Based Approach

no code implementations3 Mar 2021 Hao Zhang, DeLiang Wang

Building on the deep learning based acoustic echo cancellation (AEC) in the single-loudspeaker (single-channel) and single-microphone setup, this paper investigates multi-channel AEC (MCAEC) and multi-microphone AEC (MMAEC).

Acoustic echo cancellation

MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing

2 code implementations CVPR 2021 Zhengjue Wang, Hao Zhang, Ziheng Cheng, Bo Chen, Xin Yuan

To capture high-speed videos using a two-dimensional detector, video snapshot compressive imaging (SCI) is a promising system, where the video frames are coded by different masks and then compressed to a snapshot measurement.

Compressive Sensing Video Compressive Sensing

TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models

1 code implementation16 Feb 2021 Zhuohan Li, Siyuan Zhuang, Shiyuan Guo, Danyang Zhuo, Hao Zhang, Dawn Song, Ion Stoica

With this key idea, we design TeraPipe, a high-performance token-level pipeline parallel algorithm for synchronous model-parallel training of Transformer-based language models.

Fast Dynamics in a Model Metallic Glass-forming Material

no code implementations28 Jan 2021 Hao Zhang, Xinyi Wang, Hai-Bin Yu, Jack F. Douglas

We investigate the fast $\beta$- and Johari-Goldstein (JG) $\beta$-relaxation processes, along with the elastic scattering response of glass-forming (GF) liquids and the Boson peak, in a simulated Al-Sm GF material exhibiting a fragile-strong (FS) transition.

Materials Science

Dynamic Heterogeneity, Cooperative Motion, and Johari-Goldstein $β$-Relaxation in a Metallic Glass-Forming Material Exhibiting a Fragile to Strong Transition

no code implementations27 Jan 2021 Hao Zhang, Xinyi Wang, Hai-Bin Yu, Jack F. Douglas

We investigate the Johari-Goldstein (JG) $\beta$-relaxation process in a model metallic glass-forming (GF) material (Al90Sm10), previously studied extensively by both frequency-dependent mechanical measurements and simulation studies devoted to equilibrium properties, by molecular dynamics simulations based on validated and optimized interatomic potentials with the primary aim of better understanding the nature of this universal relaxation process from a dynamic heterogeneity (DH) perspective.

Materials Science

Data Association Between Perception and V2V Communication Sensors

no code implementations20 Jan 2021 Mustafa Ridvan Cantas, Arpita Chand, Hao Zhang, Gopi Chandra Surnilla, Levent Guvenc

The connectivity between vehicles, infrastructure, and other traffic participants brings a new dimension to automotive safety applications.

Decision Making Robotics Systems and Control Systems and Control

Reinforcement Learning for Flexibility Design Problems

no code implementations2 Jan 2021 Yehua Wei, Lei Zhang, Ruiyi Zhang, Shijing Si, Hao Zhang, Lawrence Carin

Flexibility design problems are a class of problems that appear in strategic decision-making across industries, where the objective is to design a ($e. g.$, manufacturing) network that affords flexibility and adaptivity.

Decision Making reinforcement-learning +1

Towards Understanding and Improving Dropout in Game Theory

no code implementations ICLR 2021 Hao Zhang, Sen Li, Yinchao Ma, Mingjie Li, Yichen Xie, Quanshi Zhang

Experimental results on various DNNs and datasets have shown that the interaction loss can effectively improve the utility of dropout and boost the performance of DNNs.

An Embarrassingly Simple Model for Dialogue Relation Extraction

1 code implementation27 Dec 2020 Fuzhao Xue, Aixin Sun, Hao Zhang, Jinjie Ni, Eng Siong Chng

Dialogue relation extraction (RE) is to predict the relation type of two entities mentioned in a dialogue.

Dialog Relation Extraction

Simultaneous View and Feature Selection for Collaborative Multi-Robot Perception

no code implementations17 Dec 2020 Brian Reily, Hao Zhang

In this paper, we propose a novel approach to collaborative multi-robot perception that simultaneously integrates view selection, feature selection, and object recognition into a unified regularized optimization formulation, which uses sparsity-inducing norms to identify the robots with the most representative views and the modalities with the most discriminative features.

feature selection Object Recognition

Roof-GAN: Learning to Generate Roof Geometry and Relations for Residential Houses

1 code implementation CVPR 2021 Yiming Qian, Hao Zhang, Yasutaka Furukawa

This paper presents Roof-GAN, a novel generative adversarial network that generates structured geometry of residential roof structures as a set of roof primitives and their relationships.

Unsupervised Image Segmentation using Mutual Mean-Teaching

no code implementations16 Dec 2020 Zhichao Wu, Lei Guo, Hao Zhang, Dan Xu

Unsupervised image segmentation aims at assigning the pixels with similar feature into a same cluster without annotation, which is an important task in computer vision.

Image Segmentation Semantic Segmentation +1

GDPNet: Refining Latent Multi-View Graph for Relation Extraction

1 code implementation12 Dec 2020 Fuzhao Xue, Aixin Sun, Hao Zhang, Eng Siong Chng

Recent advances on RE task are from BERT-based sequence modeling and graph-based modeling of relationships among the tokens in the sequence.

Ranked #4 on Dialog Relation Extraction on DialogRE (F1c (v1) metric)

Dialog Relation Extraction Dynamic Time Warping

D$^2$IM-Net: Learning Detail Disentangled Implicit Fields from Single Images

no code implementations11 Dec 2020 Manyi Li, Hao Zhang

We present the first single-view 3D reconstruction network aimed at recovering geometric details from an input image which encompass both topological shape structures and surface features.

3D Reconstruction Single-View 3D Reconstruction

On Learning the Right Attention Point for Feature Enhancement

no code implementations11 Dec 2020 Liqiang Lin, Pengdi Huang, Chi-Wing Fu, Kai Xu, Hao Zhang, Hui Huang

We present a novel attention-based mechanism to learn enhanced point features for point cloud processing tasks, e. g., classification and segmentation.

Classification Point Cloud Classification

LayoutGMN: Neural Graph Matching for Structural Layout Similarity

1 code implementation CVPR 2021 Akshay Gadi Patil, Manyi Li, Matthew Fisher, Manolis Savva, Hao Zhang

In particular, retrieval results by our network better match human judgement of structural layout similarity compared to both IoUs and other baselines including a state-of-the-art method based on graph neural networks and image convolution.

Graph Matching Metric Learning +1

Bidirectional Convolutional Poisson Gamma Dynamical Systems

1 code implementation NeurIPS 2020 Wenchao Chen, Chaojie Wang, Bo Chen, Yicheng Liu, Hao Zhang, Mingyuan Zhou

Incorporating the natural document-sentence-word structure into hierarchical Bayesian modeling, we propose convolutional Poisson gamma dynamical systems (PGDS) that introduce not only word-level probabilistic convolutions, but also sentence-level stochastic temporal transitions.

Bayesian Inference Variational Inference