基于多层次预训练策略和多任务学习的端到端蒙汉语音翻译(End-to-end Mongolian-Chinese Speech Translation Based on Multi-level Pre-training Strategies and Multi-task Learning)

no code implementations CCL 2021 Ningning Wang, Long Fei, HUI ZHANG

“端到端语音翻译将源语言语音直接翻译为目标语言文本, 它需要“源语言语音-目标语言文本”作为训练数据, 然而这类数据极其稀缺, 本文提出了一种多层次预训练策略和多任务学习相结合的训练方法, 首先分别对语音识别和机器翻译模型的各个模块进行多层次预训练, 接着将语音识别和机器翻译模型连接起来构成语音翻译模型, 然后使用迁移学习对预训练好的模型进行多步骤微调, 在此过程中又运用多任务学习的方法, 将语音识别作为语音翻译的一个辅助任务来组织训练, 充分利用了已经存在的各种不同形式的数据来训练端到端模型, 首次将端到端技术应用于资源受限条件下的蒙汉语音翻译, 构建了首个翻译质量较高、实际可用的端到端蒙汉语音翻译系统。”

Multi-Task Learning

Focused and Collaborative Feedback Integration for Interactive Image Segmentation

1 code implementation21 Mar 2023 Qiaoqiao Wei, HUI ZHANG, Jun-Hai Yong

Interactive image segmentation aims at obtaining a segmentation mask for an image using simple user annotations.

Image Segmentation Semantic Segmentation

Resolving quantitative MRI model degeneracy with machine learning via training data distribution design

no code implementations9 Mar 2023 Michele Guerreri, Sean Epstein, Hojjat Azadbakht, HUI ZHANG

Our results illustrate the importance of training set design which has the potential to allow accurate estimation of tissue properties with ML.

Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges

no code implementations16 Jan 2023 Yushan Han, HUI ZHANG, Huifang Li, Yi Jin, Congyan Lang, Yidong Li

Although some works have reviewed and analyzed the basic architecture and key components in this field, there is still a lack of reviews on systematical collaboration modules in perception networks and large-scale collaborative perception datasets.

Autonomous Driving

Prototypical Residual Networks for Anomaly Detection and Localization

1 code implementation5 Dec 2022 HUI ZHANG, Zuxuan Wu, Zheng Wang, Zhineng Chen, Yu-Gang Jiang

Anomaly detection and localization are widely used in industrial manufacturing for its efficiency and effectiveness.

 Ranked #1 on supervised anomaly detection on MVTec AD (using extra training data)

supervised anomaly detection

Self-Regularized Prototypical Network for Few-Shot Semantic Segmentation

no code implementations30 Oct 2022 Henghui Ding, HUI ZHANG, Xudong Jiang

A direct yet effective prototype regularization on support set is proposed in SRPNet, in which the generated prototypes are evaluated and regularized on the support set itself.

Few-Shot Semantic Segmentation Semantic Segmentation

Searching a High-Performance Feature Extractor for Text Recognition Network

no code implementations27 Sep 2022 HUI ZHANG, Quanming Yao, James T. Kwok, Xiang Bai

We design a domain-specific search space by exploring principles for having good feature extractors.

Neural Architecture Search

How can spherical CNNs benefit ML-based diffusion MRI parameter estimation?

no code implementations1 Jul 2022 Tobias Goodwin-Allcock, Jason McEwen, Robert Gray, Parashkev Nachev, HUI ZHANG

A possible consequence of the lack of rotational equivariance is that the training dataset must contain a diverse range of microstucture orientations.

Enforcing continuous symmetries in physics-informed neural network for solving forward and inverse problems of partial differential equations

no code implementations19 Jun 2022 Zhi-Yong Zhang, HUI ZHANG, Li-Sheng Zhang, Lei-Lei Guo

As a typical application of deep learning, physics-informed neural network (PINN) {has been} successfully used to find numerical solutions of partial differential equations (PDEs), but how to improve the limited accuracy is still a great challenge for PINN.

A Unified Understanding of Deep NLP Models for Text Classification

no code implementations19 Jun 2022 Zhen Li, Xiting Wang, Weikai Yang, Jing Wu, Zhengyan Zhang, Zhiyuan Liu, Maosong Sun, HUI ZHANG, Shixia Liu

The rapid development of deep natural language processing (NLP) models for text classification has led to an urgent need for a unified understanding of these models proposed individually.

text-classification Text Classification

0/1 Deep Neural Networks via Block Coordinate Descent

no code implementations19 Jun 2022 HUI ZHANG, Shenglong Zhou, Geoffrey Ye Li, Naihua Xiu

The step function is one of the simplest and most natural activation functions for deep neural networks (DNNs).

Rotated MNIST

Neural Network Decoders for Permutation Codes Correcting Different Errors

no code implementations7 Jun 2022 Yeow Meng Chee, HUI ZHANG

Permutation codes were extensively studied in order to correct different types of errors for the applications on power line communication and rank modulation for flash memory.

3D Segmentation Guided Style-based Generative Adversarial Networks for PET Synthesis

no code implementations18 May 2022 Yang Zhou, Zhiwen Yang, HUI ZHANG, Eric I-Chao Chang, Yubo Fan, Yan Xu

(2) We adopt a task-driven strategy that couples a segmentation task with a generative adversarial network (GAN) framework to improve the translation performance.


Choice of training label matters: how to best use deep learning for quantitative MRI parameter estimation

no code implementations11 May 2022 Sean C. Epstein, Timothy J. P. Bray, Margaret Hall-Craggs, HUI ZHANG

Self-supervised approaches, sometimes referred to as unsupervised, have been loosely based on auto-encoders, whereas supervised methods have, to date, been trained on groundtruth labels.

Self-Supervised Learning

Thin-Plate Spline Motion Model for Image Animation

1 code implementation CVPR 2022 Jian Zhao, HUI ZHANG

Firstly, we propose thin-plate spline motion estimation to produce a more flexible optical flow, which warps the feature maps of the source image to the feature domain of the driving image.

Image Animation Motion Estimation +1

Projected Sliced Wasserstein Autoencoder-based Hyperspectral Images Anomaly Detection

no code implementations20 Dec 2021 Yurong Chen, HUI ZHANG, Yaonan Wang, Q. M. Jonathan Wu, Yimin Yang

In this case, the Wasserstein distance can be calculated with the closed-form, even the prior distribution is not Gaussian.

Anomaly Detection

D-HAN: Dynamic News Recommendation with Hierarchical Attention Network

1 code implementation19 Dec 2021 Qinghua Zhao, Xu Chen, HUI ZHANG, Shuai Ma

However, in real-world scenarios, the news can be quite complex and diverse, blindly squeezing all the contents into an embedding vector can be less effective in extracting information compatible with the personalized preference of the users.

News Recommendation

Auto-Encoding Score Distribution Regression for Action Quality Assessment

2 code implementations22 Nov 2021 Boyu Zhang, Jiayuan Chen, Yinfei Xu, HUI ZHANG, Xu Yang, Xin Geng

Traditionally, AQA is treated as a regression problem to learn the underlying mappings between videos and action scores.

Action Quality Assessment regression

MAGORINO: Magnitude-only fat fraction and R2* estimation with Rician noise modelling

no code implementations11 Oct 2021 Timothy JP Bray, Alan Bainbridge, Margaret A Hall-Craggs, HUI ZHANG

Purpose: Magnitude-based fitting of chemical shift-encoded data enables proton density fat fraction (PDFF) and R2* estimation where complex-based methods fail or when phase data is inaccessible or unreliable, such as in multi-centre studies.

Unveiling personnel movement in a larger indoor area with a non-overlapping multi-camera system

no code implementations10 Apr 2021 Ping Zhang, Zhenxiang Tao, Wenjie Yang, Minze Chen, Shan Ding, Xiaodong Liu, Rui Yang, HUI ZHANG

Surveillance cameras are widely applied for indoor occupancy measurement and human movement perception, which benefit for building energy management and social security.

energy management Management +1

Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud

1 code implementation ICCV 2021 Mingtao Feng, Zhen Li, Qi Li, Liang Zhang, Xiangdong Zhang, Guangming Zhu, HUI ZHANG, Yaonan Wang, Ajmal Mian

There are three main challenges in 3D object grounding: to find the main focus in the complex and diverse description; to understand the point cloud scene; and to locate the target object.

Learning Frequency-aware Dynamic Network for Efficient Super-Resolution

no code implementations ICCV 2021 Wenbin Xie, Dehua Song, Chang Xu, Chunjing Xu, HUI ZHANG, Yunhe Wang

Extensive experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures to obtain the better tradeoff between visual quality and computational complexity.

Image Super-Resolution

EventAnchor: Reducing Human Interactions in Event Annotation of Racket Sports Videos

no code implementations13 Jan 2021 Dazhen Deng, Jiang Wu, Jiachen Wang, Yihong Wu, Xiao Xie, Zheng Zhou, HUI ZHANG, Xiaolong Zhang, Yingcai Wu

The popularity of racket sports (e. g., tennis and table tennis) leads to high demands for data analysis, such as notational analysis, on player performance.

TESS Delivers Five New Hot Giant Planets Orbiting Bright Stars from the Full Frame Images

no code implementations5 Jan 2021 Joseph E. Rodriguez, Samuel N. Quinn, George Zhou, Andrew Vanderburg, Louise D. Nielsen, Robert A. Wittenmyer, Rafael Brahm, Phillip A. Reed, Chelsea X. Huang, Sydney Vach, David R. Ciardi, Ryan J. Oelkers, Keivan G. Stassun, Coel Hellier, B. Scott Gaudi, Jason D. Eastman, Karen A. Collins, Allyson Bieryla, Sam Christian, David W. Latham, Ilaria Carleo, Duncan J. Wright, Elisabeth Matthews, Erica J. Gonzales, Carl Ziegler, Courtney D. Dressing, Steve B. Howell, Thiam-Guan Tan, Justin Wittrock, Peter Plavchan, Kim K. McLeod, David Baker, Gavin Wang, Don Radford, Richard P. Schwarz, Massimiliano Esposito, George R. Ricker, Roland K. Vanderspek, Sara Seager, Joshua N. Winn, Jon M. Jenkins, Brett Addison, D. R. Anderson, Thomas Barclay, Thomas G. Beatty, Perry Berlind, Francois Bouchy, Michael Bowen, Brendan P. Bowler, C. E. Brasseur, César Briceño, Douglas A. Caldwell, Michael L. Calkins, Priyanka Chaturvedi, Guillaume Chaverot, Sudhish Chimaladinne, Jessie L. Christiansen, Kevin Collins, Ian J. M. Crossfield, Kevin Eastridge, N'estor Espinoza, Gilbert A. Esquerdo, Dax Feliz, Tyler Fenske, William Fong, Tianjun Gan, Steven Giacalone, Holden Gill, Lindsey Gordon, Alex Granados, Nolan Grieves, Eike W. Guenther, Natalia Guerrero, Thomas Henning, Christopher E. Henze, Katharine Hesse, Melissa J. Hobson, Jonathan Horner, David J. James, Eric L. N. Jensen, Mary Jimenez, Andrés Jordán, Stephen R. Kane, John Kielkopf, Kingsley Kim, Rudolf B. Kuhn, Natasha Latouf, Nicholas M. Law, Alan M. Levine, Michael B. Lund, Andrew W. Mann, Shude Mao, Rachel A. Matson, Scott McDermott, Matthew W. Mengel, Jessica Mink, Patrick Newman, Tanner O'Dwyer, Jack Okumura, Enric Palle, Joshua Pepper, Elisa V. Quintana, Paula Sarkis, Arjun Savel, Joshua E. Schlieder, Chloe Schnaible, Avi Shporer, Ramotholo Sefako, Julia Seidel, Robert J. Siverd, Brett Skinner, Manu Stalport, Daniel J. Stevens, Caitlin Stibbards, C. G. Tinney, R. G. West, Daniel A. Yahalomi, HUI ZHANG

TOI-640 b is one of only three known hot Jupiters to have a highly inflated radius (R$_{\rm P}$ > 1. 7R$_{\rm J}$, possibly a result of its host star's evolution) and resides on an orbit with a period longer than 5 days.

Earth and Planetary Astrophysics Solar and Stellar Astrophysics

Multi-Model Least Squares-Based Recomputation Framework for Large Data Analysis

no code implementations4 Jan 2021 Wandong Zhang, QM Jonathan Wu, Yimin Yang, WG Will Zhao, Tianlei Wang, HUI ZHANG

Most multilayer least squares (LS)-based neural networks are structured with two separate stages: unsupervised feature encoding and supervised pattern classification.

Representation Learning

Interaction via Bi-Directional Graph of Semantic Region Affinity for Scene Parsing

no code implementations ICCV 2021 Henghui Ding, HUI ZHANG, Jun Liu, Jiaxin Li, Zijian Feng, Xudong Jiang

In this work, we treat each respective region in an image as a whole, and capture the structure topology as well as the affinity among different regions.

Scene Parsing

Prototypical Matching and Open Set Rejection for Zero-Shot Semantic Segmentation

no code implementations ICCV 2021 HUI ZHANG, Henghui Ding

In this work, we present zero-shot semantic segmentation, which aims to identify not only the seen classes contained in training but also the novel classes that have never been seen.

Semantic Segmentation

[Re] Reimplementation of FixMatch and Investigation on Noisy (Pseudo) Labels and Confirmation Errors of FixMatch

1 code implementation RC 2020 Ci Li, Ruibo Tu, HUI ZHANG

FixMatch is a semi-supervised learning method, which achieves comparable results with fully supervised learning by leveraging a limited number of labeled data (pseudo labelling technique) and taking a good use of the unlabeled data (consistency regularization ).

Semi-Supervised Image Classification

Decoupling Representation and Classifier for Noisy Label Learning

no code implementations16 Nov 2020 HUI ZHANG, Quanming Yao

Since convolutional neural networks (ConvNets) can easily memorize noisy labels, which are ubiquitous in visual classification tasks, it has been a great challenge to train ConvNets against them robustly.

Self-Supervised Learning

Robust Speaker Extraction Network Based on Iterative Refined Adaptation

no code implementations4 Nov 2020 Chengyun Deng, Shiqian Ma, Yi Zhang, Yongtao Sha, HUI ZHANG, Hui Song, Xiangang Li

dataset confirm the superior performance of the proposed method over the network without IRA in terms of SI-SDR and PESQ improvement.

UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition

no code implementations29 Oct 2020 Xiang Hao, Xiangdong Su, Zhiyu Wang, HUI ZHANG, Batushiren

This approach consists of a generator network and a discriminator network, which operate directly in the time domain.

Speech Enhancement

TOI-954 b and EPIC 246193072 b: Short-Period Saturn-Mass Planets that Test Whether Irradiation Leads to Inflation

no code implementations27 Oct 2020 Lizhou Sha, Chelsea X. Huang, Avi Shporer, Joseph E. Rodriguez, Andrew Vanderburg, Rafael Brahm, Janis Hagelberg, Elisabeth C. Matthews, Carl Ziegler, John H. Livingston, Keivan G. Stassun, Duncan J. Wright, Jeffrey D. Crane, Néstor Espinoza, François Bouchy, Gáspár Á. Bakos, Karen A. Collins, George Zhou, Allyson Bieryla, Joel D. Hartman, Robert A. Wittenmyer, Louise D. Nielsen, Peter Plavchan, Daniel Bayliss, Paula Sarkis, Thiam-Guan Tan, Ryan Cloutier, Luigi Mancini, Andrés Jordán, Sharon Wang, Thomas Henning, Norio Narita, Kaloyan Penev, Johanna K. Teske, Stephen R. Kane, Andrew W. Mann, Brett C. Addison, Motohide Tamura, Jonathan Horner, Mauro Barbieri, Jennifer A. Burt, Matías R. Díaz, Ian J. M. Crossfield, Diana Dragomir, Holger Drass, Adina D. Feinstein, HUI ZHANG, Rhodes Hart, John F. Kielkopf, Eric L. N. Jensen, Benjamin T. Montet, Gaël Ottoni, Richard P. Schwarz, Felipe Rojas, David Lopez Fdez Nespral, Pascal Torres, Matthew W. Mengel, Stéphane Udry, Abner Zapata, Erin Snoddy, Jack Okumura, George R. Ricker, Roland K. Vanderspek, David W. Latham, Joshua N. Winn, Sara Seager, Jon M. Jenkins, Knicole D. Colón, Christopher E. Henze, Akshata Krishnamurthy, Eric B. Ting, Michael Vezie, Steven Villanueva

We report the discovery of two short-period Saturn-mass planets, one transiting the G subgiant TOI-954 (TIC 44792534, $ V = 10. 343 $, $ T = 9. 78 $) observed in TESS Sectors 4 & 5, and one transiting the G dwarf EPIC 246193072 ($ V = 12. 70 $, $ K = 10. 67 $) observed in K2 Campaigns 12 & 19.

Earth and Planetary Astrophysics

A marine radioisotope gamma-ray spectrum analysis method based on Monte Carlo simulation and MLP neural network

no code implementations24 Oct 2020 Wenhan Dai, Zhi Zeng, Daowei Dou, Hao Ma, Jianping Chen, Junli Li, HUI ZHANG

We apply multilayer perceptron (MLP) to analyze the 662 keV full energy peak of Cs-137 in the seawater spectrum.

Deep Monocular Visual Odometry for Ground Vehicle

no code implementations21 Sep 2020 Xiangwei Wang, HUI ZHANG

To push the limit, we analyze the motion pattern of a ground vehicle and focus on learning two-degrees-of-freedom motions by proposed motion focusing and decoupling.

Camera Calibration Dimensionality Reduction +1

Explanation of Reinforcement Learning Model in Dynamic Multi-Agent System

no code implementations4 Aug 2020 Xinzhi Wang, Huao Li, HUI ZHANG, Michael Lewis, Katia Sycara

The results show that verbal explanation generated by both models improve subjective satisfaction of users towards the interpretability of DRL systems.

reinforcement-learning Reinforcement Learning (RL)

Emotion Correlation Mining Through Deep Learning Models on Natural Language Text

no code implementations28 Jul 2020 Xinzhi Wang, Luyao Kou, Vijayan Sugumaran, Xiangfeng Luo, HUI ZHANG

That means, journalists may try to attract attention using fear and joy words but arouse the emotion love instead; After news release, netizens generate emotional comments to express their intense emotions, i. e., anger, sadness, and love.

Emotion Recognition

AutoSTR: Efficient Backbone Search for Scene Text Recognition

1 code implementation ECCV 2020 Hui Zhang, Quanming Yao, Mingkun Yang, Yongchao Xu, Xiang Bai

In this work, inspired by the success of neural architecture search (NAS), which can identify better architectures than human-designed ones, we propose automated STR (AutoSTR) to search data-dependent backbones to boost text recognition performance.

Deblurring Neural Architecture Search +1

Single Channel Speech Enhancement Using Temporal Convolutional Recurrent Neural Networks

no code implementations2 Feb 2020 Jingdong Li, HUI ZHANG, Xueliang Zhang, Changliang Li

We show that our model is able to improve the performance of model, compared with existing convolutional recurrent networks.

Speech Enhancement

All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting

no code implementations21 Nov 2019 Hao Wang, Pu Lu, HUI ZHANG, Mingkun Yang, Xiang Bai, Yongchao Xu, Mengchao He, Yongpan Wang, Wenyu Liu

Recently, end-to-end text spotting that aims to detect and recognize text from cluttered images simultaneously has received particularly growing interest in computer vision.

Instance Segmentation Scene Text Detection +2

A Generalized and Robust Method Towards Practical Gaze Estimation on Smart Phone

no code implementations16 Oct 2019 Tianchu Guo, Yongchao Liu, HUI ZHANG, Xiabing Liu, Youngjun Kwak, Byung In Yoo, Jae-Joon Han, Changkyu Choi

For the second issue, we define a new metric to measure the robustness of gaze estimator, and propose an adversarial training based Disturbance with Ordinal loss (DwO) method to improve it.

Gaze Estimation Knowledge Distillation

Learning Alignment for Multimodal Emotion Recognition from Speech

1 code implementation6 Sep 2019 Haiyang Xu, HUI ZHANG, Kun Han, Yun Wang, Yiping Peng, Xiangang Li

Further, emotion recognition will be beneficial from using audio-textual multimodal information, it is not trivial to build a system to learn from multimodality.

Multimodal Emotion Recognition Speech Emotion Recognition +2

User independent Emotion Recognition with Residual Signal-Image Network

no code implementations10 Aug 2019 Guanghao Yin, Shou-qian Sun, HUI ZHANG, Dian Yu, Chao Li, Ke-jun Zhang, Ning Zou

To the best of author's knowledge, our method is the first attempt to classify large scale subject-independent emotion with 7962 pieces of EDA signals from 457 subjects.

Emotion Recognition

DeepDA: LSTM-based Deep Data Association Network for Multi-Targets Tracking in Clutter

no code implementations16 Jul 2019 Huajun Liu, HUI ZHANG, Christoph Mertz

The Long Short-Term Memory (LSTM) neural network based data association algorithm named as DeepDA for multi-target tracking in clutters is proposed to deal with the NP-hard combinatorial optimization problem in this paper.

Association Combinatorial Optimization

Training GANs with Centripetal Acceleration

no code implementations24 Feb 2019 Wei Peng, Yu-Hong Dai, HUI ZHANG, Li-Zhi Cheng

Training generative adversarial networks (GANs) often suffers from cyclic behaviors of iterates.

Estimation of Inter-Sentiment Correlations Employing Deep Neural Network Models

no code implementations24 Nov 2018 Xinzhi Wang, Shengcheng Yuan, HUI ZHANG, Yi Liu

By contrast, in objective news bodies and titles, it is easy to regard text as caused love (gd).

A LSTM Approach with Sub-Word Embeddings for Mongolian Phrase Break Prediction

no code implementations COLING 2018 Rui Liu, Feilong Bao, Guanglai Gao, HUI ZHANG, Yonghe Wang

In this paper, we first utilize the word embedding that focuses on sub-word units to the Mongolian Phrase Break (PB) prediction task by using Long-Short-Term-Memory (LSTM) model.

Dictionary Learning Machine Translation +2

Active Deep Learning for Classification of Hyperspectral Images

no code implementations30 Nov 2016 Peng Liu, HUI ZHANG, Kie B. Eom

It is shown that the proposed algorithm is efficient and effective in classifying hyperspectral images.

Active Learning Classification +3

The Common Self-Polar Triangle of Concentric Circles and Its Application to Camera Calibration

no code implementations CVPR 2015 Haifei Huang, HUI ZHANG, Yiu-ming Cheung

In this paper, we explore the properties of the common self-polar triangle, when the two conics happen to be concentric circles.

Camera Calibration

Scale Selection of Adaptive Kernel Regression by Joint Saliency Map for Nonrigid Image Registration

no code implementations3 Mar 2013 Zhuangming Shen, Jiuai Sun, HUI ZHANG, Binjie Qin

JSM guides the local structure matching in nonrigid registration by emphasizing these JSSs' sparse deformation vectors in adaptive kernel regression of hierarchical sparse deformation vectors for iterative dense deformation reconstruction.

Image Registration regression

Local Structure Matching Driven by Joint-Saliency-Structure Adaptive Kernel Regression

no code implementations3 Feb 2013 Binjie Qin, Zhuangming Shen, Zien Zhou, Jiawei Zhou, Jiuai Sun, HUI ZHANG, Mingxing Hu, Yisong Lv

For nonrigid image registration, matching the particular structures (or the outliers) that have missing correspondence and/or local large deformations, can be more difficult than matching the common structures with small deformations in the two images.

Image Registration regression

Inverse-Category-Frequency based supervised term weighting scheme for text categorization

2 code implementations13 Dec 2010 Deqing Wang, HUI ZHANG

Term weighting schemes often dominate the performance of many classifiers, such as kNN, centroid-based classifier and SVMs.

Cross-corpus Information Retrieval +3

