Search Results for author: HUI ZHANG

Found 103 papers, 26 papers with code

基于多层次预训练策略和多任务学习的端到端蒙汉语音翻译(End-to-end Mongolian-Chinese Speech Translation Based on Multi-level Pre-training Strategies and Multi-task Learning)

no code implementations CCL 2021 Ningning Wang, Long Fei, HUI ZHANG

“端到端语音翻译将源语言语音直接翻译为目标语言文本, 它需要“源语言语音-目标语言文本”作为训练数据, 然而这类数据极其稀缺, 本文提出了一种多层次预训练策略和多任务学习相结合的训练方法, 首先分别对语音识别和机器翻译模型的各个模块进行多层次预训练, 接着将语音识别和机器翻译模型连接起来构成语音翻译模型, 然后使用迁移学习对预训练好的模型进行多步骤微调, 在此过程中又运用多任务学习的方法, 将语音识别作为语音翻译的一个辅助任务来组织训练, 充分利用了已经存在的各种不同形式的数据来训练端到端模型, 首次将端到端技术应用于资源受限条件下的蒙汉语音翻译, 构建了首个翻译质量较高、实际可用的端到端蒙汉语音翻译系统。”

Multi-Task Learning

Teaching with Uncertainty: Unleashing the Potential of Knowledge Distillation in Object Detection

no code implementations11 Jun 2024 Junfei Yi, Jianxu Mao, Tengfei Liu, Mingjie Li, Hanyu Gu, HUI ZHANG, Xiaojun Chang, Yaonan Wang

In this paper, we propose a novel feature-based distillation paradigm with knowledge uncertainty for object detection, termed "Uncertainty Estimation-Discriminative Knowledge Extraction-Knowledge Transfer (UET)", which can seamlessly integrate with existing distillation methods.

Knowledge Distillation object-detection +2

Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment

no code implementations7 Jun 2024 Venkanna Babu Guthula, Stefan Oehmcke, Remigio Chilaule, HUI ZHANG, Nico Lang, Ankit Kariryaa, Johan Mottelson, Christian Igel

We show that our DOW variant is a generic approach that improves the performance of both U-Net and DINOv2 backbones, leading to a better trade-off between semantic segmentation and instance segmentation.

Decoder Instance Segmentation +5

All-In-One Medical Image Restoration via Task-Adaptive Routing

1 code implementation30 May 2024 Zhiwen Yang, Haowei Chen, Ziniu Qian, Yang Yi, HUI ZHANG, Dan Zhao, Bingzheng Wei, Yan Xu

In this paper, we focus on the task of all-in-one medical image restoration, aiming to address multiple distinct MedIR tasks with a single universal model.

Denoising Image Restoration +1

Dynamic Identity-Guided Attention Network for Visible-Infrared Person Re-identification

no code implementations21 May 2024 Peng Gao, Yujian Lee, HUI ZHANG, Xubo Liu, Yiyang Hu, Guquan Jing

Effectively minimizing these cross-modal discrepancies relies on obtaining representations that are guided by identity and consistent across modalities, while also filtering out representations that are irrelevant to identity.

Person Re-Identification

Eddeep: Fast eddy-current distortion correction for diffusion MRI with deep learning

no code implementations17 May 2024 Antoine Legouhy, Ross Callaghan, Whitney Stee, Philippe Peigneux, Hojjat Azadbakht, HUI ZHANG

However, this is non-trivial because correspondence between volumes can be severely disrupted due to volume-specific signal attenuations induced by varying directions and strengths of the applied gradients.

Image Registration

Deep Lead Optimization: Leveraging Generative AI for Structural Modification

no code implementations30 Apr 2024 Odin Zhang, Haitao Lin, HUI ZHANG, Huifeng Zhao, Yufei Huang, Yuansheng Huang, Dejun Jiang, Chang-Yu Hsieh, Peichen Pan, Tingjun Hou

Through this lens, de novo design can incorporate strategies from lead optimization to address the challenge of generating hard-to-synthesize molecules; inversely, lead optimization can benefit from the innovations in de novo design by approaching it as a task of generating molecules conditioned on certain substructures.

The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge

no code implementations9 Apr 2024 Yiwei Guo, Chenrun Wang, Yifan Yang, Hankun Wang, Ziyang Ma, Chenpeng Du, Shuai Wang, Hanzheng Li, Shuai Fan, HUI ZHANG, Xie Chen, Kai Yu

Discrete speech tokens have been more and more popular in multiple speech processing fields, including automatic speech recognition (ASR), text-to-speech (TTS) and singing voice synthesis (SVS).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

GraspXL: Generating Grasping Motions for Diverse Objects at Scale

no code implementations28 Mar 2024 HUI ZHANG, Sammy Christen, Zicong Fan, Otmar Hilliges, Jie Song

Moreover, we show that our framework can be deployed to different dexterous hands and work with reconstructed or generated objects.

Object

HIMap: HybrId Representation Learning for End-to-end Vectorized HD Map Construction

no code implementations CVPR 2024 Yi Zhou, HUI ZHANG, Jiaqian Yu, Yifan Yang, Sangil Jung, Seung-In Park, ByungIn Yoo

Concretely, we introduce a hybrid representation called HIQuery to represent all map elements, and propose a point-element interactor to interactively extract and encode the hybrid information of elements, e. g. point position and element shape, into the HIQuery.

Representation Learning

VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech

no code implementations25 Jan 2024 Chenpeng Du, Yiwei Guo, Hankun Wang, Yifan Yang, Zhikang Niu, Shuai Wang, HUI ZHANG, Xie Chen, Kai Yu

Recent TTS models with decoder-only Transformer architecture, such as SPEAR-TTS and VALL-E, achieve impressive naturalness and demonstrate the ability for zero-shot adaptation given a speech prompt.

Decoder Hallucination

Validating Privacy-Preserving Face Recognition under a Minimum Assumption

1 code implementation CVPR 2024 HUI ZHANG, Xingbo Dong, YenLung Lai, Ying Zhou, Xiaoyan Zhang, Xingguo Lv, Zhe Jin, Xuejun Li

The widespread use of cloud-based face recognition technology raises privacy concerns as unauthorized access to face images can expose personal information or be exploited for fraudulent purposes.

Face Recognition Privacy Preserving

Double-Flow GAN model for the reconstruction of perceived faces from brain activities

no code implementations12 Dec 2023 ZiHao Wang, Jing Zhao, HUI ZHANG

Face plays an important role in human's visual perception, and reconstructing perceived faces from brain activities is challenging because of its difficulty in extracting high-level features and maintaining consistency of multiple face attributes, such as expression, identity, gender, etc.

AdaDiff: Adaptive Step Selection for Fast Diffusion

no code implementations24 Nov 2023 HUI ZHANG, Zuxuan Wu, Zhen Xing, Jie Shao, Yu-Gang Jiang

Diffusion models, as a type of generative models, have achieved impressive results in generating images and videos conditioned on textual conditions.

Denoising Image Generation +1

Joint Design of ISAC Waveform under PAPR Constraints

no code implementations20 Nov 2023 Yating Chen, Cai Wen, Yan Huang, Le Liang, Jie Li, HUI ZHANG, Wei Hong

In this paper, we formulate the precoding problem of integrated sensing and communication (ISAC) waveform as a non-convex quadratically constrainted quadratic program (QCQP), in which the weighted sum of communication multi-user interference (MUI) and the gap between dual-use waveform and ideal radar waveform is minimized with peak-to-average power ratio (PAPR) constraints.

Predicting urban tree cover from incomplete point labels and limited background information

no code implementations20 Nov 2023 HUI ZHANG, Ankit Kariryaa, Venkanna Babu Guthula, Christian Igel, Stefan Oehmcke

This paper studies how to combine accurate point labels of urban trees along streets with crowd-sourced annotations from an open geographic database to delineate city trees in remote sensing images, a task which is challenging even for humans.

Semantic Segmentation

AdapterShadow: Adapting Segment Anything Model for Shadow Detection

1 code implementation15 Nov 2023 Leiping Jie, HUI ZHANG

To adapt SAM for shadow images, trainable adapters are inserted into the frozen image encoder of SAM, since the training of the full SAM model is both time and memory consuming.

Shadow Detection

Resilient and constrained consensus against adversarial attacks: A distributed MPC framework

no code implementations10 Nov 2023 Henglai Wei, Kunwu Zhang, HUI ZHANG, Yang Shi

In this work, we propose a distributed resilient consensus framework, consisting of a pre-designed consensus protocol and distributed model predictive control (DMPC) optimization, which can help significantly reduce the requirement on the network robustness and effectively handle the general linear constrained MAS under adversarial attacks.

Adversarial Attack Detection Model Predictive Control

Addressing preferred orientation in single-particle cryo-EM through AI-generated auxiliary particles

no code implementations26 Sep 2023 HUI ZHANG, Dihan Zheng, Qiurong Wu, Nieng Yan, Zuoqiang Shi, Mingxu Hu, Chenglong Bao

The single-particle cryo-EM field faces the persistent challenge of preferred orientation, lacking general computational solutions.

Single Particle Analysis

ArtiGrasp: Physically Plausible Synthesis of Bi-Manual Dexterous Grasping and Articulation

no code implementations7 Sep 2023 HUI ZHANG, Sammy Christen, Zicong Fan, Luocheng Zheng, Jemin Hwangbo, Jie Song, Otmar Hilliges

ArtiGrasp leverages reinforcement learning and physics simulations to train a policy that controls the global and local hand pose.

hand-object pose Object

S$^3$-MonoDETR: Supervised Shape&Scale-perceptive Deformable Transformer for Monocular 3D Object Detection

no code implementations2 Sep 2023 Xuan He, Kailun Yang, Junwei Zheng, Jin Yuan, Luis M. Bergasa, HUI ZHANG, Zhiyong Li

These methods typically use visual and depth representations to generate query points on objects, whose quality plays a decisive role in the detection accuracy.

Monocular 3D Object Detection object-detection

Efficient option pricing with unary-based photonic computing chip and generative adversarial learning

no code implementations8 Aug 2023 HUI ZHANG, Lingxiao Wan, Sergi Ramos-Calderer, Yuancheng Zhan, Wai-Keong Mok, Hong Cai, Feng Gao, Xianshu Luo, Guo-Qiang Lo, Leong Chuan Kwek, José Ignacio Latorre, Ai Qun Liu

In the modern financial industry system, the structure of products has become more and more complex, and the bottleneck constraint of classical computing power has already restricted the development of the financial industry.

Generative Adversarial Network

Rician likelihood loss for quantitative MRI using self-supervised deep learning

no code implementations13 Jul 2023 Christopher S. Parker, Anna Schroder, Sean C. Epstein, James Cole, Daniel C. Alexander, HUI ZHANG

Results: Networks trained with NLR loss show higher estimation accuracy than MSE for the ADC and IVIM diffusion coefficients as SNR decreases, with minimal loss of precision or total error.

DRMC: A Generalist Model with Dynamic Routing for Multi-Center PET Image Synthesis

1 code implementation11 Jul 2023 Zhiwen Yang, Yang Zhou, HUI ZHANG, Bingzheng Wei, Yubo Fan, Yan Xu

To address this, we develop a generalist model that shares architecture and parameters across centers to utilize the shared knowledge.

Image Generation

Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling

1 code implementation CVPR 2023 Zhanhao Hu, Wenda Chu, Xiaopei Zhu, HUI ZHANG, Bo Zhang, Xiaolin Hu

In order to craft natural-looking adversarial clothes that can evade person detectors at multiple viewing angles, we propose adversarial camouflage textures (AdvCaT) that resemble one kind of the typical textures of daily clothes, camouflage textures.

Patch-CNN: Training data-efficient deep learning for high-fidelity diffusion tensor estimation from minimal diffusion protocols

no code implementations3 Jul 2023 Tobias Goodwin-Allcock, Ting Gong, Robert Gray, Parashkev Nachev, HUI ZHANG

To overcome these limitations, we propose Patch-CNN, a neural network with a minimal (non-voxel-wise) convolutional kernel (3$\times$3$\times$3).

SSC3OD: Sparsely Supervised Collaborative 3D Object Detection from LiDAR Point Clouds

no code implementations3 Jul 2023 Yushan Han, HUI ZHANG, Honglei Zhang, Yidong Li

Extensive experiments on three large-scale datasets reveal that our proposed SSC3OD can effectively improve the performance of sparsely supervised collaborative 3D object detectors.

3D Object Detection Autonomous Driving +2

When SAM Meets Shadow Detection

1 code implementation19 May 2023 Leiping Jie, HUI ZHANG

As a promptable generic object segmentation model, segment anything model (SAM) has recently attracted significant attention, and also demonstrates its powerful performance.

Image Segmentation Medical Image Segmentation +6

Quadratic Graph Attention Network (Q-GAT) for Robust Construction of Gene Regulatory Networks

1 code implementation24 Mar 2023 HUI ZHANG, Xuexin An, Qiang He, YuDong Yao, Yudong Zhang, Feng-Lei Fan, Yueyang Teng

The former informs that nonlinear aggregation of quadratic neurons can amplify useful signals and suppress unwanted noise, thereby facilitating robustness, while the latter reveals that Q-GAT can leverage more features in prediction thanks to the dual attention mechanism, which endows Q-GAT with the ability to confront adversarial perturbation.

Graph Attention

DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection

1 code implementation15 Mar 2023 HUI ZHANG, Zheng Wang, Zuxuan Wu, Yu-Gang Jiang

Anomaly detection has garnered extensive applications in real industrial manufacturing due to its remarkable effectiveness and efficiency.

Denoising Unsupervised Anomaly Detection

Resolving quantitative MRI model degeneracy with machine learning via training data distribution design

no code implementations9 Mar 2023 Michele Guerreri, Sean Epstein, Hojjat Azadbakht, HUI ZHANG

Our results illustrate the importance of training set design which has the potential to allow accurate estimation of tissue properties with ML.

Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges

1 code implementation16 Jan 2023 Yushan Han, HUI ZHANG, Huifang Li, Yi Jin, Congyan Lang, Yidong Li

The former focuses on collaboration modules and efficiency, and the latter is devoted to addressing the problems in actual application.

Autonomous Driving

Linking Garment With Person via Semantically Associated Landmarks for Virtual Try-On

no code implementations CVPR 2023 Keyu Yan, Tingwei Gao, HUI ZHANG, Chengjun Xie

In this paper, a novel virtual try-on algorithm, dubbed SAL-VTON, is proposed, which links the garment with the person via semantically associated landmarks to alleviate misalignment.

Virtual Try-on

Prototypical Residual Networks for Anomaly Detection and Localization

no code implementations CVPR 2023 HUI ZHANG, Zuxuan Wu, Zheng Wang, Zhineng Chen, Yu-Gang Jiang

Anomaly detection and localization are widely used in industrial manufacturing for its efficiency and effectiveness.

Ranked #2 on Supervised Anomaly Detection on MVTec AD (using extra training data)

Supervised Anomaly Detection

Self-Regularized Prototypical Network for Few-Shot Semantic Segmentation

no code implementations30 Oct 2022 Henghui Ding, HUI ZHANG, Xudong Jiang

A direct yet effective prototype regularization on support set is proposed in SRPNet, in which the generated prototypes are evaluated and regularized on the support set itself.

Few-Shot Semantic Segmentation Segmentation +1

How can spherical CNNs benefit ML-based diffusion MRI parameter estimation?

no code implementations1 Jul 2022 Tobias Goodwin-Allcock, Jason McEwen, Robert Gray, Parashkev Nachev, HUI ZHANG

A possible consequence of the lack of rotational equivariance is that the training dataset must contain a diverse range of microstucture orientations.

A Unified Understanding of Deep NLP Models for Text Classification

no code implementations19 Jun 2022 Zhen Li, Xiting Wang, Weikai Yang, Jing Wu, Zhengyan Zhang, Zhiyuan Liu, Maosong Sun, HUI ZHANG, Shixia Liu

The rapid development of deep natural language processing (NLP) models for text classification has led to an urgent need for a unified understanding of these models proposed individually.

text-classification Text Classification

Enforcing continuous symmetries in physics-informed neural network for solving forward and inverse problems of partial differential equations

no code implementations19 Jun 2022 Zhi-Yong Zhang, HUI ZHANG, Li-Sheng Zhang, Lei-Lei Guo

As a typical application of deep learning, physics-informed neural network (PINN) {has been} successfully used to find numerical solutions of partial differential equations (PDEs), but how to improve the limited accuracy is still a great challenge for PINN.

Neural Network Decoders for Permutation Codes Correcting Different Errors

no code implementations7 Jun 2022 Yeow Meng Chee, HUI ZHANG

Permutation codes were extensively studied in order to correct different types of errors for the applications on power line communication and rank modulation for flash memory.

3D Segmentation Guided Style-based Generative Adversarial Networks for PET Synthesis

no code implementations18 May 2022 Yang Zhou, Zhiwen Yang, HUI ZHANG, Eric I-Chao Chang, Yubo Fan, Yan Xu

(2) We adopt a task-driven strategy that couples a segmentation task with a generative adversarial network (GAN) framework to improve the translation performance.

Generative Adversarial Network Translation

Choice of training label matters: how to best use deep learning for quantitative MRI parameter estimation

1 code implementation11 May 2022 Sean C. Epstein, Timothy J. P. Bray, Margaret Hall-Craggs, HUI ZHANG

Self-supervised approaches, sometimes referred to as unsupervised, have been loosely based on auto-encoders, whereas supervised methods have, to date, been trained on groundtruth labels.

Self-Supervised Learning

Thin-Plate Spline Motion Model for Image Animation

1 code implementation CVPR 2022 Jian Zhao, HUI ZHANG

Firstly, we propose thin-plate spline motion estimation to produce a more flexible optical flow, which warps the feature maps of the source image to the feature domain of the driving image.

Face Reenactment Image Animation +2

Projected Sliced Wasserstein Autoencoder-based Hyperspectral Images Anomaly Detection

no code implementations20 Dec 2021 Yurong Chen, HUI ZHANG, Yaonan Wang, Q. M. Jonathan Wu, Yimin Yang

In this case, the Wasserstein distance can be calculated with the closed-form, even the prior distribution is not Gaussian.

Anomaly Detection

Slot-VPS: Object-centric Representation Learning for Video Panoptic Segmentation

no code implementations CVPR 2022 Yi Zhou, HUI ZHANG, Hana Lee, Shuyang Sun, Pingjun Li, Yangguang Zhu, ByungIn Yoo, Xiaojuan Qi, Jae-Joon Han

We encode all panoptic entities in a video, including both foreground instances and background semantics, with a unified representation called panoptic slots.

Object Representation Learning +1

Auto-Encoding Score Distribution Regression for Action Quality Assessment

2 code implementations22 Nov 2021 Boyu Zhang, Jiayuan Chen, Yinfei Xu, HUI ZHANG, Xu Yang, Xin Geng

Traditionally, AQA is treated as a regression problem to learn the underlying mappings between videos and action scores.

Action Quality Assessment regression

MAGORINO: Magnitude-only fat fraction and R2* estimation with Rician noise modelling

no code implementations11 Oct 2021 Timothy JP Bray, Alan Bainbridge, Margaret A Hall-Craggs, HUI ZHANG

Purpose: Magnitude-based fitting of chemical shift-encoded data enables proton density fat fraction (PDFF) and R2* estimation where complex-based methods fail or when phase data is inaccessible or unreliable, such as in multi-centre studies.

Unveiling personnel movement in a larger indoor area with a non-overlapping multi-camera system

no code implementations10 Apr 2021 Ping Zhang, Zhenxiang Tao, Wenjie Yang, Minze Chen, Shan Ding, Xiaodong Liu, Rui Yang, HUI ZHANG

Surveillance cameras are widely applied for indoor occupancy measurement and human movement perception, which benefit for building energy management and social security.

energy management Management +1

Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud

1 code implementation ICCV 2021 Mingtao Feng, Zhen Li, Qi Li, Liang Zhang, Xiangdong Zhang, Guangming Zhu, HUI ZHANG, Yaonan Wang, Ajmal Mian

There are three main challenges in 3D object grounding: to find the main focus in the complex and diverse description; to understand the point cloud scene; and to locate the target object.

Object

Learning Frequency-aware Dynamic Network for Efficient Super-Resolution

no code implementations ICCV 2021 Wenbin Xie, Dehua Song, Chang Xu, Chunjing Xu, HUI ZHANG, Yunhe Wang

Extensive experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures to obtain the better tradeoff between visual quality and computational complexity.

Image Super-Resolution

EventAnchor: Reducing Human Interactions in Event Annotation of Racket Sports Videos

no code implementations13 Jan 2021 Dazhen Deng, Jiang Wu, Jiachen Wang, Yihong Wu, Xiao Xie, Zheng Zhou, HUI ZHANG, Xiaolong Zhang, Yingcai Wu

The popularity of racket sports (e. g., tennis and table tennis) leads to high demands for data analysis, such as notational analysis, on player performance.

TESS Delivers Five New Hot Giant Planets Orbiting Bright Stars from the Full Frame Images

no code implementations5 Jan 2021 Joseph E. Rodriguez, Samuel N. Quinn, George Zhou, Andrew Vanderburg, Louise D. Nielsen, Robert A. Wittenmyer, Rafael Brahm, Phillip A. Reed, Chelsea X. Huang, Sydney Vach, David R. Ciardi, Ryan J. Oelkers, Keivan G. Stassun, Coel Hellier, B. Scott Gaudi, Jason D. Eastman, Karen A. Collins, Allyson Bieryla, Sam Christian, David W. Latham, Ilaria Carleo, Duncan J. Wright, Elisabeth Matthews, Erica J. Gonzales, Carl Ziegler, Courtney D. Dressing, Steve B. Howell, Thiam-Guan Tan, Justin Wittrock, Peter Plavchan, Kim K. McLeod, David Baker, Gavin Wang, Don Radford, Richard P. Schwarz, Massimiliano Esposito, George R. Ricker, Roland K. Vanderspek, Sara Seager, Joshua N. Winn, Jon M. Jenkins, Brett Addison, D. R. Anderson, Thomas Barclay, Thomas G. Beatty, Perry Berlind, Francois Bouchy, Michael Bowen, Brendan P. Bowler, C. E. Brasseur, César Briceño, Douglas A. Caldwell, Michael L. Calkins, Priyanka Chaturvedi, Guillaume Chaverot, Sudhish Chimaladinne, Jessie L. Christiansen, Kevin Collins, Ian J. M. Crossfield, Kevin Eastridge, N'estor Espinoza, Gilbert A. Esquerdo, Dax Feliz, Tyler Fenske, William Fong, Tianjun Gan, Steven Giacalone, Holden Gill, Lindsey Gordon, Alex Granados, Nolan Grieves, Eike W. Guenther, Natalia Guerrero, Thomas Henning, Christopher E. Henze, Katharine Hesse, Melissa J. Hobson, Jonathan Horner, David J. James, Eric L. N. Jensen, Mary Jimenez, Andrés Jordán, Stephen R. Kane, John Kielkopf, Kingsley Kim, Rudolf B. Kuhn, Natasha Latouf, Nicholas M. Law, Alan M. Levine, Michael B. Lund, Andrew W. Mann, Shude Mao, Rachel A. Matson, Scott McDermott, Matthew W. Mengel, Jessica Mink, Patrick Newman, Tanner O'Dwyer, Jack Okumura, Enric Palle, Joshua Pepper, Elisa V. Quintana, Paula Sarkis, Arjun Savel, Joshua E. Schlieder, Chloe Schnaible, Avi Shporer, Ramotholo Sefako, Julia Seidel, Robert J. Siverd, Brett Skinner, Manu Stalport, Daniel J. Stevens, Caitlin Stibbards, C. G. Tinney, R. G. West, Daniel A. Yahalomi, HUI ZHANG

TOI-640 b is one of only three known hot Jupiters to have a highly inflated radius (R$_{\rm P}$ > 1. 7R$_{\rm J}$, possibly a result of its host star's evolution) and resides on an orbit with a period longer than 5 days.

Earth and Planetary Astrophysics Solar and Stellar Astrophysics

Multi-Model Least Squares-Based Recomputation Framework for Large Data Analysis

no code implementations4 Jan 2021 Wandong Zhang, QM Jonathan Wu, Yimin Yang, WG Will Zhao, Tianlei Wang, HUI ZHANG

Most multilayer least squares (LS)-based neural networks are structured with two separate stages: unsupervised feature encoding and supervised pattern classification.

Representation Learning

Prototypical Matching and Open Set Rejection for Zero-Shot Semantic Segmentation

no code implementations ICCV 2021 HUI ZHANG, Henghui Ding

In this work, we present zero-shot semantic segmentation, which aims to identify not only the seen classes contained in training but also the novel classes that have never been seen.

Segmentation Semantic Segmentation +1

Interaction via Bi-Directional Graph of Semantic Region Affinity for Scene Parsing

no code implementations ICCV 2021 Henghui Ding, HUI ZHANG, Jun Liu, Jiaxin Li, Zijian Feng, Xudong Jiang

In this work, we treat each respective region in an image as a whole, and capture the structure topology as well as the affinity among different regions.

Scene Parsing

[Re] Reimplementation of FixMatch and Investigation on Noisy (Pseudo) Labels and Confirmation Errors of FixMatch

1 code implementation RC 2020 Ci Li, Ruibo Tu, HUI ZHANG

FixMatch is a semi-supervised learning method, which achieves comparable results with fully supervised learning by leveraging a limited number of labeled data (pseudo labelling technique) and taking a good use of the unlabeled data (consistency regularization ).

Semi-Supervised Image Classification

Combining Self-Supervised and Supervised Learning with Noisy Labels

no code implementations16 Nov 2020 Yongqi Zhang, HUI ZHANG, Quanming Yao, Jun Wan

Thus, inspired by the observation that classifier is more robust to noisy labels while representation is much more fragile, and by the recent advances of self-supervised representation learning (SSRL) technologies, we design a new method, i. e., CS$^3$NL, to obtain representation by SSRL without labels and train the classifier directly with noisy labels.

Learning with noisy labels Representation Learning +1

Robust Speaker Extraction Network Based on Iterative Refined Adaptation

no code implementations4 Nov 2020 Chengyun Deng, Shiqian Ma, Yi Zhang, Yongtao Sha, HUI ZHANG, Hui Song, Xiangang Li

dataset confirm the superior performance of the proposed method over the network without IRA in terms of SI-SDR and PESQ improvement.

UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition

no code implementations29 Oct 2020 Xiang Hao, Xiangdong Su, Zhiyu Wang, HUI ZHANG, Batushiren

This approach consists of a generator network and a discriminator network, which operate directly in the time domain.

Speech Enhancement

TOI-954 b and EPIC 246193072 b: Short-Period Saturn-Mass Planets that Test Whether Irradiation Leads to Inflation

no code implementations27 Oct 2020 Lizhou Sha, Chelsea X. Huang, Avi Shporer, Joseph E. Rodriguez, Andrew Vanderburg, Rafael Brahm, Janis Hagelberg, Elisabeth C. Matthews, Carl Ziegler, John H. Livingston, Keivan G. Stassun, Duncan J. Wright, Jeffrey D. Crane, Néstor Espinoza, François Bouchy, Gáspár Á. Bakos, Karen A. Collins, George Zhou, Allyson Bieryla, Joel D. Hartman, Robert A. Wittenmyer, Louise D. Nielsen, Peter Plavchan, Daniel Bayliss, Paula Sarkis, Thiam-Guan Tan, Ryan Cloutier, Luigi Mancini, Andrés Jordán, Sharon Wang, Thomas Henning, Norio Narita, Kaloyan Penev, Johanna K. Teske, Stephen R. Kane, Andrew W. Mann, Brett C. Addison, Motohide Tamura, Jonathan Horner, Mauro Barbieri, Jennifer A. Burt, Matías R. Díaz, Ian J. M. Crossfield, Diana Dragomir, Holger Drass, Adina D. Feinstein, HUI ZHANG, Rhodes Hart, John F. Kielkopf, Eric L. N. Jensen, Benjamin T. Montet, Gaël Ottoni, Richard P. Schwarz, Felipe Rojas, David Lopez Fdez Nespral, Pascal Torres, Matthew W. Mengel, Stéphane Udry, Abner Zapata, Erin Snoddy, Jack Okumura, George R. Ricker, Roland K. Vanderspek, David W. Latham, Joshua N. Winn, Sara Seager, Jon M. Jenkins, Knicole D. Colón, Christopher E. Henze, Akshata Krishnamurthy, Eric B. Ting, Michael Vezie, Steven Villanueva

We report the discovery of two short-period Saturn-mass planets, one transiting the G subgiant TOI-954 (TIC 44792534, $ V = 10. 343 $, $ T = 9. 78 $) observed in TESS Sectors 4 & 5, and one transiting the G dwarf EPIC 246193072 ($ V = 12. 70 $, $ K = 10. 67 $) observed in K2 Campaigns 12 & 19.

Earth and Planetary Astrophysics

A marine radioisotope gamma-ray spectrum analysis method based on Monte Carlo simulation and MLP neural network

no code implementations24 Oct 2020 Wenhan Dai, Zhi Zeng, Daowei Dou, Hao Ma, Jianping Chen, Junli Li, HUI ZHANG

We apply multilayer perceptron (MLP) to analyze the 662 keV full energy peak of Cs-137 in the seawater spectrum.

Deep Monocular Visual Odometry for Ground Vehicle

no code implementations21 Sep 2020 Xiangwei Wang, HUI ZHANG

To push the limit, we analyze the motion pattern of a ground vehicle and focus on learning two-degrees-of-freedom motions by proposed motion focusing and decoupling.

Camera Calibration Dimensionality Reduction +1

Explanation of Reinforcement Learning Model in Dynamic Multi-Agent System

no code implementations4 Aug 2020 Xinzhi Wang, Huao Li, HUI ZHANG, Michael Lewis, Katia Sycara

The results show that verbal explanation generated by both models improve subjective satisfaction of users towards the interpretability of DRL systems.

reinforcement-learning Reinforcement Learning (RL)

Emotion Correlation Mining Through Deep Learning Models on Natural Language Text

no code implementations28 Jul 2020 Xinzhi Wang, Luyao Kou, Vijayan Sugumaran, Xiangfeng Luo, HUI ZHANG

That means, journalists may try to attract attention using fear and joy words but arouse the emotion love instead; After news release, netizens generate emotional comments to express their intense emotions, i. e., anger, sadness, and love.

Emotion Recognition

AutoSTR: Efficient Backbone Search for Scene Text Recognition

1 code implementation ECCV 2020 Hui Zhang, Quanming Yao, Mingkun Yang, Yongchao Xu, Xiang Bai

In this work, inspired by the success of neural architecture search (NAS), which can identify better architectures than human-designed ones, we propose automated STR (AutoSTR) to search data-dependent backbones to boost text recognition performance.

Deblurring Neural Architecture Search +1

Single Channel Speech Enhancement Using Temporal Convolutional Recurrent Neural Networks

no code implementations2 Feb 2020 Jingdong Li, HUI ZHANG, Xueliang Zhang, Changliang Li

We show that our model is able to improve the performance of model, compared with existing convolutional recurrent networks.

Speech Enhancement

All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting

no code implementations21 Nov 2019 Hao Wang, Pu Lu, HUI ZHANG, Mingkun Yang, Xiang Bai, Yongchao Xu, Mengchao He, Yongpan Wang, Wenyu Liu

Recently, end-to-end text spotting that aims to detect and recognize text from cluttered images simultaneously has received particularly growing interest in computer vision.

Instance Segmentation Scene Text Detection +3

A Generalized and Robust Method Towards Practical Gaze Estimation on Smart Phone

no code implementations16 Oct 2019 Tianchu Guo, Yongchao Liu, HUI ZHANG, Xiabing Liu, Youngjun Kwak, Byung In Yoo, Jae-Joon Han, Changkyu Choi

For the second issue, we define a new metric to measure the robustness of gaze estimator, and propose an adversarial training based Disturbance with Ordinal loss (DwO) method to improve it.

Gaze Estimation Knowledge Distillation

Learning Alignment for Multimodal Emotion Recognition from Speech

1 code implementation6 Sep 2019 Haiyang Xu, HUI ZHANG, Kun Han, Yun Wang, Yiping Peng, Xiangang Li

Further, emotion recognition will be beneficial from using audio-textual multimodal information, it is not trivial to build a system to learn from multimodality.

Multimodal Emotion Recognition Speech Emotion Recognition +2

User independent Emotion Recognition with Residual Signal-Image Network

no code implementations10 Aug 2019 Guanghao Yin, Shou-qian Sun, HUI ZHANG, Dian Yu, Chao Li, Ke-jun Zhang, Ning Zou

To the best of author's knowledge, our method is the first attempt to classify large scale subject-independent emotion with 7962 pieces of EDA signals from 457 subjects.

Binary Classification Emotion Recognition

DeepDA: LSTM-based Deep Data Association Network for Multi-Targets Tracking in Clutter

no code implementations16 Jul 2019 Huajun Liu, HUI ZHANG, Christoph Mertz

The Long Short-Term Memory (LSTM) neural network based data association algorithm named as DeepDA for multi-target tracking in clutters is proposed to deal with the NP-hard combinatorial optimization problem in this paper.

Combinatorial Optimization

Training GANs with Centripetal Acceleration

no code implementations24 Feb 2019 Wei Peng, Yu-Hong Dai, HUI ZHANG, Li-Zhi Cheng

Training generative adversarial networks (GANs) often suffers from cyclic behaviors of iterates.

Estimation of Inter-Sentiment Correlations Employing Deep Neural Network Models

no code implementations24 Nov 2018 Xinzhi Wang, Shengcheng Yuan, HUI ZHANG, Yi Liu

By contrast, in objective news bodies and titles, it is easy to regard text as caused love (gd).

A LSTM Approach with Sub-Word Embeddings for Mongolian Phrase Break Prediction

no code implementations COLING 2018 Rui Liu, Feilong Bao, Guanglai Gao, HUI ZHANG, Yonghe Wang

In this paper, we first utilize the word embedding that focuses on sub-word units to the Mongolian Phrase Break (PB) prediction task by using Long-Short-Term-Memory (LSTM) model.

Dictionary Learning Machine Translation +2

Active Deep Learning for Classification of Hyperspectral Images

no code implementations30 Nov 2016 Peng Liu, HUI ZHANG, Kie B. Eom

It is shown that the proposed algorithm is efficient and effective in classifying hyperspectral images.

Active Learning Classification +3

The Common Self-Polar Triangle of Concentric Circles and Its Application to Camera Calibration

no code implementations CVPR 2015 Haifei Huang, HUI ZHANG, Yiu-ming Cheung

In this paper, we explore the properties of the common self-polar triangle, when the two conics happen to be concentric circles.

Camera Calibration

Scale Selection of Adaptive Kernel Regression by Joint Saliency Map for Nonrigid Image Registration

no code implementations3 Mar 2013 Zhuangming Shen, Jiuai Sun, HUI ZHANG, Binjie Qin

JSM guides the local structure matching in nonrigid registration by emphasizing these JSSs' sparse deformation vectors in adaptive kernel regression of hierarchical sparse deformation vectors for iterative dense deformation reconstruction.

Image Registration regression

Local Structure Matching Driven by Joint-Saliency-Structure Adaptive Kernel Regression

no code implementations3 Feb 2013 Binjie Qin, Zhuangming Shen, Zien Zhou, Jiawei Zhou, Jiuai Sun, HUI ZHANG, Mingxing Hu, Yisong Lv

For nonrigid image registration, matching the particular structures (or the outliers) that have missing correspondence and/or local large deformations, can be more difficult than matching the common structures with small deformations in the two images.

Image Registration regression

Inverse-Category-Frequency based supervised term weighting scheme for text categorization

2 code implementations13 Dec 2010 Deqing Wang, HUI ZHANG

Term weighting schemes often dominate the performance of many classifiers, such as kNN, centroid-based classifier and SVMs.

Cross-corpus Information Retrieval +3

Cannot find the paper you are looking for? You can Submit a new open access paper.