Divide and Denoise: Learning from Noisy Labels in Fine-Grained Entity Typing with Cluster-Wise Loss Correction

no code implementations ACL 2022 Kunyuan Pang, Haoyu Zhang, Jie zhou, Ting Wang

In this work, we propose a clustering-based loss correction framework named Feature Cluster Loss Correction (FCLC), to address these two problems.

Entity Typing

360ORB-SLAM: A Visual SLAM System for Panoramic Images with Depth Completion Network

no code implementations19 Jan 2024 Yichen Chen, Yiqi Pan, Ruyu Liu, Haoyu Zhang, Guodao Zhang, Bo Sun, Jianhua Zhang

To enhance the performance and effect of AR/VR applications and visual assistance and inspection systems, visual simultaneous localization and mapping (vSLAM) is a fundamental task in computer vision and robotics.

Depth Completion Simultaneous Localization and Mapping

Flexible uniform-sampling foveated Fourier single-pixel imaging

no code implementations5 Nov 2023 Huan Cui, Jie Cao, Qun Hao, Haoyu Zhang, Chang Zhou

At a sampling ratio of 0. 0084 referring to HR FSI with 1024*768 pixels, experimentally, by UFFSI with 255*341 cells of 89% reduction in data redundancy, the ROI has a significantly better imaging quality to meet imaging needs.

Uncovering Hidden Connections: Iterative Tracking and Reasoning for Video-grounded Dialog

1 code implementation11 Oct 2023 Haoyu Zhang, Meng Liu, YaoWei Wang, Da Cao, Weili Guan, Liqiang Nie

In response to this gap, we present an iterative tracking and reasoning strategy that amalgamates a textual encoder, a visual encoder, and a generator.

Question Answering Response Generation +1

Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis

1 code implementation9 Oct 2023 Haoyu Zhang, Yu Wang, Guanghao Yin, Kejun Liu, Yuanyuan Liu, Tianshu Yu

Though Multimodal Sentiment Analysis (MSA) proves effective by utilizing rich information from multiple sources (e. g., language, video, and audio), the potential sentiment-irrelevant and conflicting information across modalities may hinder the performance from being further improved.

Multimodal Sentiment Analysis

SAMN: A Sample Attention Memory Network Combining SVM and NN in One Architecture

no code implementations25 Sep 2023 Qiaoling Yang, Linkai Luo, Haoyu Zhang, Hong Peng, Ziyang Chen

To address this, we propose a sample attention memory network (SAMN) that effectively combines SVM and NN by incorporating sample attention module, class prototypes, and memory block to NN.

A Hierarchical Destroy and Repair Approach for Solving Very Large-Scale Travelling Salesman Problem

no code implementations9 Aug 2023 Zhang-Hua Fu, Sipeng Sun, Jintong Ren, Tianshu Yu, Haoyu Zhang, Yuanyuan Liu, Lingxiao Huang, Xiang Yan, Pinyan Lu

Fair comparisons based on nineteen famous large-scale instances (with 10, 000 to 10, 000, 000 cities) show that HDR is highly competitive against existing state-of-the-art TSP algorithms, in terms of both efficiency and solution quality.

Computational Efficiency

DiffuseGAE: Controllable and High-fidelity Image Manipulation from Disentangled Representation

no code implementations12 Jul 2023 Yipeng Leng, Qiangjuan Huang, Zhiyuan Wang, Yangyang Liu, Haoyu Zhang

To further explore the latent space of Diff-AE and achieve a generic editing pipeline, we proposed a module called Group-supervised AutoEncoder(dubbed GAE) for Diff-AE to achieve better disentanglement on the latent code.

Attribute Disentanglement +2

GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech

no code implementations27 Jun 2023 Yahuan Cong, Haoyu Zhang, Haopeng Lin, Shichao Liu, Chunfeng Wang, Yi Ren, Xiang Yin, Zejun Ma

Cross-lingual timbre and style generalizable text-to-speech (TTS) aims to synthesize speech with a specific reference timbre or style that is never trained in the target language.

Disentanglement Style Generalization

Learning Variable Impedance Skills from Demonstrations with Passivity Guarantee

no code implementations20 Jun 2023 Yu Zhang, Long Cheng, Xiuze Xia, Haoyu Zhang

The proposed approach involves the estimation of full stiffness matrices from human demonstrations, which are then combined with sensed forces and motion information to create a model using the non-parametric method.

Noise-Resistant Multimodal Transformer for Emotion Recognition

no code implementations4 May 2023 Yuanyuan Liu, Haoyu Zhang, Yibing Zhan, Zijing Chen, Guanghao Yin, Lin Wei, Zhe Chen

To this end, we present a novel paradigm that attempts to extract noise-resistant features in its pipeline and introduces a noise-aware learning scheme to effectively improve the robustness of multimodal emotion understanding.

Multimodal Emotion Recognition

LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion

no code implementations2 Mar 2023 Chunfeng Wang, Peisong Huang, Yuxiang Zou, Haoyu Zhang, Shichao Liu, Xiang Yin, Zejun Ma

As a key component of automated speech recognition (ASR) and the front-end in text-to-speech (TTS), grapheme-to-phoneme (G2P) plays the role of converting letters to their corresponding pronunciations.

speech-recognition Speech Recognition

A Survey on Computationally Efficient Neural Architecture Search

no code implementations3 Jun 2022 Shiqing Liu, Haoyu Zhang, Yaochu Jin

Neural architecture search (NAS) has become increasingly popular in the deep learning community recently, mainly because it can provide an opportunity to allow interested users without rich expertise to benefit from the success of deep neural networks (DNNs).

Computational Efficiency Neural Architecture Search

Temporally and Spatially variant-resolution illumination patterns in computational ghost imaging

no code implementations5 May 2022 Dong Zhou, Jie Cao, Huan Cui, Li-Xing Lin, Haoyu Zhang, Yingqiang Zhang, Qun Hao

For the same number of measurements, the method using temporally variable-resolution illumination patterns has better imaging quality than CGI, but it is less robust to noise.

End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding

no code implementations ACL 2022 Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Wenming Tan, Jin Wang, Peng Wang, ShiLiang Pu, Fei Wu

To achieve effective grounding under a limited annotation budget, we investigate one-shot video grounding, and learn to ground natural language in all video frames with solely one frame labeled, in an end-to-end manner.

Descriptive Representation Learning +1

Generation of Non-Deterministic Synthetic Face Datasets Guided by Identity Priors

no code implementations7 Dec 2021 Marcel Grimmer, Haoyu Zhang, Raghavendra Ramachandra, Kiran Raja, Christoph Busch

Mated samples are generated by manipulating latent vectors, and more precisely, we exploit Principal Component Analysis (PCA) to define semantically meaningful directions in the latent space and control the similarity between the original and the mated samples using a pre-trained face recognition system.

Face Image Quality Face Recognition

Learning Non-Stationary Time-Series with Dynamic Pattern Extractions

no code implementations20 Nov 2021 Xipei Wang, Haoyu Zhang, Yuanbo Zhang, Meng Wang, Jiarui Song, Tin Lai, Matloob Khushi

Our results show that our model can predict 4-hour future trends with high accuracy in the Forex dataset, which is crucial in realistic scenarios to assist foreign exchange trading decision making.

Decision Making Dynamic Time Warping +2

Expression Snippet Transformer for Robust Video-based Facial Expression Recognition

no code implementations17 Sep 2021 Yuanyuan Liu, Wenbin Wang, Chuanxu Feng, Haoyu Zhang, Zhe Chen, Yibing Zhan

To this end, we propose to decompose each video into a series of expression snippets, each of which contains a small number of facial movements, and attempt to augment the Transformer's ability for modeling intra-snippet and inter-snippet visual relations, respectively, obtaining the Expression snippet Transformer (EST).

Dynamic Facial Expression Recognition Facial Expression Recognition +1

On the Applicability of Synthetic Data for Face Recognition

no code implementations6 Apr 2021 Haoyu Zhang, Marcel Grimmer, Raghavendra Ramachandra, Kiran Raja, Christoph Busch

Face verification has come into increasing focus in various applications including the European Entry/Exit System, which integrates face recognition mechanisms.

Face Image Quality Face Image Quality Assessment +2

Keyphrase Extraction with Dynamic Graph Convolutional Networks and Diversified Inference

no code implementations24 Oct 2020 Haoyu Zhang, Dingkun Long, Guangwei Xu, Pengjun Xie, Fei Huang, Ji Wang

Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.

Keyphrase Extraction Representation Learning

VirtualFlow: Decoupling Deep Learning Models from the Underlying Hardware

no code implementations20 Sep 2020 Andrew Or, Haoyu Zhang, Michael J. Freedman

In our evaluation, our implementation of VirtualFlow for TensorFlow achieved strong convergence guarantees across different hardware with out-of-the-box hyperparameters, up to 48% lower job completion times with resource elasticity, and up to 42% higher throughput with heterogeneous training.

From Federated Learning to Federated Neural Architecture Search: A Survey

no code implementations12 Sep 2020 Hangyu Zhu, Haoyu Zhang, Yaochu Jin

Federated learning is a recently proposed distributed machine learning paradigm for privacy preservation, which has found a wide range of applications where data privacy is of primary concern.

Distributed, Parallel, and Cluster Computing

MIPGAN -- Generating Strong and High Quality Morphing Attacks Using Identity Prior Driven GAN

no code implementations3 Sep 2020 Haoyu Zhang, Sushma Venkatesh, Raghavendra Ramachandra, Kiran Raja, Naser Damer, Christoph Busch

Extensive experiments are carried out to assess the FRS's vulnerability against the proposed morphed face generation technique on three types of data such as digital images, re-digitized (printed and scanned) images, and compressed images after re-digitization from newly generated MIPGAN Face Morph Dataset.

Face Generation Face Recognition +2

Sampled Training and Node Inheritance for Fast Evolutionary Neural Architecture Search

no code implementations7 Mar 2020 Haoyu Zhang, Yaochu Jin, Ran Cheng, Kuangrong Hao

Recently, evolutionary neural architecture search (ENAS) has received increasing attention due to the attractive global optimization capability of evolutionary algorithms.

Evolutionary Algorithms Neural Architecture Search

Recovering compressed images for automatic crack segmentation using generative models

no code implementations6 Mar 2020 Yong Huang, Haoyu Zhang, Hui Li, Stephen Wu

We develop a recovery framework for automatic crack segmentation of compressed crack images based on this new CS method and demonstrate the remarkable performance of the method taking advantage of the strong capability of generative models to capture the necessary features required in the crack segmentation task even the backgrounds of the generated images are not well reconstructed.

Compressive Sensing Crack Segmentation +1

Complex Question Decomposition for Semantic Parsing

1 code implementation ACL 2019 Haoyu Zhang, Jingjing Cai, Jianjun Xu, Ji Wang

We conduct experiments on COMPLEXWEBQUESTIONS which is a large scale complex question semantic parsing dataset, results show that our model achieves significant improvement compared to state-of-the-art methods.

Semantic Parsing

Pretraining-Based Natural Language Generation for Text Summarization

4 code implementations CONLL 2019 Haoyu Zhang, Jianjun Xu, Ji Wang

For the decoder, there are two stages in our model, in the first stage, we use a Transformer-based decoder to generate a draft output sequence.

Abstractive Text Summarization Text Generation

EmbedJoin: Efficient Edit Similarity Joins via Embeddings

1 code implementation1 Feb 2017 Haoyu Zhang, Qin Zhang

Edit similarity join is a fundamental problem in data cleaning/integration, bioinformatics, collaborative filtering and natural language processing, and has been identified as a primitive operator for database systems.


