Search Results for author: Yi He

Found 32 papers, 7 papers with code

LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Perception System

no code implementations16 Apr 2024 Shijing Hu, Ruijun Deng, Xin Du, Zhihui Lu, Qiang Duan, Yi He, Shih-Chia Huang, Jie Wu

We propose to update the edge model and its collaboration strategy with the cloud under the supervision of the large vision model, so as to adapt to the dynamic IoT data streams.

Autonomous Driving Semantic Segmentation

Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis

1 code implementation29 Aug 2023 Sotirios Kastanas, Shaomu Tan, Yi He

In this study, we aim to fill these gaps by conducting a comparative evaluation of state-of-the-art models in document layout analysis and investigating the potential of cross-lingual layout analysis by utilizing machine translation techniques.

Document AI Document Layout Analysis +2

Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition

no code implementations9 Jun 2023 Xianzhao Chen, Yist Y. Lin, Kang Wang, Yi He, Zejun Ma

In this paper, we improve the frame-level classifier for word timings in E2E system by introducing label priors in connectionist temporal classification (CTC) loss, which is adopted from prior works, and combining low-level Mel-scale filter banks with high-level ASR encoder output as input feature.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Sketch2Cloth: Sketch-based 3D Garment Generation with Unsigned Distance Fields

no code implementations1 Mar 2023 Yi He, Haoran Xie, Kazunori Miyata

In this study, we propose Sketch2Cloth, a sketch-based 3D garment generation system using the unsigned distance fields from the user's sketch input.

Model Editing

Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking

no code implementations16 Feb 2023 Zichong Wang, Yang Zhou, Meikang Qiu, Israat Haque, Laura Brown, Yi He, Jianwu Wang, David Lo, Wenbin Zhang

The increasing use of Machine Learning (ML) software can lead to unfair and unethical decisions, thus fairness bugs in software are becoming a growing concern.

Benchmarking counterfactual +1

Multi-Metric AutoRec for High Dimensional and Sparse User Behavior Data Prediction

no code implementations20 Dec 2022 Cheng Liang, Teng Huang, Yi He, Song Deng, Di wu, Xin Luo

The idea of the proposed MMA is mainly two-fold: 1) apply different $L_p$-norm on loss function and regularization to form different variant models in different metric spaces, and 2) aggregate these variant models.

Recommendation Systems

Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition

no code implementations28 Oct 2022 Yist Y. Lin, Tao Han, HaiHua Xu, Van Tung Pham, Yerbolat Khassanov, Tze Yuang Chong, Yi He, Lu Lu, Zejun Ma

One of limitations in end-to-end automatic speech recognition (ASR) framework is its performance would be compromised if train-test utterance lengths are mismatched.

Action Detection Activity Detection +4

Reducing Language confusion for Code-switching Speech Recognition with Token-level Language Diarization

1 code implementation26 Oct 2022 Hexin Liu, HaiHua Xu, Leibny Paola Garcia, Andy W. H. Khong, Yi He, Sanjeev Khudanpur

The comparison of the proposed methods indicates that incorporating language information is more effective than disentangling for reducing language confusion in CS speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

An Online Sparse Streaming Feature Selection Algorithm

no code implementations2 Aug 2022 Feilong Chen, Di wu, Jie Yang, Yi He

In many real applications such as intelligent healthcare platform, streaming feature always has some missing data, which raises a crucial challenge in conducting OSFS, i. e., how to establish the uncertain relationship between sparse streaming features and labels.

feature selection

Internal Language Model Estimation based Language Model Fusion for Cross-Domain Code-Switching Speech Recognition

no code implementations9 Jul 2022 Yizhou Peng, Yufei Liu, Jicheng Zhang, HaiHua Xu, Yi He, Hao Huang, Eng Siong Chng

More importantly, we train an end-to-end (E2E) speech recognition model by means of merging two monolingual data sets and observe the efficacy of the proposed ILME-based LM fusion for CSSR.

Language Modelling speech-recognition +1

Intermediate-layer output Regularization for Attention-based Speech Recognition with Shared Decoder

no code implementations9 Jul 2022 Jicheng Zhang, Yizhou Peng, HaiHua Xu, Yi He, Eng Siong Chng, Hao Huang

Intermediate layer output (ILO) regularization by means of multitask training on encoder side has been shown to be an effective approach to yielding improved results on a wide range of end-to-end ASR frameworks.

speech-recognition Speech Recognition

Efficient Human-in-the-loop System for Guiding DNNs Attention

1 code implementation13 Jun 2022 Yi He, Xi Yang, Chia-Ming Chang, Haoran Xie, Takeo Igarashi

Attention guidance is an approach to addressing dataset bias in deep learning, where the model relies on incorrect features to make decisions.

Active Learning Image Classification

Online Deep Learning from Doubly-Streaming Data

1 code implementation25 Apr 2022 Heng Lian, John Scovil Atwood, BoJian Hou, Jian Wu, Yi He

This paper investigates a new online learning problem with doubly-streaming data, where the data streams are described by feature spaces that constantly evolve, with new features emerging and old features fading away.

A Multi-Metric Latent Factor Model for Analyzing High-Dimensional and Sparse data

no code implementations16 Apr 2022 Di wu, Peng Zhang, Yi He, Xin Luo

High-dimensional and sparse (HiDS) matrices are omnipresent in a variety of big data-related applications.

Representation Learning

Graph-incorporated Latent Factor Analysis for High-dimensional and Sparse Matrices

no code implementations16 Apr 2022 Di wu, Yi He, Xin Luo

A High-dimensional and sparse (HiDS) matrix is frequently encountered in a big data-related application like an e-commerce system or a social network services system.

Representation Learning Vocal Bursts Intensity Prediction

Asymmetric 3D Context Fusion for Universal Lesion Detection

1 code implementation17 Sep 2021 Jiancheng Yang, Yi He, Kaiming Kuang, Zudi Lin, Hanspeter Pfister, Bingbing Ni

The proposed A3D consistently outperforms symmetric context fusion operators by considerable margins, and establishes a new \emph{state of the art} on DeepLesion.

Computed Tomography (CT) Lesion Detection +1

Sketch-based Normal Map Generation with Geometric Sampling

no code implementations23 Apr 2021 Yi He, Haoran Xie, Chao Zhang, Xi Yang, Kazunori Miyata

This paper proposes a deep generative model for generating normal maps from users sketch with geometric sampling.

Generative Adversarial Network

Improving RNN transducer with normalized jointer network

no code implementations3 Nov 2020 Mingkun Huang, Jun Zhang, Meng Cai, Yang Zhang, Jiali Yao, Yongbin You, Yi He, Zejun Ma

In this work, we analyze the cause of the huge gradient variance in RNN-T training and proposed a new \textit{normalized jointer network} to overcome it.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Dynamic latency speech recognition with asynchronous revision

no code implementations3 Nov 2020 Mingkun Huang, Meng Cai, Jun Zhang, Yang Zhang, Yongbin You, Yi He, Zejun Ma

In this work we propose an inference technique, asynchronous revision, to unify streaming and non-streaming speech recognition models.

speech-recognition Speech Recognition

Generating Fundus Fluorescence Angiography Images from Structure Fundus Images Using Generative Adversarial Networks

no code implementations MIDL 2019 Wanyue Li, Wen Kong, YiWei Chen, Jing Wang, Yi He, Guohua Shi, Guohua Deng

Fluorescein angiography can provide a map of retinal vascular structure and function, which is commonly used in ophthalmology diagnosis, however, this imaging modality may pose risks of harm to the patients.

Generative Adversarial Network Translation

Domain adaptation model for retinopathy detection from cross-domain OCT images

no code implementations MIDL 2019 Jing Wang, YiWei Chen, Wanyue Li, Wen Kong, Yi He, Chunhui Jiang, Guohua Shi

A deep neural network (DNN) can assist in retinopathy screening by automatically classifying patients into normal and abnormal categories according to optical coherence tomography (OCT) images.

Domain Adaptation Generative Adversarial Network

Reinventing 2D Convolutions for 3D Images

2 code implementations24 Nov 2019 Jiancheng Yang, Xiaoyang Huang, Yi He, Jingwei Xu, Canqian Yang, Guozheng Xu, Bingbing Ni

Theoretically, ANY 2D CNN (ResNet, DenseNet, or DeepLab) is able to be converted into a 3D ACS CNN, with pretrained weight of a same parameter size.

Representation Learning

Semi-supervised Skin Detection by Network with Mutual Guidance

no code implementations ICCV 2019 Yi He, Jiayuan Shi, Chuan Wang, Haibin Huang, Jiaming Liu, Guanbin Li, Risheng Liu, Jue Wang

In this paper we present a new data-driven method for robust skin detection from a single human portrait image.

On the Convergence of Learning-based Iterative Methods for Nonconvex Inverse Problems

no code implementations16 Aug 2018 Risheng Liu, Shichao Cheng, Yi He, Xin Fan, Zhouchen Lin, Zhongxuan Luo

Moreover, there is a lack of rigorous analysis about the convergence behaviors of these reimplemented iterations, and thus the significance of such methods is a little bit vague.

Scheduling

Toward Designing Convergent Deep Operator Splitting Methods for Task-specific Nonconvex Optimization

no code implementations28 Apr 2018 Risheng Liu, Shichao Cheng, Yi He, Xin Fan, Zhongxuan Luo

Operator splitting methods have been successfully used in computational sciences, statistics, learning and vision areas to reduce complex problems into a series of simpler subproblems.

Deblurring

Frame Stacking and Retaining for Recurrent Neural Network Acoustic Model

no code implementations17 May 2017 Xu Tian, Jun Zhang, Zejun Ma, Yi He, Juan Wei

The system which combined frame retaining with frame stacking could reduces the time consumption of both training and decoding.

General Classification

Deep LSTM for Large Vocabulary Continuous Speech Recognition

no code implementations21 Mar 2017 Xu Tian, Jun Zhang, Zejun Ma, Yi He, Juan Wei, Peihao Wu, Wenchang Situ, Shuai Li, Yang Zhang

It is a competitive framework that LSTM models of more than 7 layers are successfully trained on Shenma voice search data in Mandarin and they outperform the deep LSTM models trained by conventional approach.

speech-recognition Speech Recognition +1

Exponential Moving Average Model in Parallel Speech Recognition Training

no code implementations3 Mar 2017 Xu Tian, Jun Zhang, Zejun Ma, Yi He, Juan Wei

As training data rapid growth, large-scale parallel training with multi-GPUs cluster is widely applied in the neural network model learning currently. We present a new approach that applies exponential moving average method in large-scale parallel training of neural network model.

speech-recognition Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.