Search Results for author: Yi Yu

Found 90 papers, 36 papers with code

Anchor-aware Deep Metric Learning for Audio-visual Retrieval

no code implementations21 Apr 2024 Donghuo Zeng, Yanan Wang, Kazushi Ikeda, Yi Yu

However, the model training fails to fully explore the space due to the scarcity of training data points, resulting in an incomplete representation of the overall positive and negative distributions.

Cross-Modal Retrieval Metric Learning +1

MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection

no code implementations12 Apr 2024 Chenqi Kong, Anwei Luo, Song Xia, Yi Yu, Haoliang Li, Alex C. Kot

Moreover, MoE-FFD leverages the expressivity of transformers and local priors of CNNs to simultaneously extract global and local forgery clues.

Empirical Upscaling of Point-scale Soil Moisture Measurements for Spatial Evaluation of Model Simulations and Satellite Retrievals

no code implementations8 Apr 2024 Yi Yu, Brendan P. Malone, Luigi J. Renzullo

The cross-cluster validation underscored the capability of the upscaling approach to map the spatial variability of SM within areas that were not covered by in-situ sites, with correlation performance ranging between 0. 6 and 0. 8.

Safeguarding Medical Image Segmentation Datasets against Unauthorized Training via Contour- and Texture-Aware Perturbations

no code implementations21 Mar 2024 Xun Lin, Yi Yu, Song Xia, Jue Jiang, Haoran Wang, Zitong Yu, Yizhong Liu, Ying Fu, Shuai Wang, Wenzhong Tang, Alex Kot

This is particularly true for medical image segmentation (MIS) datasets, where the processes of collection and fine-grained annotation are time-intensive and laborious.

Image Classification Image Generation +4

Federated Transfer Learning with Differential Privacy

no code implementations17 Mar 2024 Mengchu Li, Ye Tian, Yang Feng, Yi Yu

By investigating the minimax rates and identifying the costs of privacy for these problems, we show that federated differential privacy is an intermediate privacy model between the well-established local and central models of differential privacy.

Federated Learning regression +1

Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI

1 code implementation15 Mar 2024 Chong Wang, Lanqing Guo, YuFei Wang, Hao Cheng, Yi Yu, Bihan Wen

Starting from decomposing the original maximum-a-posteriori problem of accelerated MRI, we present a rigorous derivation of the proposed PDAC framework, which could be further unfolded into an end-to-end trainable network.

MRI Reconstruction

LM2D: Lyrics- and Music-Driven Dance Synthesis

no code implementations14 Mar 2024 Wenjie Yin, Xuejiao Zhao, Yi Yu, Hang Yin, Danica Kragic, Mårten Björkman

First, we propose LM2D, a novel probabilistic architecture that incorporates a multimodal diffusion model with consistency distillation, designed to create dance conditioned on both music and lyrics in one diffusion generation step.

Pose Estimation

PnPNet: Pull-and-Push Networks for Volumetric Segmentation with Boundary Confusion

1 code implementation13 Dec 2023 Xin You, Ming Ding, Minghui Zhang, Hanxiao Zhang, Yi Yu, Jie Yang, Yun Gu

Precise boundary segmentation of volumetric images is a critical task for image-guided diagnosis and computer-assisted intervention, especially for boundary confusion in clinical practice.

Scalable Motion Style Transfer with Constrained Diffusion Generation

no code implementations12 Dec 2023 Wenjie Yin, Yi Yu, Hang Yin, Danica Kragic, Mårten Björkman

Current training of motion style transfer systems relies on consistency losses across style domains to preserve contents, hindering its scalable application to a large number of domains and private data.

Motion Style Transfer Style Transfer

Syllable-level lyrics generation from melody exploiting character-level language model

no code implementations2 Oct 2023 Zhe Zhang, Karol Lasocki, Yi Yu, Atsuhiro Takasu

The generation of lyrics tightly connected to accompanying melodies involves establishing a mapping between musical notes and syllables of lyrics.

Language Modelling Sentence

Semantic Difference Guidance for the Uncertain Boundary Segmentation of CT Left Atrial Appendage

1 code implementation MICCAI 2023 Xin You, Ming Ding, Minghui Zhang, Yangqian Wu, Yi Yu, Yun Gu, Jie Yang

In this paper, we have modeled relative relations between the LA and LAA via deep segmentation networks for the first time, and introduce a new LA & LAA CT dataset.

Segmentation

LiveChat: Video Comment Generation from Audio-Visual Multimodal Contexts

no code implementations1 Oct 2023 Julien Lalanne, Raphael Bournet, Yi Yu

Live commenting on video, a popular feature of live streaming platforms, enables viewers to engage with the content and share their comments, reactions, opinions, or questions with the streamer or other viewers while watching the video or live stream.

Comment Generation multimodal generation

Music- and Lyrics-driven Dance Synthesis

1 code implementation30 Sep 2023 Wenjie Yin, Qingyuan Yao, Yi Yu, Hang Yin, Danica Kragic, Mårten Björkman

To complement it, we introduce JustLMD, a new multimodal dataset of 3D dance motion with music and lyrics.

Modify Training Directions in Function Space to Reduce Generalization Error

no code implementations25 Jul 2023 Yi Yu, Wenlian Lu, BoYu Chen

We propose theoretical analyses of a modified natural gradient descent method in the neural network function space based on the eigendecompositions of neural tangent kernel and Fisher information matrix.

Towards Integrated Traffic Control with Operating Decentralized Autonomous Organization

no code implementations25 Jul 2023 Shengyue Yao, Jingru Yu, Yi Yu, Jia Xu, Xingyuan Dai, Honghai Li, Fei-Yue Wang, Yilun Lin

Furthermore, an operation algorithm is proposed regarding the issue of structural rigidity in DAO.

ExposureDiffusion: Learning to Expose for Low-light Image Enhancement

1 code implementation ICCV 2023 YuFei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex C. Kot, Bihan Wen

Different from a vanilla diffusion model that has to perform Gaussian denoising, with the injected physics-based exposure model, our restoration process can directly start from a noisy image instead of pure noise.

Image Denoising Low-Light Image Enhancement

Beyond Learned Metadata-based Raw Image Reconstruction

1 code implementation21 Jun 2023 YuFei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex C. Kot, Bihan Wen

Besides, we propose a novel design of the context model, which can better predict the order masks of encoding/decoding based on both the sRGB image and the masks of already processed features.

Image Compression Image Reconstruction +1

Controllable Lyrics-to-Melody Generation

no code implementations5 Jun 2023 Zhe Zhang, Yi Yu, Atsuhiro Takasu

Lyrics-to-melody generation is an interesting and challenging topic in AI music research field.

Music Generation

Robust Andrew's sine estimate adaptive filtering

no code implementations29 Mar 2023 Lu Lu, Yi Yu, Zongsheng Zheng, Guangya Zhu, Xiaomin Yang

Two Andrew's sine estimator (ASE)-based robust adaptive filtering algorithms are proposed in this brief.

Denoising

Emotionally Enhanced Talking Face Generation

1 code implementation21 Mar 2023 Sahil Goyal, Shagun Uppal, Sarthak Bhagat, Yi Yu, Yifang Yin, Rajiv Ratn Shah

To mitigate this, we build a talking face generation framework conditioned on a categorical emotion to generate videos with appropriate expressions, making them more realistic and convincing.

Talking Face Generation Talking Head Generation

Augmented smartphone bilirubinometer enabled by a mobile app that turns smartphone into multispectral imager

no code implementations4 Mar 2023 Qinghua He, Wanyu Li, Yaping Shi, Yi Yu, Yi Zhang, Wenqian Geng, Zhiyuan Sun, Ruikang K Wang

This study highlights the potential of SpeCamX to improve the prediction of bio-chromophores, and its ability to transform an ordinary smartphone into a powerful medical tool without the need for additional investments or expertise.

Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger

no code implementations CVPR 2023 Yi Yu, YuFei Wang, Wenhan Yang, Shijian Lu, Yap-Peng Tan, Alex C. Kot

Extensive experiments show that with our trained trigger injection models and simple modification of encoder parameters (of the compression model), the proposed attack can successfully inject several backdoors with corresponding triggers in a single image compression model.

Backdoor Attack Face Recognition +2

Raw Image Reconstruction with Learned Compact Metadata

1 code implementation CVPR 2023 YuFei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex Kot, Bihan Wen

While raw images exhibit advantages over sRGB images (e. g., linearity and fine-grained quantization level), they are not widely used by common users due to the large storage requirements.

Image Compression Image Reconstruction +1

Deep Attention-Based Alignment Network for Melody Generation from Incomplete Lyrics

no code implementations23 Jan 2023 Gurunath Reddy M, Zhe Zhang, Yi Yu, Florian Harscoet, Simon Canales, Suhua Tang

We propose a deep attention-based alignment network, which aims to automatically predict lyrics and melody with given incomplete lyrics as input in a way similar to the music creation of humans.

Deep Attention

Emotional Talking Faces: Making Videos More Expressive and Realistic

no code implementations ACM Multimedia Asia 2022 Sahil Goyal, Shagun Uppal, Sarthak Bhagat, Dhroov Goel, Sakshat Mali, Yi Yu, Yifang Yin, Rajiv Ratn Shah

Lip synchronization and talking face generation have gained a specific interest from the research community with the advent and need of digital communication in different fields.

Talking Face Generation

Phase-Shifting Coder: Predicting Accurate Orientation in Oriented Object Detection

1 code implementation CVPR 2023 Yi Yu, Feipeng Da

With the vigorous development of computer vision, oriented object detection has gradually been featured.

Object object-detection +2

SoccerNet 2022 Challenges Results

7 code implementations5 Oct 2022 Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li

The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.

Action Spotting Camera Calibration +3

Playing Technique Detection by Fusing Note Onset Information in Guzheng Performance

no code implementations19 Sep 2022 Dichucheng Li, Yulun Wu, Qinyu Li, Jiahao Zhao, Yi Yu, Fan Xia, Wei Li

Because each Guzheng playing technique is applied to a note, a dedicated onset detector is trained to divide an audio into several notes and its predictions are fused with frame-wise IPT predictions.

Widely Used and Fast De Novo Drug Design by a Protein Sequence-Based Reinforcement Learning Model

no code implementations14 Aug 2022 YaQin Li, Lingli Li, Yongjin Xu, Yi Yu

In the generative model, one of the reward components, a binding affinity predictor, is based on 1D protein sequence and molecular SMILES.

Drug Discovery Molecular Docking +1

Study of General Robust Subband Adaptive Filtering

no code implementations4 Aug 2022 Yi Yu, Hongsen He, Rodrigo C. de Lamare, Badong Chen

In this paper, we propose a general robust subband adaptive filtering (GR-SAF) scheme against impulsive noise by minimizing the mean square deviation under the random-walk model with individual weight uncertainty.

Interpretable Melody Generation from Lyrics with Discrete-Valued Adversarial Training

no code implementations30 Jun 2022 Wei Duan, Zhe Zhang, Yi Yu, Keizo Oyama

Generating melody from lyrics is an interesting yet challenging task in the area of artificial intelligence and music.

Design and Analysis of Robust Resilient Diffusion over Multi-Task Networks Against Byzantine Attacks

no code implementations25 Jun 2022 Tao Yu, Rodrigo C. de Lamare, Yi Yu

This paper studies distributed diffusion adaptation over clustered multi-task networks in the presence of impulsive interferences and Byzantine attacks.

Chemical transformer compression for accelerating both training and inference of molecular modeling

1 code implementation16 May 2022 Yi Yu, Karl Borjesson

Transformer models have been developed in molecular science with excellent performance in applications including quantitative structure-activity relationship (QSAR) and virtual screening (VS).

Knowledge Distillation Model Compression

Sparsity-Aware Robust Normalized Subband Adaptive Filtering algorithms based on Alternating Optimization

no code implementations15 May 2022 Yi Yu, Zongxin Huang, Hongsen He, Yuriy Zakharov, Rodrigo C. de Lamare

This paper proposes a unified sparsity-aware robust normalized subband adaptive filtering (SA-RNSAF) algorithm for identification of sparse systems under impulsive noise.

HarmoF0: Logarithmic Scale Dilated Convolution For Pitch Estimation

1 code implementation2 May 2022 Weixing Wei, Peilin Li, Yi Yu, Wei Li

Sounds, especially music, contain various harmonic components scattered in the frequency dimension.

Towards Robust Rain Removal Against Adversarial Attacks: A Comprehensive Benchmark Analysis and Beyond

1 code implementation CVPR 2022 Yi Yu, Wenhan Yang, Yap-Peng Tan, Alex C. Kot

Finally, we examine various types of adversarial attacks that are specific to deraining problems and their effects on both human and machine vision tasks, including 1) rain region attacks, adding perturbations only in the rain regions to make the perturbations in the attacked rain images less visible; 2) object-sensitive attacks, adding perturbations only in regions near the given objects.

Rain Removal

Conjugate Gradient Adaptive Learning with Tukey's Biweight M-Estimate

no code implementations19 Mar 2022 Lu Lu, Yi Yu, Rodrigo C. de Lamare, Xiaomin Yang

We propose a novel M-estimate conjugate gradient (CG) algorithm, termed Tukey's biweight M-estimate CG (TbMCG), for system identification in impulsive noise environments.

Recent Advances and Challenges in Deep Audio-Visual Correlation Learning

no code implementations28 Feb 2022 Luís Vilaça, Yi Yu, Paula Viana

Audio-visual correlation learning aims to capture essential correspondences and understand natural phenomena between audio and video.

DEEPCHORUS: A Hybrid Model of Multi-scale Convolution and Self-attention for Chorus Detection

1 code implementation13 Feb 2022 Qiqi He, Xiaoheng Sun, Yi Yu, Wei Li

Chorus detection is a challenging problem in musical signal processing as the chorus often repeats more than once in popular songs, usually with rich instruments and complex rhythm forms.

Feature Distillation Interaction Weighting Network for Lightweight Image Super-Resolution

1 code implementation16 Dec 2021 Guangwei Gao, Wenjie Li, Juncheng Li, Fei Wu, Huimin Lu, Yi Yu

Convolutional neural networks based single-image super-resolution (SISR) has made great progress in recent years.

Image Super-Resolution

Variational Autoencoder with CCA for Audio-Visual Cross-Modal Retrieval

no code implementations5 Dec 2021 Jiwei Zhang, Yi Yu, Suhua Tang, Jianming Wu, Wei Li

On the one hand, audio encoder and visual encoder separately encode audio data and visual data into two different latent spaces.

Cross-Modal Retrieval Information Retrieval +1

Active noise control techniques for nonlinear systems

no code implementations19 Oct 2021 Lu Lu, Kai-Li Yin, Rodrigo C. de Lamare, Zongsheng Zheng, Yi Yu, Xiaomin Yang, Badong Chen

Most of the literature focuses on the development of the linear active noise control (ANC) techniques.

A survey on active noise control techniques -- Part I: Linear systems

no code implementations1 Oct 2021 Lu Lu, Kai-Li Yin, Rodrigo C. de Lamare, Zongsheng Zheng, Yi Yu, Xiaomin Yang, Badong Chen

Active noise control (ANC) is an effective way for reducing the noise level in electroacoustic or electromechanical systems.

CRNNTL: convolutional recurrent neural network and transfer learning for QSAR modelling

no code implementations7 Sep 2021 YaQin Li, Yongjin Xu, Yi Yu

Our strategy takes advantages of both convolutional and recurrent neural networks for feature extraction, as well as the data augmentation method.

Data Augmentation Transfer Learning

FBSNet: A Fast Bilateral Symmetrical Network for Real-Time Semantic Segmentation

1 code implementation2 Sep 2021 Guangwei Gao, Guoan Xu, Juncheng Li, Yi Yu, Huimin Lu, Jian Yang

Specifically, FBSNet employs a symmetrical encoder-decoder structure with two branches, semantic information branch and spatial detail branch.

Autonomous Driving Drone navigation +1

Study of Proximal Normalized Subband Adaptive Algorithm for Acoustic Echo Cancellation

no code implementations14 Aug 2021 Gang Guo, Yi Yu, Rodrigo C. de Lamare, Zongsheng Zheng, Lu Lu, Qiangming Cai

In addition, an adaptive approach for the choice of the thresholding parameter in the proximal step is also proposed based on the minimization of the mean square deviation.

Acoustic echo cancellation

Interpretable Visual Understanding with Cognitive Attention Network

1 code implementation6 Aug 2021 Xuejiao Tang, Wenbin Zhang, Yi Yu, Kea Turner, Tyler Derr, Mengyu Wang, Eirini Ntoutsi

While image understanding on recognition-level has achieved remarkable advancements, reliable visual scene understanding requires comprehensive image understanding on recognition-level but also cognition-level, which calls for exploiting the multi-source information as well as learning different levels of understanding and extensive commonsense knowledge.

Scene Understanding Visual Commonsense Reasoning

Multi-TimeLine Summarization (MTLS): Improving Timeline Summarization by Generating Multiple Summaries

no code implementations ACL 2021 Yi Yu, Adam Jatowt, Antoine Doucet, Kazunari Sugiyama, Masatoshi Yoshikawa

In this paper, we address a novel task, Multiple TimeLine Summarization (MTLS), which extends the flexibility and versatility of Time-Line Summarization (TLS).

Timeline Summarization

Adversarial Learning with Mask Reconstruction for Text-Guided Image Inpainting

1 code implementation Conference 2021 Xingcai Wu, Yucheng Xie, Jiaqi Zeng, Zhenguo Yang, Yi Yu, Qing Li, and Wenyin Liu

In this paper, we propose an adversarial learning framework with mask reconstruction (ALMR) for image inpainting with textual guidance, which consists of a two-stage generator and dual discriminators.

Image Inpainting Sentence

Lattice partition recovery with dyadic CART

1 code implementation NeurIPS 2021 Oscar Hernan Madrid Padilla, Yi Yu, Alessandro Rinaldo

We study piece-wise constant signals corrupted by additive Gaussian noise over a $d$-dimensional lattice.

regression

Few-Data Guided Learning Upon End-to-End Point Cloud Network for 3D Face Recognition

no code implementations31 Mar 2021 Yi Yu, Feipeng Da, Ziyu Zhang

Without fine-tuning on the test set, the Rank-1 Recognition Rate (RR1) is achieved as follows: 98. 85% on FRGC v2. 0 dataset and 99. 33% on Bosphorus dataset, which proves the effectiveness and the potentiality of our method.

Face Recognition Point Cloud Classification

Leaning Compact and Representative Features for Cross-Modality Person Re-Identification

1 code implementation26 Mar 2021 Guangwei Gao, Hao Shao, Fei Wu, Meng Yang, Yi Yu

This paper pays close attention to the cross-modality visible-infrared person re-identification (VI Re-ID) task, which aims to match pedestrian samples between visible and infrared modes.

Cross-Modality Person Re-identification Knowledge Distillation +1

Hierarchical Deep CNN Feature Set-Based Representation Learning for Robust Cross-Resolution Face Recognition

no code implementations25 Mar 2021 Guangwei Gao, Yi Yu, Jian Yang, Guo-Jun Qi, Meng Yang

(i) To learn more robust and discriminative features, we desire to adaptively fuse the contextual features from different layers.

Face Recognition Representation Learning

MSCFNet: A Lightweight Network With Multi-Scale Context Fusion for Real-Time Semantic Segmentation

no code implementations24 Mar 2021 Guangwei Gao, Guoan Xu, Yi Yu, Jin Xie, Jian Yang, Dong Yue

In recent years, how to strike a good trade-off between accuracy and inference speed has become the core issue for real-time semantic segmentation applications, which plays a vital role in real-world scenarios such as autonomous driving systems and drones.

Autonomous Driving Real-Time Semantic Segmentation +1

Lightweight Image Super-Resolution with Multi-scale Feature Interaction Network

no code implementations24 Mar 2021 Zhengxue Wang, Guangwei Gao, Juncheng Li, Yi Yu, Huimin Lu

Recently, the single image super-resolution (SISR) approaches with deep and complex convolutional neural network structures have achieved promising performance.

Image Super-Resolution

Generalized non-stationary bandits

no code implementations1 Feb 2021 Anne Gael Manegueu, Alexandra Carpentier, Yi Yu

On top of the switching bandit problem (\textbf{Case a}), we are interested in three concrete examples: (\textbf{b}) the means of the arms are local polynomials, (\textbf{c}) the means of the arms are locally smooth, and (\textbf{d}) the gaps of the arms have a bounded number of inflexion points and where the highest arm mean cannot vary too much in a short range.

Optimal network online change point localisation

no code implementations14 Jan 2021 Yi Yu, Oscar Hernan Madrid Padilla, Daren Wang, Alessandro Rinaldo

The goal is to detect the change point as quickly as possible, if it exists, subject to a constraint on the number or probability of false alarms.

Change Point Detection

MusicTM-Dataset for Joint Representation Learning among Sheet Music, Lyrics, and Musical Audio

no code implementations1 Dec 2020 Donghuo Zeng, Yi Yu, Keizo Oyama

This work present a music dataset named MusicTM-Dataset, which is utilized in improving the representation learning ability of different types of cross-modal retrieval (CMR).

Cross-Modal Retrieval Information Retrieval +3

Functional Linear Regression with Mixed Predictors

1 code implementation1 Dec 2020 Daren Wang, Zifeng Zhao, Yi Yu, Rebecca Willett

We derive finite sample theoretical guarantees and show that the excess prediction risk of our estimator is minimax optimal.

Statistics Theory Methodology Statistics Theory

Automatic Neural Lyrics and Melody Composition

no code implementations12 Nov 2020 Gurunath Reddy Madhumani, Yi Yu, Florian Harscoët, Simon Canales, Suhua Tang

In this paper, we propose a technique to address the most challenging aspect of algorithmic songwriting process, which enables the human community to discover original lyrics, and melodies suitable for the generated lyrics.

Sentence

Conditional Hybrid GAN for Sequence Generation

no code implementations18 Sep 2020 Yi Yu, Abhishek Srivastava, Rajiv Ratn Shah

Conditional sequence generation aims to instruct the generation procedure by conditioning the model with additional context information, which is a self-supervised learning issue (a form of unsupervised learning with supervision information from data itself).

Attribute Relational Reasoning +1

Sparsity-Aware SSAF Algorithm with Individual Weighting Factors for Acoustic Echo Cancellation

no code implementations18 Sep 2020 Yi Yu, Tao Yang, Hongyang Chen, Rodrigo C. de Lamare, Yingsong Li

In this paper, we propose and analyze the sparsity-aware sign subband adaptive filtering with individual weighting factors (S-IWF-SSAF) algorithm, and consider its application in acoustic echo cancellation (AEC).

Acoustic echo cancellation

Unsupervised Generative Adversarial Alignment Representation for Sheet music, Audio and Lyrics

no code implementations29 Jul 2020 Donghuo Zeng, Yi Yu, Keizo Oyama

In this paper, we propose an unsupervised generative adversarial alignment representation (UGAAR) model to learn deep discriminative representations shared across three major musical modalities: sheet music, lyrics, and audio, where a deep neural network based architecture on three branches is jointly trained.

Representation Learning

End-to-end Named Entity Recognition from English Speech

1 code implementation22 May 2020 Hemant Yadav, Sreyan Ghosh, Yi Yu, Rajiv Ratn Shah

Named entity recognition (NER) from text has been a widely studied problem and usually extracts semantic information from text.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

C3VQG: Category Consistent Cyclic Visual Question Generation

1 code implementation15 May 2020 Shagun Uppal, Anish Madan, Sarthak Bhagat, Yi Yu, Rajiv Ratn Shah

In this paper, we try to exploit the different visual cues and concepts in an image to generate questions using a variational autoencoder (VAE) without ground-truth answers.

Natural Questions Question Generation +1

Investigation of Singing Voice Separation for Singing Voice Detection in Polyphonic Music

no code implementations8 Apr 2020 Yifu Sun, xulong Zhang, Yi Yu, Xi Chen, Wei Li

Singing voice detection (SVD), to recognize vocal parts in the song, is an essential task in music information retrieval (MIR).

Information Retrieval Melody Extraction +2

Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions

1 code implementation26 Nov 2019 Osaid Rehman Nasir, Shailesh Kumar Jha, Manraj Singh Grover, Yi Yu, Ajit Kumar, Rajiv Ratn Shah

We then model the highly multi-modal problem of text to face generation as learning the conditional distribution of faces (conditioned on text) in same latent space.

Face Generation Face Reconstruction +1

Conditional LSTM-GAN for Melody Generation from Lyrics

2 code implementations15 Aug 2019 Yi Yu, Abhishek Srivastava, Simon Canales

Melody generation from lyrics has been a challenging research issue in the field of artificial intelligence and music, which enables to learn and discover latent relationship between interesting lyrics and accompanying melody.

Generative Adversarial Network

Audio-Visual Embedding for Cross-Modal MusicVideo Retrieval through Supervised Deep CCA

no code implementations10 Aug 2019 Donghuo Zeng, Yi Yu, Keizo Oyama

ii) We propose an end-to-end deep model for cross-modal audio-visual learning where S-DCCA is trained to learn the semantic correlation between audio and visual modalities.

audio-visual learning Retrieval +1

Deep Triplet Neural Networks with Cluster-CCA for Audio-Visual Cross-modal Retrieval

2 code implementations10 Aug 2019 Donghuo Zeng, Yi Yu, Keizo Oyama

In particular, two significant contributions are made: i) a better representation by constructing deep triplet neural network with triplet loss for optimal projections can be generated to maximize correlation in the shared subspace.

Cross-Modal Retrieval Information Retrieval +1

Personalized Music Recommendation with Triplet Network

no code implementations10 Aug 2019 Haoting Liang, Donghuo Zeng, Yi Yu, Keizo Oyama

Since many online music services emerged in recent years so that effective music recommendation systems are desirable.

Music Recommendation Recommendation Systems

Social Influence-based Attentive Mavens Mining and Aggregative Representation Learning for Group Recommendation

no code implementations10 Aug 2019 Peipei Wang, Lin Li, Yi Yu, Guandong Xu

To tackle the issue of preference aggregation for group recommendation, we propose a novel attentive aggregation representation learning method based on sociological theory for group recommendation, namely SIAGR (short for "Social Influence-based Attentive Group Recommendation"), which takes attention mechanisms and the popular method (BERT) as the aggregation representation for group profile modeling.

Collaborative Filtering Decision Making +2

Ensemble Super-Resolution with A Reference Dataset

1 code implementation12 May 2019 Junjun Jiang, Yi Yu, Zheng Wang, Suhua Tang, Ruimin Hu, Jiayi Ma

In this paper, we present a simple but effective single image SR method based on ensemble learning, which can produce a better performance than that could be obtained from any of SR methods to be ensembled (or called component super-resolvers).

Ensemble Learning Image Super-Resolution

Deep Knowledge Tracing and Dynamic Student Classification for Knowledge Tracing

1 code implementation24 Sep 2018 Sein Minn, Yi Yu, Michel C. Desmarais, Feida Zhu, Jill Jenn Vie

In Intelligent Tutoring System (ITS), tracing the student's knowledge state during learning has been studied for several decades in order to provide more supportive learning instructions.

Classification General Classification +1

Context-Patch Face Hallucination Based on Thresholding Locality-constrained Representation and Reproducing Learning

2 code implementations3 Sep 2018 Junjun Jiang, Yi Yu, Suhua Tang, Jiayi Ma, Akiko Aizawa, Kiyoharu Aizawa

To this end, this study incorporates the contextual information of image patch and proposes a powerful and efficient context-patch based face hallucination approach, namely Thresholding Locality-constrained Representation and Reproducing learning (TLcR-RL).

Face Hallucination Hallucination +1

Deep CNN Denoiser and Multi-layer Neighbor Component Embedding for Face Hallucination

1 code implementation28 Jun 2018 Junjun Jiang, Yi Yu, Jinhui Hu, Suhua Tang, Jiayi Ma

Most of the current face hallucination methods, whether they are shallow learning-based or deep learning-based, all try to learn a relationship model between Low-Resolution (LR) and High-Resolution (HR) spaces with the help of a training set.

Face Hallucination Hallucination +1

Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal Data

no code implementations8 May 2018 Yi Yu, Suhua Tang, Kiyoharu Aizawa, Akiko Aizawa

Given a photo as input, this model performs (i) exact venue search (find the venue where the photo was taken), and (ii) group venue search (find relevant venues with the same category as that of the photo), by the cross-modal correlation between the input photo and textual description of venues.

Cross-Modal Retrieval Retrieval

Towards Deep Modeling of Music Semantics using EEG Regularizers

no code implementations14 Dec 2017 Francisco Raposo, David Martins de Matos, Ricardo Ribeiro, Suhua Tang, Yi Yu

Modeling of music audio semantics has been previously tackled through learning of mappings from audio data to high-level tags or latent unsupervised spaces.

Cross-Modal Retrieval EEG +2

How Many Communities Are There?

no code implementations4 Dec 2014 Diego Franco Saldana, Yi Yu, Yang Feng

Stochastic blockmodels and variants thereof are among the most widely used approaches to community detection for social networks and relational data.

Clustering Community Detection +1

APPLE: Approximate Path for Penalized Likelihood Estimators

no code implementations2 Nov 2012 Yi Yu, Yang Feng

In high-dimensional data analysis, penalized likelihood estimators are shown to provide superior results in both variable selection and parameter estimation.

Variable Selection

Cannot find the paper you are looking for? You can Submit a new open access paper.