Search Results for author: Rui Liu

Found 144 papers, 62 papers with code

Target Really Matters: Target-aware Contrastive Learning and Consistency Regularization for Few-shot Stance Detection

1 code implementation COLING 2022 Rui Liu, Zheng Lin, Huishan Ji, Jiangnan Li, Peng Fu, Weiping Wang

Despite the significant progress on this task, it is extremely time-consuming and budget-unfriendly to collect sufficient high-quality labeled data for every new target under fully-supervised learning, whereas unlabeled data can be collected easier.

Contrastive Learning Few-Shot Stance Detection

Aspect Is Not You Need: No-aspect Differential Sentiment Framework for Aspect-based Sentiment Analysis

no code implementations NAACL 2022 Jiahao Cao, Rui Liu, Huailiang Peng, Lei Jiang, Xu Bai

Then we propose a differential sentiment loss instead of the cross-entropy loss to better classify the sentiments by distinguishing the different distances between sentiments.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +3

Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction

1 code implementation5 May 2025 Inclusion AI, Biao Gong, Cheng Zou, Dandan Zheng, Hu Yu, Jingdong Chen, Jianxin Sun, Junbo Zhao, Jun Zhou, Kaixiang Ji, Lixiang Ru, Libin Wang, Qingpei Guo, Rui Liu, Weilong Chai, Xinyu Xiao, Ziyuan Huang

We introduce Ming-Lite-Uni, an open-source multimodal framework featuring a newly designed unified visual generator and a native multimodal autoregressive model tailored for unifying vision and language.

multimodal interaction Text-to-Image Generation

A comprehensive review of remote sensing in wetland classification and mapping

no code implementations15 Apr 2025 Shuai Yuan, Xiangan Liang, Tianwu Lin, Shuang Chen, Rui Liu, Jie Wang, Hongsheng Zhang, Peng Gong

Although some review articles summarized the development of this field, there is a lack of a thorough and in-depth understanding of wetland classification and mapping: (1) the scientific importance of wetlands, (2) major data, methods used in wetland classification and mapping, (3) driving factors of wetland changes, (4) current research paradigm and limitations, (5) challenges and opportunities in wetland classification and mapping under the context of technological innovation and global environmental change.

Classification

EEG2GAIT: A Hierarchical Graph Convolutional Network for EEG-based Gait Decoding

no code implementations2 Apr 2025 Xi Fu, Rui Liu, Aung Aung Phyo Wai, Hannah Pulferer, Neethu Robinson, Gernot R Müller-Putz, Cuntai Guan

Ablation studies validate the contributions of the hierarchical GCN modules and HTSR Loss, while saliency maps reveal the significance of motor-related brain regions in decoding tasks.

Brain Computer Interface EEG

A Theoretical Analysis of Analogy-Based Evolutionary Transfer Optimization

no code implementations27 Mar 2025 Xiaoming Xue, Liang Feng, Yinglan Feng, Rui Liu, Kai Zhang, Kay Chen Tan

Evolutionary transfer optimization (ETO) has been gaining popularity in research over the years due to its outstanding knowledge transfer ability to address various challenges in optimization.

Transfer Learning

Robust Safety Critical Control Under Multiple State and Input Constraints: Volume Control Barrier Function Method

no code implementations18 Mar 2025 Jinyang Dong, Shizhen Wu, Rui Liu, Xiao Liang, Biao Lu, Yongchun Fang

To further address the challenges arising from multiple CBF and input constraints, a novel Volume CBF (VCBF) is proposed by analyzing the feasible space of the quadratic programming (QP) problem.

MSCMHMST: A traffic flow prediction model based on Transformer

no code implementations16 Mar 2025 Weiyang Geng, Yiming Pan, Zhecong Xing, Dongyu Liu, Rui Liu, Yuan Zhu

This study proposes a hybrid model based on Transformers, named MSCMHMST, aimed at addressing key challenges in traffic flow prediction.

Prediction Traffic Prediction

MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis

no code implementations26 Feb 2025 Ziyue Jiang, Yi Ren, RuiQi Li, Shengpeng Ji, Boyang Zhang, Zhenhui Ye, Chen Zhang, Bai Jionghao, Xiaoda Yang, Jialong Zuo, Yu Zhang, Rui Liu, Xiang Yin, Zhou Zhao

While recent zero-shot text-to-speech (TTS) models have significantly improved speech quality and expressiveness, mainstream systems still suffer from issues related to speech-text alignment modeling: 1) models without explicit speech-text alignment modeling exhibit less robustness, especially for hard sentences in practical applications; 2) predefined alignment-based models suffer from naturalness constraints of forced alignments.

Speech Synthesis text-to-speech +1

CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems

no code implementations25 Feb 2025 Rui Liu, Yu Shen, Peng Gao, Pratap Tokekar, Ming Lin

Multi-modality learning has become a crucial technique for improving the performance of machine learning applications across domains such as autonomous driving, robotics, and perception systems.

Autonomous Driving Decision Making +1

Enhancing Image Matting in Real-World Scenes with Mask-Guided Iterative Refinement

no code implementations24 Feb 2025 Rui Liu

Real-world image matting is essential for applications in content creation and augmented reality.

Benchmarking feature selection +1

AUKT: Adaptive Uncertainty-Guided Knowledge Transfer with Conformal Prediction

no code implementations23 Feb 2025 Rui Liu, Peng Gao, Yu Shen, Ming Lin, Pratap Tokekar

Knowledge transfer between teacher and student models has proven effective across various machine learning applications.

Autonomous Driving Conformal Prediction +3

Return of the Encoder: Maximizing Parameter Efficiency for SLMs

1 code implementation27 Jan 2025 Mohamed Elfeki, Rui Liu, Chad Voegele

The dominance of large decoder-only language models has overshadowed encoder-decoder architectures, despite their fundamental efficiency advantages in sequence processing.

Computational Efficiency Decoder +1

Ultralow-dimensionality reduction for identifying critical transitions by spatial-temporal PCA

no code implementations22 Jan 2025 Pei Chen, Yaofang Suo, Rui Liu, Luonan Chen

Discovering dominant patterns and exploring dynamic behaviors especially critical state transitions and tipping points in high-dimensional time-series data are challenging tasks in study of real-world complex systems, which demand interpretable data representations to facilitate comprehension of both spatial and temporal information within the original data space.

Dimensionality Reduction Time Series

Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech Synthesis

1 code implementation11 Jan 2025 Rui Liu, Zhenqi Jia, Feilong Bao, Haizhou Li

Then, we design a multi-attribute retrieval scheme to match the dialogue semantic and style vectors of the CD with the stored dialogue semantic and style vectors in the SDSSD, retrieving the most similar dialogues.

Attribute Benchmarking +2

Listening and Seeing Again: Generative Error Correction for Audio-Visual Speech Recognition

1 code implementation3 Jan 2025 Rui Liu, Hongyu Yuan, Haizhou Li

Unlike traditional Automatic Speech Recognition (ASR), Audio-Visual Speech Recognition (AVSR) takes audio and visual signals simultaneously to infer the transcription.

Audio-Visual Speech Recognition Automatic Speech Recognition +4

Towards Expressive Video Dubbing with Multiscale Multimodal Context Interaction

no code implementations25 Dec 2024 Yuan Zhao, Rui Liu, Gaoxiang Cong

Recent research focuses on modeling multimodal context to enhance prosody expressiveness but overlooks two key issues: 1) Multiscale prosody expression attributes in the context influence the current sentence's prosody.

Graph Attention Sentence

Intra- and Inter-modal Context Interaction Modeling for Conversational Speech Synthesis

no code implementations25 Dec 2024 Zhenqi Jia, Rui Liu

In the inference phase, we take MDH and adopt trained interaction modules to fully infer the speech prosody of the target utterance's text content.

Contrastive Learning Speech Synthesis

Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech

1 code implementation16 Dec 2024 Rui Liu, Shuwei He, Yifan Hu, Haizhou Li

The multi-modal aims to take both the RGB and Depth spaces of the spatial image to learn more comprehensive spatial information, and the multi-scale seeks to model the local and global spatial knowledge simultaneously.

text-to-speech Text to Speech

Analyst Reports and Stock Performance: Evidence from the Chinese Market

no code implementations13 Nov 2024 Rui Liu, Jiayou Liang, Haolong Chen, Yujia Hu

This article applies natural language processing (NLP) to extract and quantify textual information to predict stock performance.

Sentiment Analysis

Voltage Support Capability Analysis of Grid-Forming Inverters with Current-Limiting Control Under Asymmetrical Grid Faults

no code implementations7 Nov 2024 Han Zhang, Rui Liu, Yunwei, Li

It is discovered that matching the phase angle of the virtual impedance, emulated by the CLC, with that of the composed impedance from the capacitor to the fault location can maximize the voltage support capability of GFM inverters under asymmetrical grid faults.

Flexible Coded Distributed Convolution Computing for Enhanced Fault Tolerance and Numerical Stability in Distributed CNNs

no code implementations3 Nov 2024 Shuo Tan, Rui Liu, Xianlei Long, Kai Wan, Linqi Song, Yong Li

Deploying Convolutional Neural Networks (CNNs) on resource-constrained devices necessitates efficient management of computational resources, often via distributed systems susceptible to latency from straggler nodes.

Computational Efficiency Distributed Computing +1

FairDgcl: Fairness-aware Recommendation with Dynamic Graph Contrastive Learning

1 code implementation23 Oct 2024 Wei Chen, Meng Yuan, Zhao Zhang, Ruobing Xie, Fuzhen Zhuang, Deqing Wang, Rui Liu

Specifically, we propose FairDgcl, a dynamic graph adversarial contrastive learning framework aiming at improving fairness in recommender system.

Contrastive Learning Data Augmentation +2

Vision-Language Navigation with Energy-Based Policy

no code implementations18 Oct 2024 Rui Liu, Wenguan Wang, Yi Yang

Consequently, ENP learns to globally align with the expert policy by maximizing the likelihood of the actions and modeling the dynamics of the navigation states in a collaborative manner.

Behavioural cloning Vision-Language Navigation

Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech

1 code implementation18 Oct 2024 Shuwei He, Rui Liu

Previous works focus on the RGB modality for global environmental modeling, overlooking the potential of multi-source spatial knowledge like depth, speaker position, and environmental semantics.

object-detection Object Detection +3

Emphasis Rendering for Conversational Text-to-Speech with Multi-modal Multi-scale Context Modeling

no code implementations12 Oct 2024 Rui Liu, Zhenqi Jia, Jie Yang, Yifan Hu, Haizhou Li

In this paper, we propose a novel Emphasis Rendering scheme for the CTTS model, termed ER-CTTS, that includes two main components: 1) we simultaneously take into account textual and acoustic contexts, with both global and local semantic modeling to understand the conversation context comprehensively; 2) we deeply integrate multi-modal and multi-scale context to learn the influence of context on the emphasis expression of the current utterance.

text-to-speech Text to Speech

FluentEditor2: Text-based Speech Editing by Modeling Multi-Scale Acoustic and Prosody Consistency

1 code implementation28 Sep 2024 Rui Liu, Jiatian Xi, Ziyue Jiang, Haizhou Li

To maintain speech fluency, we propose a new fluency speech editing scheme based on our previous \textit{FluentEditor} model, termed \textit{\textbf{FluentEditor2}}, by modeling the multi-scale acoustic and prosody consistency training criterion in TSE training.

Text to Speech

Building Real-time Awareness of Out-of-distribution in Trajectory Prediction for Autonomous Vehicles

no code implementations25 Sep 2024 Tongfei, Guo, Taposh Banerjee, Rui Liu, Lili Su

Trajectory prediction describes the motions of surrounding moving obstacles for an autonomous vehicle; it plays a crucial role in enabling timely decision-making, such as collision avoidance and trajectory replanning.

Autonomous Vehicles Change Point Detection +6

Medical Report Generation Is A Multi-label Classification Problem

no code implementations30 Aug 2024 Yijian Fan, Zhenbang Yang, Rui Liu, Mingjie Li, Xiaojun Chang

However, in this paper, we propose a novel perspective: rethinking medical report generation as a multi-label classification problem.

Medical Report Generation Multi-Label Classification +1

MCDubber: Multimodal Context-Aware Expressive Video Dubbing

no code implementations21 Aug 2024 Yuan Zhao, Zhenqi Jia, Rui Liu, De Hu, Feilong Bao, Guanglai Gao

Automatic Video Dubbing (AVD) aims to take the given script and generate speech that aligns with lip motion and prosody expressiveness.

Sentence

Generative Expressive Conversational Speech Synthesis

1 code implementation31 Jul 2024 Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li

After that, the expressive conversational speech is synthesized by the conversation-enriched VITS to deliver feedback to the user. Furthermore, we propose a large-scale Natural CSS Dataset called NCSSD, that includes both naturally recorded conversational speech in improvised styles and dialogues extracted from TV shows.

Speech Synthesis

Navigation Instruction Generation with BEV Perception and Large Language Models

1 code implementation21 Jul 2024 Sheng Fan, Rui Liu, Wenguan Wang, Yi Yang

To address these challenges, we propose BEVInstructor, which incorporates Bird's Eye View (BEV) features into Multi-Modal Large Language Models (MLLMs) for instruction generation.

Exploiting Scale-Variant Attention for Segmenting Small Medical Objects

1 code implementation10 Jul 2024 Wei Dai, Rui Liu, Zixuan Wu, Tianyi Wu, Min Wang, Junxian Zhou, Yixuan Yuan, Jun Liu

Early detection and accurate diagnosis can predict the risk of malignant disease transformation, thereby increasing the probability of effective treatment.

Cell Segmentation MRI segmentation +2

Federated Knowledge Transfer Fine-tuning Large Server Model with Resource-Constrained IoT Clients

no code implementations7 Jul 2024 Shaoyuan Chen, Linlin You, Rui Liu, Shuo Yu, Ahmed M. Abdelmoniem

Compared to the solutions based on centralized data centers, updating large models in the Internet of Things (IoT) faces challenges in coordinating knowledge from distributed clients by using their private and heterogeneous data.

Federated Learning Knowledge Distillation +2

Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset

1 code implementation3 Jul 2024 Rui Liu, Haolin Zuo, Zheng Lian, Xiaofen Xing, Björn W. Schuller, Haizhou Li

Together with the release of the dataset, we also develop an Emotion and Intent Interaction (EI$^2$) network as a reference system by modeling the deep correlation between emotion and intent in the multimodal conversation.

Benchmarking Diversity

Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion

1 code implementation28 May 2024 Hongze Sun, Rui Liu, Wuque Cai, Jun Wang, Yue Wang, Huajin Tang, Yan Cui, Dezhong Yao, Daqing Guo

In this study, we propose a novel multimodal hybrid tracker (MMHT) that utilizes frame-event-based data for reliable single object tracking.

Object Visual Object Tracking

EEG-Deformer: A Dense Convolutional Transformer for Brain-computer Interfaces

1 code implementation25 Apr 2024 Yi Ding, Yong Li, Hao Sun, Rui Liu, Chengxuan Tong, Chenyu Liu, Xinliang Zhou, Cuntai Guan

Effectively learning the temporal dynamics in electroencephalogram (EEG) signals is challenging yet essential for decoding brain activities using brain-computer interfaces (BCIs).

EEG

Infrared Small Target Detection with Scale and Location Sensitivity

1 code implementation CVPR 2024 Qiankun Liu, Rui Liu, Bolun Zheng, Hongkui Wang, Ying Fu

In this paper, we focus on boosting detection performance with a more effective loss but a simpler model structure.

Volumetric Environment Representation for Vision-Language Navigation

1 code implementation CVPR 2024 Rui Liu, Wenguan Wang, Yi Yang

To achieve a comprehensive 3D representation with fine-grained details, we introduce a Volumetric Environment Representation (VER), which voxelizes the physical world into structured 3D cells.

3D geometry Multi-Task Learning +3

Adaptive Visual Imitation Learning for Robotic Assisted Feeding Across Varied Bowl Configurations and Food Types

no code implementations19 Mar 2024 Rui Liu, Amisha Bhaskar, Pratap Tokekar

Notably, our model, trained solely on data from a transparent glass bowl containing granular cereals, showcases generalization ability when tested zero-shot on other bowl configurations with different types of food.

Imitation Learning

Towards Efficient Risk-Sensitive Policy Gradient: An Iteration Complexity Analysis

no code implementations13 Mar 2024 Rui Liu, Anish Gupta, Erfaun Noorani, Pratap Tokekar

To validate our analysis, we empirically evaluate the learning performance and convergence efficiency of the risk-neutral and risk-sensitive REINFORCE algorithms in multiple environments: CartPole, MiniGrid, and Robot Navigation.

Policy Gradient Methods Reinforcement Learning (RL) +1

Large Model based Sequential Keyframe Extraction for Video Summarization

no code implementations10 Jan 2024 Kailong Tan, Yuxiang Zhou, Qianchen Xia, Rui Liu, Yong Chen

Keyframe extraction aims to sum up a video's semantics with the minimum number of its frames.

Video Summarization

Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling

1 code implementation19 Dec 2023 Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li

Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting.

Contrastive Learning Speech Synthesis

Prompt Based Tri-Channel Graph Convolution Neural Network for Aspect Sentiment Triplet Extraction

1 code implementation18 Dec 2023 Kun Peng, Lei Jiang, Hao Peng, Rui Liu, Zhengtao Yu, Jiaqian Ren, Zhifeng Hao, Philip S. Yu

Aspect Sentiment Triplet Extraction (ASTE) is an emerging task to extract a given sentence's triplets, which consist of aspects, opinions, and sentiments.

Aspect Sentiment Triplet Extraction Triplet

Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Incomplete Data Scenarios

1 code implementation21 Sep 2023 Qi Fan, Haolin Zuo, Rui Liu, Zheng Lian, Guanglai Gao

This approach includes two pivotal components: firstly, a noise scheduler that adjusts the type and level of noise in the data to emulate various realistic incomplete situations.

Multimodal Emotion Recognition

Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech

1 code implementation21 Sep 2023 Rui Liu, Bin Liu, Haizhou Li

Prosodic phrasing is crucial to the naturalness and intelligibility of end-to-end Text-to-Speech (TTS).

text-to-speech Text to Speech

FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency

1 code implementation21 Sep 2023 Rui Liu, Jiatian Xi, Ziyue Jiang, Haizhou Li

Text-based speech editing (TSE) techniques are designed to enable users to edit the output audio by modifying the input text transcript instead of the audio itself.

Explainable AI for tool wear prediction in turning

no code implementations17 Aug 2023 Saleh Valizadeh Sotubadi, Rui Liu, Vinh Neguyen

After the training process, the Shapley criterion was used to explain the predictions of the trained ML classifier.

Binary Classification Decision Making +3

Aggregating Intrinsic Information to Enhance BCI Performance through Federated Learning

1 code implementation14 Aug 2023 Rui Liu, YuanYuan Chen, Anran Li, Yi Ding, Han Yu, Cuntai Guan

Though numerous research groups and institutes collect a multitude of EEG datasets for the same BCI task, sharing EEG data from multiple sites is still challenging due to the heterogeneity of devices.

EEG Eeg Decoding +2

Bird's-Eye-View Scene Graph for Vision-Language Navigation

1 code implementation ICCV 2023 Rui Liu, Xiaohan Wang, Wenguan Wang, Yi Yang

Vision-language navigation (VLN), which entails an agent to navigate 3D environments following human instructions, has shown great advances.

Navigate Vision-Language Navigation

Understanding the Application of Utility Theory in Robotics and Artificial Intelligence: A Survey

no code implementations15 Jun 2023 Qin Yang, Rui Liu

As a unifying concept in economics, game theory, and operations research, even in the Robotics and AI field, the utility is used to evaluate the level of individual needs, preferences, and interests.

Decision Making

Modeling Dynamic Heterogeneous Graph and Node Importance for Future Citation Prediction

no code implementations27 May 2023 Hao Geng, Deqing Wang, Fuzhen Zhuang, Xuehua Ming, Chenguang Du, Ting Jiang, Haolong Guo, Rui Liu

To cope with this problem, we propose a Dynamic heterogeneous Graph and Node Importance network (DGNI) learning framework, which fully leverages the dynamic heterogeneous graph and node importance information to predict future citation trends of newly published papers.

Citation Prediction Network Embedding

Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion

1 code implementation25 May 2023 Rui Liu, Jinhua Zhang, Guanglai Gao, Haizhou Li

In this paper, we propose a novel ADD model, termed as M2S-ADD, that attempts to discover audio authenticity cues during the mono-to-stereo conversion process.

Audio Deepfake Detection Face Swapping +3

Distributed TD(0) with Almost No Communication

no code implementations25 May 2023 Rui Liu, Alex Olshevsky

We provide a new non-asymptotic analysis of distributed temporal difference learning with linear function approximation.

PerCoNet: News Recommendation with Explicit Persona and Contrastive Learning

no code implementations17 Apr 2023 Rui Liu, Bin Yin, Ziyi Cao, Qianchen Xia, Yong Chen, Dell Zhang

Personalized news recommender systems help users quickly find content of their interests from the sea of information.

Contrastive Learning News Recommendation +1

Neural Partial Differential Equations with Functional Convolution

no code implementations10 Mar 2023 Ziqian Wu, Xingzhe He, Yijun Li, Cheng Yang, Rui Liu, Shiying Xiong, Bo Zhu

We present a lightweighted neural PDE representation to discover the hidden structure and predict the solution of different nonlinear PDEs.

Towards Interpretable Federated Learning

no code implementations27 Feb 2023 Anran Li, Rui Liu, Ming Hu, Luu Anh Tuan, Han Yu

Federated learning (FL) enables multiple data owners to build machine learning models collaboratively without exposing their private local data.

Federated Learning

MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset

1 code implementation11 Dec 2022 Kailin Liang, Bin Liu, Yifan Hu, Rui Liu, Feilong Bao, Guanglai Gao

Text-to-Speech (TTS) synthesis for low-resource languages is an attractive research issue in academia and industry nowadays.

Speech Synthesis text-to-speech +2

Quantifying syntax similarity with a polynomial representation of dependency trees

1 code implementation13 Nov 2022 Pengyu Liu, Tinghao Feng, Rui Liu

We introduce a graph polynomial that distinguishes tree structures to represent dependency grammar and a measure based on the polynomial representation to quantify syntax similarity.

Diversity Sentence

Coverage-centric Coreset Selection for High Pruning Rates

1 code implementation28 Oct 2022 Haizhong Zheng, Rui Liu, Fan Lai, Atul Prakash

We then propose a novel one-shot coreset selection method, Coverage-centric Coreset Selection (CCS), that jointly considers overall data coverage upon a distribution as well as the importance of each example.

Vocal Bursts Intensity Prediction

FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis

1 code implementation27 Oct 2022 Yifan Hu, Rui Liu, Guanglai Gao, Haizhou Li

Therefore, we propose a novel expressive conversational TTS model, termed as FCTalker, that learn the fine and coarse grained context dependency at the same time during speech generation.

Speech Synthesis text-to-speech +1

Explicit Intensity Control for Accented Text-to-speech

no code implementations27 Oct 2022 Rui Liu, Haolin Zuo, De Hu, Guanglai Gao, Haizhou Li

Accented text-to-speech (TTS) synthesis seeks to generate speech with an accent (L2) as a variant of the standard version (L1).

speech-recognition Speech Recognition +2

A Deep Investigation of RNN and Self-attention for the Cyrillic-Traditional Mongolian Bidirectional Conversion

no code implementations24 Sep 2022 Muhan Na, Rui Liu, Feilong, Guanglai Gao

To answer this question, this paper investigates the utility of these two powerful techniques for CTMBC task combined with agglutinative characteristics of Mongolian language.

Decoder Machine Translation

A Spatial-channel-temporal-fused Attention for Spiking Neural Networks

no code implementations22 Sep 2022 Wuque Cai, Hongze Sun, Rui Liu, Yan Cui, Jun Wang, Yang Xia, Dezhong Yao, Daqing Guo

Spiking neural networks (SNNs) mimic brain computational strategies, and exhibit substantial capabilities in spatiotemporal information processing.

MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline

1 code implementation22 Sep 2022 Yifan Hu, Pengkai Yin, Rui Liu, Feilong Bao, Guanglai Gao

This paper introduces a high-quality open-source text-to-speech (TTS) synthesis dataset for Mongolian, a low-resource language spoken by over 10 million people worldwide.

Speech Synthesis text-to-speech +2

Controllable Accented Text-to-Speech Synthesis

no code implementations22 Sep 2022 Rui Liu, Berrak Sisman, Guanglai Gao, Haizhou Li

Accented TTS synthesis is challenging as L2 is different from L1 in both in terms of phonetic rendering and prosody pattern.

Speech Synthesis text-to-speech +2

Communication-efficient Distributed Learning for Large Batch Optimization

1 code implementation Proceedings of the 39th International Conference on Machine Learning 2022 Rui Liu, Barzan Mozafari

In this paper, we propose new gradient compression methods for large batch optimization, JointSpar and its variant JointSpar-LARS with layerwise adaptive learning rates, that jointly reduce both the computation and the communication cost.

Mitigating Data Redundancy to Revitalize Transformer-based Long-Term Time Series Forecasting System

2 code implementations16 Jul 2022 Mingjie Li, Rui Liu, Guangsi Shi, Mingfei Han, Changling Li, Lina Yao, Xiaojun Chang, Ling Chen

This curriculum-driven noise introduction aids the memory-driven decoder by supplying more diverse and representative training data, enhancing the decoder's ability to model seasonal tendencies and dependencies in the time-series data.

Data Augmentation Decoder +2

Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning

1 code implementation15 Jun 2022 Rui Liu, Berrak Sisman, Björn Schuller, Guanglai Gao, Haizhou Li

In this paper, we propose a data-driven deep learning model, i. e. StrengthNet, to improve the generalization of emotion strength assessment for seen and unseen speech.

Attribute Emotion Classification +4

Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers

no code implementations28 May 2022 Rui Liu, Young Jin Kim, Alexandre Muzio, Hany Hassan Awadalla

Sparsely activated transformers, such as Mixture of Experts (MoE), have received great interest due to their outrageous scaling capability which enables dramatical increases in model size without significant increases in computational cost.

Machine Translation Mixture-of-Experts

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

no code implementations25 May 2022 Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, Jin Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang, Javen Qinfeng Shi, Dong Gong, Dan Zhu, Mengdi Sun, Guannan Chen, Yang Hu, Haowei Li, Baozhu Zou, Zhen Liu, Wenjie Lin, Ting Jiang, Chengzhi Jiang, Xinpeng Li, Mingyan Han, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Juan Marín-Vega, Michael Sloth, Peter Schneider-Kamp, Richard Röttger, Chunyang Li, Long Bao, Gang He, Ziyao Xu, Li Xu, Gen Zhan, Ming Sun, Xing Wen, Junlin Li, Shuang Feng, Fei Lei, Rui Liu, Junxiang Ruan, Tianhong Dai, Wei Li, Zhan Lu, Hengyan Liu, Peian Huang, Guangyu Ren, Yonglin Luo, Chang Liu, Qiang Tu, Fangya Li, Ruipeng Gang, Chenghua Li, Jinjing Li, Sai Ma, Chenming Liu, Yizhen Cao, Steven Tel, Barthelemy Heyrman, Dominique Ginhac, Chul Lee, Gahyeon Kim, Seonghyun Park, An Gia Vien, Truong Thanh Nhat Mai, Howoon Yoon, Tu Vo, Alexander Holston, Sheir Zaheer, Chan Y. Park

The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i. e. solutions can not exceed a given number of operations).

Image Restoration Vocal Bursts Intensity Prediction

Transformer with Memory Replay

no code implementations19 May 2022 Rui Liu, Barzan Mozafari

Transformers achieve state-of-the-art performance for natural language processing tasks by pre-training on large-scale text corpora.

Deeply Supervised Skin Lesions Diagnosis with Stage and Branch Attention

2 code implementations9 May 2022 Wei Dai, Rui Liu, Tianyi Wu, Min Wang, Jianqin Yin, Jun Liu

Visual features of skin lesions vary significantly because the images are collected from patients with different lesion colours and morphologies by using dissimilar imaging equipment.

Classification

Peng Cheng Object Detection Benchmark for Smart City

no code implementations11 Mar 2022 YaoWei Wang, Zhouxin Yang, Rui Liu, Deng Li, Yuandu Lai, Leyuan Fang, Yahong Han

Considering the diversity and complexity of scenes in intelligent city governance, we build a large-scale object detection benchmark for the smart city.

Diversity Object +2

CTformer: Convolution-free Token2Token Dilated Vision Transformer for Low-dose CT Denoising

2 code implementations28 Feb 2022 Dayang Wang, Fenglei Fan, Zhan Wu, Rui Liu, Fei Wang, Hengyong Yu

Furthermore, an overlapped inference mechanism is introduced to effectively eliminate the boundary artifacts that are common for encoder-decoder-based denoising models.

Decoder Denoising

Federated Graph Neural Networks: Overview, Techniques and Challenges

no code implementations15 Feb 2022 Rui Liu, Pengwei Xing, Zichao Deng, Anran Li, Cuntai Guan, Han Yu

This has led to the rapid development of the emerging research field of federated graph neural networks (FedGNNs).

Federated Learning Survey

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

no code implementations16 Dec 2021 Rui Liu, Yahong Han, YaoWei Wang, Qi Tian

In the second stage, augmented source and target data with pseudo labels are adopted to perform the self-training for prediction consistency.

Object object-detection +1

PP-MSVSR: Multi-Stage Video Super-Resolution

1 code implementation6 Dec 2021 Lielin Jiang, Na Wang, Qingqing Dang, Rui Liu, Baohua Lai

Different from the Single Image Super-Resolution(SISR) task, the key for Video Super-Resolution(VSR) task is to make full use of complementary information across frames to reconstruct the high-resolution sequence.

Image Super-Resolution Video Super-Resolution

SSMF: Shifting Seasonal Matrix Factorization

1 code implementation NeurIPS 2021 Koki Kawabata, Siddharth Bhatia, Rui Liu, Mohit Wadhwa, Bryan Hooi

In general, given a data stream of events with seasonal patterns that innovate over time, how can we effectively and efficiently forecast future events?

Data Compression

StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis

1 code implementation7 Oct 2021 Rui Liu, Berrak Sisman, Haizhou Li

The emotion strength of synthesized speech can be controlled flexibly using a strength descriptor, which is obtained by an emotion attribute ranking function.

Attribute Data Augmentation +2

VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over

no code implementations7 Oct 2021 Junchen Lu, Berrak Sisman, Rui Liu, Mingyang Zhang, Haizhou Li

The proposed VisualTTS adopts two novel mechanisms that are 1) textual-visual attention, and 2) visual fusion strategy during acoustic decoding, which both contribute to forming accurate alignment between the input text content and lip motion in input lip sequence.

Speech Synthesis text-to-speech +1

Vector-Decomposed Disentanglement for Domain-Invariant Object Detection

1 code implementation ICCV 2021 Aming Wu, Rui Liu, Yahong Han, Linchao Zhu, Yi Yang

Secondly, domain-specific representations are introduced as the differences between the input and domain-invariant representations.

Disentanglement Object +2

Using Query Expansion in Manifold Ranking for Query-Oriented Multi-Document Summarization

1 code implementation CCL 2021 Quanye Jia, Rui Liu, Jianying Lin

It not only makes use of the relationships among the sentences, but also the relationships between the given query and the sentences.

Document Summarization Multi-Document Summarization

Spatiotemporal information conversion machine for time-series prediction

1 code implementation3 Jul 2021 Hao Peng, Pei Chen, Rui Liu, Luonan Chen

Making predictions in a robust way is a difficult task only based on the observed data of a nonlinear system.

Causal Inference Prediction +2

Emotional Voice Conversion: Theory, Databases and ESD

1 code implementation31 May 2021 Kun Zhou, Berrak Sisman, Rui Liu, Haizhou Li

In this paper, we first provide a review of the state-of-the-art emotional voice conversion research, and the existing emotional speech databases.

Voice Conversion

Distributed TD(0) with Almost No Communication

no code implementations16 Apr 2021 Rui Liu, Alex Olshevsky

In the global state model, we show that the convergence rate of our distributed one-shot averaging method matches the known convergence rate of TD(0).

Decoupled Spatial-Temporal Transformer for Video Inpainting

1 code implementation14 Apr 2021 Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu Sun, Xiaogang Wang, Jifeng Dai, Hongsheng Li

Seamless combination of these two novel designs forms a better spatial-temporal attention scheme and our proposed model achieves better performance than state-of-the-art video inpainting approaches with significant boosted efficiency.

Video Inpainting

Isconna: Streaming Anomaly Detection with Frequency and Patterns

2 code implementations4 Apr 2021 Rui Liu, Siddharth Bhatia, Bryan Hooi

Isconna does not actively explore or maintain pattern snippets; it instead measures the consecutive presence and absence of edge records.

Anomaly Detection

DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network

1 code implementation CVPR 2021 Rui Liu, Yixiao Ge, Ching Lam Choi, Xiaogang Wang, Hongsheng Li

Conditional generative adversarial networks (cGANs) target at synthesizing diverse images given the input conditions and latent codes, but unfortunately, they usually suffer from the issue of mode collapse.

Contrastive Learning Diversity +2

How flux feeding causes eruptions of solar magnetic flux ropes with the hyperbolic flux tube configuration?

no code implementations29 Jan 2021 Quanhao Zhang, Rui Liu, Yuming Wang, Zhenjun Zhou, Bin Zhuang, Xiaolei Li

But it is unclear how flux feeding influences coronal flux ropes that are wrapped by hyperbolic flux tubes (HFT), and whether it is able to cause the flux-rope eruption.

Solar and Stellar Astrophysics

A Physics-Informed Machine Learning Model for Porosity Analysis in Laser Powder Bed Fusion Additive Manufacturing

no code implementations13 Jan 2021 Rui Liu, Sen Liu, Xiaoli Zhang

To address the first problem, a physics-informed, data-driven model (PIM), which instead of directly using machine setting parameters to predict porosity levels of printed parts, it first interprets machine settings into physical effects, such as laser energy density and laser radiation pressure.

BIG-bench Machine Learning Physics-informed machine learning

Non-equilibrium Flux Rope Formation by Confined Flares Preceding a Solar Coronal Mass Ejection

no code implementations6 Jan 2021 Bernhard Kliem, Jeongwoo Lee, Rui Liu, Stephen M. White, Chang Liu, Satoshi Masuda

We present evidence that a magnetic flux rope was formed before a coronal mass ejection (CME) and its associated long-duration flare during a pair of preceding confined eruptions and associated impulsive flares in a compound event in NOAA Active Region 12371.

Solar and Stellar Astrophysics

Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset

2 code implementations28 Oct 2020 Kun Zhou, Berrak Sisman, Rui Liu, Haizhou Li

Emotional voice conversion aims to transform emotional prosody in speech while preserving the linguistic content and speaker identity.

Decoder Generative Adversarial Network +3

Temporal Difference Learning as Gradient Splitting

no code implementations27 Oct 2020 Rui Liu, Alex Olshevsky

Temporal difference learning with linear function approximation is a popular method to obtain a low-dimensional approximation of the value function of a policy in a Markov Decision Process.

Adam with Bandit Sampling for Deep Learning

no code implementations NeurIPS 2020 Rui Liu, Tianyi Wu, Barzan Mozafari

In this paper, we propose a generalization of Adam, called Adambs, that allows us to also adapt to different training examples based on their importance in the model's convergence.

Deep Learning

GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech Synthesis

no code implementations23 Oct 2020 Rui Liu, Berrak Sisman, Haizhou Li

Attention-based end-to-end text-to-speech synthesis (TTS) is superior to conventional statistical methods in many ways.

Graph Attention Graph Neural Network +5

Real-Time Anomaly Detection in Edge Streams

3 code implementations17 Sep 2020 Siddharth Bhatia, Rui Liu, Bryan Hooi, Minji Yoon, Kijung Shin, Christos Faloutsos

Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges in an online manner, for the purpose of detecting unusual behavior, using constant time and memory?

Anomaly Detection Anomaly Detection in Edge Streams

Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS

no code implementations11 Aug 2020 Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li

We propose a multi-task learning scheme for Tacotron training, that optimizes the system to predict both Mel spectrum and phrase breaks.

Multi-Task Learning Speech Synthesis

Asymptotic Convergence Rate of Alternating Minimization for Rank One Matrix Completion

no code implementations11 Aug 2020 Rui Liu, Alex Olshevsky

We study alternating minimization for matrix completion in the simplest possible setting: completing a rank-one matrix from a revealed subset of the entries.

Matrix Completion

Keyphrase Prediction With Pre-trained Language Model

no code implementations22 Apr 2020 Rui Liu, Zheng Lin, Weiping Wang

Considering the different characteristics of extractive and generative methods, we propose to divide the keyphrase prediction into two subtasks, i. e., present keyphrase extraction (PKE) and absent keyphrase generation (AKG), to fully exploit their respective advantages.

Keyphrase Extraction Keyphrase Generation +4

An Attention Transfer Model for Human-Assisted Failure Avoidance in Robot Manipulations

no code implementations11 Feb 2020 Boyi Song, Yuntao Peng, Ruijiao Luo, Rui Liu

With the attention transfer, a robot understands \textit{what} and \textit{where} human concerns are to identify and correct abnormal manipulations.

Robot Manipulation

Proficiency Constrained Multi-Agent Reinforcement Learning for Environment-Adaptive Multi UAV-UGV Teaming

no code implementations10 Feb 2020 Qifei Yu, Zhexin Shen, Yijiang Pang, Rui Liu

Due to heterogeneous robots inside a team and the resilient capabilities of robots, it is challenging to perform a task with an optimal balance between reasonable task allocations and maximum utilization of robot capability.

Deep Reinforcement Learning Multi-agent Reinforcement Learning +2

Understanding and Optimizing Packed Neural Network Training for Hyper-Parameter Tuning

no code implementations7 Feb 2020 Rui Liu, Sanjay Krishnan, Aaron J. Elmore, Michael J. Franklin

As neural networks are increasingly employed in machine learning practice, how to efficiently share limited training resources among a diverse set of model training tasks becomes a crucial issue.

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

no code implementations2 Feb 2020 Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li

To address this problem, we propose a new training scheme for Tacotron-based TTS, referred to as WaveTTS, that has 2 loss functions: 1) time-domain loss, denoted as the waveform loss, that measures the distortion between the natural and generated waveform; and 2) frequency-domain loss, that measures the Mel-scale acoustic feature loss between the natural and generated acoustic features.

text-to-speech Text to Speech

Teacher-Student Training for Robust Tacotron-based TTS

no code implementations7 Nov 2019 Rui Liu, Berrak Sisman, Jingdong Li, Feilong Bao, Guanglai Gao, Haizhou Li

We first train a Tacotron2-based TTS model by always providing natural speech frames to the decoder, that serves as a teacher model.

Decoder Knowledge Distillation +2

Regularized Non-negative Spectral Embedding for Clustering

no code implementations1 Nov 2019 Yifei Wang, Rui Liu, Yong Chen, Hui Zhangs, Zhiwen Ye

Spectral Clustering is a popular technique to split data points into groups, especially for complex datasets.

Clustering

Joint Lifelong Topic Model and Manifold Ranking for Document Summarization

no code implementations7 Jul 2019 Jianying Lin, Rui Liu, Quanye Jia

The JTMMR model can improve the effect of the manifold ranking method by using the better semantic feature.

Document Summarization Multi-Document Summarization

Conditional Adversarial Generative Flow for Controllable Image Synthesis

no code implementations CVPR 2019 Rui Liu, Yu Liu, Xinyu Gong, Xiaogang Wang, Hongsheng Li

Flow-based generative models show great potential in image synthesis due to its reversible pipeline and exact log-likelihood target, yet it suffers from weak ability for conditional image synthesis, especially for multi-label or unaware conditions.

Image Generation

Benchmarking Time Series Databases with IoTDB-Benchmark for IoT Scenarios

1 code implementation24 Jan 2019 Rui Liu, Jun Yuan

With the goal of establishing a standard of evaluating TSDB systems, we present the IoTDB-Benchmark framework, specifically designed for TSDB and IoT application scenarios.

Databases

A Bandit Approach to Maximum Inner Product Search

no code implementations15 Dec 2018 Rui Liu, Tianyi Wu, Barzan Mozafari

There has been substantial research on sub-linear time approximate algorithms for Maximum Inner Product Search (MIPS).

A LSTM Approach with Sub-Word Embeddings for Mongolian Phrase Break Prediction

no code implementations COLING 2018 Rui Liu, Feilong Bao, Guanglai Gao, HUI ZHANG, Yonghe Wang

In this paper, we first utilize the word embedding that focuses on sub-word units to the Mongolian Phrase Break (PB) prediction task by using Long-Short-Term-Memory (LSTM) model.

Dictionary Learning Machine Translation +2

Discrete Factorization Machines for Fast Feature-based Recommendation

1 code implementation6 May 2018 Han Liu, Xiangnan He, Fuli Feng, Liqiang Nie, Rui Liu, Hanwang Zhang

In this paper, we develop a generic feature-based recommendation model, called Discrete Factorization Machine (DFM), for fast and accurate recommendation.

Binarization Quantization

Phase Conductor on Multi-layered Attentions for Machine Comprehension

no code implementations ICLR 2018 Rui Liu, Wei Wei, Weiguang Mao, Maria Chikina

Attention models have been intensively studied to improve NLP tasks such as machine comprehension via both question-aware passage attention model and self-matching attention model.

Question Answering Reading Comprehension

Untangling Blockchain: A Data Processing View of Blockchain Systems

1 code implementation17 Aug 2017 Tien Tuan Anh Dinh, Rui Liu, Meihui Zhang, Gang Chen, Beng Chin Ooi, Ji Wang

Blockchain technologies are gaining massive momentum in the last few years.

Databases Cryptography and Security

BLOCKBENCH: A Framework for Analyzing Private Blockchains

2 code implementations12 Mar 2017 Tien Tuan Anh Dinh, Ji Wang, Gang Chen, Rui Liu, Beng Chin Ooi, Kian-Lee Tan

However, there is a clear lack of a systematic framework with which different systems can be analyzed and compared against each other.

Databases Cryptography and Security Distributed, Parallel, and Cluster Computing

Structural Embedding of Syntactic Trees for Machine Comprehension

no code implementations EMNLP 2017 Rui Liu, Junjie Hu, Wei Wei, Zi Yang, Eric Nyberg

Deep neural networks for machine comprehension typically utilizes only word or character embeddings without explicitly taking advantage of structured linguistic information such as constituency trees and dependency trees.

Question Answering Reading Comprehension

A Review of Methodologies for Natural-Language-Facilitated Human-Robot Cooperation

no code implementations30 Jan 2017 Rui Liu, Xiaoli Zhang

However, a thorough review, that can reveal latest methodologies to use NL to facilitate human-robot cooperation, is missing.

Autonomous Navigation

Systems of natural-language-facilitated human-robot cooperation: A review

no code implementations28 Jan 2017 Rui Liu, Xiaoli Zhang

Natural-language-facilitated human-robot cooperation (NLC), in which natural language (NL) is used to share knowledge between a human and a robot for conducting intuitive human-robot cooperation (HRC), is continuously developing in the recent decade.

Generating machine-executable plans from end-user's natural-language instructions

no code implementations20 Nov 2016 Rui Liu, Xiaoli Zhang

To address this NL-based human-machine communication problem and enable the machines to appropriately execute tasks by following the end-user's NL instructions, we developed a Machine-Executable-Plan-Generation (exePlan) method.

Simultaneous Low-rank Component and Graph Estimation for High-dimensional Graph Signals: Application to Brain Imaging

no code implementations26 Sep 2016 Rui Liu, Hossein Nejati, Seyed Hamid Safavi, Ngai-Man Cheung

We propose an algorithm to uncover the intrinsic low-rank component of a high-dimensional, graph-smooth and grossly-corrupted dataset, under the situations that the underlying graph is unknown.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.