Search Results for author: Yuke Li

In order to predict a pedestrian's trajectory in a crowd accurately, one has to take into account her/his underlying socio-temporal interactions with other pedestrians consistently.

Trajectory Prediction

Paper
Add Code

Enhancing Traffic Object Detection in Variable Illumination with RGB-Event Fusion

no code implementations • 1 Nov 2023 • Zhanwen Liu, Nan Yang, Yang Wang, Yuke Li, Xiangmo Zhao, Fei-Yue Wang

To address this issue, we introduce bio-inspired event cameras and propose a novel Structure-aware Fusion Network (SFNet) that extracts sharp and complete object structures from the event stream to compensate for the lost information in images through cross-modality fusion, enabling the network to obtain illumination-robust representations for traffic object detection.

Object object-detection +2

Paper
Add Code

Boosting Multi-Speaker Expressive Speech Synthesis with Semi-supervised Contrastive Learning

no code implementations • 26 Oct 2023 • Xinfa Zhu, Yuke Li, Yi Lei, Ning Jiang, Guoqing Zhao, Lei Xie

This paper aims to build a multi-speaker expressive TTS system, synthesizing a target speaker's speech with multiple styles and emotions.

Contrastive Learning Expressive Speech Synthesis

Paper
Add Code

HPL-ViT: A Unified Perception Framework for Heterogeneous Parallel LiDARs in V2V

no code implementations • 27 Sep 2023 • Yuhang Liu, Boyi Sun, Yuke Li, Yuzheng Hu, Fei-Yue Wang

It uses a graph-attention Transformer to extract domain-specific features for each agent, coupled with a cross-attention mechanism for the final fusion.

Autonomous Driving Graph Attention

Paper
Add Code

Differentiable Resolution Compression and Alignment for Efficient Video Classification and Retrieval

1 code implementation • 15 Sep 2023 • Rui Deng, Qian Wu, Yuke Li, Haoran Fu

To address these issues, we propose an efficient video representation network with Differentiable Resolution Compression and Alignment mechanism, which compresses non-essential information in the early stage of the network to reduce computational costs while maintaining consistent temporal correlations.

Retrieval Video Classification +1

Paper
Code

Part-Aware Transformer for Generalizable Person Re-identification

1 code implementation • ICCV 2023 • Hao Ni, Yuke Li, Lianli Gao, Heng Tao Shen, Jingkuan Song

Based on the local similarity obtained in CSL, a Part-guided Self-Distillation (PSD) is proposed to further improve the generalization of global features.

Domain Generalization Generalizable Person Re-identification

Paper
Code

Enhancing the Unified Streaming and Non-streaming Model with Contrastive Learning

no code implementations • 1 Jun 2023 • Yuting Yang, Yuke Li, Binbin Du

Specifically, the top-layer hidden representation at the same frame of the streaming and non-streaming modes are regarded as a positive pair, encouraging the representation of the streaming mode close to its non-streaming counterpart.

Contrastive Learning speech-recognition +1

Paper
Add Code

3D-CSL: self-supervised 3D context similarity learning for Near-Duplicate Video Retrieval

1 code implementation • 10 Nov 2022 • Rui Deng, Qian Wu, Yuke Li

In this paper, we introduce 3D-CSL, a compact pipeline for Near-Duplicate Video Retrieval (NDVR), and explore a novel self-supervised learning strategy for video similarity learning.

Retrieval Self-Supervised Learning +3

Paper
Code

Improving CTC-based ASR Models with Gated Interlayer Collaboration

no code implementations • 25 May 2022 • Yuting Yang, Yuke Li, Binbin Du

The CTC-based automatic speech recognition (ASR) models without the external language model usually lack the capacity to model conditional dependencies and textual interactions.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition

no code implementations • 24 May 2022 • Yuting Yang, Binbin Du, Yuke Li

Thus only considering the writing of Chinese characters as modeling units is insufficient to capture speech features.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

BBS-KWS:The Mandarin Keyword Spotting System Won the Video Keyword Wakeup Challenge

no code implementations • 3 Dec 2021 • Yuting Yang, Binbin Du, Yingxin Zhang, Wenxuan Wang, Yuke Li

We propose a mandarin keyword spotting system (KWS) with several novel and effective improvements, including a big backbone (B) model, a keyword biasing (B) mechanism and the introduction of syllable modeling units (S).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Transformer-based Network for RGB-D Saliency Detection

no code implementations • 1 Dec 2021 • Yue Wang, Xu Jia, Lu Zhang, Yuke Li, James Elder, Huchuan Lu

TFFM conducts a sufficient feature fusion by integrating features from multiple scales and two modalities over all positions simultaneously.

Saliency Detection

Paper
Add Code

Intervention-based Recurrent Casual Model for Non-stationary Video Causal Discovery

no code implementations • 29 Sep 2021 • Yuke Li, Kenneth Li, Pin Wang, Donglai Wei, Hanspeter Pfister, Ching-Yao Chan

Non-stationary casual structures are prevalent in real-world physical systems.

Causal Discovery counterfactual +1

Paper
Add Code

Synergistic saliency and depth prediction for RGB-D saliency detection

no code implementations • 3 Jul 2020 • Yue Wang, Yuke Li, James H. Elder, Huchuan Lu, Runmin Wu, Lu Zhang

Evaluation on seven RGB-D datasets demonstrates that even without saliency ground truth for RGB-D datasets and using only the RGB data of RGB-D datasets at inference, our semi-supervised system performs favorable against state-of-the-art fully-supervised RGB-D saliency detection methods that use saliency ground truth for RGB-D datasets at training and depth data at inference on two largest testing datasets.

Depth Estimation Depth Prediction +1

Paper
Add Code

Class-Conditional Domain Adaptation on Semantic Segmentation

no code implementations • 27 Nov 2019 • Yue Wang, Yuke Li, James H. Elder, Runmin Wu, Huchuan Lu

We address this problem by introducing a Class-Conditional Domain Adaptation method (CCDA).

Semantic Segmentation Unsupervised Domain Adaptation

Paper
Add Code

Which Way Are You Going? Imitative Decision Learning for Path Forecasting in Dynamic Scenes

no code implementations • CVPR 2019 • Yuke Li

A policy is then generated by taking the sampled latent decision into account to predict the future.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.