Search Results for author: Shugong Xu

Found 42 papers, 3 papers with code

SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing Surrogate

no code implementations13 Nov 2024 Yifei Jin, Ali Maatouk, Sarunas Girdzijauskas, Shugong Xu, Leandros Tassiulas, Rex Ying

Wireless ray-tracing (RT) is emerging as a key tool for three-dimensional (3D) wireless channel modeling, driven by advances in graphical rendering.

Decision Making Sequential Decision Making

LinFormer: A Linear-based Lightweight Transformer Architecture For Time-Aware MIMO Channel Prediction

no code implementations28 Oct 2024 Yanliang Jin, Yifan Wu, Yuan Gao, Shunqing Zhang, Shugong Xu, Cheng-Xiang Wang

The emergence of 6th generation (6G) mobile networks brings new challenges in supporting high-mobility communications, particularly in addressing the issue of channel aging.

Data Augmentation

StyleFusion TTS: Multimodal Style-control and Enhanced Feature Fusion for Zero-shot Text-to-speech Synthesis

no code implementations24 Sep 2024 Zhiyong Chen, Xinnuo Li, Zhiqi Ai, Shugong Xu

We introduce StyleFusion-TTS, a prompt and/or audio referenced, style and speaker-controllable, zero-shot text-to-speech (TTS) synthesis system designed to enhance the editability and naturalness of current research literature.

Speech Synthesis Text to Speech +1

Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample

no code implementations24 Sep 2024 Zhiyong Chen, Zhiqi Ai, Xinnuo Li, Shugong Xu

This paper introduces a novel framework for open-set speaker identification in household environments, playing a crucial role in facilitating seamless human-computer interactions.

Speaker Identification Speaker Recognition

A Learnable Color Correction Matrix for RAW Reconstruction

no code implementations4 Sep 2024 Anqi Liu, Shiyi Mu, Shugong Xu

Autonomous driving algorithms usually employ sRGB images as model input due to their compatibility with the human visual system.

Autonomous Driving Raw reconstruction

TLD: A Vehicle Tail Light signal Dataset and Benchmark

no code implementations4 Sep 2024 Jinhao Chai, Shiyi Mu, Shugong Xu

To our knowledge, TLD is the first dataset to separately annotate brake lights and turn signals in real driving scenarios.

Autonomous Driving

MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting

1 code implementation11 Jun 2024 Zhiqi Ai, Zhiyong Chen, Shugong Xu

In this paper, we propose MM-KWS, a novel approach to user-defined keyword spotting leveraging multi-modal enrollments of text and speech templates.

Data Augmentation Keyword Spotting

High-Performance Transmission Mechanism Design of Multi-Stream Carrier Aggregation for 5G Non-Standalone Network

no code implementations21 Aug 2022 Jun Yu, Shunqing Zhang, Jiayun Sun, Shugong Xu, Shan Cao

Multi-stream carrier aggregation is a key technology to expand bandwidth and improve the throughput of the fifth-generation wireless communication systems.

Prediction-based Hybrid Slicing Framework for Service Level Agreement Guarantee in Mobility Scenarios: A Deep Learning Approach

no code implementations6 Aug 2022 Heng Zhang, Guangjin Pan, Shugong Xu, Shunqing Zhang, Zhiyuan Jiang

In the proposal, LSTM networks are employed to predict traffic demand and the location of each user in a slicing window level.

Joint Optimization of DNN Inference Delay and Energy under Accuracy Constraints for AR Applications

no code implementations3 Aug 2022 Guangjin Pan, Heng Zhang, Shugong Xu, Shunqing Zhang, Xiaojing Chen

The high computational complexity and high energy consumption of artificial intelligence (AI) algorithms hinder their application in augmented reality (AR) systems.

Edge-computing Scheduling

TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance

no code implementations16 Nov 2021 Yue Tao, Zhiwei Jia, Runze Ma, Shugong Xu

We propose a 1-D split to address the challenges of complexity and replace the CNN with the transformer encoder to reduce the need for a context modeling module.

Decoder Inductive Bias +1

Multi-Task and Multi-Modal Learning for RGB Dynamic Gesture Recognition

no code implementations29 Oct 2021 Dinghao Fan, Hengjie Lu, Shugong Xu, Shan Cao

Our framework is trained to learn a representation for multi-task learning: gesture segmentation and gesture recognition.

Decoder Gesture Recognition +2

HENet: Forcing a Network to Think More for Font Recognition

1 code implementation21 Oct 2021 Jingchao Chen, Shiyi Mu, Shugong Xu, Youdong Ding

Although lots of progress were made in Text Recognition/OCR in recent years, the task of font recognition is remaining challenging.

 Ranked #1 on Font Recognition on Explor_all (Top 1 Accuracy metric)

Font Recognition Optical Character Recognition (OCR)

Deep Image Matting with Flexible Guidance Input

1 code implementation21 Oct 2021 Hang Cheng, Shugong Xu, Xiufeng Jiang, Rongrong Wang

In this paper, we propose a matting method that use Flexible Guidance Input as user hint, which means our method can use trimap, scribblemap or clickmap as guidance information or even work without any guidance input.

Image Matting

IFR: Iterative Fusion Based Recognizer For Low Quality Scene Text Recognition

no code implementations13 Aug 2021 Zhiwei Jia, Shugong Xu, Shiyi Mu, Yue Tao, Shan Cao, Zhiyong Chen

In this paper, we propose an Iterative Fusion based Recognizer (IFR) for low quality scene text recognition, taking advantage of refined text images input and robust feature representation.

Image Restoration Scene Text Recognition

RW-Resnet: A Novel Speech Anti-Spoofing Model Using Raw Waveform

no code implementations12 Aug 2021 Youxuan Ma, Zongze Ren, Shugong Xu

In recent years, synthetic speech generated by advanced text-to-speech (TTS) and voice conversion (VC) systems has caused great harms to automatic speaker verification (ASV) systems, urging us to design a synthetic speech detection system to protect ASV systems.

Speaker Verification Synthetic Speech Detection +2

SGTBN: Generating Dense Depth Maps from Single-Line LiDAR

no code implementations24 Jun 2021 Hengjie Lu, Shugong Xu, Shan Cao

Therefore, we propose a method to tackle the problem of single-line depth completion, in which we aim to generate a dense depth map from the single-line LiDAR info and the aligned RGB image.

3D geometry Depth Completion +1

A Novel GCN based Indoor Localization System with Multiple Access Points

no code implementations21 Apr 2021 Yanzan Sun, Qinggang Xie, Guangjin Pan, Shunqing Zhang, Shugong Xu

With the rapid development of indoor location-based services (LBSs), the demand for accurate localization keeps growing as well.

Indoor Localization

Arbitrary-Shaped Text Detection withAdaptive Text Region Representation

no code implementations1 Apr 2021 Xiufeng Jiang, Shugong Xu, Shunqing Zhang, Shan Cao

In this paper, we propose a novel text regionrepresentation method, with a robust pipeline, which can precisely detect dense adjacent text instances witharbitrary shapes.

Text Detection

Tracking Based Semi-Automatic Annotation for Scene Text Videos

no code implementations29 Mar 2021 Jiajun Zhu, Xiufeng Jiang, Zhiwei Jia, Shugong Xu, Shan Cao

Moreover, a paired low-quality scene text video dataset named Text-RBL is proposed, consisting of raw videos, blurry videos, and low-resolution videos, labeled by the proposed convenient semi-automatic labeling strategy.

Scene Text Detection text annotation +1

A Dataset and Benchmark Towards Multi-Modal Face Anti-Spoofing Under Surveillance Scenarios

no code implementations29 Mar 2021 Xudong Chen, Shugong Xu, Qiaobin Ji, Shan Cao

Besides, we propose an Attention based Face Anti-spoofing network with Feature Augment (AFA) to solve the FAS towards low-quality face images.

Face Anti-Spoofing

Self-Calibrating Indoor Localization with Crowdsourcing Fingerprints and Transfer Learning

no code implementations26 Jan 2021 Chenlu Xiang, Shunqing Zhang, Shugong Xu, George C. Alexandropoulos

Precise indoor localization is one of the key requirements for fifth Generation (5G) and beyond, concerning various wireless communication systems, whose applications span different vertical sectors.

Indoor Localization Transfer Learning

SIRI: Spatial Relation Induced Network For Spatial Description Resolution

no code implementations NeurIPS 2020 Peiyao Wang, Weixin Luo, Yanyu Xu, Haojie Li, Shugong Xu, Jianyu Yang, Shenghua Gao

Spatial Description Resolution, as a language-guided localization task, is proposed for target location in a panoramic street view, given corresponding language descriptions.

Relation

Energy-Efficient NOMA Multicasting System for 5G Cellular V2X Communications with Imperfect CSI

no code implementations8 Sep 2020 Asim Ihsan, Wen Chen, Shunqing Zhang, Shugong Xu

The proposed system multicast the information through low complexity optimal power allocation algorithms used under channel outage probability constraint of vehicles with imperfect CSI, QoS constraints of vehicles, and transmit power limits constraint of RSUs.

Finding Action Tubes with a Sparse-to-Dense Framework

no code implementations30 Aug 2020 Yuxi Li, Weiyao Lin, Tao Wang, John See, Rui Qian, Ning Xu, Li-Min Wang, Shugong Xu

The task of spatial-temporal action detection has attracted increasing attention among researchers.

Ranked #3 on Action Detection on UCF Sports (Video-mAP 0.2 metric)

Action Detection

CFAD: Coarse-to-Fine Action Detector for Spatiotemporal Action Localization

no code implementations ECCV 2020 Yuxi Li, Weiyao Lin, John See, Ning Xu, Shugong Xu, Ke Yan, Cong Yang

Most current pipelines for spatio-temporal action localization connect frame-wise or clip-wise detection results to generate action proposals, where only local information is exploited and the efficiency is hindered by dense per-frame localization.

Action Detection Spatio-Temporal Action Localization +1

High Accurate Time-of-Arrival Estimation with Fine-Grained Feature Generation for Internet-of-Things Applications

no code implementations18 Aug 2020 Guangjin Pan, Tao Wang, Shunqing Zhang, Shugong Xu

Conventional schemes often require extra reference signals or more complicated algorithms to improve the time-of-arrival (TOA) estimation accuracy.

Cooling-Aware Resource Allocation and Load Management for Mobile Edge Computing Systems

no code implementations19 Jun 2020 Xiaojing Chen, Zhouyu Lu, Wei Ni, Xin Wang, Feng Wang, Shunqing Zhang, Shugong Xu

Driven by explosive computation demands of Internet of Things (IoT), mobile edge computing (MEC) provides a promising technique to enhance the computation capability for mobile users.

Edge-computing Management

Age of Information Optimized MAC in V2X Sidelink via Piggyback-Based Collaboration

no code implementations24 Feb 2020 Fei Peng, Zhiyuan Jiang, Shunqing Zhang, Shugong Xu

Real-time status update in future vehicular networks is vital to enable control-level cooperative autonomous driving.

Information Theory Networking and Internet Architecture Information Theory

Deep Reinforcement Learning-Based Beam Tracking for Low-Latency Services in Vehicular Networks

no code implementations13 Feb 2020 Yan Liu, Zhiyuan Jiang, Shunqing Zhang, Shugong Xu

Ultra-Reliable and Low-Latency Communications (URLLC) services in vehicular networks on millimeter-wave bands present a significant challenge, considering the necessity of constantly adjusting the beam directions.

Deep Reinforcement Learning reinforcement-learning +1

A Study on Angular Based Embedding Learning for Text-independent Speaker Verification

no code implementations12 Aug 2019 Zhiyong Chen, Zongze Ren, Shugong Xu

Learning a good speaker embedding is important for many automatic speaker recognition tasks, including verification, identification and diarization.

Speaker Recognition Text-Independent Speaker Verification

Two-stage Training for Chinese Dialect Recognition

no code implementations6 Aug 2019 Zongze Ren, Guofu Yang, Shugong Xu

In this paper, we present a two-stage language identification (LID) system based on a shallow ResNet14 followed by a simple 2-layer recurrent neural network (RNN) architecture, which was used for Xunfei (iFlyTek) Chinese Dialect Recognition Challenge and won the first place among 110 teams.

Language Identification Vocal Bursts Valence Prediction

Triplet Based Embedding Distance and Similarity Learning for Text-independent Speaker Verification

no code implementations6 Aug 2019 Zongze Ren, Zhiyong Chen, Shugong Xu

The improvements are both based on triplet cause the training stage and the evaluation stage of the baseline x-vector system focus on different aims.

Speaker Recognition Text-Independent Speaker Verification +1

Passive TCP Identification for Wired and WirelessNetworks: A Long-Short Term Memory Approach

no code implementations9 Apr 2019 Xiaoyu Chen, Shugong Xu, Xudong Chen, Shan Cao, Shunqing Zhang, Yanzan Sun

TCP congestion control algorithm identification (TCP identification) can be used to significantly improve network efficiency.

BIG-bench Machine Learning

How many labeled license plates are needed?

no code implementations25 Aug 2018 Changhao Wu, Shugong Xu, Guocong Song, Shunqing Zhang

As a large amount of labeled data is typically difficult to collect and even more difficult to annotate, data augmentation and data generation are widely used in the process of training deep neural networks.

Data Augmentation License Plate Recognition

Monocular Depth Estimation with Augmented Ordinal Depth Relationships

no code implementations2 Jun 2018 Yuanzhouhan Cao, Tianqi Zhao, Ke Xian, Chunhua Shen, Zhiguo Cao, Shugong Xu

In this paper, we propose to improve the performance of metric depth estimation with relative depths collected from stereo movie videos using existing stereo matching algorithm.

Depth Prediction Monocular Depth Estimation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.