Search Results for author: Yuxiang Zhang

Found 54 papers, 15 papers with code

Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives

1 code implementation15 Mar 2024 Ronghui Li, Yuxiang Zhang, Yachao Zhang, Hongwen Zhang, Jie Guo, Yan Zhang, Yebin Liu, Xiu Li

In contrast, the second-stage is the local diffusion, which parallelly generates detailed motion sequences under the guidance of the dance primitives and choreographic rules.

Motion Synthesis

Revisiting Edge Perturbation for Graph Neural Network in Graph Data Augmentation and Attack

no code implementations10 Mar 2024 Xin Liu, Yuxiang Zhang, Meng Wu, Mingyu Yan, Kun He, Wei Yan, Shirui Pan, Xiaochun Ye, Dongrui Fan

It can be categorized into two veins based on their effects on the performance of graph neural networks (GNNs), i. e., graph data augmentation and attack.

Data Augmentation

OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems

1 code implementation21 Feb 2024 Chaoqun He, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Leng Thai, Junhao Shen, Jinyi Hu, Xu Han, Yujie Huang, Yuxiang Zhang, Jie Liu, Lei Qi, Zhiyuan Liu, Maosong Sun

Notably, the best-performing model, GPT-4V, attains an average score of 17. 23% on OlympiadBench, with a mere 11. 28% in physics, highlighting the benchmark rigor and the intricacy of physical reasoning.

Logical Fallacies

A Saliency Enhanced Feature Fusion based multiscale RGB-D Salient Object Detection Network

no code implementations22 Jan 2024 Rui Huang, Qingyi Zhao, Yan Xing, Sihua Gao, Weifeng Xu, Yuxiang Zhang, Wei Fan

SEFF utilizes saliency maps of the neighboring scales to enhance the necessary features for fusing, resulting in more representative fused features.

object-detection RGB-D Salient Object Detection +2

TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding

no code implementations16 Jan 2024 Yun Liu, Haolin Yang, Xu Si, Ling Liu, Zipeng Li, Yuxiang Zhang, Yebin Liu, Li Yi

Humans commonly work with multiple objects in daily life and can intuitively transfer manipulation skills to novel objects by understanding object functional regularities.

Action Recognition Benchmarking +2

Towards 6G Digital Twin Channel Using Radio Environment Knowledge Pool

no code implementations16 Dec 2023 Jialin Wang, Jianhua Zhang, Yuxiang Zhang, Yutong Sun, Gaofeng, Nie, Lianzheng Shi, Ping Zhang, Guangyi Liu

The digital twin channel (DTC) is crucial for 6G wireless autonomous networks as it replicates the wireless channel fading states in 6G air interface transmissions.

Ins-HOI: Instance Aware Human-Object Interactions Recovery

1 code implementation15 Dec 2023 Jiajun Zhang, Yuxiang Zhang, Hongwen Zhang, Xiao Zhou, Boyao Zhou, Ruizhi Shao, Zonghai Hu, Yebin Liu

To address this, we further propose a complementary training strategy that leverages synthetic data to introduce instance-level shape priors, enabling the disentanglement of occupancy fields for different instances.

Descriptive Disentanglement +3

Adapting Vision Transformer for Efficient Change Detection

no code implementations8 Dec 2023 Yang Zhao, Yuxiang Zhang, Yanni Dong, Bo Du

Most change detection models based on vision transformers currently follow a "pretraining then fine-tuning" strategy.

Change Detection

C-NERF: Representing Scene Changes as Directional Consistency Difference-based NeRF

1 code implementation5 Dec 2023 Rui Huang, Binbin Jiang, Qingyi Zhao, William Wang, Yuxiang Zhang, Qing Guo

Our approach surpasses state-of-the-art 2D change detection and NeRF-based methods by a significant margin.

Change Detection

GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians

1 code implementation4 Dec 2023 Liangxiao Hu, Hongwen Zhang, Yuxiang Zhang, Boyao Zhou, Boning Liu, Shengping Zhang, Liqiang Nie

We present GaussianAvatar, an efficient approach to creating realistic human avatars with dynamic 3D appearances from a single video.

Motion Estimation

SpeechAct: Towards Generating Whole-body Motion from Speech

no code implementations29 Nov 2023 Jinsong Zhang, Minjie Zhu, Yuxiang Zhang, Yebin Liu, Kun Li

Then, we regress the motion representation from the audio signal by a translation model employing our contrastive motion learning method.

How to Extend 3D GBSM to Integrated Sensing and Communication Channel with Sharing Feature?

no code implementations25 Oct 2023 Yameng Liu, Jianhua Zhang, Yuxiang Zhang, Huiwen Gong, Tao Jiang, Guangyi Liu

The proposed approach can be summarized as follows: Firstly, an ISAC channel model is proposed, where shared and non-shared components are superimposed for both communication and sensing.

EALM: Introducing Multidimensional Ethical Alignment in Conversational Information Retrieval

1 code implementation2 Oct 2023 Yiyao Yu, Junjie Wang, Yuxiang Zhang, Lin Zhang, Yujiu Yang, Tetsuya Sakai

Artificial intelligence (AI) technologies should adhere to human norms to better serve our society and avoid disseminating harmful or misleading information, particularly in Conversational Information Retrieval (CIR).

Ethics Information Retrieval +1

Synthetic Speech Detection Based on Temporal Consistency and Distribution of Speaker Features

no code implementations29 Sep 2023 Yuxiang Zhang, Zhuo Li, Jingze Lu, Wenchao Wang, Pengyuan Zhang

Based on these analyzes, an SSD method based on temporal consistency and distribution of speaker features is proposed.

Synthetic Speech Detection

The Impact of Silence on Speech Anti-Spoofing

no code implementations21 Sep 2023 Yuxiang Zhang, Zhuo Li, Jingze Lu, Hua Hua, Wenchao Wang, Pengyuan Zhang

First, the reasons for the impact are explored, including the proportion of silence duration and the content of silence.

Action Detection Activity Detection +1

Improving Short Utterance Anti-Spoofing with AASIST2

no code implementations15 Sep 2023 Yuxiang Zhang, Jingze Lu, Zengqiang Shang, Wenchao Wang, Pengyuan Zhang

The modified Res2Net blocks can extract multi-scale features and improve the detection performance for speech of different durations, thus improving the short utterance evaluation performance.

Graph Attention Speaker Verification

You talk what you read: Understanding News Comment Behavior by Dispositional and Situational Attribution

no code implementations4 Aug 2023 Yuhang Wang, Yuxiang Zhang, Dongyuan Lu, Jitao Sang

Many news comment mining studies are based on the assumption that comment is explicitly linked to the corresponding news.

News Summarization Position

ProxyCap: Real-time Monocular Full-body Capture in World Space via Human-Centric Proxy-to-Motion Learning

no code implementations3 Jul 2023 Yuxiang Zhang, Hongwen Zhang, Liangxiao Hu, Jiajun Zhang, Hongwei Yi, Shengping Zhang, Yebin Liu

For more accurate and physically plausible predictions in world space, our network is designed to learn human motions from a human-centric perspective, which enables the understanding of the same motion captured with different camera trajectories.

3D Human Pose Estimation

Channel Measurement, Modeling, and Simulation for 6G: A Survey and Tutorial

no code implementations26 May 2023 Jianhua Zhang, Jiaxin Lin, Pan Tang, Yuxiang Zhang, Huixin Xu, Tianyang Gao, Haiyang Miao, Zeyong Chai, Zhengfu Zhou, Yi Li, Huiwen Gong, Yameng Liu, Zhiqiang Yuan, Lei Tian, Shaoshi Yang, Liang Xia, Guangyi Liu, Ping Zhang

Then, a survey of the progress of the 6G channel research regarding the above five promising technologies is presented in terms of the latest measurement campaigns, new characteristics, modeling methods, and research prospects.

UniEX: An Effective and Efficient Framework for Unified Information Extraction via a Span-extractive Perspective

no code implementations17 May 2023 Ping Yang, Junyu Lu, Ruyi Gan, Junjie Wang, Yuxiang Zhang, Jiaxing Zhang, Pingjian Zhang

We propose a new paradigm for universal information extraction (IE) that is compatible with any schema format and applicable to a list of IE tasks, such as named entity recognition, relation extraction, event extraction and sentiment analysis.

Event Extraction named-entity-recognition +3

NER-to-MRC: Named-Entity Recognition Completely Solving as Machine Reading Comprehension

no code implementations6 May 2023 Yuxiang Zhang, Junjie Wang, Xinyu Zhu, Tetsuya Sakai, Hayato Yamana

Named-entity recognition (NER) detects texts with predefined semantic labels and is an essential building block for natural language processing (NLP).

Machine Reading Comprehension named-entity-recognition +2

StyleAvatar: Real-time Photo-realistic Portrait Avatar from a Single Video

1 code implementation1 May 2023 Lizhen Wang, Xiaochen Zhao, Jingxiang Sun, Yuxiang Zhang, Hongwen Zhang, Tao Yu, Yebin Liu

Results and experiments demonstrate the superiority of our method in terms of image quality, full portrait video generation, and real-time re-animation compared to existing facial reenactment methods.

Face Reenactment Translation +1

CloSET: Modeling Clothed Humans on Continuous Surface with Explicit Template Decomposition

no code implementations CVPR 2023 Hongwen Zhang, Siyou Lin, Ruizhi Shao, Yuxiang Zhang, Zerong Zheng, Han Huang, Yandong Guo, Yebin Liu

In this way, the clothing deformations are disentangled such that the pose-dependent wrinkles can be better learned and applied to unseen poses.

OASIS: Automated Assessment of Urban Pedestrian Paths at Scale

no code implementations4 Mar 2023 Yuxiang Zhang, Suresh Devalapalli, Sachin Mehta, Anat Caspi

The inspection of the Public Right of Way (PROW) for accessibility barriers is necessary for monitoring and maintaining the built environment for communities' walkability, rollability, safety, active transportation, and sustainability.

APE: An Open and Shared Annotated Dataset for Learning Urban Pedestrian Path Networks

no code implementations4 Mar 2023 Yuxiang Zhang, Nicholas Bolten, Sachin Mehta, Anat Caspi

The process features the use of a multi-input segmentation network trained on our dataset to predict important classes in the pedestrian environment and then generate a connected pedestrian path network.

Autonomous Driving

How to Extend 3D GBSM to RIS Cascade Channel with Non-ideal Phase Modulation?

no code implementations15 Feb 2023 Huiwen Gong, Jianhua Zhang, Yuxiang Zhang, Zhengfu Zhou, Guangyi Liu

In the modeling process, we consider the non-ideal phase modulation of the RIS element, so as to accurately characterize the dependence of its phase modulation on the incoming wave angle.

Capacity Analysis of Holographic MIMO Channels with Practical Constraints

no code implementations29 Dec 2022 Yuan Zhang, Jianhua Zhang, Yuxiang Zhang, Yuan YAO, Guangyi Liu

However, the channel might not satisfy isotropic scattering because of generalized angle distributions, and the antenna gain is limited by the array aperture in reality.

SrTR: Self-reasoning Transformer with Visual-linguistic Knowledge for Scene Graph Generation

no code implementations19 Dec 2022 Yuxiang Zhang, Zhenbo Liu, Shuai Wang

The execution efficiency of the one-stage scene graph generation approaches are quite high, which infer the effective relation between entity pairs using sparse proposal sets and a few queries.

Graph Generation Object +3

Background-Mixed Augmentation for Weakly Supervised Change Detection

1 code implementation21 Nov 2022 Rui Huang, Ruofei Wang, Qing Guo, Jieda Wei, Yuxiang Zhang, Wei Fan, Yang Liu

Change detection (CD) is to decouple object changes (i. e., object missing or appearing) from background changes (i. e., environment variations) like light and season variations in two images captured in the same scene over a long time span, presenting critical applications in disaster management, urban development, etc.

Change Detection Data Augmentation +1

A Shared Cluster-based Stochastic Channel Model for Joint Communication and Sensing Systems

no code implementations12 Nov 2022 Yameng Liu, Jianhua Zhang, Yuxiang Zhang, Zhiqiang Yuan, Guangyi Liu

Then, a stochastic JCAS channel model is proposed to capture the sharing feature, where shared and non-shared clusters by the two channels are defined and superimposed.

Solving Math Word Problems via Cooperative Reasoning induced Language Models

1 code implementation28 Oct 2022 Xinyu Zhu, Junjie Wang, Lin Zhang, Yuxiang Zhang, Ruyi Gan, Jiaxing Zhang, Yujiu Yang

This inspires us to develop a cooperative reasoning-induced PLM for solving MWPs, called Cooperative Reasoning (CoRe), resulting in a human-like reasoning architecture with system 1 as the generator and system 2 as the verifier.

Arithmetic Reasoning Math

Deepfake Detection System for the ADD Challenge Track 3.2 Based on Score Fusion

no code implementations13 Oct 2022 Yuxiang Zhang, Jingze Lu, Xingming Wang, Zhuo Li, Runqiu Xiao, Wenchao Wang, Ming Li, Pengyuan Zhang

The overfitting of the model to the training set leads to extreme values of the scores and low correlation of the score distributions, which makes score fusion difficult.

Data Augmentation DeepFake Detection +1

Language-aware Domain Generalization Network for Cross-Scene Hyperspectral Image Classification

no code implementations6 Sep 2022 Yuxiang Zhang, Mengmeng Zhang, Wei Li, Shuai Wang, Ran Tao

Text information including extensive prior knowledge about land cover classes has been ignored in hyperspectral image classification (HSI) tasks.

Contrastive Learning Domain Generalization +1

Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss

1 code implementation5 Aug 2022 Junjie Wang, Yuxiang Zhang, Ping Yang, Ruyi Gan

This report describes a pre-trained language model Erlangshen with propensity-corrected loss, the No. 1 in CLUE Semantic Matching Challenge.

Language Modelling Masked Language Modeling

IDET: Iterative Difference-Enhanced Transformers for High-Quality Change Detection

1 code implementation15 Jul 2022 Qing Guo, Ruofei Wang, Rui Huang, Shuifa Sun, Yuxiang Zhang

Change detection (CD) aims to detect change regions within an image pair captured at different times, playing a significant role in diverse real-world applications.

Change Detection Vocal Bursts Intensity Prediction

PyMAF-X: Towards Well-aligned Full-body Model Regression from Monocular Images

1 code implementation13 Jul 2022 Hongwen Zhang, Yating Tian, Yuxiang Zhang, Mengcheng Li, Liang An, Zhenan Sun, Yebin Liu

To address these issues, we propose a Pyramidal Mesh Alignment Feedback (PyMAF) loop in our regression network for well-aligned human mesh recovery and extend it as PyMAF-X for the recovery of expressive full-body models.

Ranked #6 on 3D Human Pose Estimation on AGORA (using extra training data)

3D human pose and shape estimation Human Mesh Recovery +2

SASV Based on Pre-trained ASV System and Integrated Scoring Module

no code implementations1 Jul 2022 Yuxiang Zhang, Zhuo Li, Wenchao Wang, Pengyuan Zhang

Based on the assumption that there is a correlation between anti-spoofing and speaker verification, a Total-Divide-Total integrated Spoofing-Aware Speaker Verification (SASV) system based on pre-trained automatic speaker verification (ASV) system and integrated scoring module is proposed and submitted to the SASV 2022 Challenge.

Speaker Verification

Adversarial Training-Aided Time-Varying Channel Prediction for TDD/FDD Systems

no code implementations25 Apr 2022 Zhen Zhang, Yuxiang Zhang, Jianhua Zhang, Feifei Gao

In this paper, a time-varying channel prediction method based on conditional generative adversarial network (CPcGAN) is proposed for time division duplexing/frequency division duplexing (TDD/FDD) systems.

Generative Adversarial Network

Lightweight Multi-person Total Motion Capture Using Sparse Multi-view Cameras

no code implementations ICCV 2021 Yuxiang Zhang, Zhe Li, Liang An, Mengcheng Li, Tao Yu, Yebin Liu

Overall, we propose the first light-weight total capture system and achieves fast, robust and accurate multi-person total motion capture performance.

3D Multi-Person Pose Estimation

Rethinking Semantic Segmentation Evaluation for Explainability and Model Selection

no code implementations21 Jan 2021 Yuxiang Zhang, Sachin Mehta, Anat Caspi

Semantic segmentation is a prerequisite for this task since it maps contiguous regions of the same class as single entities.

Autonomous Navigation Model Selection +3

Incorporating Linguistic Constraints into Keyphrase Generation

no code implementations ACL 2019 Jing Zhao, Yuxiang Zhang

Keyphrases, that concisely describe the high-level topics discussed in a document, are very useful for a wide range of natural language processing tasks.

Keyphrase Generation Multi-Task Learning

Training Bit Fully Convolutional Network for Fast Semantic Segmentation

no code implementations1 Dec 2016 He Wen, Shuchang Zhou, Zhe Liang, Yuxiang Zhang, Dieqiao Feng, Xinyu Zhou, Cong Yao

Fully convolutional neural networks give accurate, per-pixel prediction for input images and have applications like semantic segmentation.

Segmentation Semantic Segmentation

An Open Source Testing Tool for Evaluating Handwriting Input Methods

no code implementations30 May 2015 Liquan Qiu, Lianwen Jin, Ruifen Dai, Yuxiang Zhang, Lei LI

This paper presents an open source tool for testing the recognition accuracy of Chinese handwriting input methods.

Handwriting Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.