Search Results for author: Jinsong Zhang

Found 21 papers, 8 papers with code

基于GPT-2和互信息的语言单位信息量对韵律特征的影响(Prosodic Effects of Speech Unit’s Information Based on GPT-2 and Mutual Information)

no code implementations • CCL 2022 • Yun Hao, Yanlu Xie, Binghuai Lin, Jinsong Zhang

Paper
Add Code

SpeechAct: Towards Generating Whole-body Motion from Speech

no code implementations • 29 Nov 2023 • Jinsong Zhang, Minjie Zhu, Yuxiang Zhang, Yebin Liu, Kun Li

Then, we regress the motion representation from the audio signal by a translation model employing our contrastive motion learning method.

Paper
Add Code

High-Quality Animatable Dynamic Garment Reconstruction from Monocular Videos

no code implementations • 2 Nov 2023 • Xiongzheng Li, Jinsong Zhang, Yu-Kun Lai, Jingyu Yang, Kun Li

To alleviate the ambiguity estimating 3D garments from monocular videos, we design a multi-hypothesis deformation module that learns spatial representations of multiple plausible deformations.

Garment Reconstruction

Paper
Add Code

Towards Grouping in Large Scenes with Occlusion-aware Spatio-temporal Transformers

no code implementations • 30 Oct 2023 • Jinsong Zhang, Lingfeng Gu, Yu-Kun Lai, Xueyang Wang, Kun Li

To explore the potential spatio-temporal relationship, we propose spatio-temporal transformers to simultaneously extract trajectory information and fuse inter-person features in a hierarchical manner.

Paper
Add Code

Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning

no code implementations • ICCV 2023 • Haibiao Xuan, Xiongzheng Li, Jinsong Zhang, Hongwen Zhang, Yebin Liu, Kun Li

Also, we model global and local spatial relationships in a 3D scene and a textual description respectively based on the scene graph, and introduce a partlevel action mechanism to represent interactions as atomic body part states.

Paper
Add Code

Out-of-Distribution Detection based on In-Distribution Data Patterns Memorization with Modern Hopfield Energy

1 code implementation • ICLR 2023 • Jinsong Zhang, Qiang Fu, Xu Chen, Lun Du, Zelin Li, Gang Wang, Xiaoguang Liu, Shi Han, Dongmei Zhang

In more detail, penultimate layer outputs on the training set are considered as the representations of in-distribution (ID) data.

Ranked #11 on Out-of-Distribution Detection on ImageNet-1k vs Places

Computational Efficiency Memorization +2

Paper
Code

Learning Semantic-Aware Disentangled Representation for Flexible 3D Human Body Editing

no code implementations • CVPR 2023 • Xiaokun Sun, Qiao Feng, Xiongzheng Li, Jinsong Zhang, Yu-Kun Lai, Jingyu Yang, Kun Li

3D human body representation learning has received increasing attention in recent years.

Representation Learning Style Transfer

Paper
Add Code

DiaASQ : A Benchmark of Conversational Aspect-based Sentiment Quadruple Analysis

1 code implementation • 10 Nov 2022 • Bobo Li, Hao Fei, Fei Li, Yuhan Wu, Jinsong Zhang, Shengqiong Wu, Jingye Li, Yijiang Liu, Lizi Liao, Tat-Seng Chua, Donghong Ji

The rapid development of aspect-based sentiment analysis (ABSA) within recent decades shows great potential for real-world society.

Ranked #1 on Conversational Sentiment Quadruple Extraction on DiaASQ (ZH)

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

Paper
Code

Text-Aware End-to-end Mispronunciation Detection and Diagnosis

1 code implementation • 15 Jun 2022 • Linkai Peng, Yingming Gao, Binghuai Lin, Dengfeng Ke, Yanlu Xie, Jinsong Zhang

In the field of assessing the pronunciation quality of constrained speech, the given transcriptions can play the role of a teacher.

Contrastive Learning

Paper
Code

High-Fidelity Human Avatars From a Single RGB Camera

no code implementations • CVPR 2022 • Hao Zhao, Jinsong Zhang, Yu-Kun Lai, Zerong Zheng, Yingdi Xie, Yebin Liu, Kun Li

To cope with the complexity of textures and generate photo-realistic results, we propose a reference-based neural rendering network and exploit a bottom-up sharpening-guided fine-tuning strategy to obtain detailed textures.

Neural Rendering Vocal Bursts Intensity Prediction

Paper
Add Code

Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers

1 code implementation • 7 Jul 2021 • Huahuan Zheng, Wenjie Peng, Zhijian Ou, Jinsong Zhang

Automatic speech recognition systems have been largely improved in the past few decades and current systems are mainly hybrid-based and end-to-end-based.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

307

Paper
Code

Speech Enhancement using Separable Polling Attention and Global Layer Normalization followed with PReLU

no code implementations • 6 May 2021 • Dengfeng Ke, Jinsong Zhang, Yanlu Xie, Yanyan Xu, Binghuai Lin

With all these modifications, the size of the PHASEN model is shrunk from 33M parameters to 5M parameters, while the performance on VoiceBank+DEMAND is improved to the CSIG score of 4. 30, the PESQ score of 3. 07 and the COVL score of 3. 73.

Speech Enhancement

Paper
Add Code

A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augmentation Techniques

1 code implementation • 17 Apr 2021 • Kaiqi Fu, Jones Lin, Dengfeng Ke, Yanlu Xie, Jinsong Zhang, Binghuai Lin

Recently, end-to-end mispronunciation detection and diagnosis (MD&D) systems has become a popular alternative to greatly simplify the model-building process of conventional hybrid DNN-HMM systems by representing complicated modules with a single deep network architecture.

Data Augmentation

Paper
Code

PISE: Person Image Synthesis and Editing with Decoupled GAN

1 code implementation • CVPR 2021 • Jinsong Zhang, Kun Li, Yu-Kun Lai, Jingyu Yang

The results of qualitative and quantitative experiments demonstrate the superiority of our model on human pose transfer.

Human Parsing Pose Transfer

121

Paper
Code

Human Pose Transfer by Adaptive Hierarchical Deformation

1 code implementation • 13 Dec 2020 • Jinsong Zhang, Xingzi Liu, Kun Li

Existing methods cannot effectively utilize the input information, which often fail to preserve the style and shape of hair and clothes.

Pose Transfer Semantic Parsing +1

Paper
Code

PoNA: Pose-guided Non-local Attention for Human Pose Transfer

1 code implementation • 13 Dec 2020 • Kun Li, Jinsong Zhang, Yebin Liu, Yu-Kun Lai, Qionghai Dai

In each block, we propose a pose-guided non-local attention (PoNA) mechanism with a long-range dependency scheme to select more important regions of image features to transfer.

Generative Adversarial Network Person Re-Identification +1

Paper
Code

Adaptive 3D Face Reconstruction from a Single Image

no code implementations • 8 Jul 2020 • Kun Li, Jing Yang, Nianhong Jiao, Jinsong Zhang, Yu-Kun Lai

3D face reconstruction from a single image is a challenging problem, especially under partial occlusions and extreme poses.

3D Face Reconstruction Pose Estimation

Paper
Add Code

All-Weather Deep Outdoor Lighting Estimation

no code implementations • CVPR 2019 • Jinsong Zhang, Kalyan Sunkavalli, Yannick Hold-Geoffroy, Sunil Hadap, Jonathan Eisenmann, Jean-François Lalonde

We use this network to label a large-scale dataset of LDR panoramas with lighting parameters and use them to train our single image outdoor lighting estimation network.

Lighting Estimation

Paper
Add Code

Deep Photovoltaic Nowcasting

no code implementations • 15 Oct 2018 • Jinsong Zhang, Rodrigo Verschae, Shohei Nobuhara, Jean-François Lalonde

Our experiments reveal that the MLP network, already used similarly in previous work, achieves an RMSE skill score of 7% over the commonly-used persistence baseline on the 1-minute future photovoltaic power prediction task.

Management

Paper
Add Code

Learning High Dynamic Range from Outdoor Panoramas

no code implementations • ICCV 2017 • Jinsong Zhang, Jean-François Lalonde

Outdoor lighting has extremely high dynamic range.

Vocal Bursts Intensity Prediction

Paper
Add Code

Phoneme Set Design Using English Speech Database by Japanese for Dialogue-Based English CALL Systems

no code implementations • LREC 2014 • Xiaoyun Wang, Jinsong Zhang, Masafumi Nishida, Seiichi Yamamoto

This paper describes a method of generating a reduced phoneme set for dialogue-based computer assisted language learning (CALL)systems.

Language Modelling Speech Recognition +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.