Search Results for author: Kun Zhou

Found 119 papers, 58 papers with code

A Survey of Large Language Models

4 code implementations31 Mar 2023 Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, YiFan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, Ji-Rong Wen

To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.

Language Modelling

Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks

3 code implementations CVPR 2020 Jiangke Lin, Yi Yuan, Tianjia Shao, Kun Zhou

In this paper, we introduce a method to reconstruct 3D facial shapes with high-fidelity textures from single-view images in-the-wild, without the need to capture a large-scale face texture database.

3D Face Reconstruction

StructGPT: A General Framework for Large Language Model to Reason over Structured Data

1 code implementation16 May 2023 Jinhao Jiang, Kun Zhou, Zican Dong, Keming Ye, Wayne Xin Zhao, Ji-Rong Wen

Specially, we propose an \emph{invoking-linearization-generation} procedure to support LLMs in reasoning on the structured data with the help of the external interfaces.

Language Modelling Large Language Model +1

Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset

2 code implementations28 Oct 2020 Kun Zhou, Berrak Sisman, Rui Liu, Haizhou Li

Emotional voice conversion aims to transform emotional prosody in speech while preserving the linguistic content and speaker identity.

Generative Adversarial Network Speech Emotion Recognition +2

Emotional Voice Conversion: Theory, Databases and ESD

1 code implementation31 May 2021 Kun Zhou, Berrak Sisman, Rui Liu, Haizhou Li

In this paper, we first provide a review of the state-of-the-art emotional voice conversion research, and the existing emotional speech databases.

Voice Conversion

S^3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization

2 code implementations18 Aug 2020 Kun Zhou, Hui Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, Ji-Rong Wen

To tackle this problem, we propose the model S^3-Rec, which stands for Self-Supervised learning for Sequential Recommendation, based on the self-attentive neural architecture.

Attribute Self-Supervised Learning +1

Best-Buddy GANs for Highly Detailed Image Super-Resolution

2 code implementations29 Mar 2021 Wenbo Li, Kun Zhou, Lu Qi, Liying Lu, Nianjuan Jiang, Jiangbo Lu, Jiaya Jia

We consider the single image super-resolution (SISR) problem, where a high-resolution (HR) image is generated based on a low-resolution (LR) input.

Image Super-Resolution

LAPAR: Linearly-Assembled Pixel-Adaptive Regression Network for Single Image Super-Resolution and Beyond

2 code implementations NeurIPS 2020 Wenbo Li, Kun Zhou, Lu Qi, Nianjuan Jiang, Jiangbo Lu, Jiaya Jia

Single image super-resolution (SISR) deals with a fundamental problem of upsampling a low-resolution (LR) image to its high-resolution (HR) version.

Image Deblocking Image Denoising +2

Evaluating Object Hallucination in Large Vision-Language Models

2 code implementations17 May 2023 YiFan Li, Yifan Du, Kun Zhou, Jinpeng Wang, Wayne Xin Zhao, Ji-Rong Wen

Despite the promising progress on LVLMs, we find that LVLMs suffer from the hallucination problem, i. e. they tend to generate objects that are inconsistent with the target images in the descriptions.

Hallucination Object

EmbedMask: Embedding Coupling for One-stage Instance Segmentation

3 code implementations4 Dec 2019 Hui Ying, Zhaojin Huang, Shu Liu, Tianjia Shao, Kun Zhou

The pixel-level clustering enables EmbedMask to generate high-resolution masks without missing details from repooling, and the existence of proposal embedding simplifies and strengthens the clustering procedure to achieve high speed with higher performance than segmentation-based methods.

Clustering Instance Segmentation +2

Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data

1 code implementation1 Feb 2020 Kun Zhou, Berrak Sisman, Haizhou Li

Many studies require parallel speech data between different emotional patterns, which is not practical in real life.

Voice Conversion

Predicting Loose-Fitting Garment Deformations Using Bone-Driven Motion Networks

1 code implementation3 May 2022 Xiaoyu Pan, Jiaming Mai, Xinwei Jiang, Dongxue Tang, Jingxiang Li, Tianjia Shao, Kun Zhou, Xiaogang Jin, Dinesh Manocha

We present a learning algorithm that uses bone-driven motion networks to predict the deformation of loose-fitting garment meshes at interactive rates.

SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval

1 code implementation21 Oct 2022 Kun Zhou, Yeyun Gong, Xiao Liu, Wayne Xin Zhao, Yelong Shen, Anlei Dong, Jingwen Lu, Rangan Majumder, Ji-Rong Wen, Nan Duan, Weizhu Chen

Thus, we propose a simple ambiguous negatives sampling method, SimANS, which incorporates a new sampling probability distribution to sample more ambiguous negatives.

Retrieval Text Retrieval

MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers

1 code implementation15 Dec 2022 Kun Zhou, Xiao Liu, Yeyun Gong, Wayne Xin Zhao, Daxin Jiang, Nan Duan, Ji-Rong Wen

Pre-trained Transformers (\eg BERT) have been commonly used in existing dense retrieval methods for parameter initialization, and recent studies are exploring more effective pre-training tasks for further improving the quality of dense vectors.

Passage Retrieval Retrieval

Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training

2 code implementations31 Mar 2021 Kun Zhou, Berrak Sisman, Haizhou Li

In stage 2, we perform emotion training with a limited amount of emotional speech data, to learn how to disentangle emotional style and linguistic information from the speech.

Voice Conversion

Filter-enhanced MLP is All You Need for Sequential Recommendation

2 code implementations28 Feb 2022 Kun Zhou, Hui Yu, Wayne Xin Zhao, Ji-Rong Wen

Recently, deep neural networks such as RNN, CNN and Transformer have been applied in the task of sequential recommendation, which aims to capture the dynamic preference characteristics from logged user behavior data for accurate recommendation.

Sequential Recommendation

From NeRFLiX to NeRFLiX++: A General NeRF-Agnostic Restorer Paradigm

1 code implementation10 Jun 2023 Kun Zhou, Wenbo Li, Nianjuan Jiang, Xiaoguang Han, Jiangbo Lu

To address this, we propose NeRFLiX, a general NeRF-agnostic restorer paradigm that learns a degradation-driven inter-viewpoint mixer.

Computational Efficiency Novel View Synthesis

Converting Anyone's Emotion: Towards Speaker-Independent Emotional Voice Conversion

1 code implementation13 May 2020 Kun Zhou, Berrak Sisman, Mingyang Zhang, Haizhou Li

We consider that there is a common code between speakers for emotional expression in a spoken language, therefore, a speaker-independent mapping between emotional states is possible.

Voice Conversion

Towards Unified Conversational Recommender Systems via Knowledge-Enhanced Prompt Learning

1 code implementation19 Jun 2022 Xiaolei Wang, Kun Zhou, Ji-Rong Wen, Wayne Xin Zhao

Our approach unifies the recommendation and conversation subtasks into the prompt learning paradigm, and utilizes knowledge-enhanced prompts based on a fixed pre-trained language model (PLM) to fulfill both subtasks in a unified approach.

Language Modelling Recommendation Systems +1

Image Inpainting via Iteratively Decoupled Probabilistic Modeling

2 code implementations6 Dec 2022 Wenbo Li, Xin Yu, Kun Zhou, Yibing Song, Zhe Lin, Jiaya Jia

To achieve high-quality results with low computational cost, we present a novel pixel spread model (PSM) that iteratively employs decoupled probabilistic modeling, combining the optimization efficiency of GANs with the prediction tractability of probabilistic models.

Denoising Image Inpainting

AutoSweep: Recovering 3D Editable Objectsfrom a Single Photograph

1 code implementation27 May 2020 Xin Chen, Yuwei Li, Xi Luo, Tianjia Shao, Jingyi Yu, Kun Zhou, Youyi Zheng

We base our work on the assumption that most human-made objects are constituted by parts and these parts can be well represented by generalized primitives.

3D Reconstruction Instance Segmentation +1

UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph

1 code implementation2 Dec 2022 Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen

Multi-hop Question Answering over Knowledge Graph~(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question on a large-scale Knowledge Graph (KG).

Language Modelling Multi-hop Question Answering +2

Diffusion Models for Non-autoregressive Text Generation: A Survey

1 code implementation12 Mar 2023 YiFan Li, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen

In this survey, we review the recent progress in diffusion models for NAR text generation.

Text Generation

ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models

1 code implementation23 May 2023 Zhipeng Chen, Kun Zhou, Beichen Zhang, Zheng Gong, Wayne Xin Zhao, Ji-Rong Wen

Although large language models (LLMs) have achieved excellent performance in a variety of evaluation benchmarks, they still struggle in complex reasoning tasks which require specific knowledge and multi-hop reasoning.

Math

A Locality-based Neural Solver for Optical Motion Capture

1 code implementation1 Sep 2023 Xiaoyu Pan, Bowen Zheng, Xinwei Jiang, Guanglong Xu, Xianli Gu, Jingxiang Li, Qilong Kou, He Wang, Tianjia Shao, Kun Zhou, Xiaogang Jin

Finally, we propose a training regime based on representation learning and data augmentation, by training the model on data with masking.

Data Augmentation Representation Learning

SketchGNN: Semantic Sketch Segmentation with Graph Neural Networks

1 code implementation2 Mar 2020 Lumin Yang, Jiajie Zhuang, Hongbo Fu, Xiangzhi Wei, Kun Zhou, Youyi Zheng

We introduce SketchGNN, a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches.

Segmentation Semantic Segmentation

Debiased Contrastive Learning of Unsupervised Sentence Representations

1 code implementation ACL 2022 Kun Zhou, Beichen Zhang, Wayne Xin Zhao, Ji-Rong Wen

In DCLR, we design an instance weighting method to punish false negatives and generate noise-based negatives to guarantee the uniformity of the representation space.

Contrastive Learning Semantic Textual Similarity +1

JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding

1 code implementation13 Jun 2022 Wayne Xin Zhao, Kun Zhou, Zheng Gong, Beichen Zhang, Yuanhang Zhou, Jing Sha, Zhigang Chen, Shijin Wang, Cong Liu, Ji-Rong Wen

Considering the complex nature of mathematical texts, we design a novel curriculum pre-training approach for improving the learning of mathematical PLMs, consisting of both basic and advanced courses.

Language Modelling Math

Neural Sentence Ordering Based on Constraint Graphs

1 code implementation27 Jan 2021 Yutao Zhu, Kun Zhou, Jian-Yun Nie, Shengchao Liu, Zhicheng Dou

Our experiments on five benchmark datasets show that our method outperforms all the existing baselines significantly, achieving a new state-of-the-art performance.

Sentence Sentence Ordering

BASAR:Black-box Attack on Skeletal Action Recognition

1 code implementation CVPR 2021 Yunfeng Diao, Tianjia Shao, Yong-Liang Yang, Kun Zhou, He Wang

The robustness of skeleton-based activity recognizers has been questioned recently, which shows that they are vulnerable to adversarial attacks when the full-knowledge of the recognizer is accessible to the attacker.

Action Recognition Adversarial Attack +1

Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack

4 code implementations21 Nov 2022 Yunfeng Diao, He Wang, Tianjia Shao, Yong-Liang Yang, Kun Zhou, David Hogg

Via BASAR, we find on-manifold adversarial samples are extremely deceitful and rather common in skeletal motions, in contrast to the common belief that adversarial samples only exist off-manifold.

Adversarial Attack Human Activity Recognition +2

What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning

1 code implementation2 Nov 2023 Yifan Du, Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Jinpeng Wang, Chuyuan Wang, Mingchen Cai, Ruihua Song, Ji-Rong Wen

By conducting a comprehensive empirical study, we find that instructions focused on complex visual reasoning tasks are particularly effective in improving the performance of MLLMs on evaluation benchmarks.

Visual Reasoning Zero-shot Generalization

C2-CRS: Coarse-to-Fine Contrastive Learning for Conversational Recommender System

1 code implementation4 Jan 2022 Yuanhang Zhou, Kun Zhou, Wayne Xin Zhao, Cheng Wang, Peng Jiang, He Hu

To implement this framework, we design both coarse-grained and fine-grained procedures for modeling user preference, where the former focuses on more general, coarse-grained semantic fusion and the latter focuses on more specific, fine-grained semantic fusion.

Contrastive Learning Recommendation Systems +2

Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning

1 code implementation NeurIPS 2023 Beichen Zhang, Kun Zhou, Xilin Wei, Wayne Xin Zhao, Jing Sha, Shijin Wang, Ji-Rong Wen

Based on this finding, we propose a new approach that can deliberate the reasoning steps with tool interfaces, namely \textbf{DELI}.

Math

Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint

1 code implementation11 Jan 2024 Zhipeng Chen, Kun Zhou, Wayne Xin Zhao, Junchen Wan, Fuzheng Zhang, Di Zhang, Ji-Rong Wen

To address it, we propose a new RL method named \textbf{RLMEC} that incorporates a generative model as the reward model, which is trained by the erroneous solution rewriting task under the minimum editing constraint, and can produce token-level rewards for RL training.

Question Answering Reinforcement Learning (RL)

Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack

1 code implementation CVPR 2021 He Wang, Feixiang He, Zhexi Peng, Tianjia Shao, Yong-Liang Yang, Kun Zhou, David Hogg

In this paper, we examine the robustness of state-of-the-art action recognizers against adversarial attack, which has been rarely investigated so far.

Action Recognition Adversarial Attack +4

Proactive Retrieval-based Chatbots based on Relevant Knowledge and Goals

1 code implementation18 Jul 2021 Yutao Zhu, Jian-Yun Nie, Kun Zhou, Pan Du, Hao Jiang, Zhicheng Dou

The final response is selected according to the predicted knowledge, the goal to achieve, and the context.

Multi-Task Learning Retrieval

Unsupervised Image Generation with Infinite Generative Adversarial Networks

1 code implementation ICCV 2021 Hui Ying, He Wang, Tianjia Shao, Yin Yang, Kun Zhou

Image generation has been heavily investigated in computer vision, where one core research challenge is to generate images from arbitrarily complex distributions with little supervision.

Image Generation

Improving Conversational Recommendation Systems via Counterfactual Data Simulation

1 code implementation5 Jun 2023 Xiaolei Wang, Kun Zhou, Xinyu Tang, Wayne Xin Zhao, Fan Pan, Zhao Cao, Ji-Rong Wen

To develop our approach, we characterize user preference and organize the conversation flow by the entities involved in the dialogue, and design a multi-stage recommendation dialogue simulator based on a conversation flow language model.

counterfactual Data Augmentation +2

Adaptive Local Basis Functions for Shape Completion

1 code implementation17 Jul 2023 Hui Ying, Tianjia Shao, He Wang, Yin Yang, Kun Zhou

Quantitative and qualitative experiments demonstrate that our method outperforms the state-of-the-art methods in shape completion, detail preservation, generalization to unseen geometries, and computational cost.

DeepWarp: DNN-based Nonlinear Deformation

1 code implementation24 Mar 2018 Ran Luo, Tianjia Shao, Huamin Wang, Weiwei Xu, Kun Zhou, Yin Yang

DeepWarp is an efficient and highly re-usable deep neural network (DNN) based nonlinear deformable simulation framework.

Graphics

Content Selection Network for Document-grounded Retrieval-based Chatbots

1 code implementation21 Jan 2021 Yutao Zhu, Jian-Yun Nie, Kun Zhou, Pan Du, Zhicheng Dou

It is thus crucial to select the part of document content relevant to the current conversation context.

Retrieval

DeepSketchHair: Deep Sketch-based 3D Hair Modeling

1 code implementation20 Aug 2019 Yuefan Shen, Changgeng Zhang, Hongbo Fu, Kun Zhou, Youyi Zheng

The key enablers of our system are two carefully designed neural networks, namely, S2ONet, which converts an input sketch to a dense 2D hair orientation field; and O2VNet, which maps the 2D orientation field to a 3D vector field.

Graphics

Alleviating the Long-Tail Problem in Conversational Recommender Systems

1 code implementation21 Jul 2023 Zhipeng Zhao, Kun Zhou, Xiaolei Wang, Wayne Xin Zhao, Fan Pan, Zhao Cao, Ji-Rong Wen

Conversational recommender systems (CRS) aim to provide the recommendation service via natural language conversations.

Recommendation Systems Retrieval

Visually-augmented pretrained language models for NLP tasks without images

1 code implementation15 Dec 2022 Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Qinyu Zhang, Ji-Rong Wen

Although pre-trained language models~(PLMs) have shown impressive performance by text-only self-supervised training, they are found lack of visual semantics or commonsense.

Retrieval

Generating by Understanding: Neural Visual Generation with Logical Symbol Groundings

1 code implementation26 Oct 2023 Yifei Peng, Yu Jin, Zhexu Luo, Yao-Xiang Ding, Wang-Zhou Dai, Zhong Ren, Kun Zhou

There are two levels of symbol grounding problems among the core challenges: the first is symbol assignment, i. e. mapping latent factors of neural visual generators to semantic-meaningful symbolic factors from the reasoning systems by learning from limited labeled data.

Intrinsic Light Field Images

no code implementations15 Aug 2016 Elena Garces, Jose I. Echevarria, Wen Zhang, Hongzhi Wu, Kun Zhou, Diego Gutierrez

We present a method to automatically decompose a light field into its intrinsic shading and albedo components.

Segmentation Rectification for Video Cutout via One-Class Structured Learning

no code implementations16 Feb 2016 Junyan Wang, Sai-Kit Yeung, Jue Wang, Kun Zhou

Comprehensive experiments on both RGB and RGB-D data demonstrate that our simple and effective method significantly outperforms the segmentation propagation methods adopted in the state-of-the-art video cutout systems, and the results also suggest the potential usefulness of our method in image cutout system.

Segmentation

FBI-Pose: Towards Bridging the Gap between 2D Images and 3D Human Poses using Forward-or-Backward Information

no code implementations25 Jun 2018 Yulong Shi, Xiaoguang Han, Nianjuan Jiang, Kun Zhou, Kui Jia, Jiangbo Lu

Although significant advances have been made in the area of human poses estimation from images using deep Convolutional Neural Network (ConvNet), it remains a big challenge to perform 3D pose inference in-the-wild.

3D Human Pose Estimation

CaricatureShop: Personalized and Photorealistic Caricature Sketching

no code implementations24 Jul 2018 Xiaoguang Han, Kangcheng Hou, Dong Du, Yuda Qiu, Yizhou Yu, Kun Zhou, Shuguang Cui

To construct the mapping between 2D sketches and a vertex-wise scaling field, a novel deep learning architecture is developed.

Caricature Face Model

Adversarial 3D Human Pose Estimation via Multimodal Depth Supervision

no code implementations21 Sep 2018 Kun Zhou, Jinmiao Cai, Yao Li, Yulong Shi, Xiaoguang Han, Nianjuan Jiang, Kui Jia, Jiangbo Lu

In this paper, a novel deep-learning based framework is proposed to infer 3D human poses from a single image.

3D Human Pose Estimation

Bayesian Depth-from-Defocus with Shading Constraints

no code implementations CVPR 2013 Chen Li, Shuochen Su, Yasuyuki Matsushita, Kun Zhou, Stephen Lin

We present a method that enhances the performance of depth-from-defocus (DFD) through the use of shading information.

Depth Estimation

A Geodesic-Preserving Method for Image Warping

no code implementations CVPR 2015 Dongping Li, Kaiming He, Jian Sun, Kun Zhou

The image projections will turn the straight lines into curved "geodesic lines", and it is fundamentally impossible to keep all these lines straight.

Image Manipulation

Specular Highlight Removal in Facial Images

no code implementations CVPR 2017 Chen Li, Stephen Lin, Kun Zhou, Katsushi Ikeuchi

An important practical feature of the proposed method is that the skin color model is utilized in a way that does not require color calibration of the camera.

highlight removal

Radiometric Calibration From Faces in Images

no code implementations CVPR 2017 Chen Li, Stephen Lin, Kun Zhou, Katsushi Ikeuchi

We present a method for radiometric calibration of cameras from a single image that contains a human face.

SMART: Skeletal Motion Action Recognition aTtack

no code implementations16 Nov 2019 He Wang, Feixiang He, Zhexi Peng, Yong-Liang Yang, Tianjia Shao, Kun Zhou, David Hogg

In this paper, we propose a method, SMART, to attack action recognizers which rely on 3D skeletal motions.

Action Recognition Adversarial Attack +2

Improving Multi-Turn Response Selection Models with Complementary Last-Utterance Selection by Instance Weighting

no code implementations18 Feb 2020 Kun Zhou, Wayne Xin Zhao, Yutao Zhu, Ji-Rong Wen, Jingsong Yu

Open-domain retrieval-based dialogue systems require a considerable amount of training data to learn their parameters.

Retrieval

VAW-GAN for Singing Voice Conversion with Non-parallel Training Data

no code implementations10 Aug 2020 Junchen Lu, Kun Zhou, Berrak Sisman, Haizhou Li

We train an encoder to disentangle singer identity and singing prosody (F0 contour) from phonetic content.

Generative Adversarial Network Voice Conversion

Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN

no code implementations11 Aug 2020 Zongyang Du, Kun Zhou, Berrak Sisman, Haizhou Li

It relies on non-parallel training data from two different languages, hence, is more challenging than mono-lingual voice conversion.

Voice Conversion

Mesh Guided One-shot Face Reenactment using Graph Convolutional Networks

no code implementations18 Aug 2020 Guangming Yao, Yi Yuan, Tianjia Shao, Kun Zhou

In this paper, we introduce a method for one-shot face reenactment, which uses the reconstructed 3D meshes (i. e., the source mesh and driving mesh) as guidance to learn the optical flow needed for the reenacted face synthesis.

Face Generation Face Reenactment +2

Leveraging Historical Interaction Data for Improving Conversational Recommender System

no code implementations19 Aug 2020 Kun Zhou, Wayne Xin Zhao, Hui Wang, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, Ji-Rong Wen

Most of the existing CRS methods focus on learning effective preference representations for users from conversation data alone.

Attribute Recommendation Systems

Dynamic Future Net: Diversified Human Motion Generation

no code implementations25 Aug 2020 Wenheng Chen, He Wang, Yi Yuan, Tianjia Shao, Kun Zhou

We evaluate our model on a wide range of motions and compare it with the state-of-the-art methods.

Second-order Neural Network Training Using Complex-step Directional Derivative

no code implementations15 Sep 2020 Siyuan Shen, Tianjia Shao, Kun Zhou, Chenfanfu Jiang, Feng Luo, Yin Yang

We believe our method will inspire a wide-range of new algorithms for deep learning and numerical optimization.

Second-order methods

Learning to Match Jobs with Resumes from Sparse Interaction Data using Multi-View Co-Teaching Network

no code implementations25 Sep 2020 Shuqing Bian, Xu Chen, Wayne Xin Zhao, Kun Zhou, Yupeng Hou, Yang song, Tao Zhang, Ji-Rong Wen

Compared with pure text-based matching models, the proposed approach is able to learn better data representations from limited or even sparse interaction data, which is more resistible to noise in training data.

Text Matching

VAW-GAN for Disentanglement and Recomposition of Emotional Elements in Speech

no code implementations3 Nov 2020 Kun Zhou, Berrak Sisman, Haizhou Li

Emotional voice conversion (EVC) aims to convert the emotion of speech from one state to another while preserving the linguistic content and speaker identity.

Disentanglement Generative Adversarial Network +1

Structure-aware Person Image Generation with Pose Decomposition and Semantic Correlation

no code implementations5 Feb 2021 Jilin Tang, Yi Yuan, Tianjia Shao, Yong liu, Mengmeng Wang, Kun Zhou

In this paper we tackle the problem of pose guided person image generation, which aims to transfer a person image from the source pose to a novel target pose while maintaining the source appearance.

Image Generation

In-game Residential Home Planning via Visual Context-aware Global Relation Learning

no code implementations8 Feb 2021 Lijuan Liu, Yin Yang, Yi Yuan, Tianjia Shao, He Wang, Kun Zhou

In this paper, we propose an effective global relation learning algorithm to recommend an appropriate location of a building unit for in-game customization of residential home complex.

Graph Generation Relation

One-shot Face Reenactment Using Appearance Adaptive Normalization

no code implementations8 Feb 2021 Guangming Yao, Yi Yuan, Tianjia Shao, Shuang Li, Shanqi Liu, Yong liu, Mengmeng Wang, Kun Zhou

The paper proposes a novel generative adversarial network for one-shot face reenactment, which can animate a single face image to a different pose-and-expression (provided by a driving image) while keeping its original appearance.

Face Reenactment Generative Adversarial Network

High-order Differentiable Autoencoder for Nonlinear Model Reduction

no code implementations19 Feb 2021 Siyuan Shen, Yang Yin, Tianjia Shao, He Wang, Chenfanfu Jiang, Lei Lan, Kun Zhou

This paper provides a new avenue for exploiting deep neural networks to improve physics-based simulation.

Vocal Bursts Intensity Prediction

Learning Efficient Photometric Feature Transform for Multi-view Stereo

no code implementations ICCV 2021 Kaizhang Kang, Cihui Xie, Ruisheng Zhu, Xiaohe Ma, Ping Tan, Hongzhi Wu, Kun Zhou

We present a novel framework to learn to convert the perpixel photometric information at each view into spatially distinctive and view-invariant low-level features, which can be plugged into existing multi-view stereo pipeline for enhanced 3D reconstruction.

3D Reconstruction

Curriculum Pre-Training Heterogeneous Subgraph Transformer for Top-$N$ Recommendation

no code implementations12 Jun 2021 Hui Wang, Kun Zhou, Wayne Xin Zhao, Jingyuan Wang, Ji-Rong Wen

Due to the flexibility in modelling data heterogeneity, heterogeneous information network (HIN) has been adopted to characterize complex and heterogeneous auxiliary data in top-$N$ recommender systems, called \emph{HIN-based recommendation}.

Recommendation Systems

Learn with Noisy Data via Unsupervised Loss Correction for Weakly Supervised Reading Comprehension

no code implementations COLING 2020 Xuemiao Zhang, Kun Zhou, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, Junfei Liu

Weakly supervised machine reading comprehension (MRC) task is practical and promising for its easily available and massive training data, but inevitablely introduces noise.

Machine Reading Comprehension

Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion

no code implementations20 Oct 2021 Zongyang Du, Berrak Sisman, Kun Zhou, Haizhou Li

Expressive voice conversion performs identity conversion for emotional speakers by jointly converting speaker identity and emotional style.

Disentanglement Voice Conversion

Learning Implicit Body Representations from Double Diffusion Based Neural Radiance Fields

no code implementations23 Dec 2021 Guangming Yao, Hongzhi Wu, Yi Yuan, Lincheng Li, Kun Zhou, Xin Yu

In this paper, we present a novel double diffusion based neural radiance field, dubbed DD-NeRF, to reconstruct human body geometry and render the human body appearance in novel views from a sparse set of images.

Novel View Synthesis

Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

no code implementations2 Feb 2022 Jiawei Lu, He Wang, Tianjia Shao, Yin Yang, Kun Zhou

However, as source images are often misaligned due to the large disparities among the camera settings, strong assumptions have been made in the past with respect to the camera(s) or/and the object in interest, limiting the application of such techniques.

Pose-Guided Image Generation

DiFT: Differentiable Differential Feature Transform for Multi-View Stereo

no code implementations16 Mar 2022 Kaizhang Kang, Chong Zeng, Hongzhi Wu, Kun Zhou

We present a novel framework to automatically learn to transform the differential cues from a stack of images densely captured with a rotational motion into spatially discriminative and view-invariant per-pixel features at each view.

3D Reconstruction

NeuralReshaper: Single-image Human-body Retouching with Deep Neural Networks

no code implementations20 Mar 2022 Beijia Chen, Yuefan Shen, Hongbo Fu, Xiang Chen, Kun Zhou, Youyi Zheng

In this paper, we present NeuralReshaper, a novel method for semantic reshaping of human bodies in single images using deep generative networks.

Efficient Reflectance Capture with a Deep Gated Mixture-of-Experts

no code implementations29 Mar 2022 Xiaohe Ma, Yaxin Yu, Hongzhi Wu, Kun Zhou

A common, pre-trained latent transform module is also appended to each decoder, to offset the burden of the increased number of decoders.

NeuralHDHair: Automatic High-fidelity Hair Modeling from a Single Image Using Implicit Neural Representations

no code implementations CVPR 2022 Keyu Wu, Yifan Ye, Lingchen Yang, Hongbo Fu, Kun Zhou, Youyi Zheng

To improve the efficiency of a traditional hair growth algorithm, we adopt a local neural implicit function to grow strands based on the estimated 3D hair geometric features.

Speech Synthesis with Mixed Emotions

no code implementations11 Aug 2022 Kun Zhou, Berrak Sisman, Rajib Rana, B. W. Schuller, Haizhou Li

We then incorporate our formulation into a sequence-to-sequence emotional text-to-speech framework.

Attribute Emotional Speech Synthesis

Mixed-EVC: Mixed Emotion Synthesis and Control in Voice Conversion

no code implementations25 Oct 2022 Kun Zhou, Berrak Sisman, Carlos Busso, Bin Ma, Haizhou Li

To achieve this, we propose a novel EVC framework, Mixed-EVC, which only leverages discrete emotion training labels.

Attribute Voice Conversion

Mutual Guidance and Residual Integration for Image Enhancement

no code implementations25 Nov 2022 Kun Zhou, Kenkun Liu, Wenbo Li, Xiaoguang Han, Jiangbo Lu

To address those issues, we propose a novel mutual guidance network (MGN) to perform effective bidirectional global-local information exchange while keeping a compact architecture.

Computational Efficiency Image Enhancement +1

Diffusion-NAT: Self-Prompting Discrete Diffusion for Non-Autoregressive Text Generation

no code implementations6 May 2023 Kun Zhou, YiFan Li, Wayne Xin Zhao, Ji-Rong Wen

To solve it, we propose Diffusion-NAT, which introduces discrete diffusion models~(DDM) into NAR text-to-text generation and integrates BART to improve the performance.

Denoising Text Generation

A Unified Spatial-Angular Structured Light for Single-View Acquisition of Shape and Reflectance

no code implementations CVPR 2023 Xianmin Xu, Yuxin Lin, Haoyang Zhou, Chong Zeng, Yaxin Yu, Kun Zhou, Hongzhi Wu

We propose a unified structured light, consisting of an LED array and an LCD mask, for high-quality acquisition of both shape and reflectance from a single view.

Learning Photometric Feature Transform for Free-form Object Scan

no code implementations7 Aug 2023 Xiang Feng, Kaizhang Kang, Fan Pei, Huakeng Ding, Jinjiang You, Ping Tan, Kun Zhou, Hongzhi Wu

We propose a novel framework to automatically learn to aggregate and transform photometric measurements from multiple unstructured views into spatially distinctive and view-invariant low-level features, which are fed to a multi-view stereo method to enhance 3D reconstruction.

3D Reconstruction Object

SPGM: Prioritizing Local Features for enhanced speech separation performance

1 code implementation22 Sep 2023 Jia Qi Yip, Shengkui Zhao, Yukun Ma, Chongjia Ni, Chong Zhang, Hao Wang, Trung Hieu Nguyen, Kun Zhou, Dianwen Ng, Eng Siong Chng, Bin Ma

Dual-path is a popular architecture for speech separation models (e. g. Sepformer) which splits long sequences into overlapping chunks for its intra- and inter-blocks that separately model intra-chunk local features and inter-chunk global relationships.

Speech Separation

A Real-time Method for Inserting Virtual Objects into Neural Radiance Fields

no code implementations9 Oct 2023 Keyang Ye, Hongzhi Wu, Xin Tong, Kun Zhou

We present the first real-time method for inserting a rigid virtual object into a neural radiance field, which produces realistic lighting and shadowing effects, as well as allows interactive manipulation of the object.

Lighting Estimation Object

Don't Make Your LLM an Evaluation Benchmark Cheater

no code implementations3 Nov 2023 Kun Zhou, Yutao Zhu, Zhipeng Chen, Wentong Chen, Wayne Xin Zhao, Xu Chen, Yankai Lin, Ji-Rong Wen, Jiawei Han

Large language models~(LLMs) have greatly advanced the frontiers of artificial intelligence, attaining remarkable improvement in model capacity.

Animatable 3D Gaussians for High-fidelity Synthesis of Human Motions

no code implementations22 Nov 2023 Keyang Ye, Tianjia Shao, Kun Zhou

The learnable code serves as a pose-dependent appearance embedding for refining the erroneous appearance caused by geometric transformation of Gaussians, based on which an appearance refinement model is learned to produce residual Gaussian properties to match the appearance in target pose.

Text-Guided 3D Face Synthesis -- From Generation to Editing

no code implementations1 Dec 2023 Yunjie Wu, Yapeng Meng, Zhipeng Hu, Lincheng Li, Haoqian Wu, Kun Zhou, Weiwei Xu, Xin Yu

In the editing stage, we first employ a pre-trained diffusion model to update facial geometry or texture based on the texts.

Face Generation Texture Synthesis

QPoser: Quantized Explicit Pose Prior Modeling for Controllable Pose Generation

no code implementations2 Dec 2023 Yumeng Li, YaoXiang Ding, Zhong Ren, Kun Zhou

Explicit pose prior models compress human poses into latent representations for using in pose-related downstream tasks.

ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph

no code implementations30 Dec 2023 Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Yaliang Li, Ji-Rong Wen

To better perform reasoning on KG, recent work typically adopts a pre-trained language model~(PLM) to model the question, and a graph neural network~(GNN) based module to perform multi-hop reasoning on the KG.

Language Modelling Question Answering

Data-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning

no code implementations7 Jan 2024 Yingqian Min, Kun Zhou, Dawei Gao, Wayne Xin Zhao, He Hu, Yaliang Li

Recently, multi-task instruction tuning has been applied into sentence representation learning, which endows the capability of generating specific representations with the guidance of task instruction, exhibiting strong generalization ability on new tasks.

Representation Learning Sentence +1

Gaussian Splashing: Dynamic Fluid Synthesis with Gaussian Splatting

no code implementations27 Jan 2024 Yutao Feng, Xiang Feng, Yintong Shang, Ying Jiang, Chang Yu, Zeshun Zong, Tianjia Shao, Hongzhi Wu, Kun Zhou, Chenfanfu Jiang, Yin Yang

We demonstrate the feasibility of integrating physics-based animations of solids and fluids with 3D Gaussian Splatting (3DGS) to create novel effects in virtual scenes reconstructed using 3DGS.

AniDress: Animatable Loose-Dressed Avatar from Sparse Views Using Garment Rigging Model

no code implementations27 Jan 2024 Beijia Chen, Yuefan Shen, Qing Shuai, Xiaowei Zhou, Kun Zhou, Youyi Zheng

In this paper, we introduce AniDress, a novel method for generating animatable human avatars in loose clothes using very sparse multi-view videos (4-8 in our setting).

KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph

no code implementations17 Feb 2024 Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Yang song, Chen Zhu, HengShu Zhu, Ji-Rong Wen

To guarantee the effectiveness, we leverage program language to formulate the multi-hop reasoning process over the KG, and synthesize a code-based instruction dataset to fine-tune the base LLM.

Knowledge Graphs

Less is More: Data Value Estimation for Visual Instruction Tuning

no code implementations14 Mar 2024 Zikang Liu, Kun Zhou, Wayne Xin Zhao, Dawei Gao, Yaliang Li, Ji-Rong Wen

To investigate this issue, we conduct a series of empirical studies, which reveal a significant redundancy within the visual instruction datasets, and show that greatly reducing the amount of several instruction dataset even do not affect the performance.

Cannot find the paper you are looking for? You can Submit a new open access paper.