Search Results for author: Zhongyuan Wang

Found 74 papers, 37 papers with code

Masked Face Recognition Dataset and Application

3 code implementations • 20 Mar 2020 • Zhongyuan Wang, Guangcheng Wang, Baojin Huang, Zhangyang Xiong, Qi Hong, Hao Wu, Peng Yi, Kui Jiang, Nanxi Wang, Yingjiao Pei, Heling Chen, Yu Miao, Zhibing Huang, Jinbi Liang

These datasets are freely available to industry and academia, based on which various applications on masked faces can be developed.

Face Detection Face Recognition

1,916

Paper
Code

KwaiAgents: Generalized Information-seeking Agent System with Large Language Models

1 code implementation • 8 Dec 2023 • Haojie Pan, Zepeng Zhai, Hao Yuan, Yaojia LV, Ruiji Fu, Ming Liu, Zhongyuan Wang, Bing Qin

Driven by curiosity, humans have continually sought to explore and understand the world around them, leading to the invention of various tools to satiate this inquisitiveness.

958

Paper
Code

S^3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization

2 code implementations • 18 Aug 2020 • Kun Zhou, Hui Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, Ji-Rong Wen

To tackle this problem, we propose the model S^3-Rec, which stands for Self-Supervised learning for Sequential Recommendation, based on the self-attentive neural architecture.

Attribute Self-Supervised Learning +1

228

Paper
Code

Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommender Systems

5 code implementations • 11 May 2019 • Hongwei Wang, Fuzheng Zhang, Mengdi Zhang, Jure Leskovec, Miao Zhao, Wenjie Li, Zhongyuan Wang

Here we propose Knowledge-aware Graph Neural Networks with Label Smoothness regularization (KGNN-LS) to provide better recommendations.

Ranked #1 on Recommendation Systems on Dianping-Food

Feature Engineering Inductive Bias +2

157

Paper
Code

Multi-Scale Progressive Fusion Network for Single Image Deraining

3 code implementations • CVPR 2020 • Kui Jiang, Zhongyuan Wang, Peng Yi, Chen Chen, Baojin Huang, Yimin Luo, Jiayi Ma, Junjun Jiang

In this work, we explore the multi-scale collaborative representation for rain streaks from the perspective of input image scales and hierarchical deep features in a unified framework, termed multi-scale progressive fusion network (MSPFN) for single image rain streak removal.

Ranked #6 on Single Image Deraining on Test2800

Single Image Deraining

154

Paper
Code

CAT: Cross Attention in Vision Transformer

1 code implementation • 10 Jun 2021 • Hezheng Lin, Xing Cheng, Xiangyu Wu, Fan Yang, Dong Shen, Zhongyuan Wang, Qing Song, Wei Yuan

In this paper, we propose a new attention mechanism in Transformer termed Cross Attention, which alternates attention inner the image patch instead of the whole image to capture local information and apply attention between image patches which are divided from single-channel feature maps capture global information.

132

Paper
Code

End-to-end training of Multimodal Model and ranking Model

2 code implementations • 9 Apr 2024 • Xiuqi Deng, Lu Xu, Xiyao Li, Jinkai Yu, Erpeng Xue, Zhongyuan Wang, Di Zhang, Zhaojie Liu, Guorui Zhou, Yang song, Na Mou, Shen Jiang, Han Li

In this paper, we propose an industrial multimodal recommendation framework named EM3: End-to-end training of Multimodal Model and ranking Model, which sufficiently utilizes multimodal information and allows personalized ranking tasks to directly train the core modules in the multimodal model to obtain more task-oriented content features, without overburdening resource consumption.

Contrastive Learning Multimodal Recommendation

131

Paper
Code

ESimCSE: Enhanced Sample Building Method for Contrastive Learning of Unsupervised Sentence Embedding

2 code implementations • COLING 2022 • Xing Wu, Chaochen Gao, Liangjun Zang, Jizhong Han, Zhongyuan Wang, Songlin Hu

Unsup-SimCSE takes dropout as a minimal data augmentation method, and passes the same input sentence to a pre-trained Transformer encoder (with dropout turned on) twice to obtain the two corresponding embeddings to build a positive pair.

Contrastive Learning Data Augmentation +5

Paper
Code

Smoothed Contrastive Learning for Unsupervised Sentence Embedding

2 code implementations • COLING 2022 • Xing Wu, Chaochen Gao, Yipeng Su, Jizhong Han, Zhongyuan Wang, Songlin Hu

Contrastive learning has been gradually applied to learn high-quality unsupervised sentence embedding.

Contrastive Learning Sentence +4

Paper
Code

DistilCSE: Effective Knowledge Distillation For Contrastive Sentence Embeddings

1 code implementation • 10 Dec 2021 • Chaochen Gao, Xing Wu, Peng Wang, Jue Wang, Liangjun Zang, Zhongyuan Wang, Songlin Hu

To tackle that, we propose an effective knowledge distillation framework for contrastive sentence embeddings, termed DistilCSE.

Contrastive Learning Knowledge Distillation +5

Paper
Code

InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings

2 code implementations • 8 Oct 2022 • Xing Wu, Chaochen Gao, Zijia Lin, Jizhong Han, Zhongyuan Wang, Songlin Hu

Contrastive learning has been extensively studied in sentence embedding learning, which assumes that the embeddings of different views of the same sentence are closer.

Contrastive Learning Language Modelling +5

Paper
Code

Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing

1 code implementation • CVPR 2022 • Zhuo Wang, Zezheng Wang, Zitong Yu, Weihong Deng, Jiahong Li, Tingting Gao, Zhongyuan Wang

A novel Shuffled Style Assembly Network (SSAN) is proposed to extract and reassemble different content and style features for a stylized feature space.

Contrastive Learning Domain Generalization +1

Paper
Code

Paragraph-to-Image Generation with Information-Enriched Diffusion Model

1 code implementation • 24 Nov 2023 • Weijia Wu, Zhuang Li, Yefei He, Mike Zheng Shou, Chunhua Shen, Lele Cheng, Yan Li, Tingting Gao, Di Zhang, Zhongyuan Wang

In this paper, we introduce an information-enriched diffusion model for paragraph-to-image generation task, termed ParaDiffusion, which delves into the transference of the extensive semantic comprehension capabilities of large language models to the task of image generation.

Image Generation Language Modelling +1

Paper
Code

Stable Segment Anything Model

1 code implementation • 27 Nov 2023 • Qi Fan, Xin Tao, Lei Ke, Mingqiao Ye, Yuan Zhang, Pengfei Wan, Zhongyuan Wang, Yu-Wing Tai, Chi-Keung Tang

Thus, our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality, with 3) minimal learnable parameters (0. 08 M) and fast adaptation (by 1 training epoch).

Segmentation

Paper
Code

DVIS++: Improved Decoupled Framework for Universal Video Segmentation

1 code implementation • 20 Dec 2023 • Tao Zhang, Xingye Tian, Yikang Zhou, Shunping Ji, Xuebo Wang, Xin Tao, Yuan Zhang, Pengfei Wan, Zhongyuan Wang, Yu Wu

We present the \textbf{D}ecoupled \textbf{VI}deo \textbf{S}egmentation (DVIS) framework, a novel approach for the challenging task of universal video segmentation, including video instance segmentation (VIS), video semantic segmentation (VSS), and video panoptic segmentation (VPS).

Ranked #1 on Video Semantic Segmentation on VSPW

Contrastive Learning Denoising +6

Paper
Code

SuperPCA: A Superpixelwise PCA Approach for Unsupervised Feature Extraction of Hyperspectral Imagery

1 code implementation • 26 Jun 2018 • Junjun Jiang, Jiayi Ma, Chen Chen, Zhongyuan Wang, Zhihua Cai, Lizhe Wang

(1) Unlike the traditional PCA method based on a whole image, SuperPCA takes into account the diversity in different homogeneous regions, that is, different regions should have different projections.

Dimensionality Reduction General Classification

Paper
Code

Kuaipedia: a Large-scale Multi-modal Short-video Encyclopedia

1 code implementation • 28 Oct 2022 • Haojie Pan, Zepeng Zhai, Yuzhou Zhang, Ruiji Fu, Ming Liu, Yangqiu Song, Zhongyuan Wang, Bing Qin

In this paper, we propose Kuaipedia, a large-scale multi-modal encyclopedia consisting of items, aspects, and short videos lined to them, which was extracted from billions of videos of Kuaishou (Kwai), a well-known short-video platform in China.

Entity Linking Entity Typing

Paper
Code

MlTr: Multi-label Classification with Transformer

1 code implementation • 11 Jun 2021 • Xing Cheng, Hezheng Lin, Xiangyu Wu, Fan Yang, Dong Shen, Zhongyuan Wang, Nian Shi, Honglin Liu

The task of multi-label image classification is to recognize all the object labels presented in an image.

Ranked #12 on Multi-Label Classification on MS-COCO

Classification Multi-Label Classification +1

Paper
Code

CogGPT: Unleashing the Power of Cognitive Dynamics on Large Language Models

1 code implementation • 6 Jan 2024 • Yaojia LV, Haojie Pan, Ruiji Fu, Ming Liu, Zhongyuan Wang, Bing Qin

Cognitive dynamics are pivotal to advance human understanding of the world.

Paper
Code

Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection

1 code implementation • EMNLP 2020 • Shaolei Wang, Zhongyuan Wang, Wanxiang Che, Ting Liu

Most existing approaches to disfluency detection heavily rely on human-annotated corpora, which is expensive to obtain in practice.

Self-Supervised Learning Word Embeddings

Paper
Code

Text Smoothing: Enhance Various Data Augmentation Methods on Text Classification Tasks

1 code implementation • ACL 2022 • Xing Wu, Chaochen Gao, Meng Lin, Liangjun Zang, Zhongyuan Wang, Songlin Hu

Before entering the neural network, a token is generally converted to the corresponding one-hot representation, which is a discrete distribution of the vocabulary.

Data Augmentation Language Modelling +3

Paper
Code

ConTextual Masked Auto-Encoder for Dense Passage Retrieval

2 code implementations • 16 Aug 2022 • Xing Wu, Guangyuan Ma, Meng Lin, Zijia Lin, Zhongyuan Wang, Songlin Hu

Dense passage retrieval aims to retrieve the relevant passages of a query from a large corpus based on dense representations (i. e., vectors) of the query and the passages.

Passage Retrieval Retrieval +1

Paper
Code

Degrade is Upgrade: Learning Degradation for Low-light Image Enhancement

1 code implementation • 19 Mar 2021 • Kui Jiang, Zhongyuan Wang, Zheng Wang, Chen Chen, Peng Yi, Tao Lu, Chia-Wen Lin

Different from existing methods tending to accomplish the relighting task directly by ignoring the fidelity and naturalness recovery, we investigate the intrinsic degradation and relight the low-light image while refining the details and color in two steps.

Low-Light Image Enhancement

Paper
Code

When Face Recognition Meets Occlusion: A New Benchmark

1 code implementation • 4 Mar 2021 • Baojin Huang, Zhongyuan Wang, Guangcheng Wang, Kui Jiang, Kangli Zeng, Zhen Han, Xin Tian, Yuhong Yang

In particular, we first collect a variety of glasses and masks as occlusion, and randomly combine the occlusion attributes (occlusion objects, textures, and colors) to achieve a large number of more realistic occlusion types.

Face Recognition

Paper
Code

Face Hallucination via Split-Attention in Split-Attention Network

1 code implementation • 22 Oct 2020 • Tao Lu, Yuanzhi Wang, Yanduo Zhang, Yu Wang, Wei Liu, Zhongyuan Wang, Junjun Jiang

However, most of them fail to take into account the overall facial profile and fine texture details simultaneously, resulting in reduced naturalness and fidelity of the reconstructed face, and further impairing the performance of downstream tasks (e. g., face detection, facial recognition).

Face Detection Face Hallucination +4

Paper
Code

Learning Inverse Rendering of Faces from Real-world Videos

1 code implementation • 26 Mar 2020 • Yuda Qiu, Zhangyang Xiong, Kai Han, Zhongyuan Wang, Zixiang Xiong, Xiaoguang Han

To alleviate this problem, we propose a weakly supervised training approach to train our model on real face videos, based on the assumption of consistency of albedo and normal across different frames, thus bridging the gap between real and synthetic face images.

Inverse Rendering

Paper
Code

Consistency Regularization for Deep Face Anti-Spoofing

1 code implementation • 24 Nov 2021 • Zezheng Wang, Zitong Yu, Xun Wang, Yunxiao Qin, Jiahong Li, Chenxu Zhao, Zhen Lei, Xin Liu, Size Li, Zhongyuan Wang

Face anti-spoofing (FAS) plays a crucial role in securing face recognition systems.

Face Anti-Spoofing Face Recognition

Paper
Code

Magic ELF: Image Deraining Meets Association Learning and Transformer

1 code implementation • 21 Jul 2022 • Kui Jiang, Zhongyuan Wang, Chen Chen, Zheng Wang, Laizhong Cui, Chia-Wen Lin

Convolutional neural network (CNN) and Transformer have achieved great success in multimedia applications.

Rain Removal

Paper
Code

Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue

1 code implementation • 1 Oct 2020 • Zipeng Xu, Fangxiang Feng, Xiaojie Wang, Yushu Yang, Huixing Jiang, Zhongyuan Wang

In this paper, we propose an Answer-Driven Visual State Estimator (ADVSE) to impose the effects of different answers on visual states.

Question Generation Question-Generation +1

Paper
Code

RaP: Redundancy-aware Video-language Pre-training for Text-Video Retrieval

1 code implementation • 13 Oct 2022 • Xing Wu, Chaochen Gao, Zijia Lin, Zhongyuan Wang, Jizhong Han, Songlin Hu

Sparse sampling is also likely to miss important frames corresponding to some text portions, resulting in textual redundancy.

Contrastive Learning Retrieval +1

Paper
Code

Adaptive Unsupervised Self-training for Disfluency Detection

1 code implementation • COLING 2022 • Zhongyuan Wang, YiXuan Wang, Shaolei Wang, Wanxiang Che

Supervised methods have achieved remarkable results in disfluency detection.

Selection bias

Paper
Code

Code-Style In-Context Learning for Knowledge-Based Question Answering

1 code implementation • 9 Sep 2023 • Zhijie Nie, Richong Zhang, Zhongyuan Wang, Xudong Liu

Current methods for Knowledge-Based Question Answering (KBQA) usually rely on complex training techniques and model frameworks, leading to many limitations in practical applications.

Code Generation In-Context Learning +2

Paper
Code

Contrastive Learning of Semantic and Visual Representations for Text Tracking

1 code implementation • 30 Dec 2021 • Zhuang Li, Weijia Wu, Mike Zheng Shou, Jiahong Li, Size Li, Zhongyuan Wang, Hong Zhou

Semantic representation is of great benefit to the video text tracking(VTT) task that requires simultaneously classifying, detecting, and tracking texts in the video.

Contrastive Learning

Paper
Code

Real-time End-to-End Video Text Spotter with Contrastive Representation Learning

1 code implementation • 18 Jul 2022 • Wejia Wu, Zhuang Li, Jiahong Li, Chunhua Shen, Hong Zhou, Size Li, Zhongyuan Wang, Ping Luo

Our contributions are three-fold: 1) CoText simultaneously address the three tasks (e. g., text detection, tracking, recognition) in a real-time end-to-end trainable framework.

Contrastive Learning Representation Learning +2

Paper
Code

Augmentation-Aware Self-Supervision for Data-Efficient GAN Training

1 code implementation • NeurIPS 2023 • Liang Hou, Qi Cao, Yige Yuan, Songtao Zhao, Chongyang Ma, Siyuan Pan, Pengfei Wan, Zhongyuan Wang, HuaWei Shen, Xueqi Cheng

Training generative adversarial networks (GANs) with limited data is challenging because the discriminator is prone to overfitting.

Data Augmentation Representation Learning

Paper
Code

Syntactic Parsing of Web Queries

no code implementations • EMNLP 2016 • Xiangyan Sun, Haixun Wang, Yanghua Xiao, Zhongyuan Wang

Paper
Add Code

Probabilistic Prototype Model for Serendipitous Property Mining

no code implementations • COLING 2016 • Taesung Lee, Seung-won Hwang, Zhongyuan Wang

Besides providing the relevant information, amusing users has been an important role of the web.

Question Generation

Paper
Add Code

Earlier Attention? Aspect-Aware LSTM for Aspect-Based Sentiment Analysis

no code implementations • 19 May 2019 • Bowen Xing, Lejian Liao, Dandan song, Jingang Wang, Fuzheng Zhang, Zhongyuan Wang, He-Yan Huang

This paper proposes a novel variant of LSTM, termed as aspect-aware LSTM (AA-LSTM), which incorporates aspect information into LSTM cells in the context modeling stage before the attention mechanism.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)

Paper
Add Code

An End-to-End Network for Co-Saliency Detection in One Single Image

no code implementations • 25 Oct 2019 • Yuanhao Yue, Qin Zou, Hongkai Yu, Qian Wang, Zhongyuan Wang, Song Wang

Co-saliency detection within a single image is a common vision problem that has received little attention and has not yet been well addressed.

Clustering Co-Salient Object Detection +1

Paper
Add Code

Leveraging Historical Interaction Data for Improving Conversational Recommender System

no code implementations • 19 Aug 2020 • Kun Zhou, Wayne Xin Zhao, Hui Wang, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, Ji-Rong Wen

Most of the existing CRS methods focus on learning effective preference representations for users from conversation data alone.

Attribute Recommendation Systems

Paper
Add Code

Query-aware Tip Generation for Vertical Search

no code implementations • 19 Oct 2020 • Yang Yang, Junmei Hao, Canjia Li, Zili Wang, Jingang Wang, Fuzheng Zhang, Rao Fu, Peixu Hou, Gong Zhang, Zhongyuan Wang

Existing work on tip generation does not take query into consideration, which limits the impact of tips in search scenarios.

Decision Making

Paper
Add Code

Table Fact Verification with Structure-Aware Transformer

no code implementations • EMNLP 2020 • Hongzhi Zhang, Yingyao Wang, Sirui Wang, Xuezhi Cao, Fuzheng Zhang, Zhongyuan Wang

Verifying fact on semi-structured evidence like tables requires the ability to encode structural information and perform symbolic reasoning.

Fact Verification

Paper
Add Code

Converse, Focus and Guess -- Towards Multi-Document Driven Dialogue

1 code implementation • 4 Feb 2021 • Han Liu, Caixia Yuan, Xiaojie Wang, Yushu Yang, Huixing Jiang, Zhongyuan Wang

We propose a novel task, Multi-Document Driven Dialogue (MD3), in which an agent can guess the target document that the user is interested in by leading a dialogue.

Attribute

Paper
Code

Metric Learning for Anti-Compression Facial Forgery Detection

no code implementations • 15 Mar 2021 • Shenhao Cao, Qin Zou, Xiuqing Mao, Zhongyuan Wang

Detecting facial forgery images and videos is an increasingly important topic in multimedia forensics.

Metric Learning

Paper
Add Code

Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection

no code implementations • CVPR 2021 • Jiaming Li, Hongtao Xie, Jiahong Li, Zhongyuan Wang, Yongdong Zhang

Face forgery detection is raising ever-increasing interest in computer vision since facial manipulation technologies cause serious worries.

Paper
Add Code

Omniscient Video Super-Resolution

no code implementations • ICCV 2021 • Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, Tao Lu, Xin Tian, Jiayi Ma

Most recent video super-resolution (SR) methods either adopt an iterative manner to deal with low-resolution (LR) frames from a temporally sliding window, or leverage the previously estimated SR output to help reconstruct the current frame recurrently.

Ranked #5 on Video Super-Resolution on Vid4 - 4x upscaling - BD degradation

Video Super-Resolution

Paper
Add Code

HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval

no code implementations • ICCV 2021 • Song Liu, Haoqi Fan, Shengsheng Qian, Yiru Chen, Wenkui Ding, Zhongyuan Wang

Video-Text Retrieval has been a hot research topic with the growth of multimedia data on the internet.

Retrieval Text Retrieval +1

Paper
Add Code

Combining ResNet and Transformer for Chinese Grammatical Error Diagnosis

no code implementations • 1 Oct 2020 • Shaolei Wang, Baoxin Wang, Jiefu Gong, Zhongyuan Wang, Xiao Hu, Xingyi Duan, Zizhuo Shen, Gang Yue, Ruiji Fu, Dayong Wu, Wanxiang Che, Shijin Wang, Guoping Hu, Ting Liu

Grammatical error diagnosis is an important task in natural language processing.

Position

Paper
Add Code

Learn with Noisy Data via Unsupervised Loss Correction for Weakly Supervised Reading Comprehension

no code implementations • COLING 2020 • Xuemiao Zhang, Kun Zhou, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, Junfei Liu

Weakly supervised machine reading comprehension (MRC) task is practical and promising for its easily available and massive training data, but inevitablely introduces noise.

Machine Reading Comprehension

Paper
Add Code

Syntactic Graph Convolutional Network for Spoken Language Understanding

no code implementations • COLING 2020 • Keqing He, Shuyu Lei, Yushu Yang, Huixing Jiang, Zhongyuan Wang

Slot filling and intent detection are two major tasks for spoken language understanding.

Intent Detection Multi-Task Learning +3

Paper
Add Code

TANet: A new Paradigm for Global Face Super-resolution via Transformer-CNN Aggregation Network

no code implementations • 16 Sep 2021 • Yuanzhi Wang, Tao Lu, Yanduo Zhang, Junjun Jiang, JiaMing Wang, Zhongyuan Wang, Jiayi Ma

Recently, face super-resolution (FSR) methods either feed whole face image into convolutional neural networks (CNNs) or utilize extra facial priors (e. g., facial parsing maps, facial landmarks) to focus on facial structure, thereby maintaining the consistency of the facial structure while restoring facial details.

Face Reconstruction Super-Resolution

Paper
Add Code

whu-nercms at trecvid2021:instance search task

no code implementations • 30 Oct 2021 • Yanrui Niu, Jingyao Yang, Ankang Lu, Baojin Huang, Yue Zhang, Ji Huang, Shishi Wen, Dongshu Xu, Chao Liang, Zhongyuan Wang, Jun Chen

We will make a brief introduction of the experimental methods and results of the WHU-NERCMS in the TRECVID2021 in the paper.

Action Detection Face Detection +5

Paper
Add Code

ITTR: Unpaired Image-to-Image Translation with Transformers

no code implementations • 30 Mar 2022 • Wanfeng Zheng, Qiang Li, Guoxin Zhang, Pengfei Wan, Zhongyuan Wang

Unpaired image-to-image translation is to translate an image from a source domain to a target domain without paired training data.

Image-to-Image Translation Translation

Paper
Add Code

Diagnosing Ensemble Few-Shot Classifiers

no code implementations • 9 Jun 2022 • Weikai Yang, Xi Ye, Xingxing Zhang, Lanxi Xiao, Jiazhi Xia, Zhongyuan Wang, Jun Zhu, Hanspeter Pfister, Shixia Liu

The base learners and labeled samples (shots) in an ensemble few-shot classifier greatly affect the model performance.

Paper
Add Code

Deepfake Face Traceability with Disentangling Reversing Network

no code implementations • 8 Jul 2022 • Jiaxin Ai, Zhongyuan Wang, Baojin Huang, Zhen Han

Deepfake face not only violates the privacy of personal identity, but also confuses the public and causes huge social harm.

DeepFake Detection Face Swapping

Paper
Add Code

TokenFlow: Rethinking Fine-grained Cross-modal Alignment in Vision-Language Retrieval

no code implementations • 28 Sep 2022 • Xiaohan Zou, Changqiao Wu, Lele Cheng, Zhongyuan Wang

Most existing methods in vision-language retrieval match two modalities by either comparing their global feature vectors which misses sufficient information and lacks interpretability, detecting objects in images or videos and aligning the text with fine-grained features which relies on complicated model designs, or modeling fine-grained interaction via cross-attention upon visual and textual tokens which suffers from inferior efficiency.

Retrieval Text Retrieval +1

Paper
Add Code

Bridging CLIP and StyleGAN through Latent Alignment for Image Editing

no code implementations • 10 Oct 2022 • Wanfeng Zheng, Qiang Li, Xiaoyan Guo, Pengfei Wan, Zhongyuan Wang

More specifically, our efforts consist of three parts: 1) a data-free training strategy to train latent mappers to bridge the latent space of CLIP and StyleGAN; 2) for more precise mapping, temporal relative consistency is proposed to address the knowledge distribution bias problem among different latent spaces; 3) to refine the mapped latent in s space, adaptive style mixing is also proposed.

Image Manipulation Language Modelling +1

Paper
Add Code

Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation

no code implementations • 17 Nov 2022 • Chunyu Qiang, Peng Yang, Hao Che, Jinba Xiao, Xiaorui Wang, Zhongyuan Wang

In this paper we propose a simple back-translation-style data augmentation method for mandarin Chinese polyphone disambiguation, utilizing a large amount of unlabeled text data.

Data Augmentation Machine Translation +3

Paper
Add Code

A Unified Model for Video Understanding and Knowledge Embedding with Heterogeneous Knowledge Graph Dataset

no code implementations • 19 Nov 2022 • Jiaxin Deng, Dong Shen, Haojie Pan, Xiangyu Wu, Ximan Liu, Gaofeng Meng, Fan Yang, Size Li, Ruiji Fu, Zhongyuan Wang

Furthermore, based on this dataset, we propose an end-to-end model that jointly optimizes the video understanding objective with knowledge graph embedding, which can not only better inject factual knowledge into video understanding but also generate effective multi-modal entity embedding for KG.

Common Sense Reasoning Knowledge Graph Embedding +4

Paper
Add Code

A Scale-Arbitrary Image Super-Resolution Network Using Frequency-domain Information

no code implementations • 8 Dec 2022 • Jing Fang, Yinbo Yu, Zhongyuan Wang, Xin Ding, Ruimin Hu

Image super-resolution (SR) is a technique to recover lost high-frequency information in low-resolution (LR) images.

Image Super-Resolution valid

Paper
Add Code

Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis

no code implementations • 13 Dec 2022 • Chunyu Qiang, Peng Yang, Hao Che, Xiaorui Wang, Zhongyuan Wang

In order to improve the style extraction ability of the reference encoder, a style invariant and contrastive data augmentation method is proposed.

Data Augmentation Speech Synthesis +1

Paper
Add Code

Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis

no code implementations • 14 Mar 2023 • Chunyu Qiang, Peng Yang, Hao Che, Ying Zhang, Xiaorui Wang, Zhongyuan Wang

Cross-speaker style transfer in speech synthesis aims at transferring a style from source speaker to synthesized speech of a target speaker's timbre.

Prosody Prediction Speech Synthesis +1

Paper
Add Code

Ranking Aggregation with Interactive Feedback for Collaborative Person Re-identification

1 code implementation • The 33rd British Machine Vision Conference 2022 • Ji Huang, Chao Liang, Yue Zhang, Zhongyuan Wang, Chunjie Zhang

Existing RA work can be generally divided into unsupervised methods and fully-supervised methods.

Person Re-Identification Re-Ranking +1

Paper
Code

Implicit Identity Driven Deepfake Face Swapping Detection

no code implementations • CVPR 2023 • Baojin Huang, Zhongyuan Wang, Jifan Yang, Jiaxin Ai, Qin Zou, Qian Wang, Dengpan Ye

Face swapping aims to replace the target face with the source face and generate the fake face that the human cannot distinguish between real and fake.

Face Swapping

Paper
Add Code

LSTFE-Net:Long Short-Term Feature Enhancement Network for Video Small Object Detection

no code implementations • CVPR 2023 • Jinsheng Xiao, Yuanxu Wu, Yunhua Chen, Shurui Wang, Zhongyuan Wang, Jiayi Ma

We find that context information from the long-term frame and temporal information from the short-term frame are two useful cues for video small object detection.

Object object-detection +1

Paper
Add Code

Towards Practical Capture of High-Fidelity Relightable Avatars

no code implementations • 8 Sep 2023 • Haotian Yang, Mingwu Zheng, Wanquan Feng, Haibin Huang, Yu-Kun Lai, Pengfei Wan, Zhongyuan Wang, Chongyang Ma

Specifically, TRAvatar is trained with dynamic image sequences captured in a Light Stage under varying lighting conditions, enabling realistic relighting and real-time animation for avatars in diverse scenes.

Paper
Add Code

KwaiYiiMath: Technical Report

no code implementations • 11 Oct 2023 • Jiayi Fu, Lei Lin, Xiaoyang Gao, Pengli Liu, Zhengzong Chen, Zhirui Yang, ShengNan Zhang, Xue Zheng, Yan Li, Yuliang Liu, Xucheng Ye, Yiqiao Liao, Chao Liao, Bin Chen, Chengru Song, Junchen Wan, Zijia Lin, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai

Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning.

Ranked #87 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +1

Paper
Add Code

Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions

no code implementations • 11 Oct 2023 • Yuchong Sun, Che Liu, Jinwen Huang, Ruihua Song, Fuzheng Zhang, Di Zhang, Zhongyuan Wang, Kun Gai

In this paper, we address these challenges by introducing Parrot, a highly scalable solution designed to automatically generate high-quality instruction-tuning data, which are then used to enhance the effectiveness of chat models in multi-turn conversations.

Attribute Instruction Following

Paper
Add Code

HGCVAE: Integrating Generative and Contrastive Learning for Heterogeneous Graph Learning

no code implementations • 17 Oct 2023 • Yulan Hu, Zhirui Yang, Sheng Ouyang, Junchen Wan, Fuzheng Zhang, Zhongyuan Wang, Yong liu

In this study, we aim to explore the problem of generative SSL in the context of heterogeneous graph learning (HGL).

Attribute Contrastive Learning +3

Paper
Add Code

Graph Ranking Contrastive Learning: A Extremely Simple yet Efficient Method

no code implementations • 23 Oct 2023 • Yulan Hu, Sheng Ouyang, Jingyu Liu, Ge Chen, Zhirui Yang, Junchen Wan, Fuzheng Zhang, Zhongyuan Wang, Yong liu

Thus, we propose GraphRank, a simple yet efficient graph contrastive learning method that addresses the problem of false negative samples by redefining the concept of negative samples to a certain extent, thereby avoiding the issue of false negative samples.

Contrastive Learning Graph Learning +1

Paper
Add Code

Improving Vision-and-Language Reasoning via Spatial Relations Modeling

no code implementations • 9 Nov 2023 • Cheng Yang, Rui Xu, Ye Guo, Peixiang Huang, Yiru Chen, Wenkui Ding, Zhongyuan Wang, Hong Zhou

Further, we design two pre-training tasks named object position regression (OPR) and spatial relation classification (SRC) to learn to reconstruct the spatial relation graph respectively.

Position regression Relation +3

Paper
Add Code

Ask One More Time: Self-Agreement Improves Reasoning of Language Models in (Almost) All Scenarios

no code implementations • 14 Nov 2023 • Lei Lin, Jiayi Fu, Pengli Liu, Qingyang Li, Yan Gong, Junchen Wan, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai

Although chain-of-thought (CoT) prompting combined with language models has achieved encouraging results on complex reasoning tasks, the naive greedy decoding used in CoT prompting usually causes the repetitiveness and local optimality.

Language Modelling

Paper
Add Code

Temporal-Aware Refinement for Video-based Human Pose and Shape Recovery

no code implementations • 16 Nov 2023 • Ming Chen, Yan Zhou, Weihua Jian, Pengfei Wan, Zhongyuan Wang

Though significant progress in human pose and shape recovery from monocular RGB images has been made in recent years, obtaining 3D human motion with high accuracy and temporal consistency from videos remains challenging.

TAR

Paper
Add Code

Not all Layers of LLMs are Necessary during Inference

no code implementations • 4 Mar 2024 • Siqi Fan, Xin Jiang, Xiang Li, Xuying Meng, Peng Han, Shuo Shang, Aixin Sun, Yequan Wang, Zhongyuan Wang

To answer this question, we first indicate that Not all Layers are Necessary during Inference by statistically analyzing the activated layers across tasks.

In-Context Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.