Search Results for author: Wenhao Jiang

Found 26 papers, 13 papers with code

Learning to Guide Decoding for Image Captioning

no code implementations • 3 Apr 2018 • Wenhao Jiang, Lin Ma, Xinpeng Chen, Hanwang Zhang, Wei Liu

Recently, much advance has been made in image captioning, and an encoder-decoder framework has achieved outstanding performance for this task.

Attribute Decoder +1

Paper
Add Code

Theoretic Analysis and Extremely Easy Algorithms for Domain Adaptive Feature Learning

no code implementations • 5 Sep 2015 • Wenhao Jiang, Cheng Deng, Wei Liu, Feiping Nie, Fu-Lai Chung, Heng Huang

Domain adaptation problems arise in a variety of applications, where a training dataset from the \textit{source} domain and a test dataset from the \textit{target} domain typically follow different distributions.

Domain Adaptation

Paper
Add Code

Recurrent Fusion Network for Image Captioning

no code implementations • ECCV 2018 • Wenhao Jiang, Lin Ma, Yu-Gang Jiang, Wei Liu, Tong Zhang

In this paper, in order to exploit the complementary information from multiple encoders, we propose a novel Recurrent Fusion Network (RFNet) for tackling image captioning.

Decoder Image Captioning

Paper
Add Code

Real-Time Neural Style Transfer for Videos

no code implementations • CVPR 2017 • Hao-Zhi Huang, Hao Wang, Wenhan Luo, Lin Ma, Wenhao Jiang, Xiaolong Zhu, Zhifeng Li, Wei Liu

More specifically, a hybrid loss is proposed to capitalize on the content information of input frames, the style information of a given style image, and the temporal information of consecutive frames.

Style Transfer Video Style Transfer

Paper
Add Code

Hierarchical Photo-Scene Encoder for Album Storytelling

no code implementations • 2 Feb 2019 • Bairui Wang, Lin Ma, Wei zhang, Wenhao Jiang, Feng Zhang

In this paper, we propose a novel model with a hierarchical photo-scene encoder and a reconstructor for the task of album storytelling.

Ranked #5 on Image-guided Story Ending Generation on VIST-E

Decoder Image-guided Story Ending Generation

Paper
Add Code

Respiratory Motion Correction in Abdominal MRI using a Densely Connected U-Net with GAN-guided Training

no code implementations • 24 Jun 2019 • Wenhao Jiang, Zhiyu Liu, Kit-Hang Lee, Shihui Chen, Yui-Lun Ng, Qi Dou, Hing-Chiu Chang, Ka-Wai Kwok

Abdominal magnetic resonance imaging (MRI) provides a straightforward way of characterizing tissue and locating lesions of patients as in standard diagnosis.

Generative Adversarial Network

Paper
Add Code

Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos

no code implementations • ECCV 2020 • Shaoxiang Chen, Wenhao Jiang, Wei Liu, Yu-Gang Jiang

Inspired by the fact that there exist cross-modal interactions in the human brain, we propose a novel method for learning pairwise modality interactions in order to better exploit complementary information for each pair of modalities in videos and thus improve performances on both tasks.

Sentence

Paper
Add Code

Poisoning MorphNet for Clean-Label Backdoor Attack to Point Clouds

no code implementations • 11 May 2021 • Guiyu Tian, Wenhao Jiang, Wei Liu, Yadong Mu

To this end, MorphNet jointly optimizes two objectives for sample-adaptive poisoning: a reconstruction loss that preserves the visual similarity between benign / poisoned point clouds, and a classification loss that enforces a modern recognition model of point clouds tends to mis-classify the poisoned sample to a pre-specified target category.

Backdoor Attack Denoising

Paper
Add Code

Can Decentralized Stochastic Minimax Optimization Algorithms Converge Linearly for Finite-Sum Nonconvex-Nonconcave Problems?

no code implementations • 24 Apr 2023 • Yihan Zhang, Wenhao Jiang, Feng Zheng, Chiu C. Tan, Xinghua Shi, Hongchang Gao

This motivates us to study decentralized minimax optimization algorithms for the nonconvex-nonconcave problem.

Paper
Add Code

Prefix-Tuning Based Unsupervised Text Style Transfer

no code implementations • 23 Oct 2023 • Huiyu Mai, Wenhao Jiang, Zhihong Deng

Unsupervised text style transfer aims at training a generative model that can alter the style of the input sentence while preserving its content without using any parallel data.

Sentence Style Transfer +2

Paper
Add Code

RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence Learning

no code implementations • 3 Nov 2023 • Ziyu Wang, Wenhao Jiang, Zixuan Zhang, Wei Tang, Junchi Yan

Sequential processes in real-world often carry a combination of simple subsystems that interact with each other in certain forms.

feature selection

Paper
Add Code

Mitigating Catastrophic Forgetting in Multi-domain Chinese Spelling Correction by Multi-stage Knowledge Transfer Framework

no code implementations • 18 Feb 2024 • Peng Xing, Yinghui Li, Shirong Ma, Xinnian Liang, Haojing Huang, Yangning Li, Hai-Tao Zheng, Wenhao Jiang, Ying Shen

Chinese Spelling Correction (CSC) aims to detect and correct spelling errors in given sentences.

Continual Learning Spelling Correction +1

Paper
Add Code

Rethinking the Roles of Large Language Models in Chinese Grammatical Error Correction

no code implementations • 18 Feb 2024 • Yinghui Li, Shang Qin, Jingheng Ye, Shirong Ma, Yangning Li, Libo Qin, Xuming Hu, Wenhao Jiang, Hai-Tao Zheng, Philip S. Yu

To promote the CGEC field to better adapt to the era of LLMs, we rethink the roles of LLMs in the CGEC task so that they can be better utilized and explored in CGEC.

Grammatical Error Correction

Paper
Add Code

Few-Shot Class-Incremental Learning with Prior Knowledge

1 code implementation • 2 Feb 2024 • Wenhao Jiang, Duo Li, Menghan Hu, Guangtao Zhai, Xiaokang Yang, Xiao-Ping Zhang

To tackle the issues of catastrophic forgetting and overfitting in few-shot class-incremental learning (FSCIL), previous work has primarily concentrated on preserving the memory of old knowledge during the incremental phase.

Few-Shot Class-Incremental Learning Incremental Learning

Paper
Code

UltraWiki: Ultra-fine-grained Entity Set Expansion with Negative Seed Entities

1 code implementation • 7 Mar 2024 • Yangning Li, Qingsong Lv, Tianyu Yu, Yinghui Li, Shulin Huang, Tingwei Lu, Xuming Hu, Wenhao Jiang, Hai-Tao Zheng, Hui Wang

To solve this issue, we first introduce negative seed entities in the inputs, which belong to the same fine-grained semantic class as the positive seed entities but differ in certain attributes.

Attribute Contrastive Learning +1

Paper
Code

MRI Reconstruction Using Deep Bayesian Estimation

1 code implementation • 3 Sep 2019 • GuanXiong Luo, Na Zhao, Wenhao Jiang, Edward S. Hui, Peng Cao

Purpose: To develop a deep learning-based Bayesian inference for MRI reconstruction.

MRI Reconstruction

Paper
Code

VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix

1 code implementation • 17 Jun 2022 • Teng Wang, Wenhao Jiang, Zhichao Lu, Feng Zheng, Ran Cheng, Chengguo Yin, Ping Luo

Existing vision-language pre-training (VLP) methods primarily rely on paired image-text datasets, which are either annotated by enormous human labors, or crawled from the internet followed by elaborate data cleaning techniques.

Contrastive Learning Data Augmentation +2

Paper
Code

Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos

1 code implementation • 11 Mar 2023 • Teng Wang, Jinrui Zhang, Feng Zheng, Wenhao Jiang, Ran Cheng, Ping Luo

Our framework is easily extensible to tasks covering visually-grounded language understanding and generation.

Ranked #1 on Natural Language Moment Retrieval on ActivityNet Captions

Dense Video Captioning Natural Language Moment Retrieval +2

Paper
Code

SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-Maximization

1 code implementation • CVPR 2022 • Zhihui Lin, Tianyu Yang, Maomao Li, Ziyu Wang, Chun Yuan, Wenhao Jiang, Wei Liu

Matching-based methods, especially those based on space-time memory, are significantly ahead of other solutions in semi-supervised video object segmentation (VOS).

Ranked #6 on Semi-Supervised Video Object Segmentation on DAVIS (no YouTube-VOS training)

Semantic Segmentation Semi-Supervised Video Object Segmentation +1

Paper
Code

Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction

1 code implementation • 11 Sep 2019 • Jingwen Wang, Lin Ma, Wenhao Jiang

The task of temporally grounding language queries in videos is to temporally localize the best matched video segment corresponding to a given language (sentence).

Sentence

Paper
Code

Controllable Video Captioning with POS Sequence Guidance Based on Gated Fusion Network

1 code implementation • ICCV 2019 • Bairui Wang, Lin Ma, Wei zhang, Wenhao Jiang, Jingwen Wang, Wei Liu

In this paper, we propose to guide the video caption generation with Part-of-Speech (POS) information, based on a gated fusion of multiple representations of input videos.

Caption Generation Decoder +3

Paper
Code

Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present

1 code implementation • CVPR 2018 • Xinpeng Chen, Lin Ma, Wenhao Jiang, Jian Yao, Wei Liu

Recently, caption generation with an encoder-decoder framework has been extensively studied and applied in different domains, such as image captioning, code captioning, and so on.

Caption Generation Decoder +1

Paper
Code

VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples

1 code implementation • CVPR 2021 • Tian Pan, Yibing Song, Tianyu Yang, Wenhao Jiang, Wei Liu

By empowering the temporal robustness of the encoder and modeling the temporal decay of the keys, our VideoMoCo improves MoCo temporally based on contrastive learning.

Ranked #76 on Action Recognition on HMDB-51

Action Recognition Contrastive Learning +1

140

Paper
Code

Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning

1 code implementation • CVPR 2018 • Jingwen Wang, Wenhao Jiang, Lin Ma, Wei Liu, Yong Xu

We propose a bidirectional proposal method that effectively exploits both past and future contexts to make proposal predictions.

Decoder Dense Video Captioning

147

Paper
Code

DynaMixer: A Vision MLP Architecture with Dynamic Mixing

2 code implementations • 28 Jan 2022 • Ziyu Wang, Wenhao Jiang, Yiming Zhu, Li Yuan, Yibing Song, Wei Liu

In contrast with vision transformers and CNNs, the success of MLP-like models shows that simple information fusion operations among tokens and channels can yield a good representation power for deep recognition models.

Image Classification

161

Paper
Code

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

4 code implementations • 3 Oct 2023 • Bin Zhu, Bin Lin, Munan Ning, Yang Yan, Jiaxi Cui, Hongfa Wang, Yatian Pang, Wenhao Jiang, Junwu Zhang, Zongwei Li, Wancai Zhang, Zhifeng Li, Wei Liu, Li Yuan

We thus propose VIDAL-10M with Video, Infrared, Depth, Audio and their corresponding Language, naming as VIDAL-10M.

Ranked #1 on Zero-shot Audio Classification on VGG-Sound (using extra training data)

Audio Classification Contrastive Learning +11

2,424

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.