Search Results for author: Minheng Ni

Found 13 papers, 6 papers with code

M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training

1 code implementation • CVPR 2021 • Minheng Ni, Haoyang Huang, Lin Su, Edward Cui, Taroon Bharti, Lijuan Wang, Jianfeng Gao, Dongdong Zhang, Nan Duan

We present M3P, a Multitask Multilingual Multimodal Pre-trained model that combines multilingual pre-training and multimodal pre-training into a unified framework via multitask pre-training.

Image Captioning Image Retrieval +4

Paper
Code

CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP

1 code implementation • 11 Jun 2020 • Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che

Compared with the existing work, our method does not rely on bilingual sentences for training, and requires only one training process for multiple target languages.

Data Augmentation

Paper
Code

Co-GAT: A Co-Interactive Graph Attention Network for Joint Dialog Act Recognition and Sentiment Classification

1 code implementation • 24 Dec 2020 • Libo Qin, Zhouyang Li, Wanxiang Che, Minheng Ni, Ting Liu

The dialog context information (contextual information) and the mutual interaction information are two key factors that contribute to the two related tasks.

Graph Attention Sentiment Analysis +1

Paper
Code

ImaginaryNet: Learning Object Detectors without Real Images and Annotations

1 code implementation • 13 Oct 2022 • Minheng Ni, Zitong Huang, Kailai Feng, WangMeng Zuo

Given a class label, the language model is used to generate a full description of a scene with a target object, and the text-to-image model deployed to generate a photo-realistic image.

Image Generation Language Modelling +3

Paper
Code

ORES: Open-vocabulary Responsible Visual Synthesis

1 code implementation • 26 Aug 2023 • Minheng Ni, Chenfei Wu, Xiaodong Wang, Shengming Yin, Lijuan Wang, Zicheng Liu, Nan Duan

In this work, we formalize a new task, Open-vocabulary Responsible Visual Synthesis (ORES), where the synthesis model is able to avoid forbidden visual concepts while allowing users to input any desired content.

Image Generation Language Modelling

Paper
Code

Multi-Domain Spoken Language Understanding Using Domain- and Task-Aware Parameterization

no code implementations • 30 Apr 2020 • Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che, Yangming Li, Ting Liu

Spoken language understanding has been addressed as a supervised learning problem, where a set of training data is available for each domain.

Spoken Language Understanding

Paper
Add Code

DCR-Net: A Deep Co-Interactive Relation Network for Joint Dialog Act Recognition and Sentiment Classification

no code implementations • 16 Aug 2020 • Libo Qin, Wanxiang Che, Yangming Li, Minheng Ni, Ting Liu

In dialog system, dialog act recognition and sentiment classification are two correlative tasks to capture speakers intentions, where dialog act and sentiment can indicate the explicit and the implicit intentions separately.

Relation Relation Network +2

Paper
Add Code

NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN

no code implementations • 10 Feb 2022 • Minheng Ni, Chenfei Wu, Haoyang Huang, Daxin Jiang, WangMeng Zuo, Nan Duan

Language guided image inpainting aims to fill in the defective regions of an image under the guidance of text while keeping non-defective regions unchanged.

Image Inpainting

Paper
Add Code

Learning 3D Photography Videos via Self-supervised Diffusion on Single Images

no code implementations • 21 Feb 2023 • Xiaodong Wang, Chenfei Wu, Shengming Yin, Minheng Ni, JianFeng Wang, Linjie Li, Zhengyuan Yang, Fan Yang, Lijuan Wang, Zicheng Liu, Yuejian Fang, Nan Duan

3D photography renders a static image into a video with appealing 3D visual effects.

Ranked #1 on Image Outpainting on MSCOCO

Image Outpainting Monocular Depth Estimation

Paper
Add Code

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation

no code implementations • 22 Mar 2023 • Shengming Yin, Chenfei Wu, Huan Yang, JianFeng Wang, Xiaodong Wang, Minheng Ni, Zhengyuan Yang, Linjie Li, Shuguang Liu, Fan Yang, Jianlong Fu, Gong Ming, Lijuan Wang, Zicheng Liu, Houqiang Li, Nan Duan

In this paper, we propose NUWA-XL, a novel Diffusion over Diffusion architecture for eXtremely Long video generation.

Video Generation

Paper
Add Code

NUWA-LIP: Language-Guided Image Inpainting With Defect-Free VQGAN

no code implementations • CVPR 2023 • Minheng Ni, Xiaoming Li, WangMeng Zuo

Language-guided image inpainting aims to fill the defective regions of an image under the guidance of text while keeping the non-defective regions unchanged.

Image Inpainting

Paper
Add Code

Ref-Diff: Zero-shot Referring Image Segmentation with Generative Models

no code implementations • 31 Aug 2023 • Minheng Ni, Yabo Zhang, Kailai Feng, Xiaoming Li, Yiwen Guo, WangMeng Zuo

In this work, we introduce a novel Referring Diffusional segmentor (Ref-Diff) for this task, which leverages the fine-grained multi-modal information from generative models.

Image Segmentation Instance Segmentation +2

Paper
Add Code

Responsible Visual Editing

1 code implementation • 8 Apr 2024 • Minheng Ni, Yeli Shen, Lei Zhang, WangMeng Zuo

To mitigate the negative implications of harmful images on research, we create a transparent and public dataset, AltBear, which expresses harmful information using teddy bears instead of humans.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.