Search Results for author: Weizhi Wang

Found 14 papers, 7 papers with code

MM-Diff: High-Fidelity Image Personalization via Multi-Modal Condition Integration

no code implementations22 Mar 2024 Zhichao Wei, Qingkun Su, Long Qin, Weizhi Wang

CLS embeddings are used on the one hand to augment the text embeddings, and on the other hand together with patch embeddings to derive a small number of detail-rich subject embeddings, both of which are efficiently integrated into the diffusion model through the well-designed multimodal cross-attention mechanism.

Image Generation

EffiVED:Efficient Video Editing via Text-instruction Diffusion Models

no code implementations18 Mar 2024 Zhenghao Zhang, Zuozhuo Dai, Long Qin, Weizhi Wang

Large-scale text-to-video models have shown remarkable abilities, but their direct application in video editing remains challenging due to limited available datasets.

Video Editing

Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters

no code implementations5 Mar 2024 Weizhi Wang, Khalil Mrini, Linjie Yang, Sateesh Kumar, Yu Tian, Xifeng Yan, Heng Wang

Our MLM filter can generalize to different models and tasks, and be used as a drop-in replacement for CLIPScore.

GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks

no code implementations2 Nov 2023 Xinlu Zhang, Yujie Lu, Weizhi Wang, An Yan, Jun Yan, Lianke Qin, Heng Wang, Xifeng Yan, William Yang Wang, Linda Ruth Petzold

Automatically evaluating vision-language tasks is challenging, especially when it comes to reflecting human judgments due to limitations in accounting for fine-grained details.

Image Generation

Augmenting Language Models with Long-Term Memory

no code implementations NeurIPS 2023 Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei

Such a decoupled memory design can easily cache and update long-term past contexts for memory retrieval without suffering from memory staleness.

In-Context Learning Language Modelling +1

STEPS: A Benchmark for Order Reasoning in Sequential Tasks

no code implementations7 Jun 2023 Weizhi Wang, Hong Wang, Xifeng Yan

Therefore, to verify the order reasoning capability of current neural models in sequential tasks, we propose a challenging benchmark , named STEPS.

In-Context Learning

Bot or Human? Detecting ChatGPT Imposters with A Single Question

1 code implementation10 May 2023 Hong Wang, Xuan Luo, Weizhi Wang, Xifeng Yan

Large language models like ChatGPT have recently demonstrated impressive capabilities in natural language understanding and generation, enabling various applications including translation, essay writing, and chit-chatting.

Language Modelling Large Language Model +2

Non-Parametric Domain Adaptation for End-to-End Speech Translation

1 code implementation23 May 2022 Yichao Du, Weizhi Wang, Zhirui Zhang, Boxing Chen, Tong Xu, Jun Xie, Enhong Chen

End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters.

Domain Adaptation Translation

Visually-Augmented Language Modeling

1 code implementation20 May 2022 Weizhi Wang, Li Dong, Hao Cheng, Haoyu Song, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei

With the visually-augmented context, VaLM uses a visual knowledge fusion layer to enable multimodal grounded language modeling by attending to both text context and visual knowledge in images.

Image Retrieval Language Modelling +1

Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement

1 code implementation21 Dec 2021 Yichao Du, Zhirui Zhang, Weizhi Wang, Boxing Chen, Jun Xie, Tong Xu

In this paper, we attempt to model the joint probability of transcription and translation based on the speech input to directly leverage such triplet data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Rethinking Zero-shot Neural Machine Translation: From a Perspective of Latent Variables

1 code implementation Findings (EMNLP) 2021 Weizhi Wang, Zhirui Zhang, Yichao Du, Boxing Chen, Jun Xie, Weihua Luo

However, it usually suffers from capturing spurious correlations between the output language and language invariant semantics due to the maximum likelihood training objective, leading to poor transfer performance on zero-shot translation.

Denoising Machine Translation +2

Task-Oriented Dialogue System as Natural Language Generation

1 code implementation31 Aug 2021 Weizhi Wang, Zhirui Zhang, Junliang Guo, Yinpei Dai, Boxing Chen, Weihua Luo

In this paper, we propose to formulate the task-oriented dialogue system as the purely natural language generation task, so as to fully leverage the large-scale pre-trained models like GPT-2 and simplify complicated delexicalization prepossessing.

Text Generation Transfer Learning

Clustering tweets usingWikipedia concepts

no code implementations LREC 2014 Guoyu Tang, Yunqing Xia, Weizhi Wang, Raymond Lau, Fang Zheng

We address the polysemy issue with a Bayesian model, and the synonymy issue by exploiting the Wikipedia redirections.

Clustering Text Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.