Search Results for author: Ning Shi

Found 15 papers, 10 papers with code

Cross-Modal Consistency in Multimodal Large Language Models

no code implementations14 Nov 2024 Xiang Zhang, Senyu Li, Ning Shi, Bradley Hauer, Zijun Wu, Grzegorz Kondrak, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan

Recent developments in multimodal methodologies have marked the beginning of an exciting era for models adept at processing diverse data types, encompassing text, audio, and visual content.

Image Captioning object-detection +1

MIO: A Foundation Model on Multimodal Tokens

1 code implementation26 Sep 2024 Zekun Wang, King Zhu, Chunpu Xu, Wangchunshu Zhou, Jiaheng Liu, Yibo Zhang, Jiashuo Wang, Ning Shi, Siyu Li, Yizhi Li, Haoran Que, Zhaoxiang Zhang, Yuanxing Zhang, Ge Zhang, Ke Xu, Jie Fu, Wenhao Huang

In this paper, we introduce MIO, a novel foundation model built on multimodal tokens, capable of understanding and generating speech, text, images, and videos in an end-to-end, autoregressive manner.

model Text Generation

Action Controlled Paraphrasing

1 code implementation18 May 2024 Ning Shi, Zijun Wu

To address the inference gap, we introduce an optional action token as a placeholder that encourages the model to determine the appropriate action independently when users' intended actions are not provided.

Paraphrase Generation Representation Learning

Optimizing Negative Prompts for Enhanced Aesthetics and Fidelity in Text-To-Image Generation

no code implementations12 Mar 2024 Michael Ogezi, Ning Shi

In text-to-image generation, using negative prompts, which describe undesirable image characteristics, can significantly boost image quality.

Text to Image Generation Text-to-Image Generation

Lost in Translation: When GPT-4V(ision) Can't See Eye to Eye with Text. A Vision-Language-Consistency Analysis of VLLMs and Beyond

no code implementations19 Oct 2023 Xiang Zhang, Senyu Li, Zijun Wu, Ning Shi

Expanding on our findings, we introduce "Vision Description Prompting," a method that effectively improves performance in challenging vision-related tasks.

Image Captioning Language Modeling +3

From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework

1 code implementation29 May 2023 Yangyi Chen, Hongcheng Gao, Ganqu Cui, Lifan Yuan, Dehan Kong, Hanlu Wu, Ning Shi, Bo Yuan, Longtao Huang, Hui Xue, Zhiyuan Liu, Maosong Sun, Heng Ji

In our experiments, we conduct a robustness evaluation of RoBERTa models to demonstrate the effectiveness of our evaluation framework, and further show the rationality of each component in the framework.

Adversarial Attack

Interactive Natural Language Processing

no code implementations22 May 2023 Zekun Wang, Ge Zhang, Kexin Yang, Ning Shi, Wangchunshu Zhou, Shaochun Hao, Guangzheng Xiong, Yizhi Li, Mong Yuan Sim, Xiuying Chen, Qingqing Zhu, Zhenzhu Yang, Adam Nik, Qi Liu, Chenghua Lin, Shi Wang, Ruibo Liu, Wenhu Chen, Ke Xu, Dayiheng Liu, Yike Guo, Jie Fu

Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence.

Decision Making

RoChBert: Towards Robust BERT Fine-tuning for Chinese

1 code implementation28 Oct 2022 Zihan Zhang, Jinfeng Li, Ning Shi, Bo Yuan, Xiangyu Liu, Rong Zhang, Hui Xue, Donghong Sun, Chao Zhang

Despite of the superb performance on a wide range of tasks, pre-trained language models (e. g., BERT) have been proved vulnerable to adversarial texts.

Data Augmentation Language Modeling +1

Text Editing as Imitation Game

1 code implementation21 Oct 2022 Ning Shi, Bin Tang, Bo Yuan, Longtao Huang, Yewen Pu, Jie Fu, Zhouhan Lin

Text editing, such as grammatical error correction, arises naturally from imperfect textual data.

Action Generation Grammatical Error Correction +1

Recurrent Inference in Text Editing

1 code implementation Findings of the Association for Computational Linguistics 2020 Ning Shi, Ziheng Zeng, Haotian Zhang, Yichen Gong

In neural text editing, prevalent sequence-to-sequence based approaches directly map the unedited text either to the edited text or the editing operations, in which the performance is degraded by the limited source text encoding and long, varying decoding steps.

Cannot find the paper you are looking for? You can Submit a new open access paper.