Search Results for author: Qifan Yu

Found 4 papers, 3 papers with code

HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data

1 code implementation22 Nov 2023 Qifan Yu, Juncheng Li, Longhui Wei, Liang Pang, Wentao Ye, Bosheng Qin, Siliang Tang, Qi Tian, Yueting Zhuang

Multi-modal Large Language Models (MLLMs) tuned on machine-generated instruction-following data have demonstrated remarkable performance in various multi-modal understanding and generation tasks.

Attribute counterfactual +3

Dancing Avatar: Pose and Text-Guided Human Motion Videos Synthesis with Image Diffusion Model

no code implementations15 Aug 2023 Bosheng Qin, Wentao Ye, Qifan Yu, Siliang Tang, Yueting Zhuang

Our approach employs a pretrained T2I diffusion model to generate each video frame in an autoregressive fashion.

Image Inpainting

Interactive Data Synthesis for Systematic Vision Adaptation via LLMs-AIGCs Collaboration

1 code implementation22 May 2023 Qifan Yu, Juncheng Li, Wentao Ye, Siliang Tang, Yueting Zhuang

Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images.

Data Augmentation Prompt Engineering +1

Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World

1 code implementation ICCV 2023 Qifan Yu, Juncheng Li, Yu Wu, Siliang Tang, Wei Ji, Yueting Zhuang

Based on that, we further introduce a novel Entangled cross-modal prompt approach for open-world predicate scene graph generation (Epic), where models can generalize to unseen predicates in a zero-shot manner.

Graph Generation Language Modelling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.