1 code implementation • COLING 2022 • Lele Sha, Yuheng Li, Dragan Gasevic, Guanliang Chen
Pretrained Language Models (PLMs), though popular, have been diagnosed to encode bias against protected groups in the representations they learn, which may harm the prediction fairness of downstream models.
no code implementations • 24 Apr 2024 • Xuxin Chen, Yuheng Li, Mingzhe Hu, Ella Salari, Xiaoqian Chen, Richard L. J. Qiu, Bin Zheng, Xiaofeng Yang
For framework evaluation, we assembled two datasets retrospectively.
no code implementations • 18 Jan 2024 • Thao Nguyen, Utkarsh Ojha, Yuheng Li, Haotian Liu, Yong Jae Lee
With increased human control, it is now possible to edit an image in a plethora of ways; from specifying in text what we want to change, to straight up dragging the contents of the image in an interactive point-based manner.
5 code implementations • 5 Oct 2023 • Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
Large multimodal models (LMM) have recently shown encouraging progress with visual instruction tuning.
Ranked #3 on visual instruction following on LLaVA-Bench
Factual Inconsistency Detection in Chart Captioning visual instruction following +1
1 code implementation • 26 Jul 2023 • Thao Nguyen, Yuheng Li, Utkarsh Ojha, Yong Jae Lee
Given pairs of example that represent the "before" and "after" images of an edit, our goal is to learn a text-based editing direction that can be used to perform the same edit on new images.
2 code implementations • 23 Jul 2023 • Zijie Zeng, Lele Sha, Yuheng Li, Kaixun Yang, Dragan Gašević, Guanliang Chen
Then we proposed a two-step approach where we (1) separated AI-generated content from human-written content during the encoder training process; and (2) calculated the distances between every two adjacent prototypes and assumed that the boundaries exist between the two adjacent prototypes that have the furthest distance from each other.
no code implementations • 29 Jun 2023 • Yuheng Li, Haotian Liu, Yangming Wen, Yong Jae Lee
Text-to-image diffusion models have attracted considerable interest due to their wide applicability across diverse fields.
no code implementations • 9 Jun 2023 • Mu Cai, Zeyi Huang, Yuheng Li, Haohan Wang, Yong Jae Lee
By leveraging the XML-based textual descriptions of SVG representations instead of raster images, we aim to bridge the gap between the visual and textual modalities, allowing LLMs to directly understand and manipulate images without the need for parameterized visual components.
1 code implementation • 31 May 2023 • Shaoyan Pan, Elham Abouei, Jacob Wynne, Tonghe Wang, Richard L. J. Qiu, Yuheng Li, Chih-Wei Chang, Junbo Peng, Justin Roper, Pretesh Patel, David S. Yu, Hui Mao, Xiaofeng Yang
The proposed model consists of two processes: a forward process which adds Gaussian noise to real CT scans, and a reverse process in which a shifted-window transformer V-net (Swin-Vnet) denoises the noisy CT scans conditioned on the MRI from the same patient to produce noise-free CT scans.
no code implementations • 21 May 2023 • Mingzhe Hu, Yuheng Li, Xiaofeng Yang
We conducted a thorough investigation of the Segment Anything Model (SAM) for the task of interactive segmentation of breast tumors in ultrasound images.
no code implementations • 30 Apr 2023 • Yuheng Li, Jacob Wynne, Jing Wang, Richard L. J. Qiu, Justin Roper, Shaoyan Pan, Ashesh B. Jani, Tian Liu, Pretesh R. Patel, Hui Mao, Xiaofeng Yang
We introduce a novel end-to-end Cross-Shaped windows (CSwin) transformer UNet model, CSwin UNet, to detect clinically significant prostate cancer (csPCa) in prostate bi-parametric MR imaging (bpMRI) and demonstrate the effectiveness of our proposed self-supervised pre-training framework.
1 code implementation • 29 Apr 2023 • Yuheng Li, Mingzhe Hu, Xiaofeng Yang
In this study, we propose Poly-SAM, a finetuned SAM model for polyp segmentation, and compare its performance to several state-of-the-art polyp segmentation models.
no code implementations • 27 Apr 2023 • Mingzhe Hu, Yuheng Li, Xiaofeng Yang
Skin cancer is a prevalent and potentially fatal disease that requires accurate and efficient diagnosis and treatment.
no code implementations • 11 Apr 2023 • Mingzhe Hu, Shaoyan Pan, Yuheng Li, Xiaofeng Yang
In this paper, we aimed to provide a review and tutorial for researchers in the field of medical imaging using language models to improve their tasks at hand.
no code implementations • 17 Mar 2023 • Lixiang Yan, Lele Sha, Linxuan Zhao, Yuheng Li, Roberto Martinez-Maldonado, Guanliang Chen, Xinyu Li, Yueqiao Jin, Dragan Gašević
Educational technology innovations leveraging large language models (LLMs) have shown the potential to automate the laborious process of generating and analysing textual content.
1 code implementation • CVPR 2023 • Utkarsh Ojha, Yuheng Li, Yong Jae Lee
In this work, we first show that the existing paradigm, which consists of training a deep network for real-vs-fake classification, fails to detect fake images from newer breeds of generative models when trained to detect GAN fake images.
1 code implementation • CVPR 2023 • Yuheng Li, Haotian Liu, Qingyang Wu, Fangzhou Mu, Jianwei Yang, Jianfeng Gao, Chunyuan Li, Yong Jae Lee
Large-scale text-to-image diffusion models have made amazing advances.
Ranked #4 on Conditional Text-to-Image Synthesis on COCO-MIG
no code implementations • 4 Nov 2022 • Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, Krishna Kumar Singh
We introduce a new method for diverse foreground generation with explicit control over various factors.
1 code implementation • CVPR 2022 • Yang Xue, Yuheng Li, Krishna Kumar Singh, Yong Jae Lee
3D-aware generative models have shown that the introduction of 3D information can lead to more controllable image generation.
no code implementations • ICCV 2021 • Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, Krishna Kumar Singh
We propose a new approach for high resolution semantic image synthesis.
1 code implementation • 23 May 2021 • Hao Huang, Yongtao Wang, Zhaoyu Chen, Yuze Zhang, Yuheng Li, Zhi Tang, Wei Chu, Jingdong Chen, Weisi Lin, Kai-Kuang Ma
Then, we design a two-level perturbation fusion strategy to alleviate the conflict between the adversarial watermarks generated by different facial images and models.
3 code implementations • CVPR 2020 • Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Lee
We present MixNMatch, a conditional generative model that learns to disentangle and encode background, object pose, shape, and texture from real images with minimal supervision, for mix-and-match image generation.