Search Results for author: Tan Wang

Found 18 papers, 12 papers with code

Unified Text-to-Image Generation and Retrieval

no code implementations9 Jun 2024 Leigang Qu, Haochuan Li, Tan Wang, Wenjie Wang, Yongqi Li, Liqiang Nie, Tat-Seng Chua

Subsequently, we unify generation and retrieval in an autoregressive generation way and propose an autonomous decision module to choose the best-matched one between generated and retrieved images as the response to the text query.

Image Retrieval Retrieval +1

Relevant or Random: Can LLMs Truly Perform Analogical Reasoning?

no code implementations19 Apr 2024 Chengwei Qin, Wenhan Xia, Tan Wang, Fangkai Jiao, Yuchen Hu, Bosheng Ding, Ruirui Chen, Shafiq Joty

One key finding in psychology is that compared with irrelevant past experiences, recalling relevant ones can help humans better handle new tasks.

GSM8K

Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model

1 code implementation CVPR 2024 Zhicai Wang, Longhui Wei, Tan Wang, Heyu Chen, Yanbin Hao, Xiang Wang, Xiangnan He, Qi Tian

Text-to-image (T2I) generative models have recently emerged as a powerful tool, enabling the creation of photo-realistic images and giving rise to a multitude of applications.

Data Augmentation Diversity +1

DisCo: Disentangled Control for Realistic Human Dance Generation

1 code implementation CVPR 2024 Tan Wang, Linjie Li, Kevin Lin, Yuanhao Zhai, Chung-Ching Lin, Zhengyuan Yang, Hanwang Zhang, Zicheng Liu, Lijuan Wang

In this paper, we depart from the traditional paradigm of human motion transfer and emphasize two additional critical attributes for the synthesis of human dance content in social media contexts: (i) Generalizability: the model should be able to generalize beyond generic human viewpoints as well as unseen human subjects, backgrounds, and poses; (ii) Compositionality: it should allow for the seamless composition of seen/unseen subjects, backgrounds, and poses from different sources.

Attribute

Explaining Language Models' Predictions with High-Impact Concepts

no code implementations3 May 2023 Ruochen Zhao, Shafiq Joty, Yongjie Wang, Tan Wang

The emergence of large-scale pretrained language models has posed unprecedented challenges in deriving explanations of why the model has made some predictions.

Fairness Vocal Bursts Intensity Prediction

Equivariant Similarity for Vision-Language Foundation Models

1 code implementation ICCV 2023 Tan Wang, Kevin Lin, Linjie Li, Chung-Ching Lin, Zhengyuan Yang, Hanwang Zhang, Zicheng Liu, Lijuan Wang

Unlike the existing image-text similarity objective which only categorizes matched pairs as similar and unmatched pairs as dissimilar, equivariance also requires similarity to vary faithfully according to the semantic changes.

Image-text Retrieval Text Retrieval +2

Equivariance and Invariance Inductive Bias for Learning from Insufficient Data

1 code implementation25 Jul 2022 Tan Wang, Qianru Sun, Sugiri Pranata, Karlekar Jayashree, Hanwang Zhang

We are interested in learning robust models from insufficient data, without the need for any externally pre-trained checkpoints.

Inductive Bias

ClothFormer:Taming Video Virtual Try-on in All Module

1 code implementation26 Apr 2022 Jianbin Jiang, Tan Wang, He Yan, Junhui Liu

Moreover, there are two other key challenges: 1) how to generate accurate warping when occlusions appear in the clothing region; 2) how to generate clothes and non-target body parts (e. g. arms, neck) in harmony with the complicated background; To address them, we propose a novel video virtual try-on framework, ClothFormer, which successfully synthesizes realistic, harmonious, and spatio-temporal consistent results in complicated environment.

Optical Flow Estimation Virtual Try-on

Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation

1 code implementation CVPR 2022 Zhaozheng Chen, Tan Wang, Xiongwei Wu, Xian-Sheng Hua, Hanwang Zhang, Qianru Sun

Specifically, due to the sum-over-class pooling nature of BCE, each pixel in CAM may be responsive to multiple classes co-occurring in the same receptive field.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

ClothFormer: Taming Video Virtual Try-On in All Module

no code implementations CVPR 2022 Jianbin Jiang, Tan Wang, He Yan, Junhui Liu

Moreover, there are two other key challenges: 1) how to generate accurate warping when occlusions appear in the clothing region; 2) how to generate clothes and non-target body parts (e. g. arms, neck) in harmony with the complicated background; To address them, we propose a novel video virtual try-on framework, ClothFormer, which successfully synthesizes realistic, harmonious, and spatio-temporal consistent results in complicated environment.

Optical Flow Estimation Virtual Try-on

Self-Supervised Learning Disentangled Group Representation as Feature

1 code implementation NeurIPS 2021 Tan Wang, Zhongqi Yue, Jianqiang Huang, Qianru Sun, Hanwang Zhang

A good visual representation is an inference map from observations (images) to features (vectors) that faithfully reflects the hidden modularized generative factors (semantics).

Colorization Contrastive Learning +1

Causal Attention for Unbiased Visual Recognition

1 code implementation ICCV 2021 Tan Wang, Chang Zhou, Qianru Sun, Hanwang Zhang

Attention module does not always help deep models learn causal features that are robust in any confounding context, e. g., a foreground object feature is invariant to different backgrounds.

Free Lunch for Co-Saliency Detection: Context Adjustment

no code implementations4 Aug 2021 Lingdong Kong, Prakhar Ganesh, Tan Wang, Junhao Liu, Le Zhang, Yao Chen

We hope that the scale, diversity, and quality of our dataset can benefit researchers in this area and beyond.

counterfactual Saliency Detection +1

Counterfactual Zero-Shot and Open-Set Visual Recognition

1 code implementation CVPR 2021 Zhongqi Yue, Tan Wang, Hanwang Zhang, Qianru Sun, Xian-Sheng Hua

We show that the key reason is that the generation is not Counterfactual Faithful, and thus we propose a faithful one, whose generation is from the sample-specific counterfactual question: What would the sample look like, if we set its class attribute to a certain class, while keeping its sample attribute unchanged?

Attribute Binary Classification +3

DeVLBert: Learning Deconfounded Visio-Linguistic Representations

1 code implementation16 Aug 2020 Shengyu Zhang, Tan Jiang, Tan Wang, Kun Kuang, Zhou Zhao, Jianke Zhu, Jin Yu, Hongxia Yang, Fei Wu

In this paper, we propose to investigate the problem of out-of-domain visio-linguistic pretraining, where the pretraining data distribution differs from that of downstream data on which the pretrained model will be fine-tuned.

Image Retrieval Question Answering +2

Visual Commonsense R-CNN

1 code implementation CVPR 2020 Tan Wang, Jianqiang Huang, Hanwang Zhang, Qianru Sun

We present a novel unsupervised feature representation learning method, Visual Commonsense Region-based Convolutional Neural Network (VC R-CNN), to serve as an improved visual region encoder for high-level tasks such as captioning and VQA.

Image Captioning Representation Learning +1

sWSI: A Low-cost and Commercial-quality Whole Slide Imaging System on Android and iOS Smartphones

no code implementations1 Apr 2017 Shuoxin Ma, Tan Wang

In this paper, scalable Whole Slide Imaging (sWSI), a novel high-throughput, cost-effective and robust whole slide imaging system on both Android and iOS platforms is introduced and analyzed.

Cannot find the paper you are looking for? You can Submit a new open access paper.