Search Results for author: Keze Wang

Found 40 papers, 16 papers with code

Mind the Context: The Impact of Contextualization in Neural Module Networks for Grounding Visual Referring Expressions

no code implementations • EMNLP 2021 • Arjun Akula, Spandana Gella, Keze Wang, Song-Chun Zhu, Siva Reddy

Our model outperforms the state-of-the-art NMN model on CLEVR-Ref+ dataset with +8. 1% improvement in accuracy on the single-referent test set and +4. 3% on the full test set.

Paper
Add Code

On Training Data Influence of GPT Models

1 code implementation • 11 Apr 2024 • Qingyi Liu, Yekun Chai, Shuohuan Wang, Yu Sun, Qiwei Peng, Keze Wang, Hua Wu

This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models.

Natural Language Understanding

2,026

Paper
Code

NeRF-VPT: Learning Novel View Representations with Neural Radiance Fields via View Prompt Tuning

1 code implementation • 2 Mar 2024 • Linsheng Chen, Guangrun Wang, Liuchun Yuan, Keze Wang, Ken Deng, Philip H. S. Torr

Furthermore, the cascading learning of NeRF-VPT introduces adaptability to scenarios with sparse inputs, resulting in a significant enhancement of accuracy for sparse-view novel view synthesis.

Novel View Synthesis

Paper
Code

Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention

1 code implementation • 12 Jan 2024 • Xingyu Zhou, Leheng Zhang, Xiaorui Zhao, Keze Wang, Leida Li, Shuhang Gu

The core of MIA-VSR is leveraging feature-level temporal continuity between adjacent frames to reduce redundant computations and make more rational use of previously enhanced SR features.

Video Super-Resolution

Paper
Code

Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation

no code implementations • 18 Dec 2023 • Hui Fu, Zeqing Wang, Ke Gong, Keze Wang, Tianshui Chen, Haojie Li, Haifeng Zeng, Wenxiong Kang

Moreover, to facilitate disentangled representation learning, we introduce four well-designed constraints: an auxiliary style classifier, an auxiliary inverse classifier, a content contrastive loss, and a pair of latent cycle losses, which can effectively contribute to the construction of the identity-related style space and semantic-related content space.

Disentanglement

Paper
Add Code

Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering

no code implementations • 29 Nov 2023 • Zeqing Wang, Wentao Wan, Runmeng Chen, Qiqing Lao, Minjie Lang, Keze Wang

The Integrator agent combines information from the Seeker agent and the Responder agent to produce the final VQA answer.

Question Answering Visual Question Answering

Paper
Add Code

SQLNet: Scale-Modulated Query and Localization Network for Few-Shot Class-Agnostic Counting

1 code implementation • 16 Nov 2023 • Hefeng Wu, Yandong Chen, Lingbo Liu, Tianshui Chen, Keze Wang, Liang Lin

In the localization stage, the Scale-aware Multi-head Localization (SAML) module utilizes the query tensor to predict the confidence, location, and size of each potential object.

Paper
Code

A Continual Learning Paradigm for Non-differentiable Visual Programming Frameworks on Visual Reasoning Tasks

no code implementations • 18 Sep 2023 • Wentao Wan, Nan Kang, Zeqing Wang, Zhuojie Yang, Liang Lin, Keze Wang

Specifically, our CLVP distills the capabilities of well-trained task-specific models into the visual sub-modules in a stepwise and anti-forgetting manner.

Continual Learning Visual Reasoning

Paper
Add Code

Towards CausalGPT: A Multi-Agent Approach for Faithful Knowledge Reasoning via Promoting Causal Consistency in LLMs

2 code implementations • 23 Aug 2023 • Ziyi Tang, Ruilin Wang, Weixing Chen, Keze Wang, Yang Liu, Tianshui Chen, Liang Lin

Despite advancements in LLMs, knowledge-based reasoning remains a longstanding issue due to the fragility of knowledge recall and inference.

counterfactual Science Question Answering

110

Paper
Code

TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning

2 code implementations • 7 Dec 2021 • Yang Liu, Keze Wang, Lingbo Liu, Haoyuan Lan, Liang Lin

To overcome these limitations, we take advantage of the multi-scale temporal dependencies within videos and proposes a novel video self-supervised learning framework named Temporal Contrastive Graph Learning (TCGL), which jointly models the inter-snippet and intra-snippet temporal dependencies for temporal representation learning with a hybrid graph contrastive learning strategy.

Action Recognition Contrastive Learning +5

Paper
Code

Enhancing Prototypical Few-Shot Learning by Leveraging the Local-Level Strategy

no code implementations • 8 Nov 2021 • Junying Huang, Fan Chen, Keze Wang, Liang Lin, Dongyu Zhang

Aiming at recognizing the samples from novel categories with few reference samples, few-shot learning (FSL) is a challenging problem.

Ranked #37 on Few-Shot Image Classification on Mini-Imagenet 5-way (1-shot)

Few-Shot Image Classification Few-Shot Learning +1

Paper
Add Code

CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models

1 code implementation • 3 Sep 2021 • Arjun R. Akula, Keze Wang, Changsong Liu, Sari Saba-Sadiya, Hongjing Lu, Sinisa Todorovic, Joyce Chai, Song-Chun Zhu

More concretely, our CX-ToM framework generates sequence of explanations in a dialog by mediating the differences between the minds of machine and human user.

counterfactual Explainable Artificial Intelligence (XAI)

Paper
Code

Solving Inefficiency of Self-supervised Representation Learning

1 code implementation • ICCV 2021 • Guangrun Wang, Keze Wang, Guangcong Wang, Philip H. S. Torr, Liang Lin

In this paper, we reveal two contradictory phenomena in contrastive learning that we call under-clustering and over-clustering problems, which are major obstacles to learning efficiency.

Ranked #1 on Self-Supervised Person Re-Identification on SYSU-30k

Clustering Contrastive Learning +4

Paper
Code

Temporal Contrastive Graph Learning for Video Action Recognition and Retrieval

no code implementations • 4 Jan 2021 • Yang Liu, Keze Wang, Haoyuan Lan, Liang Lin

To model multi-scale temporal dependencies, our TCGL integrates the prior knowledge about the frame and snippet orders into graph structures, i. e., the intra-/inter- snippet temporal contrastive graphs.

Action Recognition Contrastive Learning +5

Paper
Add Code

Linguistically Routing Capsule Network for Out-of-Distribution Visual Question Answering

no code implementations • ICCV 2021 • Qingxing Cao, Wentao Wan, Keze Wang, Xiaodan Liang, Liang Lin

The experimental results show that our proposed method can improve current VQA models on OOD split without losing performance on the in-domain test data.

Novel Concepts Question Answering +1

Paper
Add Code

Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding

1 code implementation • 14 Dec 2020 • Qingxing Cao, Bailin Li, Xiaodan Liang, Keze Wang, Liang Lin

Specifically, we generate the question-answer pair based on both the Visual Genome scene graph and an external knowledge base with controlled programs to disentangle the knowledge from other biases.

Question Answering Visual Question Answering

Paper
Code

Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp

1 code implementation • 30 Nov 2020 • Junfan Lin, Zhongzhan Huang, Keze Wang, Xiaodan Liang, Weiwei Chen, Liang Lin

Although deep reinforcement learning (RL) has been successfully applied to a variety of robotic control tasks, it's still challenging to apply it to real-world tasks, due to the poor sample efficiency.

Continuous Control Reinforcement Learning (RL)

Paper
Code

Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision Action Recognition

1 code implementation • 1 Sep 2020 • Yang Liu, Keze Wang, Guanbin Li, Liang Lin

In this paper, we propose a novel framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos) by adaptively transferring and distilling the knowledge from multiple wearable sensors.

Action Recognition Image Generation +3

Paper
Code

Linguistically Driven Graph Capsule Network for Visual Question Reasoning

no code implementations • 23 Mar 2020 • Qingxing Cao, Xiaodan Liang, Keze Wang, Liang Lin

Inspired by the property of a capsule network that can carve a tree structure inside a regular convolutional neural network (CNN), we propose a hierarchical compositional reasoning model called the "Linguistically driven Graph Capsule Network", where the compositional process is guided by the linguistic parse tree.

Question Answering Visual Question Answering

Paper
Add Code

Towards Causality-Aware Inferring: A Sequential Discriminative Approach for Medical Diagnosis

1 code implementation • 14 Mar 2020 • Junfan Lin, Keze Wang, Ziliang Chen, Xiaodan Liang, Liang Lin

To eliminate this bias and inspired by the propensity score matching technique with causal diagram, we propose a propensity-based patient simulator to effectively answer unrecorded inquiry by drawing knowledge from the other records; Bias (ii) inherently comes along with the passively collected data, and is one of the key obstacles for training the agent towards "learning how" rather than "remembering what".

Medical Diagnosis

Paper
Code

Instance-Aware Representation Learning and Association for Online Multi-Person Tracking

no code implementations • 29 May 2019 • Hefeng Wu, Yafei Hu, Keze Wang, Hanhui Li, Lin Nie, Hui Cheng

Multi-Person Tracking (MPT) is often addressed within the detection-to-association paradigm.

Representation Learning

Paper
Add Code

Face Hallucination by Attentive Sequence Optimization with Reinforcement Learning

no code implementations • 4 May 2019 • Yukai Shi, Guanbin Li, Qingxing Cao, Keze Wang, Liang Lin

Face hallucination is a domain-specific super-resolution problem that aims to generate a high-resolution (HR) face image from a low-resolution~(LR) input.

Face Hallucination Hallucination +3

Paper
Add Code

Adaptively Connected Neural Networks

1 code implementation • CVPR 2019 • Guangrun Wang, Keze Wang, Liang Lin

This paper presents a novel adaptively connected neural network (ACNet) to improve the traditional convolutional neural networks (CNNs) {in} two aspects.

Ranked #1 on Document Classification on Cora

Document Classification Image Classification +1

144

Paper
Code

3D Human Pose Machines with Self-supervised Learning

2 code implementations • arXiv.org 2019 • Keze Wang, Liang Lin, Chenhan Jiang, Chen Qian, Pengxu Wei

Driven by recent computer vision and robotic applications, recovering 3D human poses has become increasingly important and attracted growing interests.

Ranked #263 on 3D Human Pose Estimation on Human3.6M

3D Human Pose Estimation Self-Supervised Learning

409

Paper
Code

Teaching to Teach by Structured Dark Knowledge

no code implementations • 27 Sep 2018 • Ziliang Chen, Keze Wang, Liang Lin

We evaluate T2T across different learners, teachers, and tasks, which significantly demonstrates that structured knowledge can be inherited by the teachers to further benefit learners' training.

Paper
Add Code

Cost-effective Object Detection: Active Sample Mining with Switchable Selection Criteria

1 code implementation • 30 Jun 2018 • Keze Wang, Liang Lin, Xiaopeng Yan, Ziliang Chen, Dongyu Zhang, Lei Zhang

The proposed process can be compatible with mini-batch based training (i. e., using a batch of unlabeled or partially labeled data as a one-time input) for object detection.

Active Learning object-detection +2

Paper
Code

Flow Guided Recurrent Neural Encoder for Video Salient Object Detection

no code implementations • CVPR 2018 • Guanbin Li, Yuan Xie, Tianhao Wei, Keze Wang, Liang Lin

Image saliency detection has recently witnessed significant progress due to deep convolutional neural networks.

Ranked #2 on Video Salient Object Detection on DAVSOD-Difficult20 (using extra training data)

Object object-detection +4

Paper
Add Code

Towards Human-Machine Cooperation: Self-supervised Sample Mining for Object Detection

no code implementations • CVPR 2018 • Keze Wang, Xiaopeng Yan, Dongyu Zhang, Lei Zhang, Liang Lin

Though quite challenging, leveraging large-scale unlabeled or partially labeled images in a cost-effective way has increasingly attracted interests for its great importance to computer vision.

Active Learning Object +2

Paper
Add Code

Recurrent 3D Pose Sequence Machines

no code implementations • CVPR 2017 • Mude Lin, Liang Lin, Xiaodan Liang, Keze Wang, Hui Cheng

3D human articulated pose recovery from monocular image sequences is very challenging due to the diverse appearances, viewpoints, occlusions, and also the human 3D pose is inherently ambiguous from the monocular imagery.

Ranked #20 on 3D Human Pose Estimation on HumanEva-I

3D Human Pose Estimation 3D Pose Estimation

Paper
Add Code

Deep Co-Space: Sample Mining Across Feature Transformation for Semi-Supervised Learning

no code implementations • 28 Jul 2017 • Ziliang Chen, Keze Wang, Xiao Wang, Pai Peng, Ebroul Izquierdo, Liang Lin

Aiming at improving performance of visual classification in a cost-effective manner, this paper proposes an incremental semi-supervised learning paradigm called Deep Co-Space (DCS).

Classification General Classification +1

Paper
Add Code

Structure-Preserving Image Super-resolution via Contextualized Multi-task Learning

no code implementations • 26 Jul 2017 • Yukai Shi, Keze Wang, Chongyu Chen, Li Xu, Liang Lin

Single image super resolution (SR), which refers to reconstruct a higher-resolution (HR) image from the observed low-resolution (LR) image, has received substantial attention due to its tremendous application potentials.

Computational Efficiency Image Restoration +2

Paper
Add Code

Active Self-Paced Learning for Cost-Effective and Progressive Face Identification

no code implementations • 13 Jan 2017 • Liang Lin, Keze Wang, Deyu Meng, WangMeng Zuo, Lei Zhang

By naturally combining two recently rising techniques: active learning (AL) and self-paced learning (SPL), our framework is capable of automatically annotating new instances and incorporating them into training under weak expert re-certification.

Active Learning Face Identification

Paper
Add Code

Cost-Effective Active Learning for Deep Image Classification

3 code implementations • 13 Jan 2017 • Keze Wang, Dongyu Zhang, Ya Li, Ruimao Zhang, Liang Lin

In this paper, we propose a novel active learning framework, which is capable of building a competitive classifier with optimal feature representation via a limited amount of labeled training instances in an incremental learning manner.

Active Learning Classification +5

Paper
Code

Human Pose Estimation from Depth Images via Inference Embedded Multi-task Learning

no code implementations • 13 Aug 2016 • Keze Wang, Shengfu Zhai, Hui Cheng, Xiaodan Liang, Liang Lin

In this paper, we propose a novel inference-embedded multi-task learning framework for predicting human pose from still depth images, which is implemented with a deep architecture of neural networks.

Multi-Task Learning Pose Estimation +1

Paper
Add Code

Local- and Holistic- Structure Preserving Image Super Resolution via Deep Joint Component Learning

no code implementations • 25 Jul 2016 • Yukai Shi, Keze Wang, Li Xu, Liang Lin

Recently, machine learning based single image super resolution (SR) approaches focus on jointly learning representations for high-resolution (HR) and low-resolution (LR) image patch pairs to improve the quality of the super-resolved images.

Image Super-Resolution Representation Learning

Paper
Add Code

Dictionary Pair Classifier Driven Convolutional Neural Networks for Object Detection

no code implementations • CVPR 2016 • Keze Wang, Liang Lin, WangMeng Zuo, Shuhang Gu, Lei Zhang

Feature representation and object category classification are two key components of most object detection methods.

General Classification Novel Object Detection +4

Paper
Add Code

A Deep Structured Model with Radius-Margin Bound for 3D Human Activity Recognition

no code implementations • 5 Dec 2015 • Liang Lin, Keze Wang, WangMeng Zuo, Meng Wang, Jiebo Luo, Lei Zhang

Understanding human activity is very challenging even with the recently developed 3D/depth sensors.

Human Activity Recognition

Paper
Add Code

PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Edge-Preserving Coherence

no code implementations • CVPR 2013 • Keze Wang, Liang Lin, Jiangbo Lu, Chenglong Li, Keyang Shi

In this paper, we propose a unified framework called PISA, which stands for Pixelwise Image Saliency Aggregating various bottom-up cues and priors.

Image Segmentation Object Recognition +2

Paper
Add Code

3D Human Activity Recognition with Reconfigurable Convolutional Neural Networks

no code implementations • 26 Jan 2015 • Keze Wang, Xiaolong Wang, Liang Lin, Meng Wang, WangMeng Zuo

Our model thus advances existing approaches in two aspects: (i) it acts directly on the raw inputs (grayscale-depth data) to conduct recognition instead of relying on hand-crafted features, and (ii) the model structure can be dynamically adjusted accounting for the temporal variations of human activities, i. e. the network configuration is allowed to be partially activated during inference.

Human Activity Recognition

Paper
Add Code

PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

no code implementations • CVPR 2013 • Keyang Shi, Keze Wang, Jiangbo Lu, Liang Lin

By fusing complementary contrast measures in such a pixelwise adaptive manner, the detection effectiveness is significantly boosted.

Image Segmentation Object Recognition +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.