Search Results for author: Yu Kong

Found 27 papers, 10 papers with code

Facial Affective Behavior Analysis with Instruction Tuning

no code implementations • 7 Apr 2024 • YiFan Li, Anh Dao, Wentao Bao, Zhen Tan, Tianlong Chen, Huan Liu, Yu Kong

Our initiative on the dataset and benchmarks reveal the nature and rationale of facial affective behaviors, i. e., fine-grained facial movement, interpretability, and reasoning.

Instruction Following

Paper
Add Code

The Wolf Within: Covert Injection of Malice into MLLM Societies via an MLLM Operative

1 code implementation • 20 Feb 2024 • Zhen Tan, Chengshuai Zhao, Raha Moraffah, YiFan Li, Yu Kong, Tianlong Chen, Huan Liu

Unlike direct harmful output generation for MLLMs, our research demonstrates how a single MLLM agent can be subtly influenced to generate prompts that, in turn, induce other MLLM agents in the society to output malicious content.

Misinformation

Paper
Code

CSGNN: Conquering Noisy Node labels via Dynamic Class-wise Selection

no code implementations • 20 Nov 2023 • YiFan Li, Zhen Tan, Kai Shu, Zongsheng Cao, Yu Kong, Huan Liu

Graph Neural Networks (GNNs) have emerged as a powerful tool for representation learning on graphs, but they often suffer from overfitting and label noise issues, especially when the data is scarce or imbalanced.

Memorization Representation Learning

Paper
Add Code

Latent Space Energy-based Model for Fine-grained Open Set Recognition

no code implementations • 19 Sep 2023 • Wentao Bao, Qi Yu, Yu Kong

A recent trend in OSR shows the benefit of generative models to discriminative unknown detection.

Attribute Density Estimation +1

Paper
Add Code

On Model Explanations with Transferable Neural Pathways

no code implementations • 18 Sep 2023 • Xinmiao Lin, Wentao Bao, Qi Yu, Yu Kong

Neural pathways as model explanations consist of a sparse set of neurons that provide the same level of prediction performance as the whole model.

Paper
Add Code

ATM: Action Temporality Modeling for Video Question Answering

no code implementations • 5 Sep 2023 • Junwen Chen, Jie Zhu, Yu Kong

Despite significant progress in video question answering (VideoQA), existing methods fall short of questions that require causal/temporal reasoning across frames.

Ranked #17 on Video Question Answering on NExT-QA

Contrastive Learning Optical Flow Estimation +2

Paper
Add Code

Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory Forecasting

1 code implementation • ICCV 2023 • Wentao Bao, Lele Chen, Libing Zeng, Zhong Li, Yi Xu, Junsong Yuan, Yu Kong

In this paper, we set up an egocentric 3D hand trajectory forecasting task that aims to predict hand trajectories in a 3D space from early observed RGB videos in a first-person view.

3D Human Pose Tracking Trajectory Forecasting +1

Paper
Code

Prompting Language-Informed Distribution for Compositional Zero-Shot Learning

no code implementations • 23 May 2023 • Wentao Bao, Lichang Chen, Heng Huang, Yu Kong

Orthogonal to the existing literature of soft, hard, or distributional prompts, our method advocates prompting the LLM-supported class distribution that leads to a better zero-shot generalization.

Compositional Zero-Shot Learning Informativeness +1

Paper
Add Code

Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder

1 code implementation • CVPR 2023 • Xinmiao Lin, Yikang Li, Jenhao Hsiao, Chiuman Ho, Yu Kong

The popular VQ-VAE models reconstruct images through learning a discrete codebook but suffer from a significant issue in the rapid quality degradation of image reconstruction as the compression rate rises.

Image Generation Image Reconstruction

Paper
Code

GateHUB: Gated History Unit with Background Suppression for Online Action Detection

no code implementations • CVPR 2022 • Junwen Chen, Gaurav Mittal, Ye Yu, Yu Kong, Mei Chen

We present GateHUB, Gated History Unit with Background Suppression, that comprises a novel position-guided gated cross-attention mechanism to enhance or suppress parts of the history as per how informative they are for current frame prediction.

Ranked #1 on Online Action Detection on TVSeries

Online Action Detection Optical Flow Estimation

Paper
Add Code

A Dynamic Meta-Learning Model for Time-Sensitive Cold-Start Recommendations

no code implementations • 3 Apr 2022 • Krishna Prasad Neupane, Ervine Zheng, Yu Kong, Qi Yu

We present a novel dynamic recommendation model that focuses on users who have interactions in the past but turn relatively inactive recently.

Meta-Learning Recommendation Systems

Paper
Add Code

Learning of Global Objective for Network Flow in Multi-Object Tracking

no code implementations • CVPR 2022 • Shuai Li, Yu Kong, Hamid Rezatofighi

This paper concerns the problem of multi-object tracking based on the min-cost flow (MCF) formulation, which is conventionally studied as an instance of linear program.

Multi-Object Tracking

Paper
Add Code

OpenTAL: Towards Open Set Temporal Action Localization

1 code implementation • CVPR 2022 • Wentao Bao, Qi Yu, Yu Kong

The OpenTAL is general to enable existing TAL models for open set scenarios, and experimental results on THUMOS14 and ActivityNet1. 3 benchmarks show the effectiveness of our method.

Action Classification Classification +2

Paper
Code

An Eye for an Eye: Defending against Gradient-based Attacks with Gradients

no code implementations • 2 Feb 2022 • Hanbin Hong, Yuan Hong, Yu Kong

In this paper, we show that the gradients can also be exploited as a powerful weapon to defend against adversarial attacks.

Paper
Add Code

Gradient Frequency Modulation for Visually Explaining Video Understanding Models

no code implementations • 1 Nov 2021 • Xinmiao Lin, Wentao Bao, Matthew Wright, Yu Kong

In many applications, it is essential to understand why a machine learning model makes the decisions it does, but this is inhibited by the black-box nature of state-of-the-art neural networks.

Action Recognition Temporal Action Localization +1

Paper
Add Code

DRIVE: Deep Reinforced Accident Anticipation with Visual Explanation

1 code implementation • ICCV 2021 • Wentao Bao, Qi Yu, Yu Kong

Traffic accident anticipation aims to accurately and promptly predict the occurrence of a future accident from dashcam videos, which is vital for a safety-guaranteed self-driving system.

Accident Anticipation Decision Making

Paper
Code

Evidential Deep Learning for Open Set Action Recognition

2 code implementations • ICCV 2021 • Wentao Bao, Qi Yu, Yu Kong

Different from image data, video actions are more challenging to be recognized in an open-set setting due to the uncertain temporal dynamics and static bias of human actions.

Open Set Action Recognition Open Set Learning +1

114

Paper
Code

Explainable Video Entailment With Grounded Visual Evidence

no code implementations • ICCV 2021 • Junwen Chen, Yu Kong

Video entailment aims at determining if a hypothesis textual statement is entailed or contradicted by a premise video.

Visual Grounding

Paper
Add Code

Group Activity Prediction with Sequential Relational Anticipation Model

1 code implementation • ECCV 2020 • Junwen Chen, Wentao Bao, Yu Kong

Our model explicitly anticipates both activity features and positions by two graph auto-encoders, aiming to learn a discriminative group representation for group activity prediction.

Activity Prediction

Paper
Code

Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational Learning

2 code implementations • 1 Aug 2020 • Wentao Bao, Qi Yu, Yu Kong

The derived uncertainty-based ranking loss is found to significantly boost model performance by improving the quality of relational features.

Ranked #2 on Accident Anticipation on CCD

Accident Anticipation Activity Prediction +4

Paper
Code

Object-Aware Centroid Voting for Monocular 3D Object Detection

no code implementations • 20 Jul 2020 • Wentao Bao, Qi Yu, Yu Kong

Monocular 3D object detection aims to detect objects in a 3D physical world from a single camera.

Depth Estimation Monocular 3D Object Detection +3

Paper
Add Code

Cascaded Detail-Preserving Networks for Super-Resolution of Document Images

no code implementations • 25 Nov 2019 • Zhichao Fu, Yu Kong, Yingbin Zheng, Hao Ye, Wenxin Hu, Jing Yang, Liang He

The accuracy of OCR is usually affected by the quality of the input document image and different kinds of marred document images hamper the OCR results.

Image Super-Resolution Optical Character Recognition (OCR)

Paper
Add Code

Residual Dense Network for Image Restoration

3 code implementations • 25 Dec 2018 • Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, Yun Fu

We fully exploit the hierarchical features from all the convolutional layers.

Ranked #1 on JPEG Artifact Correction on LIVE1 (Quality 30 Grayscale)

Deblurring Image Compression +5

537

Paper
Code

Human Action Recognition and Prediction: A Survey

no code implementations • 28 Jun 2018 • Yu Kong, Yun Fu

Derived from rapid advances in computer vision and machine learning, video analysis tasks have been moving from inferring the present state to predicting the future state.

Action Recognition Autonomous Driving +3

Paper
Add Code

Residual Dense Network for Image Super-Resolution

16 code implementations • CVPR 2018 • Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, Yun Fu

In this paper, we propose a novel residual dense network (RDN) to address this problem in image SR. We fully exploit the hierarchical features from all the convolutional layers.

Ranked #3 on Color Image Denoising on CBSD68 sigma50

Color Image Denoising Image Super-Resolution

4,495

Paper
Code

Deep Sequential Context Networks for Action Prediction

no code implementations • CVPR 2017 • Yu Kong, Zhiqiang Tao, Yun Fu

Different from after-the-fact action recognition, action prediction task requires action labels to be predicted from these partially observed videos.

Action Recognition Temporal Action Localization

Paper
Add Code

Bilinear Heterogeneous Information Machine for RGB-D Action Recognition

no code implementations • CVPR 2015 • Yu Kong, Yun Fu

Rich heterogeneous RGB and depth data are effectively compressed and projected to a learned shared space, in order to reduce noise and capture useful information for recognition.

Action Recognition Temporal Action Localization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.