Search Results for author: Angela Yao

Found 80 papers, 34 papers with code

InstructHumans: Editing Animated 3D Human Textures with Instructions

no code implementations5 Apr 2024 Jiayin Zhu, Linlin Yang, Angela Yao

We present InstructHumans, a novel framework for instruction-driven 3D human texture editing.

AID: Attention Interpolation of Text-to-Image Diffusion

1 code implementation26 Mar 2024 Qiyuan He, Jinghao Wang, Ziwei Liu, Angela Yao

To that end, we introduce a novel training-free technique named Attention Interpolation via Diffusion (AID).

Spatial Interpolation

Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty from Pre-trained Models

no code implementations23 Dec 2023 Gianni Franchi, Olivier Laurent, Maxence Leguéry, Andrei Bursuc, Andrea Pilzer, Angela Yao

Deep Neural Networks (DNNs) are powerful tools for various computer vision tasks, yet they often struggle with reliable uncertainty quantification - a critical requirement for real-world applications.

Image Classification Semantic Segmentation +1

On the Calibration of Human Pose Estimation

no code implementations28 Nov 2023 Kerui Gu, Rongyu Chen, Angela Yao

Most 2D human pose estimation frameworks estimate keypoint confidence in an ad-hoc manner, using heuristics such as the maximum value of heatmaps.

2D Human Pose Estimation Pose Estimation

Deep Imbalanced Regression via Hierarchical Classification Adjustment

no code implementations26 Oct 2023 Haipeng Xiong, Angela Yao

To improve regression performance over the entire range of data, we propose to construct hierarchical classifiers for solving imbalanced regression tasks.

Age Estimation Classification +4

Can I Trust Your Answer? Visually Grounded Video Question Answering

1 code implementation4 Sep 2023 Junbin Xiao, Angela Yao, Yicong Li, Tat Seng Chua

We study visually grounded VideoQA in response to the emerging trends of utilizing pretraining techniques for video-language understanding.

Question Answering Video Grounding +2

HiFiHR: Enhancing 3D Hand Reconstruction from a Single Image via High-Fidelity Texture

1 code implementation25 Aug 2023 Jiayin Zhu, Zhuoran Zhao, Linlin Yang, Angela Yao

We present HiFiHR, a high-fidelity hand reconstruction approach that utilizes render-and-compare in the learning-based framework from a single image, capable of generating visually plausible and accurate 3D hand meshes while recovering realistic textures.

Opening the Vocabulary of Egocentric Actions

1 code implementation NeurIPS 2023 Dibyadip Chatterjee, Fadime Sener, Shugao Ma, Angela Yao

Given a set of verbs and objects observed during training, the goal is to generalize the verbs to an open vocabulary of actions with seen and novel objects.

 Ranked #1 on Open Vocabulary Action Recognition on Assembly101 (using extra training data)

Object Open Vocabulary Action Recognition

Learning to Generate Training Datasets for Robust Semantic Segmentation

no code implementations1 Aug 2023 Marwane Hariat, Olivier Laurent, Rémi Kazmierczak, Shihao Zhang, Andrei Bursuc, Angela Yao, Gianni Franchi

We propose a novel approach to improve the robustness of semantic segmentation techniques by leveraging the synergy between label-to-image generators and image-to-label segmentation models.

Generative Adversarial Network Segmentation +1

Every Mistake Counts in Assembly

no code implementations31 Jul 2023 Guodong Ding, Fadime Sener, Shugao Ma, Angela Yao

Our framework constructs a knowledge base with spatial and temporal beliefs based on observed mistakes.

Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

1 code implementation arXiv 2024 Kai Xu, Ziwei Yu, Xin Wang, Michael Bi Mi, Angela Yao

We show that bilinear interpolation inherently attenuates high-frequency information while an MLP-based coordinate network can approximate more frequencies.

Video Super-Resolution

Contrastive Video Question Answering via Video Graph Transformer

1 code implementation27 Feb 2023 Junbin Xiao, Pan Zhou, Angela Yao, Yicong Li, Richang Hong, Shuicheng Yan, Tat-Seng Chua

CoVGT's uniqueness and superiority are three-fold: 1) It proposes a dynamic graph transformer module which encodes video by explicitly capturing the visual objects, their relations and dynamics, for complex spatio-temporal reasoning.

Ranked #11 on Video Question Answering on NExT-QA (using extra training data)

Contrastive Learning Question Answering +1

Bias-Compensated Integral Regression for Human Pose Estimation

no code implementations25 Jan 2023 Kerui Gu, Linlin Yang, Michael Bi Mi, Angela Yao

Experimental results on both the human body and hand benchmarks show that BCIR is faster to train and more accurate than the original integral regression, making it competitive with state-of-the-art detection methods.

Hand Pose Estimation regression

Improving Deep Regression with Ordinal Entropy

1 code implementation21 Jan 2023 Shihao Zhang, Linlin Yang, Michael Bi Mi, Xiaoxu Zheng, Angela Yao

In computer vision, it is often observed that formulating regression problems as a classification task often yields better performance.

Classification Crowd Counting +2

Analyzing and Diagnosing Pose Estimation With Attributions

no code implementations CVPR 2023 Qiyuan He, Linlin Yang, Kerui Gu, Qiuxia Lin, Angela Yao

We present Pose Integrated Gradient (PoseIG), the first interpretability technique designed for pose estimation.

Hand Pose Estimation

Cross-Domain 3D Hand Pose Estimation With Dual Modalities

no code implementations CVPR 2023 Qiuxia Lin, Linlin Yang, Angela Yao

To solve this problem, we present a framework for cross-domain semi-supervised hand pose estimation and target the challenging scenario of learning models from labelled multi-modal synthetic data and unlabelled real-world data.

3D Hand Pose Estimation Contrastive Learning +2

MHEntropy: Entropy Meets Multiple Hypotheses for Pose and Shape Recovery

no code implementations ICCV 2023 Rongyu Chen, Linlin Yang, Angela Yao

For monocular RGB-based 3D pose and shape estimation, multiple solutions are often feasible due to factors like occlusion and truncation.

C2F-TCN: A Framework for Semi and Fully Supervised Temporal Action Segmentation

no code implementations20 Dec 2022 Dipika Singhania, Rahul Rahaman, Angela Yao

For the task of temporal action segmentation, we propose an encoder-decoder-style architecture named C2F-TCN featuring a "coarse-to-fine" ensemble of decoder outputs.

Action Segmentation Representation Learning +1

UV-Based 3D Hand-Object Reconstruction with Grasp Optimization

no code implementations24 Nov 2022 Ziwei Yu, Linlin Yang, You Xie, Ping Chen, Angela Yao

We propose a novel framework for 3D hand shape reconstruction and hand-object grasp optimization from a single RGB image.

Object Object Reconstruction

Temporal Action Segmentation: An Analysis of Modern Techniques

2 code implementations19 Oct 2022 Guodong Ding, Fadime Sener, Angela Yao

Temporal action segmentation (TAS) in videos aims at densely identifying video frames in minutes-long videos with multiple action classes.

Action Segmentation Segmentation +1

Perception-Distortion Balanced ADMM Optimization for Single-Image Super-Resolution

1 code implementation5 Aug 2022 Yuehan Zhang, Bo Ji, Jia Hao, Angela Yao

In image super-resolution, both pixel-wise accuracy and perceptual fidelity are desirable.

Image Super-Resolution

Discrete-Constrained Regression for Local Counting Models

1 code implementation20 Jul 2022 Haipeng Xiong, Angela Yao

Through a series of experiments on carefully controlled synthetic data, we show that this counter-intuitive result is caused by imprecise ground truth local counts.

Age Estimation Classification +2

A Generalized & Robust Framework For Timestamp Supervision in Temporal Action Segmentation

no code implementations20 Jul 2022 Rahul Rahaman, Dipika Singhania, Alexandre Thiery, Angela Yao

In temporal action segmentation, Timestamp supervision requires only a handful of labelled frames per video sequence.

Action Segmentation TAG

Leveraging Action Affinity and Continuity for Semi-supervised Temporal Action Segmentation

no code implementations18 Jul 2022 Guodong Ding, Angela Yao

To this end, we propose two novel loss functions for the unlabelled data: an action affinity loss and an action continuity loss.

Action Segmentation

A Closer Look at Branch Classifiers of Multi-exit Architectures

no code implementations28 Apr 2022 Shaohui Lin, Bo Ji, Rongrong Ji, Angela Yao

Multi-exit architectures consist of a backbone and branch classifiers that offer shortened inference pathways to reduce the run-time of deep neural networks.

TemporalUV: Capturing Loose Clothing with Temporally Coherent UV Coordinates

no code implementations CVPR 2022 You Xie, Huiqi Mao, Angela Yao, Nils Thuerey

We propose a novel approach to generate temporally coherent UV coordinates for loose clothing.

Multi-Scale Memory-Based Video Deblurring

1 code implementation CVPR 2022 Bo Ji, Angela Yao

Video deblurring has achieved remarkable progress thanks to the success of deep neural networks.

Analog Video Restoration Deblurring

Local and Global Point Cloud Reconstruction for 3D Hand Pose Estimation

1 code implementation13 Dec 2021 Ziwei Yu, Linlin Yang, Shicheng Chen, Angela Yao

This paper addresses the 3D point cloud reconstruction and 3D pose estimation of the human hand from a single RGB image.

3D Hand Pose Estimation 3D Point Cloud Reconstruction +2

Video as Conditional Graph Hierarchy for Multi-Granular Question Answering

1 code implementation12 Dec 2021 Junbin Xiao, Angela Yao, Zhiyuan Liu, Yicong Li, Wei Ji, Tat-Seng Chua

To align with the multi-granular essence of linguistic concepts in language queries, we propose to model video as a conditional graph hierarchy which weaves together visual facts of different granularity in a level-wise manner, with the guidance of corresponding textual cues.

Question Answering Video Question Answering +1

Iterative Contrast-Classify For Semi-supervised Temporal Action Segmentation

1 code implementation2 Dec 2021 Dipika Singhania, Rahul Rahaman, Angela Yao

Our method hinges on unsupervised representation learning, which, for temporal action segmentation, poses unique challenges.

Action Segmentation Representation Learning +2

Weakly-Supervised Dense Action Anticipation

1 code implementation15 Nov 2021 Haotong Zhang, Fuhai Chen, Angela Yao

We present a (semi-) weakly supervised method using only a small number of fully-labelled sequences and predominantly sequences in which only the (one) upcoming action is labelled.

Action Anticipation

Dive Deeper Into Integral Pose Regression

no code implementations ICLR 2022 Kerui Gu, Linlin Yang, Angela Yao

We do a deep dive on the inference and back-propagation of integral pose regression to better understand the causes behind the performance and training differences.

Hand Pose Estimation regression

Temporal Action Segmentation with High-level Complex Activity Labels

no code implementations15 Aug 2021 Guodong Ding, Angela Yao

Due to the lack of action-level supervision, we adopt the Hungarian matching algorithm to relate latent action prototypes to ground truth semantic classes for evaluation.

Action Recognition Action Segmentation +2

Robust Semantic Segmentation with Superpixel-Mix

1 code implementation2 Aug 2021 Gianni Franchi, Nacim Belkhir, Mai Lan Ha, Yufei Hu, Andrei Bursuc, Volker Blanz, Angela Yao

Along with predictive performance and runtime speed, reliability is a key requirement for real-world semantic segmentation.

Data Augmentation Segmentation +2

Accelerating Video Object Segmentation with Compressed Video

2 code implementations CVPR 2022 Kai Xu, Angela Yao

We propose an efficient plug-and-play acceleration framework for semi-supervised video object segmentation by exploiting the temporal redundancies in videos presented by the compressed bitstream.

Object Segmentation +3

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions

1 code implementation CVPR 2021 Junbin Xiao, Xindi Shang, Angela Yao, Tat-Seng Chua

We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions.

Question Answering Video Question Answering +2

Learning Deep Morphological Networks with Neural Architecture Search

1 code implementation14 Jun 2021 Yufei Hu, Nacim Belkhir, Jesus Angulo, Angela Yao, Gianni Franchi

Using a combination of linear and non-linear procedures is critical for generating a sufficiently deep feature space.

Edge Detection Meta-Learning +1

Transferring Knowledge from Text to Video: Zero-Shot Anticipation for Procedural Actions

no code implementations6 Jun 2021 Fadime Sener, Rishabh Saraf, Angela Yao

Can we teach a robot to recognize and make predictions for activities that it has never seen before?

Zero-Shot Learning

Transformed ROIs for Capturing Visual Transformations in Videos

no code implementations6 Jun 2021 Abhinav Rai, Fadime Sener, Angela Yao

Modeling the visual changes that an action brings to a scene is critical for video understanding.

Action Recognition Video Understanding

Towards Compact Single Image Super-Resolution via Contrastive Self-distillation

8 code implementations25 May 2021 Yanbo Wang, Shaohui Lin, Yanyun Qu, Haiyan Wu, Zhizhong Zhang, Yuan Xie, Angela Yao

Convolutional neural networks (CNNs) are highly successful for super-resolution (SR) but often require sophisticated architectures with heavy memory cost and computational overhead, significantly restricts their practical deployments on resource-limited devices.

Image Super-Resolution SSIM +1

Coarse to Fine Multi-Resolution Temporal Convolutional Network

1 code implementation23 May 2021 Dipika Singhania, Rahul Rahaman, Angela Yao

In this work, we propose a novel temporal encoder-decoder to tackle the problem of sequence fragmentation.

Action Segmentation Segmentation +2

NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions

2 code implementations18 May 2021 Junbin Xiao, Xindi Shang, Angela Yao, Tat-Seng Chua

We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions.

Question Answering Video Question Answering +2

Removing the Bias of Integral Pose Regression

no code implementations ICCV 2021 Kerui Gu, Linlin Yang, Angela Yao

Heatmap-based detection methods are dominant for 2D human pose estimation even though regression is more intuitive.

2D Human Pose Estimation 2D Pose Estimation +2

SemiHand: Semi-Supervised Hand Pose Estimation With Consistency

no code implementations ICCV 2021 Linlin Yang, Shicheng Chen, Angela Yao

By design, we introduce data augmentation of differing difficulties, consistency regularizer, label correction and sample selection for RGB-based 3D hand pose estimation.

3D Hand Pose Estimation Data Augmentation

Multi-Stage Fusion for One-Click Segmentation

no code implementations19 Oct 2020 Soumajit Majumder, Ansh Khurana, Abhinav Rai, Angela Yao

Segmenting objects of interest in an image is an essential building block of applications such as photo-editing and image analysis.

Instance Segmentation Interactive Segmentation +2

Localized Interactive Instance Segmentation

no code implementations18 Oct 2020 Soumajit Majumder, Angela Yao

In current interactive instance segmentation works, the user is granted a free hand when providing clicks to segment an object; clicks are allowed on background pixels and other object instances far from the target object.

Instance Segmentation Interactive Segmentation +3

Rethinking CNN Models for Audio Classification

3 code implementations22 Jul 2020 Kamalesh Palanisamy, Dipika Singhania, Angela Yao

Besides, we show that even though we use the pretrained model weights for initialization, there is variance in performance in various output runs of the same model.

Environmental Sound Classification General Classification +2

Neural network compression via learnable wavelet transforms

1 code implementation20 Apr 2020 Moritz Wolter, Shaohui Lin, Angela Yao

Linear layers still occupy a significant portion of the parameters in recurrent neural networks (RNNs).

Data Compression Neural Network Compression

Bonn Activity Maps: Dataset Description

no code implementations13 Dec 2019 Julian Tanke, Oh-Hun Kwon, Patrick Stotko, Radu Alexandru Rosu, Michael Weinmann, Hassan Errami, Sven Behnke, Maren Bennewitz, Reinhard Klein, Andreas Weber, Angela Yao, Juergen Gall

The key prerequisite for accessing the huge potential of current machine learning techniques is the availability of large databases that capture the complex relations of interest.

Activity Recognition

Dual Grid Net: hand mesh vertex regression from single depth maps

no code implementations ECCV 2020 Chengde Wan, Thomas Probst, Luc van Gool, Angela Yao

In the first stage, the network estimates a dense correspondence field for every pixel on the depth map or image grid to the mesh grid.

regression

Sequence Prediction using Spectral RNNs

2 code implementations13 Dec 2018 Moritz Wolter, Juergen Gall, Angela Yao

Fourier methods have a long and proven track record as an excellent tool in data processing.

Time Series Time Series Analysis

Supervised Deep Kriging for Single-Image Super-Resolution

no code implementations10 Dec 2018 Gianni Franchi, Angela Yao, Andreas Kolb

We propose a novel single-image super-resolution approach based on the geostatistical method of kriging.

Image Super-Resolution Spatial Interpolation

Learning Style Compatibility for Furniture

no code implementations9 Dec 2018 Divyansh Aggarwal, Elchin Valiyev, Fadime Sener, Angela Yao

When judging style, a key question that often arises is whether or not a pair of objects are compatible with each other.

Attribute

Zero-Shot Anticipation for Instructional Activities

no code implementations ICCV 2019 Fadime Sener, Angela Yao

How can we teach a robot to predict what will happen next for an activity it has never seen before?

Zero-Shot Learning

Disentangling Latent Hands for Image Synthesis and Pose Estimation

no code implementations CVPR 2019 Linlin Yang, Angela Yao

Hand image synthesis and pose estimation from RGB images are both highly challenging tasks due to the large discrepancy between factors of variation ranging from image background content to camera viewpoint.

Image Generation Pose Estimation

Complex Gated Recurrent Neural Networks

1 code implementation NeurIPS 2018 Moritz Wolter, Angela Yao

Complex numbers have long been favoured for digital signal processing, yet complex representations rarely appear in deep learning architectures.

Human motion prediction motion prediction +3

Unsupervised Learning and Segmentation of Complex Activities from Video

no code implementations CVPR 2018 Fadime Sener, Angela Yao

This paper presents a new method for unsupervised segmentation of complex activities from video into multiple steps, or sub-activities, without any textual input.

Dense 3D Regression for Hand Pose Estimation

1 code implementation CVPR 2018 Chengde Wan, Thomas Probst, Luc van Gool, Angela Yao

Specifically, we decompose the pose parameters into a set of per-pixel estimations, i. e., 2D heat maps, 3D heat maps and unit 3D directional vector fields.

3D Hand Pose Estimation regression

Crossing Nets: Combining GANs and VAEs with a Shared Latent Space for Hand Pose Estimation

no code implementations CVPR 2017 Chengde Wan, Thomas Probst, Luc van Gool, Angela Yao

Regressing the hand pose can then be done by learning a discriminator to estimate the posterior of the latent pose given some depth maps.

3D Hand Pose Estimation

Direction matters: hand pose estimation from local surface normals

no code implementations10 Apr 2016 Chengde Wan, Angela Yao, Luc van Gool

We present a hierarchical regression framework for estimating hand joint positions from single depth images based on local surface normals.

Hand Pose Estimation regression

Efficient Unsupervised Temporal Segmentation of Motion Data

no code implementations22 Oct 2015 Björn Krüger, Anna Vögele, Tobias Willig, Angela Yao, Reinhard Klein, Andreas Weber

We introduce a method for automated temporal segmentation of human motion data into distinct actions and compositing motion primitives based on self-similar structures in the motion sequence.

Clustering Markerless Motion Capture +1

Learning Probabilistic Non-Linear Latent Variable Models for Tracking Complex Activities

no code implementations NeurIPS 2011 Angela Yao, Juergen Gall, Luc V. Gool, Raquel Urtasun

A common approach for handling the complexity and inherent ambiguities of 3D human pose estimation is to use pose priors learned from training data.

3D Human Pose Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.