Search Results for author: Yifei HUANG

Found 22 papers, 10 papers with code

CLRNet: Cross Layer Refinement Network for Lane Detection

2 code implementations19 Mar 2022 Tu Zheng, Yifei HUANG, Yang Liu, Wenjian Tang, Zheng Yang, Deng Cai, Xiaofei He

In this way, we can exploit more contextual information to detect lanes while leveraging local detailed lane features to improve localization accuracy.

Lane Detection

Leveraging Human Selective Attention for Medical Image Analysis with Limited Training Data

no code implementations2 Dec 2021 Yifei HUANG, Xiaoxiao Li, Lijin Yang, Lin Gu, Yingying Zhu, Hirofumi Seo, Qiuming Meng, Tatsuya Harada, Yoichi Sato

Then we design a novel Auxiliary Attention Block (AAB) to allow information from SAN to be utilized by the backbone encoder to focus on selective areas.

Tumor Segmentation

Stacked Temporal Attention: Improving First-person Action Recognition by Emphasizing Discriminative Clips

no code implementations2 Dec 2021 Lijin Yang, Yifei HUANG, Yusuke Sugano, Yoichi Sato

Previous works explored to address this problem by applying temporal attention but failed to consider the global context of the full video, which is critical for determining the relatively significant parts.

Action Recognition Video Understanding

Ego4D: Around the World in 3,000 Hours of Egocentric Video

no code implementations13 Oct 2021 Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.


Spatio-Temporal Perturbations for Video Attribution

no code implementations1 Sep 2021 Zhenqiang Li, Weimin WANG, Zuoyue Li, Yifei HUANG, Yoichi Sato

The attribution method provides a direction for interpreting opaque neural networks in a visual way by identifying and visualizing the input regions/pixels that dominate the output of a network.

Video Understanding

FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning

1 code implementation ICCV 2021 Chenxu Zhang, Yifan Zhao, Yifei HUANG, Ming Zeng, Saifeng Ni, Madhukar Budagavi, Xiaohu Guo

In this paper, we propose a talking face generation method that takes an audio signal as input and a short target video clip as reference, and synthesizes a photo-realistic video of the target face with natural lip motions, head poses, and eye blinks that are in-sync with the input audio signal.

3D Face Animation Talking Face Generation

EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2021: Team M3EM Technical Report

no code implementations18 Jun 2021 Lijin Yang, Yifei HUANG, Yusuke Sugano, Yoichi Sato

In this report, we describe the technical details of our submission to the 2021 EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition.

Action Recognition Unsupervised Domain Adaptation

Goal-Oriented Gaze Estimation for Zero-Shot Learning

1 code implementation CVPR 2021 Yang Liu, Lei Zhou, Xiao Bai, Yifei HUANG, Lin Gu, Jun Zhou, Tatsuya Harada

Therefore, we introduce a novel goal-oriented gaze estimation module (GEM) to improve the discriminative attribute localization based on the class-level attributes for ZSL.

Gaze Estimation Generalized Zero-Shot Learning

Commonsense Knowledge Aware Concept Selection For Diverse and Informative Visual Storytelling

no code implementations5 Feb 2021 Hong Chen, Yifei HUANG, Hiroya Takamura, Hideki Nakayama

To enrich the candidate concepts, a commonsense knowledge graph is created for each image sequence from which the concept candidates are proposed.

Informativeness Visual Storytelling

Adversarial Robustness of Stabilized NeuralODEs Might be from Obfuscated Gradients

1 code implementation28 Sep 2020 Yifei Huang, Yaodong Yu, Hongyang Zhang, Yi Ma, Yuan YAO

Even replacing only the first layer of a ResNet by such a ODE block can exhibit further improvement in robustness, e. g., under PGD-20 ($\ell_\infty=0. 031$) attack on CIFAR-10 dataset, it achieves 91. 57\% and natural accuracy and 62. 35\% robust accuracy, while a counterpart architecture of ResNet trained with TRADES achieves natural and robust accuracy 76. 29\% and 45. 24\%, respectively.

Adversarial Defense Adversarial Robustness

Improving Action Segmentation via Graph-Based Temporal Reasoning

no code implementations CVPR 2020 Yifei Huang, Yusuke Sugano, Yoichi Sato

In this paper, we propose a network module called Graph-based Temporal Reasoning Module (GTRM) that can be built on top of existing action segmentation models to learn the relation of multiple action segments in various time spans.

Action Segmentation

Towards Visually Explaining Video Understanding Networks with Perturbation

1 code implementation1 May 2020 Zhenqiang Li, Weimin WANG, Zuoyue Li, Yifei HUANG, Yoichi Sato

''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks.

Video Understanding

Discovery of Bias and Strategic Behavior in Crowdsourced Performance Assessment

no code implementations5 Aug 2019 Yifei Huang, Matt Shum, Xi Wu, Jason Zezhong Xiao

With the industry trend of shifting from a traditional hierarchical approach to flatter management structure, crowdsourced performance assessment gained mainstream popularity.



no code implementations ICLR 2019 Yifei HUANG, Yuan YAO, Weizhi Zhu

A belief persists long in machine learning that enlargement of margins over training data accounts for the resistance of models to overfitting by increasing the robustness.

Generalization Bounds

An Evaluation of Transfer Learning for Classifying Sales Engagement Emails at Large Scale

no code implementations19 Apr 2019 Yong Liu, Pavel Dmitriev, Yifei HUANG, Andrew Brooks, Li Dong

Our results show that fine-tuning of the BERT model outperforms with as few as 300 labeled samples, but underperforms with fewer than 300 labeled samples, relative to all the feature-based approaches using different embeddings.

Language Modelling Transfer Learning

Manipulation-skill Assessment from Videos with Spatial Attention Network

no code implementations9 Jan 2019 Zhenqiang Li, Yifei Huang, Minjie Cai, Yoichi Sato

Recent advances in computer vision have made it possible to automatically assess from videos the manipulation skills of humans in performing a task, which breeds many important applications in domains such as health rehabilitation and manufacturing.

Mutual Context Network for Jointly Estimating Egocentric Gaze and Actions

no code implementations7 Jan 2019 Yifei Huang, Zhenqiang Li, Minjie Cai, Yoichi Sato

In this work, we address two coupled tasks of gaze prediction and action recognition in egocentric videos by exploring their mutual context.

Action Recognition Gaze Prediction

Differentiable Fine-grained Quantization for Deep Neural Network Compression

1 code implementation NIPS Workshop CDNNRIA 2018 Hsin-Pai Cheng, Yuanjun Huang, Xuyang Guo, Yifei HUANG, Feng Yan, Hai Li, Yiran Chen

Thus judiciously selecting different precision for different layers/structures can potentially produce more efficient models compared to traditional quantization methods by striking a better balance between accuracy and compression rate.

Neural Network Compression Quantization

Semantic Aware Attention Based Deep Object Co-segmentation

3 code implementations16 Oct 2018 Hong Chen, Yifei HUANG, Hideki Nakayama

Object co-segmentation is the task of segmenting the same objects from multiple images.

Rethinking Breiman's Dilemma in Neural Networks: Phase Transitions of Margin Dynamics

1 code implementation8 Oct 2018 Weizhi Zhu, Yifei HUANG, Yuan YAO

In this paper, we revisit Breiman's dilemma in deep neural networks with recently proposed spectrally normalized margins, from a novel perspective based on phase transitions of normalized margin distributions in training dynamics.

Generalization Bounds

Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition

2 code implementations ECCV 2018 Yifei Huang, Minjie Cai, Zhenqiang Li, Yoichi Sato

We present a new computational model for gaze prediction in egocentric videos by exploring patterns in temporal shift of gaze fixations (attention transition) that are dependent on egocentric manipulation tasks.

Gaze Prediction Saliency Prediction

