1 code implementation • COLING 2022 • Heng-yang Lu, Chenyou Fan, Jun Yang, Cong Hu, Wei Fang, Xiao-Jun Wu
Based on the predicted P2P, four effective strategies are introduced to show the BDA performance.
no code implementations • 9 Mar 2023 • Junjie Hu, Chenyou Fan, Liguang Zhou, Qing Gao, Honghai Liu, Tin Lun Lam
In this paper, we seek to enable lifelong learning for MDE, which performs cross-domain depth learning sequentially, to achieve high plasticity on a new domain and maintain good stability on original domains.
1 code implementation • 29 Aug 2022 • Junjie Hu, Chenyou Fan, Mete Ozay, Hua Feng, Yuan Gao, Tin Lun Lam
In this paper, we introduce the ground-to-aerial perception knowledge transfer and propose a progressive semi-supervised learning framework that enables drone perception using only labeled data of ground viewpoint and unlabeled data of flying viewpoints.
no code implementations • 26 Aug 2022 • Junjie Hu, Chenyou Fan, Mete Ozay, Hualie Jiang, Tin Lun Lam
We study data-free knowledge distillation (KD) for monocular depth estimation (MDE), which learns a lightweight model for real-world depth perception tasks by compressing it from a trained teacher model while lacking training data in the target domain.
no code implementations • 11 May 2022 • Junjie Hu, Chenyu Bao, Mete Ozay, Chenyou Fan, Qing Gao, Honghai Liu, Tin Lun Lam
Depth completion aims at predicting dense pixel-wise depth from an extremely sparse map captured from a depth sensor, e. g., LiDARs.
no code implementations • 8 Sep 2021 • Chongyang Wang, Yuan Gao, Chenyou Fan, Junjie Hu, Tin Lun Lam, Nicholas D. Lane, Nadia Bianchi-Berthouze
For such issues, we propose a novel Learning to Agreement (Learn2Agree) framework to tackle the challenge of learning from multiple annotators without objective ground truth.
2 code implementations • 13 May 2021 • Junjie Hu, Chenyou Fan, Hualie Jiang, Xiyue Guo, Yuan Gao, Xiangyong Lu, Tin Lun Lam
In this paper, we aim to achieve accurate depth estimation with a light-weight network.
no code implementations • 1 Apr 2021 • Chenyou Fan, Jianwei Huang
In this paper, we propose a federated few-shot learning (FedFSL) framework to learn a few-shot classification model that can classify unseen data classes with only a few labeled samples.
no code implementations • NeurIPS 2020 • Tianyi Lin, Chenyou Fan, Nhat Ho, Marco Cuturi, Michael. I. Jordan
Projection robust Wasserstein (PRW) distance, or Wasserstein projection pursuit (WPP), is a robust variant of the Wasserstein distance.
no code implementations • 7 May 2020 • Chenyou Fan, Ping Liu
This work studies training generative adversarial networks under the federated learning setting.
1 code implementation • CVPR 2019 • Chenyou Fan, Xiaofan Zhang, Shu Zhang, Wensheng Wang, Chi Zhang, Heng Huang
In this paper, we propose a novel end-to-end trainable Video Question Answering (VideoQA) framework with three major components: 1) a new heterogeneous memory which can effectively learn global context information from appearance and motion features; 2) a redesigned question memory which helps understand the complex semantics of question and highlights queried subjects; and 3) a new multimodal fusion layer which performs multi-step reasoning by attending to relevant visual and textual hints with self-updated attention.
Ranked #21 on
Visual Question Answering (VQA)
on MSVD-QA
1 code implementation • 1 Jun 2018 • Tianyi Lin, Chenyou Fan, Mengdi Wang, Michael. I. Jordan
Convex composition optimization is an emerging topic that covers a wide range of applications arising from stochastic optimal control, reinforcement learning and multi-stage stochastic programming.
no code implementations • ECCV 2018 • Mingze Xu, Chenyou Fan, Yuchen Wang, Michael S. Ryoo, David J. Crandall
In this paper, we wish to solve two specific problems: (1) given two or more synchronized third-person videos of a scene, produce a pixel-level segmentation of each visible person and identify corresponding people across different views (i. e., determine who in camera A corresponds with whom in camera B), and (2) given one or more synchronized third-person videos as well as a first-person video taken by a mobile or wearable camera, segment and identify the camera wearer in the third-person videos.
no code implementations • 7 Feb 2018 • Tianyi Lin, Chenyou Fan, Mengdi Wang
We consider the nonsmooth convex composition optimization problem where the objective is a composition of two finite-sum functions and analyze stochastic compositional variance reduced gradient (SCVRG) methods for them.
1 code implementation • 11 Jan 2018 • Mingze Xu, Chenyou Fan, John D Paden, Geoffrey C. Fox, David J. Crandall
Deep learning methods have surpassed the performance of traditional techniques on a wide range of problems in computer vision, but nearly all of this work has studied consumer photos, where precisely correct output is often not critical.
no code implementations • 20 May 2017 • Chenyou Fan, Jangwon Lee, Michael S. Ryoo
The key idea is that (1) an intermediate representation of a convolutional object recognition model abstracts scene information in its frame and that (2) we can predict (i. e., regress) such representations corresponding to the future frames based on that of the current frame.
no code implementations • CVPR 2017 • Chenyou Fan, Jang-Won Lee, Mingze Xu, Krishna Kumar Singh, Yong Jae Lee, David J. Crandall, Michael S. Ryoo
We consider scenarios in which we wish to perform joint scene understanding, object tracking, activity recognition, and other tasks in environments in which multiple people are wearing body-worn cameras while a third-person static camera also captures the scene.
1 code implementation • 12 Aug 2016 • Chenyou Fan, David J. Crandall
Lifelogging cameras capture everyday life from a first-person perspective, but generate so much data that it is hard for users to browse and organize their image collections effectively.
1 code implementation • 26 May 2016 • AJ Piergiovanni, Chenyou Fan, Michael S. Ryoo
In this paper, we newly introduce the concept of temporal attention filters, and describe how they can be used for human activity recognition from videos.
Ranked #1 on
Activity Recognition In Videos
on DogCentric