1 code implementation • 6 Mar 2023 • Yi-Lun Lee, Yi-Hsuan Tsai, Wei-Chen Chiu, Chen-Yu Lee
In this paper, we tackle two challenges in multimodal learning for visual recognition: 1) when missing-modality occurs either during training or testing in real-world situations; and 2) when the computation resources are not available to finetune on heavy transformer models.
1 code implementation • 6 Feb 2023 • Kuniaki Saito, Kihyuk Sohn, Xiang Zhang, Chun-Liang Li, Chen-Yu Lee, Kate Saenko, Tomas Pfister
Existing methods rely on supervised learning of CIR models using labeled triplets consisting of the query image, text specification, and the target image.
no code implementations • 12 Jan 2023 • Ruoxi Sun, Chun-Liang Li, Sercan O. Arik, Michael W. Dusenberry, Chen-Yu Lee, Tomas Pfister
Accurate estimation of output quantiles is crucial in many use cases, where it is desired to model the range of possibility.
no code implementations • 15 Nov 2022 • Zilong Wang, Yichao Zhou, Wei Wei, Chen-Yu Lee, Sandeep Tata
Understanding visually-rich business documents to extract structured data and automate business workflows has been receiving attention both in academia and industry.
no code implementations • 14 Nov 2022 • Zifeng Wang, Zizhao Zhang, Jacob Devlin, Chen-Yu Lee, Guolong Su, Hao Zhang, Jennifer Dy, Vincent Perot, Tomas Pfister
Zero-shot transfer learning for document understanding is a crucial yet under-investigated scenario to help reduce the high cost involved in annotating document entities.
no code implementations • 2 Jun 2022 • Kuniaki Saito, Kihyuk Sohn, Xiang Zhang, Chun-Liang Li, Chen-Yu Lee, Kate Saenko, Tomas Pfister
However, a naive unification of the real caption and the prompt sentences could lead to a complication in learning, as the distribution shift in text may not be handled properly in the language encoder.
2 code implementations • 10 Apr 2022 • Zifeng Wang, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister
Continual learning aims to enable a single model to learn a sequence of tasks without catastrophic forgetting.
no code implementations • ACL 2022 • Chen-Yu Lee, Chun-Liang Li, Timothy Dozat, Vincent Perot, Guolong Su, Nan Hua, Joshua Ainslie, Renshen Wang, Yasuhisa Fujii, Tomas Pfister
Sequence modeling has demonstrated state-of-the-art performance on natural language and document understanding tasks.
no code implementations • 10 Jan 2022 • Vishnu Suresh Lokhande, Kihyuk Sohn, Jinsung Yoon, Madeleine Udell, Chen-Yu Lee, Tomas Pfister
Such a requirement is impractical in situations where the data labeling efforts for minority or rare groups are significantly laborious or where the individuals comprising the dataset choose to conceal sensitive information.
no code implementations • 21 Dec 2021 • Kihyuk Sohn, Jinsung Yoon, Chun-Liang Li, Chen-Yu Lee, Tomas Pfister
We define a distance function between images, each of which is represented as a bag of embeddings, by the Euclidean distance between weighted averaged embeddings.
1 code implementation • CVPR 2022 • Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister
The mainstream paradigm behind continual learning has been to adapt the model parameters to non-stationary data distributions, where catastrophic forgetting is the central challenge.
no code implementations • 29 Sep 2021 • Vishnu Suresh Lokhande, Kihyuk Sohn, Jinsung Yoon, Madeleine Udell, Chen-Yu Lee, Tomas Pfister
Such a requirement is impractical in situations where the data labelling efforts for minority or rare groups is significantly laborious or where the individuals comprising the dataset choose to conceal sensitive information.
no code implementations • 29 Sep 2021 • Justin Lazarow, Kihyuk Sohn, Chun-Liang Li, Zizhao Zhang, Chen-Yu Lee, Tomas Pfister
While remarkable progress in imbalanced supervised learning has been made recently, less attention has been given to the setting of imbalanced semi-supervised learning (SSL) where not only is a few labeled data provided, but the underlying data distribution can be severely imbalanced.
no code implementations • ACL 2021 • Chen-Yu Lee, Chun-Liang Li, Chu Wang, Renshen Wang, Yasuhisa Fujii, Siyang Qin, Ashok Popat, Tomas Pfister
Natural reading orders of words are crucial for information extraction from form-like documents.
no code implementations • 11 Jun 2021 • Jinsung Yoon, Kihyuk Sohn, Chun-Liang Li, Sercan O. Arik, Chen-Yu Lee, Tomas Pfister
We demonstrate our method on various unsupervised AD tasks with image and tabular data.
no code implementations • 11 Jan 2021 • Kunpeng Li, Zizhao Zhang, Guanhang Wu, Xuehan Xiong, Chen-Yu Lee, Zhichao Lu, Yun Fu, Tomas Pfister
To address this issue, we introduce a new method for pre-training video action recognition models using queried web videos.
no code implementations • 1 Jan 2021 • Kunpeng Li, Zizhao Zhang, Guanhang Wu, Xuehan Xiong, Chen-Yu Lee, Yun Fu, Tomas Pfister
To address this issue, we introduce a new method for pre-training video action recognition models using queried web videos.
no code implementations • ICML 2020 • Pengsheng Guo, Chen-Yu Lee, Daniel Ulbricht
Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network.
6 code implementations • 10 May 2020 • Kihyuk Sohn, Zizhao Zhang, Chun-Liang Li, Han Zhang, Chen-Yu Lee, Tomas Pfister
Semi-supervised learning (SSL) has a potential to improve the predictive performance of machine learning models using unlabeled data.
Ranked #10 on
Semi-Supervised Object Detection
on COCO 100% labeled data
(using extra training data)
2 code implementations • CVPR 2019 • Chen-Yu Lee, Tanmay Batra, Mohammad Haris Baig, Daniel Ulbricht
In this work, we connect two distinct concepts for unsupervised domain adaptation: feature distribution alignment between domains by utilizing the task-specific decision boundary and the Wasserstein metric.
Ranked #17 on
Domain Adaptation
on VisDA2017
4 code implementations • ICML 2018 • Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, Andrew Rabinovich
Deep multitask networks, in which one neural network produces multiple predictive outputs, can offer better speed and performance than their single-task counterparts but are challenging to train properly.
1 code implementation • ICCV 2017 • Chen-Yu Lee, Vijay Badrinarayanan, Tomasz Malisiewicz, Andrew Rabinovich
This paper focuses on the task of room layout estimation from a monocular RGB image.
no code implementations • CVPR 2016 • Chen-Yu Lee, Simon Osindero
We present recursive recurrent neural networks with attention modeling (R$^2$AM) for lexicon-free optical character recognition in natural scene images.
2 code implementations • 30 Sep 2015 • Chen-Yu Lee, Patrick W. Gallagher, Zhuowen Tu
We seek to improve deep neural networks by generalizing the pooling operations that play a central role in current architectures.
Ranked #19 on
Image Classification
on SVHN
1 code implementation • 11 May 2015 • Liwei Wang, Chen-Yu Lee, Zhuowen Tu, Svetlana Lazebnik
One of the most promising ways of improving the performance of deep convolutional neural networks is by increasing the number of convolutional layers.
1 code implementation • 18 Sep 2014 • Chen-Yu Lee, Saining Xie, Patrick Gallagher, Zhengyou Zhang, Zhuowen Tu
Our proposed deeply-supervised nets (DSN) method simultaneously minimizes classification error while making the learning process of hidden layers direct and transparent.
Ranked #25 on
Image Classification
on SVHN
no code implementations • CVPR 2014 • Chen-Yu Lee, Anurag Bhardwaj, Wei Di, Vignesh Jagadeesh, Robinson Piramuthu
We present a new feature representation method for scene text recognition problem, particularly focusing on improving scene character recognition.