1 code implementation • 29 Nov 2022 • Manel Baradad, Chun-Fu Chen, Jonas Wulff, Tongzhou Wang, Rogerio Feris, Antonio Torralba, Phillip Isola
Learning image representations using synthetic data allows training neural networks without some of the concerns associated with real images, such as privacy and bias.
no code implementations • 18 Oct 2022 • Chun-Fu Chen, Shaohan Hu, Zhonghao Shi, Prateek Gulati, Bill Moriarty, Marco Pistoia, Vincenzo Piuri, Pierangela Samarati
The recent rapid advances in machine learning technologies largely depend on the vast richness of data available today, in terms of both the quantity and the rich content contained within.
no code implementations • 30 Nov 2021 • Samarth Mishra, Rameswar Panda, Cheng Perng Phoo, Chun-Fu Chen, Leonid Karlinsky, Kate Saenko, Venkatesh Saligrama, Rogerio S. Feris
It is thus better to tailor synthetic pre-training data to a specific downstream task, for best performance.
no code implementations • 29 Sep 2021 • Quanfu Fan, Kaidi Xu, Chun-Fu Chen, Sijia Liu, Gaoyuan Zhang, David Daniel Cox, Xue Lin
Physical adversarial attacks apply carefully crafted adversarial perturbations onto real objects to maliciously alter the prediction of object classifiers or detectors.
1 code implementation • ICCV 2021 • Ximeng Sun, Rameswar Panda, Chun-Fu Chen, Aude Oliva, Rogerio Feris, Kate Saenko
Deep convolutional networks have recently achieved great success in video recognition, yet their practical realization remains a challenge due to the large amount of computational resources required to achieve robust recognition.
1 code implementation • NeurIPS 2021 • Ashraful Islam, Chun-Fu Chen, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Richard J. Radke
As the base dataset and unlabeled dataset are from different domains, projecting the target images in the class-domain of the base dataset with a fixed pretrained model might be sub-optimal.
3 code implementations • ICLR 2022 • Chun-Fu Chen, Rameswar Panda, Quanfu Fan
The regional-to-local attention includes two steps: first, the regional self-attention extract global information among all regional tokens and then the local self-attention exchanges the information among one regional token and the associated local tokens via self-attention.
1 code implementation • ICCV 2021 • Rameswar Panda, Chun-Fu Chen, Quanfu Fan, Ximeng Sun, Kate Saenko, Aude Oliva, Rogerio Feris
Specifically, given a video segment, a multi-modal policy network is used to decide what modalities should be used for processing by the recognition model, with the goal of improving both accuracy and efficiency.
1 code implementation • ICCV 2021 • Assaf Arbelle, Sivan Doveh, Amit Alfassy, Joseph Shtok, Guy Lev, Eli Schwartz, Hilde Kuehne, Hila Barak Levi, Prasanna Sattigeri, Rameswar Panda, Chun-Fu Chen, Alex Bronstein, Kate Saenko, Shimon Ullman, Raja Giryes, Rogerio Feris, Leonid Karlinsky
In this work, we focus on the task of Detector-Free WSG (DF-WSG) to solve WSG without relying on a pre-trained detector.
Ranked #1 on
Phrase Grounding
on Visual Genome
13 code implementations • ICCV 2021 • Chun-Fu Chen, Quanfu Fan, Rameswar Panda
To this end, we propose a dual-branch transformer to combine image patches (i. e., tokens in a transformer) of different sizes to produce stronger image features.
Ranked #419 on
Image Classification
on ImageNet
2 code implementations • ICCV 2021 • Ashraful Islam, Chun-Fu Chen, Rameswar Panda, Leonid Karlinsky, Richard Radke, Rogerio Feris
Tremendous progress has been made in visual representation learning, notably with the recent success of self-supervised contrastive learning methods.
no code implementations • 2 Mar 2021 • Ximeng Sun, Rameswar Panda, Chun-Fu Chen, Naigang Wang, Bowen Pan, Kailash Gopalakrishnan, Aude Oliva, Rogerio Feris, Kate Saenko
Second, to effectively transfer knowledge, we develop a dynamic block swapping method by randomly replacing the blocks in the lower-precision student network with the corresponding blocks in the higher-precision teacher network.
no code implementations • 20 Nov 2020 • Ulrich Finkler, Michele Merler, Rameswar Panda, Mayoore S. Jaiswal, Hui Wu, Kandan Ramakrishnan, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee
Neural Architecture Search (NAS) is a powerful tool to automatically design deep neural networks for many tasks, including image classification.
1 code implementation • CVPR 2021 • Chun-Fu Chen, Rameswar Panda, Kandan Ramakrishnan, Rogerio Feris, John Cohn, Aude Oliva, Quanfu Fan
In recent years, a number of approaches based on 2D or 3D convolutional neural networks (CNN) have emerged for video action recognition, achieving state-of-the-art results on several large-scale benchmark datasets.
no code implementations • 23 Jun 2020 • Rameswar Panda, Michele Merler, Mayoore Jaiswal, Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee
The typical way of conducting large scale NAS is to search for an architectural building block on a small dataset (either using a proxy set from the large dataset or a completely different small scale dataset) and then transfer the block to a larger dataset.
1 code implementation • NeurIPS 2019 • Quanfu Fan, Chun-Fu Chen, Hilde Kuehne, Marco Pistoia, David Cox
Current state-of-the-art models for video action recognition are mostly based on expensive 3D ConvNets.
Ranked #78 on
Action Recognition
on Something-Something V2
(using extra training data)
no code implementations • 7 Aug 2018 • Chun-Fu Chen, Quanfu Fan, Marco Pistoia, Gwo Giun Lee
We propose a new method to create compact convolutional neural networks (CNNs) by exploiting sparse convolutions.
3 code implementations • ICLR 2019 • Chun-Fu Chen, Quanfu Fan, Neil Mallinar, Tom Sercu, Rogerio Feris
The proposed approach demonstrates improvement of model efficiency and performance on both object recognition and speech recognition tasks, using popular architectures including ResNet and ResNeXt.
no code implementations • CVPR 2018 • Ruichi Yu, Ang Li, Chun-Fu Chen, Jui-Hsin Lai, Vlad I. Morariu, Xintong Han, Mingfei Gao, Ching-Yung Lin, Larry S. Davis
In contrast, we argue that it is essential to prune neurons in the entire neuron network jointly based on a unified goal: minimizing the reconstruction error of important responses in the "final response layer" (FRL), which is the second-to-last layer before classification, for a pruned network to retrain its predictive power.