1 code implementation • 11 Mar 2025 • Kai Qiu, Xiang Li, Jason Kuen, Hao Chen, Xiaohao Xu, Jiuxiang Gu, Yinyi Luo, Bhiksha Raj, Zhe Lin, Marios Savvides
With the latent perturbation, we further propose (1) a novel tokenizer evaluation metric, i. e., pFID, which successfully correlates the tokenizer performance to generation quality and (2) a plug-and-play tokenizer training scheme, which significantly enhances the robustness of tokenizer thus boosting the generation quality and convergence speed.
Ranked #17 on
Image Generation
on ImageNet 256x256
no code implementations • 28 Oct 2024 • Zhantao Yang, Han Zhang, Fangyi Chen, Anudeepsekhar Bolimera, Marios Savvides
For e-commerce, an efficient and low-cost automated knowledge graph construction method is the foundation of enabling various successful downstream applications.
1 code implementation • 16 Aug 2024 • Kai Qiu, Xiang Li, Hao Chen, Jie Sun, Jinglu Wang, Zhe Lin, Marios Savvides, Bhiksha Raj
Audio generation has achieved remarkable progress with the advance of sophisticated generative models, such as diffusion models (DMs) and autoregressive (AR) models.
no code implementations • 25 Jul 2024 • Yu-Kai Huang, Yutong Zheng, Yen-Shuo Su, Anudeepsekhar Bolimera, Han Zhang, Fangyi Chen, Marios Savvides
Facial attribute editing plays a crucial role in synthesizing realistic faces with specific characteristics while maintaining realistic appearances.
1 code implementation • 30 May 2024 • Fangyi Chen, Han Zhang, Zhantao Yang, Hao Chen, Kai Hu, Marios Savvides
Open-vocabulary object detection (OVD) requires solid modeling of the region-semantic relationship, which could be learned from massive region-text pairs.
Ranked #12 on
Open Vocabulary Object Detection
on LVIS v1.0
(using extra training data)
no code implementations • 14 Apr 2023 • Xuan-Bac Nguyen, Chi Nhan Duong, Marios Savvides, Kaushik Roy, Hugh Churchill, Khoa Luu
Promoting fairness for deep clustering models in unsupervised clustering settings to reduce demographic bias is a challenging goal.
4 code implementations • 26 Jan 2023 • Hao Chen, Ran Tao, Yue Fan, Yidong Wang, Jindong Wang, Bernt Schiele, Xing Xie, Bhiksha Raj, Marios Savvides
The critical challenge of Semi-Supervised Learning (SSL) is how to effectively leverage the limited labeled data and massive unlabeled data to improve the model's generalization performance.
no code implementations • CVPR 2023 • Ran Tao, Hao Chen, Marios Savvides
This observation further motivates us to propose the Transductive Fine-tuning with Margin-based uncertainty weighting and Probability regularization (TF-MP), which learns a more balanced class marginal distribution.
2 code implementations • CVPR 2023 • Fangyi Chen, Han Zhang, Kai Hu, Yu-Kai Huang, Chenchen Zhu, Marios Savvides
This paper investigates a phenomenon where query-based object detectors mispredict at the last decoding stage while predicting correctly at an intermediate stage.
Ranked #14 on
Object Detection
on COCO 2017 val
no code implementations • 20 Nov 2022 • Hao Chen, Yue Fan, Yidong Wang, Jindong Wang, Bernt Schiele, Xing Xie, Marios Savvides, Bhiksha Raj
While standard SSL assumes uniform data distribution, we consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data.
no code implementations • 11 Sep 2022 • Thanh-Dat Truong, Chi Nhan Duong, Ngan Le, Marios Savvides, Khoa Luu
We therefore introduce a new method named Attention-based Bijective Generative Adversarial Networks in a Distillation framework (DAB-GAN) to synthesize faces of a subject given his/her extracted face recognition features.
1 code implementation • 15 Aug 2022 • Hao Chen, Ran Tao, Han Zhang, Yidong Wang, Xiang Li, Wei Ye, Jindong Wang, Guosheng Hu, Marios Savvides
Beyond classification, Conv-Adapter can generalize to detection and segmentation tasks with more than 50% reduction of parameters but comparable performance to the traditional full fine-tuning.
5 code implementations • 12 Aug 2022 • Yidong Wang, Hao Chen, Yue Fan, Wang Sun, Ran Tao, Wenxin Hou, RenJie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yu-Feng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, Jindong Wang, Xing Xie, Yue Zhang
We further provide the pre-trained versions of the state-of-the-art neural models for CV tasks to make the cost affordable for further tuning.
6 code implementations • 15 May 2022 • Yidong Wang, Hao Chen, Qiang Heng, Wenxin Hou, Yue Fan, Zhen Wu, Jindong Wang, Marios Savvides, Takahiro Shinozaki, Bhiksha Raj, Bernt Schiele, Xing Xie
Semi-supervised Learning (SSL) has witnessed great success owing to the impressive performances brought by various methods based on pseudo labeling and consistency regularization.
no code implementations • 7 Apr 2022 • Ran Tao, Han Zhang, Yutong Zheng, Marios Savvides
Class-agnostic bias is defined as the distribution shifting introduced by domain difference, which we propose Distribution Calibration Module(DCM) to reduce.
no code implementations • 1 Apr 2022 • Fangyi Chen, Han Zhang, Zaiwang Li, Jiachen Dou, Shentong Mo, Hao Chen, Yongxin Zhang, Uzair Ahmed, Chenchen Zhu, Marios Savvides
To make full use of computer vision technology in stores, it is required to consider the actual needs that fit the characteristics of the retail scene.
Ranked #1 on
Dense Object Detection
on SKU-110K
no code implementations • 25 Aug 2021 • Ngan Le, Vidhiwar Singh Rathour, Kashu Yamazaki, Khoa Luu, Marios Savvides
In this work, we provide a detailed review of recent and state-of-the-art research advances of deep reinforcement learning in computer vision.
no code implementations • ICLR 2021 • Zhiqiang Shen, Zechun Liu, Dejia Xu, Zitian Chen, Kwang-Ting Cheng, Marios Savvides
This work aims to empirically clarify a recently discovered perspective that label smoothing is incompatible with knowledge distillation.
no code implementations • CVPR 2021 • Yutong Zheng, Yu-Kai Huang, Ran Tao, Zhiqiang Shen, Marios Savvides
We propose a method to disentangle linear-encoded facial semantics from StyleGAN without external supervision.
no code implementations • CVPR 2021 • Chenchen Zhu, Fangyi Chen, Uzair Ahmed, Zhiqiang Shen, Marios Savvides
In this work, we investigate utilizing this semantic relation together with the visual information and introduce explicit relation reasoning into the learning of novel object detection.
Ranked #16 on
Few-Shot Object Detection
on MS-COCO (30-shot)
1 code implementation • CVPR 2021 • Zhiqiang Shen, Zechun Liu, Jie Qin, Lei Huang, Kwang-Ting Cheng, Marios Savvides
In this paper, we focus on this more difficult scenario: learning networks where both weights and activations are binary, meanwhile, without any human annotated labels.
no code implementations • 8 Feb 2021 • Zhiqiang Shen, Zechun Liu, Jie Qin, Marios Savvides, Kwang-Ting Cheng
A common practice for this task is to train a model on the base set first and then transfer to novel classes through fine-tuning (Here fine-tuning procedure is defined as transferring knowledge from base to novel data, i. e. learning to transfer in few-shot scenario.)
no code implementations • ICCV 2021 • Kai Hu, Jie Shao, YuAn Liu, Bhiksha Raj, Marios Savvides, Zhiqiang Shen
To address this, we present a contrast-and-order representation (CORP) framework for learning self-supervised video representations that can automatically capture both the appearance information within each frame and temporal information across different frames.
Ranked #3 on
Self-Supervised Action Recognition Linear
on UCF101
Action Recognition
Self-Supervised Action Recognition Linear
+1
no code implementations • 1 Jan 2021 • Ran Tao, Marios Savvides
By addressing the difference between feature distributions of base and novel classes, we propose the adaptive feature distribution method which is to finetune one scale vector using the support set of novel classes.
no code implementations • 3 Dec 2020 • Ngan Le, Kashu Yamazaki, Dat Truong, Kha Gia Quach, Marios Savvides
The first objective is performed by our proposed contextual brain tumor detection network, which plays a role of an attention gate and focuses on the region around brain tumor only while ignoring the far neighbor background which is less correlated to the tumor.
1 code implementation • ECCV 2020 • Devesh Walawalkar, Zhiqiang Shen, Marios Savvides
This paper presents a novel knowledge distillation based model compression framework consisting of a student ensemble.
1 code implementation • 17 Sep 2020 • Zhiqiang Shen, Marios Savvides
Our result can be regarded as a strong baseline using knowledge distillation, and to our best knowledge, this is also the first method that is able to boost vanilla ResNet-50 to surpass 80% on ImageNet without architecture modification or additional training data.
Ranked #11 on
Image Classification
on OmniBenchmark
no code implementations • CVPR 2020 • Hai Phan, Zechun Liu, Dang Huynh, Marios Savvides, Kwang-Ting Cheng, Zhiqiang Shen
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs), assuming an approximately optimal trade-off between computational cost and model accuracy.
1 code implementation • 29 Mar 2020 • Devesh Walawalkar, Zhiqiang Shen, Zechun Liu, Marios Savvides
In this paper, we propose Attentive CutMix, a naturally enhanced augmentation strategy based on CutMix.
3 code implementations • 11 Mar 2020 • Zhiqiang Shen, Zechun Liu, Zhuang Liu, Marios Savvides, Trevor Darrell, Eric Xing
This drawback hinders the model from learning subtle variance and fine-grained information.
4 code implementations • ECCV 2020 • Zechun Liu, Zhiqiang Shen, Marios Savvides, Kwang-Ting Cheng
In this paper, we propose several ideas for enhancing a binary network to close its accuracy gap from real-valued networks without incurring any additional computational cost.
2 code implementations • 12 Feb 2020 • Han Zhang, Fangyi Chen, Zhiqiang Shen, Qiqi Hao, Chenchen Zhu, Marios Savvides
In this paper, we introduce a superior solution called Background Recalibration Loss (BRL) that can automatically re-calibrate the loss signals according to the pre-defined IoU threshold and input image.
1 code implementation • arXiv 2019 • Chenchen Zhu, Fangyi Chen, Zhiqiang Shen, Marios Savvides
In this work, we aim at finding a new balance of speed and accuracy for anchor-free detectors.
2 code implementations • ECCV 2020 • Chenchen Zhu, Fangyi Chen, Zhiqiang Shen, Marios Savvides
In this work, we boost the performance of the anchor-point detector over the key-point counterparts while maintaining the speed advantage.
Ranked #3 on
Dense Object Detection
on SKU-110K
no code implementations • 24 Nov 2019 • Dipan K. Pal, Sreena Nallamothu, Marios Savvides
Overall, this paper aims to shed light on the phenomenon of visual transformation based self-supervision.
no code implementations • 13 Nov 2019 • Dipan K. Pal, Akshay Chawla, Marios Savvides
One of the fundamental problems in supervised classification and in machine learning in general, is the modelling of non-parametric invariances that exist in data.
1 code implementation • 6 Nov 2019 • Zhiqiang Shen, Harsh Maheshwari, Weichen Yao, Marios Savvides
Unsupervised domain adaptive object detection aims to learn a robust detector in the domain shift circumstance, where the training (source) domain is label-rich with bounding box annotations, while the testing (target) domain is label-agnostic and the feature distributions between training and testing domains are dissimilar or even totally different.
no code implementations • 22 Aug 2019 • Zhiqiang Shen, Zhankui He, Wanyun Cui, Jiahui Yu, Yutong Zheng, Chenchen Zhu, Marios Savvides
In order to distill diverse knowledge from different trained (teacher) models, we propose to use adversarial-based learning strategy where we define a block-wise training loss to guide and optimize the predefined student network to recover the knowledge in teacher models, and to promote the discriminator network to distinguish teacher vs. student features simultaneously.
no code implementations • 29 Jul 2019 • Hai Phan, Dang Huynh, Yihui He, Marios Savvides, Zhiqiang Shen
MobileNet and Binary Neural Networks are two among the most widely used techniques to construct deep learning models for performing a variety of tasks on mobile and embedded platforms. In this paper, we present a simple yet efficient scheme to exploit MobileNet binarization at activation function and model weights.
no code implementations • 17 Mar 2019 • Raied Aljadaany, Dipan K. Pal, Marios Savvides
This is in contrast to the common practice in literature of having the prior to be fixed and fully instantiated even during training stages.
4 code implementations • CVPR 2019 • Chenchen Zhu, Yihui He, Marios Savvides
The general concept of the FSAF module is online feature selection applied to the training of multi-level anchor-free branches.
Ranked #140 on
Object Detection
on COCO test-dev
1 code implementation • 19 Dec 2018 • Rahul Dey, Felix Juefei-Xu, Vishnu Naresh Boddeti, Marios Savvides
We present a new stage-wise learning paradigm for training generative adversarial networks (GANs).
no code implementations • 10 Oct 2018 • T. Hoang Ngan Le, Raajitha Gummadi, Marios Savvides
In each step, the Convolutional Layer is fed with the LevelSet map to obtain a brain tumor feature map.
4 code implementations • CVPR 2019 • Yihui He, Chenchen Zhu, Jianren Wang, Marios Savvides, Xiangyu Zhang
Large-scale object detection datasets (e. g., MS-COCO) try to define the ground truth bounding boxes as clear as possible.
Ranked #21 on
Object Detection
on PASCAL VOC 2007
3 code implementations • CVPR 2018 • Felix Juefei-Xu, Vishnu Naresh Boddeti, Marios Savvides
Convolutional neural networks are witnessing wide adoption in computer vision systems with numerous applications across a range of visual recognition tasks.
no code implementations • CVPR 2018 • Yutong Zheng, Dipan K. Pal, Marios Savvides
We motivate and present Ring loss, a simple and elegant feature normalization approach for deep networks designed to augment standard loss functions such as Softmax.
no code implementations • CVPR 2018 • Chenchen Zhu, Ran Tao, Khoa Luu, Marios Savvides
This paper introduces a novel anchor design to support anchor-based face detection for superior scale-invariant performance, especially on tiny faces.
no code implementations • 14 Jan 2018 • Dipan K. Pal, Marios Savvides
In this paper, we introduce a new class of deep convolutional architectures called Non-Parametric Transformation Networks (NPTNs) which can learn \textit{general} invariances and symmetries directly from data.
2 code implementations • 4 Dec 2017 • Zhiqiang Shen, Honghui Shi, Jiahui Yu, Hai Phan, Rogerio Feris, Liangliang Cao, Ding Liu, Xinchao Wang, Thomas Huang, Marios Savvides
In this paper, we present a simple and parameter-efficient drop-in module for one-stage object detectors like SSD when learning from scratch (i. e., without pre-trained models).
no code implementations • NeurIPS 2017 • Dipan Pal, Ashwin Kannan, Gautam Arakalgud, Marios Savvides
The study of representations invariant to common transformations of the data is important to learning.
no code implementations • 28 Nov 2017 • Chi Nhan Duong, Kha Gia Quach, Khoa Luu, T. Hoang Ngan Le, Marios Savvides, Tien D. Bui
The proposed model is experimented in both tasks of face aging synthesis and cross-age face verification.
no code implementations • 26 Oct 2017 • Pokkalla Harsha Vardhan, Kunal Sekhri, Dipan K. Pal, Marios Savvides
The problem of object localization has become one of the mainstream problems of vision.
no code implementations • 24 Oct 2017 • Dipan K. Pal, Ashwin A. Kannan, Gautam Arakalgud, Marios Savvides
The study of representations invariant to common transformations of the data is important to learning.
no code implementations • ICCV 2017 • Chandrasekhar Bhagavatula, Chenchen Zhu, Khoa Luu, Marios Savvides
We present our novel approach to simultaneously extract the 3D shape of the face and the semantically consistent 2D alignment through a 3D Spatial Transformer Network (3DSTN) to model both the camera projection matrix and the warping parameters of a 3D model.
Ranked #11 on
Face Alignment
on AFLW2000-3D
1 code implementation • 17 Apr 2017 • Felix Juefei-Xu, Vishnu Naresh Boddeti, Marios Savvides
A recent advance called the WGAN based on Wasserstein distance can improve on the KL and JS-divergence based GANs, and alleviate the gradient vanishing, instability, and mode collapse issues that are common in the GAN training.
no code implementations • 12 Apr 2017 • T. Hoang Ngan Le, Chi Nhan Duong, Ligong Han, Khoa Luu, Marios Savvides, Dipan Pal
Designed as extremely deep architectures, deep residual networks which provide a rich visual representation and offer robust convergence behaviors have recently achieved exceptional performance in numerous computer vision problems.
1 code implementation • 12 Apr 2017 • Ngan Le, Kha Gia Quach, Khoa Luu, Marios Savvides, Chenchen Zhu
To address these issues and boost the classic variational LS methods to a new level of the learnable deep learning approaches, we propose a novel definition of contour evolution named Recurrent Level Set (RLS)} to employ Gated Recurrent Unit under the energy minimization of a variational LS functional.
no code implementations • ICCV 2017 • Chi Nhan Duong, Kha Gia Quach, Khoa Luu, T. Hoang Ngan Le, Marios Savvides
Modeling the long-term facial aging process is extremely challenging due to the presence of large and non-linear variations during the face development stages.
no code implementations • 24 Feb 2017 • Dipan K. Pal, Marios Savvides
In this paper, we theoretically address three fundamental problems involving deep convolutional networks regarding invariance, depth and hierarchy.
no code implementations • 30 Jan 2017 • Dipan K. Pal, Vishnu Boddeti, Marios Savvides
We illustrate the general notion of selective invari- ance through object categorization experiments on large- scale datasets such as SVHN and ILSVRC 2012.
no code implementations • 16 Dec 2016 • Yutong Zheng, Chenchen Zhu, Khoa Luu, Chandrasekhar Bhagavatula, T. Hoang Ngan Le, Marios Savvides
Robust face detection is one of the most important pre-processing steps to support facial expression analysis, facial landmarking, face recognition, pose estimation, building of 3D facial models, etc.
7 code implementations • CVPR 2017 • Felix Juefei-Xu, Vishnu Naresh Boddeti, Marios Savvides
We propose local binary convolution (LBC), an efficient alternative to convolutional layers in standard convolutional neural networks (CNN).
no code implementations • 17 Jun 2016 • Chenchen Zhu, Yutong Zheng, Khoa Luu, Marios Savvides
Robust face detection in the wild is one of the ultimate components to support various facial related problems, i. e. unconstrained face recognition, facial periocular recognition, facial landmarking and pose estimation, facial expression recognition, 3D facial model construction, etc.
Ranked #29 on
Face Detection
on WIDER Face (Medium)
no code implementations • CVPR 2016 • Dipan K. Pal, Felix Juefei-Xu, Marios Savvides
We propose an explicitly discriminative and `simple' approach to generate invariance to nuisance transformations modeled as unitary.
no code implementations • 18 Nov 2015 • Dipan K. Pal, Marios Savvides
The study of representations invariant to common transformations of the data is important to learning.