Learning Trailer Moments in Full-Length Movies with Co-Contrastive Attention

no code implementations ECCV 2020 Lezi Wang, Dong Liu, Rohit Puri, Dimitris N. Metaxas

We introduce a novel ranking network that utilizes the Co-Attention between movies and trailers as guidance to generate the training pairs, where the moments highly corrected with trailers are expected to be scored higher than the uncorrelated moments.

A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and Benchmark

2 code implementations28 Feb 2022 Yunhe Gao, Mu Zhou, Di Liu, Zhennan Yan, Shaoting Zhang, Dimitris N. Metaxas

However, existing vision Transformers struggle to learn with limited medical data and are unable to generalize on diverse medical image tasks.

Learning Transferable Reward for Query Object Localization with Policy Adaptation

1 code implementation ICLR 2022 Tingfeng Li, Shaobo Han, Martin Renqiang Min, Dimitris N. Metaxas

We propose a reinforcement learning based approach to query object localization, for which an agent is trained to localize objects of interest specified by a small exemplary set.

Learned Half-Quadratic Splitting Network for Magnetic Resonance Image Reconstruction

1 code implementation17 Dec 2021 Bingyu Xin, Timothy S. Phan, Leon Axel, Dimitris N. Metaxas

Magnetic Resonance (MR) image reconstruction from highly undersampled $k$-space data is critical in accelerated MR imaging (MRI) techniques.

Hybrid Supervision Learning for Pathology Whole Slide Image Classification

1 code implementation2 Jul 2021 Jiahui Li, Wen Chen, Xiaodi Huang, Zhiqiang Hu, Qi Duan, Hongsheng Li, Dimitris N. Metaxas, Shaoting Zhang

To handle this problem, we propose a hybrid supervision learning framework for this kind of high resolution images with sufficient image-level coarse annotations and a few pixel-level fine labels.

Improved Transformer for High-Resolution GANs

2 code implementations NeurIPS 2021 Long Zhao, Zizhao Zhang, Ting Chen, Dimitris N. Metaxas, Han Zhang

Attention-based models, exemplified by the Transformer, can effectively model long range dependency, but suffer from the quadratic complexity of self-attention operation, making them difficult to be adopted for high-resolution image generation based on Generative Adversarial Networks (GANs).

More Than Just Attention: Improving Cross-Modal Attentions with Contrastive Constraints for Image-Text Matching

no code implementations20 May 2021 Yuxiao Chen, Jianbo Yuan, Long Zhao, Rui Luo, Larry Davis, Dimitris N. Metaxas

Cross-modal attention mechanisms have been widely applied to the image-text matching task and have achieved remarkable improvements thanks to its capability of learning fine-grained relevance across different modalities.

SCPM-Net: An Anchor-free 3D Lung Nodule Detection Network using Sphere Representation and Center Points Matching

1 code implementation12 Apr 2021 Xiangde Luo, Tao Song, Guotai Wang, Jieneng Chen, Yinan Chen, Kang Li, Dimitris N. Metaxas, Shaoting Zhang

To overcome these problems, we propose a 3D sphere representation-based center-points matching detection network that is anchor-free and automatically predicts the position, radius, and offset of nodules without the manual design of nodule/anchor parameters.

Deep Animation Video Interpolation in the Wild

1 code implementation CVPR 2021 Li SiYao, Shiyu Zhao, Weijiang Yu, Wenxiu Sun, Dimitris N. Metaxas, Chen Change Loy, Ziwei Liu

In the animation industry, cartoon videos are usually produced at low frame rate since hand drawing of such frames is costly and time-consuming.

Variational Bayesian Sequence-to-Sequence Networks for Memory-Efficient Sign Language Translation

no code implementations11 Feb 2021 Harris Partaourides, Andreas Voskou, Dimitrios Kosmopoulos, Sotirios Chatzis, Dimitris N. Metaxas

Memory-efficient continuous Sign Language Translation is a significant challenge for the development of assisted technologies with real-time applicability for the deaf.

Unity of Opposites: SelfNorm and CrossNorm for Model Robustness

no code implementations1 Jan 2021 Zhiqiang Tang, Yunhe Gao, Yi Zhu, Zhi Zhang, Mu Li, Dimitris N. Metaxas

CrossNorm exchanges styles between feature channels to perform style augmentation, diversifying the content and style mixtures.

Semantic Aware Data Augmentation for Cell Nuclei Microscopical Images With Artificial Neural Networks

no code implementations ICCV 2021 Alireza Naghizadeh, Hongye Xu, Mohab Mohamed, Dimitris N. Metaxas, Dongfang Liu

The importance of this subject is nested in the amount of training data that artificial neural networks need to accurately identify and segment objects in images and the infeasibility of acquiring a sufficient dataset within the biomedical field.

Multi-modal AsynDGAN: Learn From Distributed Medical Image Data without Sharing Private Information

no code implementations15 Dec 2020 Qi Chang, Zhennan Yan, Lohendran Baskaran, Hui Qu, Yikai Zhang, Tong Zhang, Shaoting Zhang, Dimitris N. Metaxas

As deep learning technologies advance, increasingly more data is necessary to generate general and robust models for various tasks.

Learning Trailer Moments in Full-Length Movies

no code implementations19 Aug 2020 Lezi Wang, Dong Liu, Rohit Puri, Dimitris N. Metaxas

A movie's key moments stand out of the screenplay to grab an audience's attention and make movie browsing efficient.

Weakly Supervised Deep Nuclei Segmentation Using Partial Points Annotation in Histopathology Images

no code implementations10 Jul 2020 Hui Qu, Pengxiang Wu, Qiaoying Huang, Jingru Yi, Zhennan Yan, Kang Li, Gregory M. Riedlinger, Subhajyoti De, Shaoting Zhang, Dimitris N. Metaxas

To alleviate such tedious and manual effort, in this paper we propose a novel weakly supervised segmentation framework based on partial points annotation, i. e., only a small portion of nuclei locations in each image are labeled.

Knowledge as Priors: Cross-Modal Knowledge Generalization for Datasets without Superior Knowledge

no code implementations CVPR 2020 Long Zhao, Xi Peng, Yuxiao Chen, Mubbasir Kapadia, Dimitris N. Metaxas

Our key idea is to generalize the distilled cross-modal knowledge learned from a Source dataset, which contains paired examples from both modalities, to the Target dataset by modeling knowledge as priors on parameters of the Student.

Vertebra-Focused Landmark Detection for Scoliosis Assessment

1 code implementation9 Jan 2020 Jingru Yi, Pengxiang Wu, Qiaoying Huang, Hui Qu, Dimitris N. Metaxas

The comparison results demonstrate the merits of our method in both Cobb angle measurement and landmark detection on low-contrast and ambiguous X-ray images.

Object-Guided Instance Segmentation for Biological Images

no code implementations20 Nov 2019 Jingru Yi, Hui Tang, Pengxiang Wu, Bo Liu, Daniel J. Hoeppner, Dimitris N. Metaxas, Lianyi Han, Wei Fan

Along with the instance normalization, the model is able to recover the target object distribution and suppress the distribution of neighboring attached objects.

ASSD: Attentive Single Shot Multibox Detector

1 code implementation27 Sep 2019 Jingru Yi, Pengxiang Wu, Dimitris N. Metaxas

This paper proposes a new deep neural network for object detection.

Collaborative Multi-agent Learning for MR Knee Articular Cartilage Segmentation

no code implementations13 Aug 2019 Chaowei Tan, Zhennan Yan, Shaoting Zhang, Kang Li, Dimitris N. Metaxas

However, effective and efficient delineation of all the knee articular cartilages in large-sized and high-resolution 3D MR knee data is still an open challenge.

Greedy AutoAugment

2 code implementations2 Aug 2019 Alireza Naghizadeh, Mohammadsajad Abavisani, Dimitris N. Metaxas

This is a challenging problem and requires exploration for data augmentation policies to ensure their effectiveness in covering the search space.

2nd Place Solution to the GQA Challenge 2019

no code implementations16 Jul 2019 Shijie Geng, Ji Zhang, Hang Zhang, Ahmed Elgammal, Dimitris N. Metaxas

We present a simple method that achieves unexpectedly superior performance for Complex Reasoning involved Visual Question Answering.

Sharpen Focus: Learning with Attention Separability and Consistency

1 code implementation ICCV 2019 Lezi Wang, Ziyan Wu, Srikrishna Karanam, Kuan-Chuan Peng, Rajat Vikram Singh, Bo Liu, Dimitris N. Metaxas

Recent developments in gradient-based attention modeling have seen attention maps emerge as a powerful tool for interpreting convolutional neural networks.

CR-GAN: Learning Complete Representations for Multi-view Generation

1 code implementation28 Jun 2018 Yu Tian, Xi Peng, Long Zhao, Shaoting Zhang, Dimitris N. Metaxas

Generating multi-view images from a single-view input is an essential yet challenging problem.

Scenarios: A New Representation for Complex Scene Understanding

no code implementations16 Feb 2018 Zachary A. Daniels, Dimitris N. Metaxas

The ability for computational agents to reason about the high-level content of real world scene images is important for many applications.

RED-Net: A Recurrent Encoder-Decoder Network for Video-based Face Alignment

no code implementations17 Jan 2018 Xi Peng, Rogerio S. Feris, Xiaoyu Wang, Dimitris N. Metaxas

We propose a novel method for real-time face alignment in videos based on a recurrent encoder-decoder network model.

Multispectral Deep Neural Networks for Pedestrian Detection

1 code implementation8 Nov 2016 Jingjing Liu, Shaoting Zhang, Shu Wang, Dimitris N. Metaxas

Multispectral pedestrian detection is essential for around-the-clock applications, e. g., surveillance and autonomous driving.

Track Facial Points in Unconstrained Videos

no code implementations9 Sep 2016 Xi Peng, Qiong Hu, Junzhou Huang, Dimitris N. Metaxas

Our approach takes advantage of part-based representation and cascade regression for robust and efficient alignment on each frame.

A Recurrent Encoder-Decoder Network for Sequential Face Alignment

no code implementations19 Aug 2016 Xi Peng, Rogerio S. Feris, Xiaoyu Wang, Dimitris N. Metaxas

We propose a novel recurrent encoder-decoder network model for real-time video-based face alignment.

Visual Tracking via Reliable Memories

no code implementations4 Feb 2016 Shu Wang, Shaoting Zhang, Wei Liu, Dimitris N. Metaxas

In this paper, we propose a novel visual tracking framework that intelligently discovers reliable patterns from a wide range of video to resist drift error for long-term tracking tasks.

PIEFA: Personalized Incremental and Ensemble Face Alignment

no code implementations ICCV 2015 Xi Peng, Shaoting Zhang, Yu Yang, Dimitris N. Metaxas

Face alignment, especially on real-time or large-scale sequential images, is a challenging task with broad applications.

