Search Results for author: Dimitris Metaxas

Found 78 papers, 41 papers with code

Recognition of Nonmanual Markers in American Sign Language (ASL) Using Non-Parametric Adaptive 2D-3D Face Tracking

no code implementations • LREC 2012 • Dimitris Metaxas, Bo Liu, Fei Yang, Peng Yang, Nicholas Michael, Carol Neidle

This paper addresses the problem of automatically recognizing linguistically significant nonmanual expressions in American Sign Language from video.

Sign Language Recognition

Paper
Add Code

Adaptive low rank and sparse decomposition of video using compressive sensing

no code implementations • 6 Feb 2013 • Fei Yang, Hong Jiang, Zuowei Shen, Wei Deng, Dimitris Metaxas

We address the problem of reconstructing and analyzing surveillance videos using compressive sensing.

Compressive Sensing Video Reconstruction

Paper
Add Code

Handling Noise in Single Image Deblurring Using Directional Filters

no code implementations • CVPR 2013 • Lin Zhong, Sunghyun Cho, Dimitris Metaxas, Sylvain Paris, Jue Wang

Based on this observation, our method applies a series of directional filters at different orientations to the input image, and estimates an accurate Radon transform of the blur kernel from each filtered image.

Deblurring Image Deblurring +2

Paper
Add Code

A New Framework for Sign Language Recognition based on 3D Handshape Identification and Linguistic Modeling

no code implementations • LREC 2014 • Mark Dilsizian, Polina Yanovich, Shu Wang, Carol Neidle, Dimitris Metaxas

Current approaches to sign recognition by computer generally have at least some of the following limitations: they rely on laboratory conditions for sign production, are limited to a small vocabulary, rely on 2D modeling (and therefore cannot deal with occlusions and off-plane rotations), and/or achieve limited success.

3D Reconstruction Sign Language Recognition +1

Paper
Add Code

3D Face Tracking and Multi-Scale, Spatio-temporal Analysis of Linguistically Significant Facial Expressions and Head Positions in ASL

no code implementations • LREC 2014 • Bo Liu, Jingjing Liu, Xiang Yu, Dimitris Metaxas, Carol Neidle

Essential grammatical information is conveyed in signed languages by clusters of events involving facial expressions and movements of the head and upper body.

Sign Language Recognition

Paper
Add Code

Mode Estimation for High Dimensional Discrete Tree Graphical Models

no code implementations • NeurIPS 2014 • Chao Chen, Han Liu, Dimitris Metaxas, Tianqi Zhao

Though the mode finding problem is generally intractable in high dimensions, this paper unveils that, if the distribution can be approximated well by a tree graphical model, mode characterization is significantly easier.

Vocal Bursts Intensity Prediction

Paper
Add Code

Detection of Major ASL Sign Types in Continuous Signing For ASL Recognition

no code implementations • LREC 2016 • Polina Yanovich, Carol Neidle, Dimitris Metaxas

In American Sign Language (ASL) as well as other signed languages, different classes of signs (e. g., lexical signs, fingerspelled signs, and classifier constructions) have different internal structural properties.

Multiple Instance Learning

Paper
Add Code

SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition

no code implementations • CVPR 2016 • Han Zhang, Tao Xu, Mohamed Elhoseiny, Xiaolei Huang, Shaoting Zhang, Ahmed Elgammal, Dimitris Metaxas

In this paper, we propose a new CNN architecture that integrates semantic part detection and abstraction (SPDA-CNN) for fine-grained classification.

General Classification Object Recognition +1

Paper
Add Code

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

21 code implementations • ICCV 2017 • Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas

Synthesizing high-quality images from text descriptions is a challenging problem in computer vision and has many practical applications.

Ranked #3 on Text-to-Image Generation on Oxford 102 Flowers (Inception score metric)

Text-to-Image Generation

1,850

Paper
Code

Reconstruction-Based Disentanglement for Pose-invariant Face Recognition

no code implementations • ICCV 2017 • Xi Peng, Xiang Yu, Kihyuk Sohn, Dimitris Metaxas, Manmohan Chandraker

Finally, we propose a new feature reconstruction metric learning to explicitly disentangle identity and pose, by demanding alignment between the feature reconstructions through various combinations of identity and pose features, which is obtained from two images of the same subject.

Disentanglement Face Recognition +2

Paper
Add Code

Automatic Vertebra Labeling in Large-Scale 3D CT using Deep Image-to-Image Network with Message Passing and Sparsity Regularization

no code implementations • 17 May 2017 • Dong Yang, Tao Xiong, Daguang Xu, Qiangui Huang, David Liu, S. Kevin Zhou, Zhoubing Xu, Jin-Hyeong Park, Mingqing Chen, Trac. D. Tran, Sang Peter Chin, Dimitris Metaxas, Dorin Comaniciu

In this paper, we propose an automatic and fast algorithm to localize and label the vertebra centroids in 3D CT volumes.

Paper
Add Code

Automatic Liver Segmentation Using an Adversarial Image-to-Image Network

no code implementations • 25 Jul 2017 • Dong Yang, Daguang Xu, S. Kevin Zhou, Bogdan Georgescu, Mingqing Chen, Sasa Grbic, Dimitris Metaxas, Dorin Comaniciu

Automatic liver segmentation in 3D medical images is essential in many clinical applications, such as pathological diagnosis of hepatic diseases, surgical planning, and postoperative assessment.

Liver Segmentation Segmentation

Paper
Add Code

StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

16 code implementations • 19 Oct 2017 • Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas

In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) aiming at generating high-resolution photo-realistic images.

Ranked #5 on Text-to-Image Generation on Oxford 102 Flowers

Generative Adversarial Network Text-to-Image Generation

1,850

Paper
Code

Interactive Reinforcement Learning for Object Grounding via Self-Talking

no code implementations • 2 Dec 2017 • Yan Zhu, Shaoting Zhang, Dimitris Metaxas

In this paper, we introduce an interactive training method to improve the natural language conversation system for a visual grounding task.

Object reinforcement-learning +2

Paper
Add Code

Toward Marker-free 3D Pose Estimation in Lifting: A Deep Multi-view Solution

no code implementations • 6 Feb 2018 • Rahil Mehrizi, Xi Peng, Zhiqiang Tang, Xu Xu, Dimitris Metaxas, Kang Li

The results are also compared with state-of-the-art methods on HumanEva-I dataset, which demonstrates the superior performance of our approach.

3D Pose Estimation

Paper
Add Code

Improving GANs Using Optimal Transport

2 code implementations • ICLR 2018 • Tim Salimans, Han Zhang, Alec Radford, Dimitris Metaxas

We present Optimal Transport GAN (OT-GAN), a variant of generative adversarial nets minimizing a new metric measuring the distance between the generator distribution and the data distribution.

Image Generation

Paper
Code

Linguistically-driven Framework for Computationally Efficient and Scalable Sign Recognition

no code implementations • LREC 2018 • Dimitris Metaxas, Mark Dilsizian, Carol Neidle

Time Series Analysis

Paper
Add Code

Self-Attention Generative Adversarial Networks

49 code implementations • arXiv 2018 • Han Zhang, Ian Goodfellow, Dimitris Metaxas, Augustus Odena

In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks.

Ranked #20 on Conditional Image Generation on ImageNet 128x128

Conditional Image Generation Generative Adversarial Network

17,583

Paper
Code

Jointly Optimize Data Augmentation and Network Training: Adversarial Data Augmentation in Human Pose Estimation

no code implementations • CVPR 2018 • Xi Peng, Zhiqiang Tang, Fei Yang, Rogerio Feris, Dimitris Metaxas

Random data augmentation is a critical technique to avoid overfitting in training deep neural network models.

Ranked #3 on Pose Estimation on Leeds Sports Poses

Data Augmentation Pose Estimation

Paper
Add Code

Show Me a Story: Towards Coherent Neural Story Illustration

1 code implementation • CVPR 2018 • Hareesh Ravi, Lezi Wang, Carlos Muniz, Leonid Sigal, Dimitris Metaxas, Mubbasir Kapadia

We propose an end-to-end network for the visual illustration of a sequence of sentences forming a story.

Sentence Story Visualization

Paper
Code

Learning to Forecast and Refine Residual Motion for Image-to-Video Generation

1 code implementation • ECCV 2018 • Long Zhao, Xi Peng, Yu Tian, Mubbasir Kapadia, Dimitris Metaxas

We consider the problem of image-to-video translation, where an input image is translated into an output video containing motions of a single object.

Human Pose Forecasting Image to Video Generation +1

Paper
Code

Quantized Densely Connected U-Nets for Efficient Landmark Localization

1 code implementation • ECCV 2018 • Zhiqiang Tang, Xi Peng, Shijie Geng, Lingfei Wu, Shaoting Zhang, Dimitris Metaxas

Finally, to reduce the memory consumption and high precision operations both in training and testing, we further quantize weights, inputs, and gradients of our localization network to low bit-width numbers.

Ranked #19 on Pose Estimation on MPII Human Pose

Face Alignment Pose Estimation

225

Paper
Code

MRI Reconstruction via Cascaded Channel-wise Attention Network

1 code implementation • 18 Oct 2018 • Qiaoying Huang, Dong Yang, Pengxiang Wu, Hui Qu, Jingru Yi, Dimitris Metaxas

We consider an MRI reconstruction problem with input of k-space data at a very low undersampled rate.

MRI Reconstruction

Paper
Code

Brain Segmentation from k-space with End-to-end Recurrent Attention Network

no code implementations • 5 Dec 2018 • Qiaoying Huang, Xiao Chen, Dimitris Metaxas, Mariappan S. Nadar

The task of medical image segmentation commonly involves an image reconstruction step to convert acquired raw data to images before any analysis.

Brain Image Segmentation Brain Segmentation +4

Paper
Add Code

Effective 3D Humerus and Scapula Extraction using Low-contrast and High-shape-variability MR Data

no code implementations • 22 Feb 2019 • Xiaoxiao He, Chaowei Tan, Yuting Qiao, Virak Tan, Dimitris Metaxas, Kang Li

For the initial shoulder preoperative diagnosis, it is essential to obtain a three-dimensional (3D) bone mask from medical images, e. g., magnetic resonance (MR).

Segmentation

Paper
Add Code

Unsupervised Domain Adaptation via Calibrating Uncertainties

1 code implementation • 25 Jul 2019 • Ligong Han, Yang Zou, Ruijiang Gao, Lezi Wang, Dimitris Metaxas

Unsupervised domain adaptation (UDA) aims at inferring class labels for unlabeled target domain given a related labeled source dataset.

Unsupervised Domain Adaptation

Paper
Code

Label Cleaning with Likelihood Ratio Test

no code implementations • 25 Sep 2019 • Songzhu Zheng, Pengxiang Wu, Aman Goswami, Mayank Goswami, Dimitris Metaxas, Chao Chen

To collect large scale annotated data, it is inevitable to introduce label noise, i. e., incorrect class labels.

Paper
Add Code

Robust Conditional GAN from Uncertainty-Aware Pairwise Comparisons

1 code implementation • 21 Nov 2019 • Ligong Han, Ruijiang Gao, Mun Kim, Xin Tao, Bo Liu, Dimitris Metaxas

Conditional generative adversarial networks have shown exceptional generation performance over the past few years.

Attribute Generative Adversarial Network

Paper
Code

Point Cloud Processing via Recurrent Set Encoding

no code implementations • 25 Nov 2019 • Pengxiang Wu, Chao Chen, Jingru Yi, Dimitris Metaxas

The spatial layout of the beams is regular, and this allows the beam features to be further fed into an efficient 2D convolutional neural network (CNN) for hierarchical feature aggregation.

Paper
Add Code

MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps

2 code implementations • CVPR 2020 • Pengxiang Wu, Siheng Chen, Dimitris Metaxas

The backbone of MotionNet is a novel spatio-temporal pyramid network, which extracts deep spatial and temporal features in a hierarchical fashion.

3D Object Detection Autonomous Driving +2

168

Paper
Code

Synthetic Learning: Learn From Distributed Asynchronized Discriminator GAN Without Sharing Medical Image Data

1 code implementation • CVPR 2020 • Qi Chang, Hui Qu, Yikai Zhang, Mert Sabuncu, Chao Chen, Tong Zhang, Dimitris Metaxas

In this paper, we propose a data privacy-preserving and communication efficient distributed GAN learning framework named Distributed Asynchronized Discriminator GAN (AsynDGAN).

Privacy Preserving

Paper
Code

Unbiased Auxiliary Classifier GANs with MINE

1 code implementation • 13 Jun 2020 • Ligong Han, Anastasis Stathopoulos, Tao Xue, Dimitris Metaxas

To remedy this, Twin Auxiliary Classifier GAN (TAC-GAN) introduces a twin classifier to the min-max game.

Paper
Code

OnlineAugment: Online Data Augmentation with Less Domain Knowledge

1 code implementation • ECCV 2020 • Zhiqiang Tang, Yunhe Gao, Leonid Karlinsky, Prasanna Sattigeri, Rogerio Feris, Dimitris Metaxas

First is that most if not all modern augmentation search methods are offline and learning policies are isolated from their usage.

Data Augmentation Image Classification

Paper
Code

Learn distributed GAN with Temporary Discriminators

1 code implementation • ECCV 2020 • Hui Qu, Yikai Zhang, Qi Chang, Zhennan Yan, Chao Chen, Dimitris Metaxas

Our proposed method tackles the challenge of training GAN in the federated learning manner: How to update the generator with a flow of temporary discriminators?

Federated Learning

Paper
Code

Oriented Object Detection in Aerial Images with Box Boundary-Aware Vectors

1 code implementation • 17 Aug 2020 • Jingru Yi, Pengxiang Wu, Bo Liu, Qiaoying Huang, Hui Qu, Dimitris Metaxas

To address this issue, in this work we extend the horizontal keypoint-based object detector to the oriented object detection task.

Ranked #11 on Oriented Object Detection on DOTA 1.0

Object object-detection +3

455

Paper
Code

PC-U Net: Learning to Jointly Reconstruct and Segment the Cardiac Walls in 3D from CT Data

no code implementations • 18 Aug 2020 • Meng Ye, Qiaoying Huang, Dong Yang, Pengxiang Wu, Jingru Yi, Leon Axel, Dimitris Metaxas

The 3D volumetric shape of the heart's left ventricle (LV) myocardium (MYO) wall provides important information for diagnosis of cardiac disease and invasive procedure navigation.

Image Segmentation Segmentation +1

Paper
Add Code

Enhanced MRI Reconstruction Network using Neural Architecture Search

no code implementations • 19 Aug 2020 • Qiaoying Huang, Dong Yang, Yikun Xian, Pengxiang Wu, Jingru Yi, Hui Qu, Dimitris Metaxas

The accurate reconstruction of under-sampled magnetic resonance imaging (MRI) data using modern deep learning technology, requires significant effort to design the necessary complex neural network architectures.

MRI Reconstruction Neural Architecture Search

Paper
Add Code

Measure Anatomical Thickness from Cardiac MRI with Deep Neural Networks

no code implementations • 25 Aug 2020 • Qiaoying Huang, Eric Z. Chen, Hanchao Yu, Yimo Guo, Terrence Chen, Dimitris Metaxas, Shanhui Sun

We also analyze thickness patterns on different cardiac pathologies with a standard clinical model and the results demonstrate the potential clinical value of our method for thickness based cardiac disease diagnosis.

Paper
Add Code

Deep Learning based NAS Score and Fibrosis Stage Prediction from CT and Pathology Data

1 code implementation • 22 Sep 2020 • Ananya Jana, Hui Qu, Puru Rattan, Carlos D. Minacapelli, Vinod Rustgi, Dimitris Metaxas

In this work, we propose a novel method to automatically predict NAS score and fibrosis stage from CT data that is non-invasive and inexpensive to obtain compared with liver biopsy.

Paper
Code

Maximum-Entropy Adversarial Data Augmentation for Improved Generalization and Robustness

1 code implementation • NeurIPS 2020 • Long Zhao, Ting Liu, Xi Peng, Dimitris Metaxas

In this paper, we propose a novel and effective regularization term for adversarial data augmentation.

Data Augmentation

Paper
Code

Error-Bounded Correction of Noisy Labels

3 code implementations • ICML 2020 • Songzhu Zheng, Pengxiang Wu, Aman Goswami, Mayank Goswami, Dimitris Metaxas, Chao Chen

To be robust against label noise, many successful methods rely on the noisy classifiers (i. e., models trained on the noisy training data) to determine whether a label is trustworthy.

Ranked #40 on Image Classification on Clothing1M

Image Classification

Paper
Code

Deep Subspace Clustering with Data Augmentation

no code implementations • NeurIPS 2020 • Mahdi Abavisani, Alireza Naghizadeh, Dimitris Metaxas, Vishal Patel

In particular, we introduce a temporal ensembling component to the objective function of DSC algorithms to enable the DSC networks to maintain consistent subspaces for random transformations in the input data.

Clustering Data Augmentation

Paper
Add Code

Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization

1 code implementation • CVPR 2021 • Long Zhao, Yuxiao Wang, Jiaping Zhao, Liangzhe Yuan, Jennifer J. Sun, Florian Schroff, Hartwig Adam, Xi Peng, Dimitris Metaxas, Ting Liu

To evaluate the power of the learned representations, in addition to the conventional fully-supervised action recognition settings, we introduce a novel task called single-shot cross-view action recognition.

Action Recognition Contrastive Learning +1

32,808

Paper
Code

A Topological Filter for Learning with Label Noise

1 code implementation • NeurIPS 2020 • Pengxiang Wu, Songzhu Zheng, Mayank Goswami, Dimitris Metaxas, Chao Chen

Noisy labels can impair the performance of deep neural networks.

Paper
Code

CrossNorm and SelfNorm for Generalization under Distribution Shifts

1 code implementation • ICCV 2021 • Zhiqiang Tang, Yunhe Gao, Yi Zhu, Zhi Zhang, Mu Li, Dimitris Metaxas

Can we develop new normalization methods to improve generalization robustness under distribution shifts?

122

Paper
Code

Training Federated GANs with Theoretical Guarantees: A Universal Aggregation Approach

1 code implementation • 9 Feb 2021 • Yikai Zhang, Hui Qu, Qi Chang, Huidong Liu, Dimitris Metaxas, Chao Chen

A federatedGAN jointly trains a centralized generator and multiple private discriminators hosted at different sites.

Federated Learning

Paper
Code

DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images

1 code implementation • CVPR 2021 • Meng Ye, Mikael Kanski, Dong Yang, Qi Chang, Zhennan Yan, Qiaoying Huang, Leon Axel, Dimitris Metaxas

Cardiac tagging magnetic resonance imaging (t-MRI) is the gold standard for regional myocardium deformation and cardiac strain estimation.

Image Registration Landmark Tracking

Paper
Code

Liver Fibrosis and NAS scoring from CT images using self-supervised learning and texture encoding

1 code implementation • 5 Mar 2021 • Ananya Jana, Hui Qu, Carlos D. Minacapelli, Carolyn Catalano, Vinod Rustgi, Dimitris Metaxas

The severity and treatment of NAFLD is determined by NAFLD Activity Scores (NAS)and liver fibrosis stage, which are usually obtained from liver biopsy.

Self-Supervised Learning Transfer Learning

Paper
Code

Enabling Data Diversity: Efficient Automatic Augmentation via Regularized Adversarial Training

1 code implementation • 30 Mar 2021 • Yunhe Gao, Zhiqiang Tang, Mu Zhou, Dimitris Metaxas

Data augmentation has proved extremely useful by increasing training data variance to alleviate overfitting and improve deep neural networks' generalization performance.

Data Augmentation Skin Cancer Classification

Paper
Code

UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation

1 code implementation • 2 Jul 2021 • Yunhe Gao, Mu Zhou, Dimitris Metaxas

In this study, we present UTNet, a simple yet powerful hybrid Transformer architecture that integrates self-attention into a convolutional neural network for enhancing medical image segmentation.

Image Segmentation Inductive Bias +2

168

Paper
Code

Dual Projection Generative Adversarial Networks for Conditional Image Generation

1 code implementation • ICCV 2021 • Ligong Han, Martin Renqiang Min, Anastasis Stathopoulos, Yu Tian, Ruijiang Gao, Asim Kadav, Dimitris Metaxas

We then propose an improved cGAN model with Auxiliary Classification that directly aligns the fake and real conditionals $P(\text{class}|\text{image})$ by minimizing their $f$-divergence.

Conditional Image Generation

Paper
Code

Global and Local Interpretation of black-box Machine Learning models to determine prognostic factors from early COVID-19 data

1 code implementation • 10 Sep 2021 • Ananya Jana, Carlos D. Minacapelli, Vinod Rustgi, Dimitris Metaxas

We explore one of the most recent techniques called symbolic metamodeling to find the mathematical expression of the machine learning models for COVID-19.

BIG-bench Machine Learning Explainable Models +1

Paper
Code

Semi-Supervised Segmentation of Radiation-Induced Pulmonary Fibrosis from Lung CT Scans with Multi-Scale Guided Dense Attention

1 code implementation • 29 Sep 2021 • Guotai Wang, Shuwei Zhai, Giovanni Lasio, Baoshe Zhang, Byong Yi, Shifeng Chen, Thomas J. Macvittie, Dimitris Metaxas, Jinghao Zhou, Shaoting Zhang

Computed Tomography (CT) plays an important role in monitoring radiation-induced Pulmonary Fibrosis (PF), where accurate segmentation of the PF lesions is highly desired for diagnosis and treatment follow-up.

Computed Tomography (CT) Lesion Segmentation +1

Paper
Code

AE-StyleGAN: Improved Training of Style-Based Auto-Encoders

1 code implementation • 17 Oct 2021 • Ligong Han, Sri Harsha Musunuri, Martin Renqiang Min, Ruijiang Gao, Yu Tian, Dimitris Metaxas

StyleGANs have shown impressive results on data generation and manipulation in recent years, thanks to its disentangled style latent space.

Paper
Code

ASL Video Corpora & Sign Bank: Resources Available through the American Sign Language Linguistic Research Project (ASLLRP)

no code implementations • 19 Jan 2022 • Carol Neidle, Augustine Opoku, Dimitris Metaxas

These data have been used for many types of research in linguistics and in computer-based sign language recognition from video; examples of such research are provided in the latter part of this article.

Sign Language Recognition

Paper
Add Code

Contrastive and Selective Hidden Embeddings for Medical Image Segmentation

1 code implementation • 21 Jan 2022 • Zhuowei Li, Zihao Liu, Zhiqiang Hu, Qing Xia, Ruiqin Xiong, Shaoting Zhang, Dimitris Metaxas, Tingting Jiang

Medical image segmentation has been widely recognized as a pivot procedure for clinical diagnosis, analysis, and treatment planning.

Contrastive Learning feature selection +4

Paper
Code

Modality Bank: Learn multi-modality images across data centers without sharing medical data

no code implementations • 22 Jan 2022 • Qi Chang, Hui Qu, Zhennan Yan, Yunhe Gao, Lohendran Baskaran, Dimitris Metaxas

Multi-modality images have been widely used and provide comprehensive information for medical image analysis.

Paper
Add Code

Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning

1 code implementation • CVPR 2022 • Ligong Han, Jian Ren, Hsin-Ying Lee, Francesco Barbieri, Kyle Olszewski, Shervin Minaee, Dimitris Metaxas, Sergey Tulyakov

In addition, our model can extract visual information as suggested by the text prompt, e. g., "an object in image one is moving northeast", and generate corresponding videos.

Self-Learning Text Augmentation +1

186

Paper
Code

Region Proposal Rectification Towards Robust Instance Segmentation of Biological Images

no code implementations • 6 Mar 2022 • Qilong Zhangli, Jingru Yi, Di Liu, Xiaoxiao He, Zhaoyang Xia, Qi Chang, Ligong Han, Yunhe Gao, Song Wen, Haiming Tang, He Wang, Mu Zhou, Dimitris Metaxas

Top-down instance segmentation framework has shown its superiority in object detection compared to the bottom-up framework.

Instance Segmentation object-detection +4

Paper
Add Code

TransFusion: Multi-view Divergent Fusion for Medical Image Segmentation with Transformers

no code implementations • 21 Mar 2022 • Di Liu, Yunhe Gao, Qilong Zhangli, Ligong Han, Xiaoxiao He, Zhaoyang Xia, Song Wen, Qi Chang, Zhennan Yan, Mu Zhou, Dimitris Metaxas

Combining information from multi-view images is crucial to improve the performance and robustness of automated methods for disease diagnosis.

Image Segmentation Medical Image Segmentation +2

Paper
Add Code

Global Matching with Overlapping Attention for Optical Flow Estimation

1 code implementation • CVPR 2022 • Shiyu Zhao, Long Zhao, Zhixing Zhang, Enyu Zhou, Dimitris Metaxas

In this paper, inspired by the traditional matching-optimization methods where matching is introduced to handle large displacements before energy-based optimizations, we introduce a simple but effective global matching step before the direct regression and develop a learning-based matching-optimization framework, namely GMFlowNet.

Ranked #4 on Optical Flow Estimation on KITTI 2015

Optical Flow Estimation regression

Paper
Code

A Manifold View of Adversarial Risk

no code implementations • 24 Mar 2022 • Wenjia Zhang, Yikai Zhang, Xiaoling Hu, Mayank Goswami, Chao Chen, Dimitris Metaxas

Assuming data lies in a manifold, we investigate two new types of adversarial risk, the normal adversarial risk due to perturbation along normal direction, and the in-manifold adversarial risk due to perturbation within the manifold.

Paper
Add Code

DeepRecon: Joint 2D Cardiac Segmentation and 3D Volume Reconstruction via A Structure-Specific Generative Method

no code implementations • 14 Jun 2022 • Qi Chang, Zhennan Yan, Mu Zhou, Di Liu, Khalid Sawalha, Meng Ye, Qilong Zhangli, Mikael Kanski, Subhi Al Aref, Leon Axel, Dimitris Metaxas

Joint 2D cardiac segmentation and 3D volume reconstruction are fundamental to building statistical cardiac anatomy models and understanding functional mechanisms from motion patterns.

3D Reconstruction 3D Shape Reconstruction +5

Paper
Add Code

A Dynamic Data Driven Approach for Explainable Scene Understanding

no code implementations • 18 Jun 2022 • Zachary A Daniels, Dimitris Metaxas

Suppose that an agent utilizing one or more sensors is placed in an unknown environment, and based on its sensory input, the agent needs to assign some label to the perceived scene.

Autonomous Driving Scene Understanding

Paper
Add Code

Exploiting Unlabeled Data with Vision and Language Models for Object Detection

1 code implementation • 18 Jul 2022 • Shiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B. G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris Metaxas

We propose a novel method that leverages the rich semantics available in recent vision and language models to localize and classify objects in unlabeled images, effectively generating pseudo labels for object detection.

Ranked #15 on Open Vocabulary Object Detection on MSCOCO (using extra training data)

Object object-detection +3

Paper
Code

Automatic Tooth Segmentation from 3D Dental Model using Deep Learning: A Quantitative Analysis of what can be learnt from a Single 3D Dental Model

1 code implementation • 16 Sep 2022 • Ananya Jana, Hrebesh Molly Subhash, Dimitris Metaxas

We conclude that the segmentation methods can learn a great deal of information from a single 3D tooth point cloud scan under suitable conditions e. g. data augmentation.

Data Augmentation Point Cloud Segmentation +2

Paper
Code

SINE: SINgle Image Editing with Text-to-Image Diffusion Models

1 code implementation • CVPR 2023 • Zhixing Zhang, Ligong Han, Arnab Ghosh, Dimitris Metaxas, Jian Ren

We propose a novel model-based guidance built upon the classifier-free guidance so that the knowledge from the model trained on a single image can be distilled into the pre-trained diffusion model, enabling content creation even with one given image.

Image Generation

173

Paper
Code

Diffusion Guided Domain Adaptation of Image Generators

no code implementations • 8 Dec 2022 • Kunpeng Song, Ligong Han, Bingchen Liu, Dimitris Metaxas, Ahmed Elgammal

Can a text-to-image diffusion model be used as a training objective for adapting a GAN generator to another domain?

Domain Adaptation

Paper
Add Code

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

1 code implementation • ICCV 2023 • Ligong Han, Yinxiao Li, Han Zhang, Peyman Milanfar, Dimitris Metaxas, Feng Yang

Diffusion models have achieved remarkable success in text-to-image generation, enabling the creation of high-quality images from text prompts or other modalities.

Data Augmentation Efficient Diffusion Personalization +1

353

Paper
Code

Constructive Assimilation: Boosting Contrastive Learning Performance through View Generation Strategies

no code implementations • 2 Apr 2023 • Ligong Han, Seungwook Han, Shivchander Sudalairaj, Charlotte Loh, Rumen Dangovski, Fei Deng, Pulkit Agrawal, Dimitris Metaxas, Leonid Karlinsky, Tsui-Wei Weng, Akash Srivastava

Recently, several attempts have been made to replace such domain-specific, human-designed transformations with generated views that are learned.

Contrastive Learning Representation Learning

Paper
Add Code

OmniLabel: A Challenging Benchmark for Language-Based Object Detection

no code implementations • ICCV 2023 • Samuel Schulter, Vijay Kumar B G, Yumin Suh, Konstantinos M. Dafnis, Zhixing Zhang, Shiyu Zhao, Dimitris Metaxas

With more than 28K unique object descriptions on over 25K images, OmniLabel provides a challenging benchmark with diverse and complex object descriptions in a naturally open-vocabulary setting.

Object object-detection +1

Paper
Add Code

Learning Articulated Shape with Keypoint Pseudo-labels from Web Images

no code implementations • CVPR 2023 • Anastasis Stathopoulos, Georgios Pavlakos, Ligong Han, Dimitris Metaxas

It is based on two key insights: (1) 2D keypoint estimation networks trained on as few as 50-150 images of a given object category generalize well and generate reliable pseudo-labels; (2) a data selection mechanism can automatically create a "curated" subset of the unlabeled web images that can be used for training -- we evaluate four data selection methods.

3D Reconstruction Keypoint Estimation +1

Paper
Add Code

Improving Tuning-Free Real Image Editing with Proximal Guidance

1 code implementation • 8 Jun 2023 • Ligong Han, Song Wen, Qi Chen, Zhixing Zhang, Kunpeng Song, Mengwei Ren, Ruijiang Gao, Anastasis Stathopoulos, Xiaoxiao He, Yuxiao Chen, Di Liu, Qilong Zhangli, Jindong Jiang, Zhaoyang Xia, Akash Srivastava, Dimitris Metaxas

Null-text inversion (NTI) optimizes null embeddings to align the reconstruction and inversion trajectories with larger CFG scales, enabling real image editing with cross-attention control.

Paper
Code

On the Challenges and Perspectives of Foundation Models for Medical Image Analysis

no code implementations • 9 Jun 2023 • Shaoting Zhang, Dimitris Metaxas

This article discusses the opportunities, applications and future directions of large-scale pre-trained models, i. e., foundation models, for analyzing medical images.

Paper
Add Code

Neural Deformable Models for 3D Bi-Ventricular Heart Shape Reconstruction and Modeling from 2D Sparse Cardiac Magnetic Resonance Imaging

no code implementations • ICCV 2023 • Meng Ye, Dong Yang, Mikael Kanski, Leon Axel, Dimitris Metaxas

We model the bi-ventricular shape using blended deformable superquadrics, which are parameterized by a set of geometric parameter functions and are capable of deforming globally and locally.

Paper
Add Code

Improving Compositional Text-to-image Generation with Large Vision-Language Models

no code implementations • 10 Oct 2023 • Song Wen, Guian Fang, Renrui Zhang, Peng Gao, Hao Dong, Dimitris Metaxas

However, compositional text-to-image models frequently encounter difficulties in generating high-quality images that accurately align with input texts describing multiple objects, variable attributes, and intricate spatial relationships.

Attribute Text-to-Image Generation

Paper
Add Code

AVID: Any-Length Video Inpainting with Diffusion Model

1 code implementation • 6 Dec 2023 • Zhixing Zhang, Bichen Wu, Xiaoyan Wang, Yaqiao Luo, Luxin Zhang, Yinan Zhao, Peter Vajda, Dimitris Metaxas, Licheng Yu

Given a video, a masked region at its initial frame, and an editing prompt, it requires a model to do infilling at each frame following the editing guidance while keeping the out-of-mask region intact.

Image Inpainting Video Inpainting

Paper
Code

Score-Guided Diffusion for 3D Human Recovery

1 code implementation • 14 Mar 2024 • Anastasis Stathopoulos, Ligong Han, Dimitris Metaxas

We present Score-Guided Human Mesh Recovery (ScoreHMR), an approach for solving inverse problems for 3D human pose and shape reconstruction.

Denoising Human Mesh Recovery

233

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.