This article presents a general Bayesian learning framework for multi-modal groupwise registration on medical images.
Current visual instruction datasets, generated through large language models, focus on creating question/answer pairs for individual image patches, which may lack diagnostic capacity on their own in histopathology, further complicated by the absence of spatial grounding in histopathology image captions.
From YouTube, we curate QUILT: a large-scale vision-language dataset consisting of $802, 144$ image and text pairs.
However, the domain shift between natural images and digital pathology images requires further research in designing MAE for patch-level WSIs.
On top of BAR, we propose using a soft-label-capable supervised contrastive loss, aiming to learn the relative similarity of representations that reflect how mixed are the synthetic MRIs using our soft labels.
1 code implementation • 11 Aug 2021 • Beibin Li, Nicholas Nuechterlein, Erin Barney, Claire Foster, Minah Kim, Monique Mahony, Adham Atyabi, Li Feng, Quan Wang, Pamela Ventola, Linda Shapiro, Frederick Shic
Identifying oculomotor behaviors relevant for eye-tracking applications is a critical but often challenging task.
Given a segmentation mask defining the layout of the semantic regions in the texture map, our network generates high-resolution textures with a variety of styles, that are then used for rendering purposes.
HATNet extends the bag-of-words approach and uses self-attention to encode global information, allowing it to learn representations from clinically relevant tissue structures without any explicit supervision.
Traditional methods for image-based 3D face reconstruction and facial motion retargeting fit a 3D morphable model (3DMM) to the face, which has limited modeling capacity and fail to generalize well to in-the-wild data.
In genomic analysis, biomarker discovery, image recognition, and other systems involving machine learning, input variables can often be organized into different groups by their source or semantic category.
We present DeepExpr, a novel expression transfer system from humans to multiple stylized characters via deep learning.
In this paper, we introduce an end-to-end machine learning-based system for classifying autism spectrum disorder (ASD) using facial attributes such as expressions, action units, arousal, and valence.
Compared to YOLOv2 on the MS-COCO object detection, ESPNetv2 delivers 4. 4% higher accuracy with 6x fewer FLOPs.
Ranked #41 on Semantic Segmentation on PASCAL VOC 2012 test
In this paper, we introduce a conceptually simple network for generating discriminative tissue-level segmentation masks for the purpose of breast cancer diagnosis.
We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints.
Ranked #48 on Semantic Segmentation on PASCAL VOC 2012 test
We present an approach for identifying the most walkable direction for navigation using a hand-held camera.
We trained and applied an encoder-decoder model to semantically segment breast biopsy images into biologically meaningful tissue labels.
Interactive video segmentation systems aim at producing sub-pixel-level object boundaries for visual effect applications.
This paper proposes a new supervised semantic edge and gradient extraction approach, which allows the user to roughly scribble over the desired region to extract semantically-dominant and coherent edges in it.