Search Results for author: Mahtab Bigverdi

Found 4 papers, 1 papers with code

Perception Tokens Enhance Visual Reasoning in Multimodal Language Models

no code implementations4 Dec 2024 Mahtab Bigverdi, Zelun Luo, Cheng-Yu Hsieh, Ethan Shen, Dongping Chen, Linda G. Shapiro, Ranjay Krishna

For example, in a depth-related task, an MLM augmented with perception tokens can reason by generating a depth map as tokens, enabling it to solve the problem effectively.

Depth Estimation object-detection +2

Gene-Level Representation Learning via Interventional Style Transfer in Optical Pooled Screening

no code implementations11 Jun 2024 Mahtab Bigverdi, Burkhard Hockendorf, Heming Yao, Phil Hanslovsky, Romain Lopez, David Richmond

Optical pooled screening (OPS) combines automated microscopy and genetic perturbations to systematically study gene function in a scalable and cost-effective way.

Clustering Representation Learning +1

Data Alignment for Zero-Shot Concept Generation in Dermatology AI

no code implementations19 Apr 2024 Soham Gadgil, Mahtab Bigverdi

Our goal is to use these models to generate caption text that aligns well with both the clinical lexicon and with the natural human language used in CLIP's pre-training data.

MIMIC: Masked Image Modeling with Image Correspondences

1 code implementation27 Jun 2023 Kalyani Marathe, Mahtab Bigverdi, Nishat Khan, Tuhin Kundu, Patrick Howe, Sharan Ranjit S, Anand Bhattad, Aniruddha Kembhavi, Linda G. Shapiro, Ranjay Krishna

We train multiple models with different masked image modeling objectives to showcase the following findings: Representations trained on our automatically generated MIMIC-3M outperform those learned from expensive crowdsourced datasets (ImageNet-1K) and those learned from synthetic environments (MULTIVIEW-HABITAT) on two dense geometric tasks: depth estimation on NYUv2 (1. 7%), and surface normals estimation on Taskonomy (2. 05%).

Depth Estimation Pose Estimation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.