Search Results for author: Mahmoud Khademi

Found 9 papers, 1 papers with code

Training Small Multimodal Models to Bridge Biomedical Competency Gap: A Case Study in Radiology Imaging

no code implementations • 12 Mar 2024 • Juan Manuel Zambrano Chaves, Shih-Cheng Huang, Yanbo Xu, Hanwen Xu, Naoto Usuyama, Sheng Zhang, Fei Wang, Yujia Xie, Mahmoud Khademi, ZiYi Yang, Hany Awadalla, Julia Gong, Houdong Hu, Jianwei Yang, Chunyuan Li, Jianfeng Gao, Yu Gu, Cliff Wong, Mu Wei, Tristan Naumann, Muhao Chen, Matthew P. Lungren, Serena Yeung-Levy, Curtis P. Langlotz, Sheng Wang, Hoifung Poon

Frontier models such as GPT-4V still have major competency gaps in multimodal capabilities for biomedical applications.

Cross-Modal Retrieval

Paper
Add Code

CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation

no code implementations • 30 Nov 2023 • Zineng Tang, ZiYi Yang, Mahmoud Khademi, Yang Liu, Chenguang Zhu, Mohit Bansal

We present CoDi-2, a versatile and interactive Multimodal Large Language Model (MLLM) that can follow complex multimodal interleaved instructions, conduct in-context learning (ICL), reason, chat, edit, etc., in an any-to-any input-output modality paradigm.

Image Generation In-Context Learning +3

Paper
Add Code

i-Code Studio: A Configurable and Composable Framework for Integrative AI

no code implementations • 23 May 2023 • Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, ZiYi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang

Artificial General Intelligence (AGI) requires comprehensive understanding and generation capabilities for a variety of tasks spanning different modalities and functionalities.

Question Answering Retrieval +4

Paper
Add Code

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

no code implementations • 21 May 2023 • ZiYi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities.

Paper
Add Code

Multimodal Neural Graph Memory Networks for Visual Question Answering

no code implementations • ACL 2020 • Mahmoud Khademi

The MN-GMN uses graph structure with different region features as node attributes and applies a recently proposed powerful graph neural network model, Graph Network (GN), to reason about objects and their interactions in an image.

Question Answering Visual Question Answering

Paper
Add Code

Learning to Represent Programs with Graphs

2 code implementations • ICLR 2018 • Miltiadis Allamanis, Marc Brockschmidt, Mahmoud Khademi

Learning tasks on source code (i. e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax.

372

Paper
Code

Relative Facial Action Unit Detection

no code implementations • 1 May 2014 • Mahmoud Khademi, Louis-Philippe Morency

This paper presents a subject-independent facial action unit (AU) detection method by introducing the concept of relative AU detection, for scenarios where the neutral face is not provided.

Action Unit Detection Facial Action Unit Detection +1

Paper
Add Code

Extended Active Learning Method

no code implementations • 10 Nov 2010 • Ali Akbar Kiaei, Saeed Bagheri Shouraki, Seyed Hossein Khasteh, Mahmoud Khademi, Alireza Ghatreh Samani

Active Learning Method (ALM) is a soft computing method which is used for modeling and control, based on fuzzy logic.

Active Learning

Paper
Add Code

Extended Two-Dimensional PCA for Efficient Face Representation and Recognition

no code implementations • 6 Apr 2010 • Mehran Safayani, Mohammad T. Manzuri-Shalmani, Mahmoud Khademi

r = 1 produces the covariance of 2DPCA, r = n that of PCA.

Vocal Bursts Valence Prediction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.