Search Results for author: Gokul Kannan

Found 1 papers, 1 papers with code

SurgicalGPT: End-to-End Language-Vision GPT for Visual Question Answering in Surgery

1 code implementation19 Apr 2023 Lalithkumar Seenivasan, Mobarakol Islam, Gokul Kannan, Hongliang Ren

Given the limitations of unidirectional attention in GPT models and their ability to generate coherent long paragraphs, we carefully sequence the word tokens before vision tokens, mimicking the human thought process of understanding the question to infer an answer from an image.

Question Answering Scene Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.