Search Results for author: Kevin Shih

Found 6 papers, 1 papers with code

Clinical Named Entity Recognition using Contextualized Token Representations

no code implementations • 23 Jun 2021 • Yichao Zhou, Chelsea Ju, J. Harry Caufield, Kevin Shih, Calvin Chen, Yizhou Sun, Kai-Wei Chang, Peipei Ping, Wei Wang

To facilitate various downstream applications using clinical case reports (CCRs), we pre-train two deep contextualized language models, Clinical Embeddings from Language Model (C-ELMo) and Clinical Contextual String Embeddings (C-Flair) using the clinical-related corpus from the PubMed Central.

Language Modelling named-entity-recognition +3

Paper
Add Code

Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis

3 code implementations • ICLR 2021 • Rafael Valle, Kevin Shih, Ryan Prenger, Bryan Catanzaro

In this paper we propose Flowtron: an autoregressive flow-based generative network for text-to-speech synthesis with control over speech variation and style transfer.

Ranked #1 on Text-To-Speech Synthesis on LJSpeech (Pleasantness MOS metric, using extra training data)

Speech Synthesis Style Transfer +1

881

Paper
Code

An Interpretable Model for Scene Graph Generation

no code implementations • 21 Nov 2018 • Ji Zhang, Kevin Shih, Andrew Tao, Bryan Catanzaro, Ahmed Elgammal

We propose an efficient and interpretable scene graph generator.

Graph Generation Image Captioning +3

Paper
Add Code

Introduction to the 1st Place Winning Model of OpenImages Relationship Detection Challenge

no code implementations • 1 Nov 2018 • Ji Zhang, Kevin Shih, Andrew Tao, Bryan Catanzaro, Ahmed Elgammal

This article describes the model we built that achieved 1st place in the OpenImage Visual Relationship Detection Challenge on Kaggle.

Relationship Detection Visual Relationship Detection

Paper
Add Code

Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks

no code implementations • ICCV 2017 • Tanmay Gupta, Kevin Shih, Saurabh Singh, Derek Hoiem

In this paper, we investigate a vision-language embedding as a core representation and show that it leads to better cross-task transfer than standard multi-task learning.

Multi-Task Learning Question Answering +1

Paper
Add Code

Efficient Media Retrieval from Non-Cooperative Queries

no code implementations • 19 Nov 2014 • Kevin Shih, Wei Di, Vignesh Jagadeesh, Robinson Piramuthu

Text is ubiquitous in the artificial world and easily attainable when it comes to book title and author names.

Optical Character Recognition (OCR) Retrieval +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.