Search Results for author: Thao Nguyen

Found 25 papers, 9 papers with code

Yo'LLaVA: Your Personalized Language and Vision Assistant

no code implementations13 Jun 2024 Thao Nguyen, Haotian Liu, Yuheng Li, Mu Cai, Utkarsh Ojha, Yong Jae Lee

In this paper, we introduce the novel task of personalizing LMMs, so that they can have conversations about a specific subject.

Multilingual Diversity Improves Vision-Language Representations

no code implementations27 May 2024 Thao Nguyen, Matthew Wallingford, Sebastin Santy, Wei-Chiu Ma, Sewoong Oh, Ludwig Schmidt, Pang Wei Koh, Ranjay Krishna

By translating all multilingual image-text pairs from a raw web crawl to English and re-filtering them, we increase the prevalence of (translated) multilingual data in the resulting training set.

Text Retrieval

GLaD: Synergizing Molecular Graphs and Language Descriptors for Enhanced Power Conversion Efficiency Prediction in Organic Photovoltaic Devices

no code implementations23 May 2024 Thao Nguyen, Tiara Torres-Flores, Changhyun Hwang, Carl Edwards, Ying Diao, Heng Ji

Especially, GLaD proves valuable for tasks in low-data regimes within the chemical space, as it enriches molecular representations by incorporating molecular property descriptions learned from large-scale pretraining.

Decision Making Efficient Exploration +2

Edit One for All: Interactive Batch Image Editing

no code implementations CVPR 2024 Thao Nguyen, Utkarsh Ojha, Yuheng Li, Haotian Liu, Yong Jae Lee

With increased human control, it is now possible to edit an image in a plethora of ways; from specifying in text what we want to change, to straight up dragging the contents of the image in an interactive point-based manner.

Probing clustering in neural network representations

no code implementations14 Nov 2023 Thao Nguyen, Simon Kornblith

Neural network representations contain structure beyond what was present in the training labels.


Language-Conditioned Observation Models for Visual Object Search

no code implementations13 Sep 2023 Thao Nguyen, Vladislav Hrosinkov, Eric Rosen, Stefanie Tellex

In this work, we bridge the gap in realistic object search by posing the search problem as a partially observable Markov decision process (POMDP) where the object detector and visual sensor noise in the observation model is determined by a single Deep Neural Network conditioned on complex language descriptions.


Guiding Image Captioning Models Toward More Specific Captions

no code implementations ICCV 2023 Simon Kornblith, Lala Li, ZiRui Wang, Thao Nguyen

We further explore the use of language models to guide the decoding process, obtaining small improvements over the Pareto frontier of reference-free vs. reference-based captioning metrics that arises from classifier-free guidance, and substantially improving the quality of captions generated from a model trained only on minimally curated web data.

Image Captioning Image Retrieval

Visual Instruction Inversion: Image Editing via Visual Prompting

1 code implementation26 Jul 2023 Thao Nguyen, Yuheng Li, Utkarsh Ojha, Yong Jae Lee

Given pairs of example that represent the "before" and "after" images of an edit, our goal is to learn a text-based editing direction that can be used to perform the same edit on new images.

Visual Prompting

Detecting COVID-19 from digitized ECG printouts using 1D convolutional neural networks

no code implementations10 Aug 2022 Thao Nguyen, Hieu H. Pham, Huy Khiem Le, Anh Tu Nguyen, Ngoc Tien Thanh, Cuong Do

Experiments on the COVID-19 ECG images dataset demonstrate that the proposed digitization method is able to capture correctly the original signals, with a mean absolute error of 28. 11 ms. Our proposed 1D-CNN model, which is trained on the digitized ECG signals, allows identifying individuals with COVID-19 and other subjects accurately, with classification accuracies of 98. 42%, 95. 63%, and 98. 50% for classifying COVID-19 vs. Normal, COVID-19 vs. Abnormal Heartbeats, and COVID-19 vs. other classes, respectively.

Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP

1 code implementation10 Aug 2022 Thao Nguyen, Gabriel Ilharco, Mitchell Wortsman, Sewoong Oh, Ludwig Schmidt

Web-crawled datasets have enabled remarkable generalization capabilities in recent image-text models such as CLIP (Contrastive Language-Image pre-training) or Flamingo, but little is known about the dataset creation processes.

A novel deep learning-based approach for sleep apnea detection using single-lead ECG signals

no code implementations5 Aug 2022 Anh-Tu Nguyen, Thao Nguyen, Huy-Khiem Le, Huy-Hieu Pham, Cuong Do

In this study, a novel method of feature extraction based on the detection of S peaks is proposed to enhance the detection of adjacent SA segments using a single-lead ECG.

Feature Engineering Sleep apnea detection +1

On the Origins of the Block Structure Phenomenon in Neural Network Representations

1 code implementation15 Feb 2022 Thao Nguyen, Maithra Raghu, Simon Kornblith

Recent work has uncovered a striking phenomenon in large-capacity neural networks: they contain blocks of contiguous hidden layers with highly similar representations.

Dominant Datapoints and the Block Structure Phenomenon in Neural Network Hidden Representations

no code implementations29 Sep 2021 Thao Nguyen, Maithra Raghu, Simon Kornblith

Recent work has uncovered a striking phenomenon in large-capacity neural networks: they contain blocks of contiguous hidden layers with highly similar representations.

Lipstick ain't enough: Beyond Color Matching for In-the-Wild Makeup Transfer

1 code implementation CVPR 2021 Thao Nguyen, Anh Tran, Minh Hoai

However, existing works overlooked the latter components and confined makeup transfer to color manipulation, focusing only on light makeup styles.

Color Manipulation Facial Makeup Transfer +2

Robust and Private Learning of Halfspaces

no code implementations30 Nov 2020 Badih Ghazi, Ravi Kumar, Pasin Manurangsi, Thao Nguyen

In this work, we study the trade-off between differential privacy and adversarial robustness under L2-perturbations in the context of learning halfspaces.

Adversarial Robustness

Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth

4 code implementations ICLR 2021 Thao Nguyen, Maithra Raghu, Simon Kornblith

We begin by investigating how varying depth and width affects model hidden representations, finding a characteristic block structure in the hidden representations of larger capacity (wider or deeper) models.

Concept Bottleneck Models

4 code implementations ICML 2020 Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, Percy Liang

We seek to learn models that we can interact with using high-level concepts: if the model did not think there was a bone spur in the x-ray, would it still predict severe arthritis?

Robot Object Retrieval with Contextual Natural Language Queries

1 code implementation23 Jun 2020 Thao Nguyen, Nakul Gopalan, Roma Patel, Matt Corsaro, Ellie Pavlick, Stefanie Tellex

The model takes in a language command containing a verb, for example "Hand me something to cut," and RGB images of candidate objects and selects the object that best satisfies the task specified by the verb.

Natural Language Queries Object +1

Grounding Language Attributes to Objects using Bayesian Eigenobjects

no code implementations30 May 2019 Vanya Cohen, Benjamin Burchfiel, Thao Nguyen, Nakul Gopalan, Stefanie Tellex, George Konidaris

Our system is able to disambiguate between novel objects, observed via depth images, based on natural language descriptions.

3D Shape Representation Object

Planning with State Abstractions for Non-Markovian Task Specifications

2 code implementations28 May 2019 Yoonseon Oh, Roma Patel, Thao Nguyen, Baichuan Huang, Ellie Pavlick, Stefanie Tellex

Often times, we specify tasks for a robot using temporal language that can also span different levels of abstraction.

Cannot find the paper you are looking for? You can Submit a new open access paper.