Search Results for author: Joyce Chai

Found 41 papers, 27 papers with code

GROUNDHOG: Grounding Large Language Models to Holistic Segmentation

no code implementations26 Feb 2024 Yichi Zhang, Ziqiao Ma, Xiaofeng Gao, Suhaila Shakiah, Qiaozi Gao, Joyce Chai

Most multimodal large language models (MLLMs) learn language-to-object grounding through causal language modeling where grounded objects are captured by bounding boxes as sequences of location tokens.

 Ranked #1 on Generalized Referring Expression Segmentation on gRefCOCO (using extra training data)

Causal Language Modeling Generalized Referring Expression Segmentation +2

Inversion-Free Image Editing with Natural Language

1 code implementation7 Dec 2023 Sihan Xu, Yidong Huang, Jiayi Pan, Ziqiao Ma, Joyce Chai

We show that when the initial sample is known, a special variance schedule reduces the denoising step to the same form as the multi-step consistency sampling.

Image Manipulation Text-based Image Editing

Efficient In-Context Learning in Vision-Language Models for Egocentric Videos

1 code implementation28 Nov 2023 Keunwoo Peter Yu, Zheyuan Zhang, Fengyuan Hu, Joyce Chai

Recent advancements in text-only large language models (LLMs) have highlighted the benefit of in-context learning for adapting to new tasks with a few demonstrations.

In-Context Learning

GIPCOL: Graph-Injected Soft Prompting for Compositional Zero-Shot Learning

1 code implementation9 Nov 2023 Guangyue Xu, Joyce Chai, Parisa Kordjamshidi

In this work, we propose GIP-COL (Graph-Injected Soft Prompting for COmpositional Learning) to better explore the compositional zero-shot learning (CZSL) ability of VLMs within the prompt-based learning framework.

Attribute Compositional Zero-Shot Learning

MetaReVision: Meta-Learning with Retrieval for Visually Grounded Compositional Concept Acquisition

no code implementations2 Nov 2023 Guangyue Xu, Parisa Kordjamshidi, Joyce Chai

Inspired by this observation, in this paper, we propose MetaReVision, a retrieval-enhanced meta-learning model to address the visually grounded compositional concept learning problem.

Meta-Learning Retrieval

Can Foundation Models Watch, Talk and Guide You Step by Step to Make a Cake?

1 code implementation1 Nov 2023 Yuwei Bao, Keunwoo Peter Yu, Yichi Zhang, Shane Storks, Itamar Bar-Yossef, Alexander De La Iglesia, Megan Su, Xiao Lin Zheng, Joyce Chai

Despite tremendous advances in AI, it remains a significant challenge to develop interactive task guidance systems that can offer situated, personalized guidance and assist humans in various tasks.

Decision Making

Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?

1 code implementation31 Oct 2023 Yichi Zhang, Jiayi Pan, Yuchen Zhou, Rui Pan, Joyce Chai

Vision-Language Models (VLMs) are trained on vast amounts of data captured by humans emulating our understanding of the world.

Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models

1 code implementation30 Oct 2023 Ziqiao Ma, Jacob Sansom, Run Peng, Joyce Chai

Such situated evaluation provides a more comprehensive assessment of mental states and potentially mitigates the risk of shortcuts and data leakage.

Position Theory of Mind Modeling

From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning

1 code implementation24 Oct 2023 Zheyuan Zhang, Shane Storks, Fengyuan Hu, Sungryull Sohn, Moontae Lee, Honglak Lee, Joyce Chai

We incorporate these interlinked dual processes in fine-tuning and in-context learning with PLMs, applying them to two language understanding tasks that require coherent physical commonsense reasoning.

In-Context Learning Physical Commonsense Reasoning

CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation

1 code implementation NeurIPS 2023 Sihan Xu, Ziqiao Ma, Yidong Huang, Honglak Lee, Joyce Chai

Our empirical studies show that Cyclenet is superior in translation consistency and quality, and can generate high-quality images for out-of-domain distributions with a simple change of the textual prompt.

2k Image Generation +2

Think, Act, and Ask: Open-World Interactive Personalized Robot Navigation

1 code implementation12 Oct 2023 Yinpei Dai, Run Peng, Sikai Li, Joyce Chai

To address these limitations, we introduce Zero-shot Interactive Personalized Object Navigation (ZIPON), where robots need to navigate to personalized goal objects while engaging in conversations with users.

Navigate Object +1

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

1 code implementation21 Sep 2023 Jianing Yang, Xuweiyi Chen, Shengyi Qian, Nikhil Madaan, Madhavan Iyengar, David F. Fouhey, Joyce Chai

While existing approaches often rely on extensive labeled data or exhibit limitations in handling complex language queries, we propose LLM-Grounder, a novel zero-shot, open-vocabulary, Large Language Model (LLM)-based 3D visual grounding pipeline.

Language Modelling Large Language Model +3

World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models

1 code implementation14 Jun 2023 Ziqiao Ma, Jiayi Pan, Joyce Chai

The ability to connect language units to their referents in the physical world, referred to as grounding, is crucial to learning and understanding grounded meanings of words.

Grounded Open Vocabulary Acquisition Language Modelling

In-Context Analogical Reasoning with Pre-Trained Language Models

1 code implementation28 May 2023 Xiaoyang Hu, Shane Storks, Richard L. Lewis, Joyce Chai

Analogical reasoning is a fundamental capacity of human cognition that allows us to reason abstractly about novel situations by relating them to past experiences.

In-Context Learning Relational Reasoning

NLP Reproducibility For All: Understanding Experiences of Beginners

3 code implementations26 May 2023 Shane Storks, Keunwoo Peter Yu, Ziqiao Ma, Joyce Chai

As natural language processing (NLP) has recently seen an unprecedented level of excitement, and more people are eager to enter the field, it is unclear whether current research reproducibility efforts are sufficient for this group of beginners to apply the latest developments.

Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue

1 code implementation18 May 2023 Cristian-Paul Bara, Ziqiao Ma, Yingzhuo Yu, Julie Shah, Joyce Chai

To complete these tasks, agents need to engage in situated communication with their partners and coordinate their partial plans towards a complete plan to achieve a joint task goal.

Collaborative Plan Acquisition Theory of Mind Modeling

BAD: BiAs Detection for Large Language Models in the context of candidate screening

1 code implementation17 May 2023 Nam Ho Koh, Joseph Plata, Joyce Chai

Application Tracking Systems (ATS) have allowed talent managers, recruiters, and college admissions committees to process large volumes of potential candidate applications efficiently.

Bias Detection Fairness

Prompting Large Pre-trained Vision-Language Models For Compositional Concept Learning

no code implementations9 Nov 2022 Guangyue Xu, Parisa Kordjamshidi, Joyce Chai

This work explores the zero-shot compositional learning ability of large pre-trained vision-language models(VLMs) within the prompt-based learning framework and propose a model (\textit{PromptCompVL}) to solve the compositonal zero-shot learning (CZSL) problem.

Zero-Shot Learning

DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents

1 code implementation22 Oct 2022 Ziqiao Ma, Ben VanDerPloeg, Cristian-Paul Bara, Huang Yidong, Eui-In Kim, Felix Gervits, Matthew Marge, Joyce Chai

To this end, we introduce Dialogue On the ROad To Handle Irregular Events (DOROTHIE), a novel interactive simulation platform that enables the creation of unexpected situations on the fly to support empirical studies on situated communication with autonomous driving agents.

Autonomous Driving Dialogue Act Classification +2

Reproducibility Beyond the Research Community: Experience from NLP Beginners

no code implementations4 May 2022 Shane Storks, Keunwoo Peter Yu, Joyce Chai

As NLP research attracts public attention and excitement, it becomes increasingly important for it to be accessible to a broad audience.

Learning to Mediate Disparities Towards Pragmatic Communication

1 code implementation ACL 2022 Yuwei Bao, Sayan Ghosh, Joyce Chai

The PRS attempts to learn the speaker-listener disparity and adjust the speech accordingly, by adding a light-weighted disparity adjustment layer into working memory on top of speaker's long-term memory system.

Partition-Based Active Learning for Graph Neural Networks

1 code implementation23 Jan 2022 Jiaqi Ma, Ziqiao Ma, Joyce Chai, Qiaozhu Mei

We study the problem of semi-supervised learning with Graph Neural Networks (GNNs) in an active learning setup.

Active Learning Node Classification

MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks

1 code implementation EMNLP 2021 Cristian-Paul Bara, Sky CH-Wang, Joyce Chai

An ideal integration of autonomous agents in a human world implies that they are able to collaborate on human terms.

Theory of Mind Modeling

Beyond the Tip of the Iceberg: Assessing Coherence of Text Classifiers

1 code implementation Findings (EMNLP) 2021 Shane Storks, Joyce Chai

As large-scale, pre-trained language models achieve human-level and superhuman accuracy on existing language understanding tasks, statistical bias in benchmark data and probing studies have recently called into question their true capabilities.

text-classification Text Classification

Tiered Reasoning for Intuitive Physics: Toward Verifiable Commonsense Language Understanding

1 code implementation Findings (EMNLP) 2021 Shane Storks, Qiaozi Gao, Yichi Zhang, Joyce Chai

However, evaluations only based on end task performance shed little light on machines' true ability in language understanding and reasoning.

valid

Hierarchical Task Learning from Language Instructions with Unified Transformers and Self-Monitoring

1 code implementation Findings (ACL) 2021 Yichi Zhang, Joyce Chai

On the ALFRED benchmark for task learning, the published state-of-the-art system only achieves a task success rate of less than 10% in an unseen environment, compared to the human performance of over 90%.

Experience Grounds Language

2 code implementations EMNLP 2020 Yonatan Bisk, Ari Holtzman, Jesse Thomason, Jacob Andreas, Yoshua Bengio, Joyce Chai, Mirella Lapata, Angeliki Lazaridou, Jonathan May, Aleksandr Nisnevich, Nicolas Pinto, Joseph Turian

Language understanding research is held back by a failure to relate language to the physical world it describes and to the social interactions it facilitates.

Representation Learning

Commonsense Justification for Action Explanation

1 code implementation EMNLP 2018 Shaohua Yang, Qiaozi Gao, Sari Sadiya, Joyce Chai

To enable collaboration and communication between humans and agents, this paper investigates learning to acquire commonsense evidence for action justification.

Decision Making

What Action Causes This? Towards Naive Physical Action-Effect Prediction

no code implementations ACL 2018 Qiaozi Gao, Shaohua Yang, Joyce Chai, V, Lucy erwende

Despite recent advances in knowledge representation, automated reasoning, and machine learning, artificial agents still lack the ability to understand basic action-effect relations regarding the physical world, for example, the action of cutting a cucumber most likely leads to the state where the cucumber is broken apart into smaller pieces.

Interactive Learning of Grounded Verb Semantics towards Human-Robot Communication

no code implementations ACL 2017 Lanbo She, Joyce Chai

To enable human-robot communication and collaboration, previous works represent grounded verb semantics as the potential change of state to the physical world caused by these verbs.

Cannot find the paper you are looking for? You can Submit a new open access paper.