1 code implementation • 31 Oct 2024 • Jiajun Xi, Yinong He, Jianing Yang, Yinpei Dai, Joyce Chai
In real-world scenarios, it is desirable for embodied agents to have the ability to leverage human language to gain explicit or implicit knowledge for learning tasks.
1 code implementation • 22 Oct 2024 • Zheyuan Zhang, Fengyuan Hu, Jayjun Lee, Freda Shi, Parisa Kordjamshidi, Joyce Chai, Ziqiao Ma
Spatial expressions in situated communication can be ambiguous, as their meanings vary depending on the frames of reference (FoR) adopted by speakers and listeners.
no code implementations • 23 Sep 2024 • Yinpei Dai, Jayjun Lee, Nima Fazeli, Joyce Chai
Developing robust and correctable visuomotor policies for robotic manipulation is challenging due to the lack of self-recovery mechanisms from failures and the limitations of simple language instructions in guiding robot actions.
1 code implementation • 9 Jul 2024 • Yue Zhang, Ziqiao Ma, Jialu Li, Yanyuan Qiao, Zun Wang, Joyce Chai, Qi Wu, Mohit Bansal, Parisa Kordjamshidi
Vision-and-Language Navigation (VLN) has gained increasing attention over recent years and many approaches have emerged to advance their development.
1 code implementation • 8 Jul 2024 • Xuweiyi Chen, Ziqiao Ma, Xuejun Zhang, Sihan Xu, Shengyi Qian, Jianing Yang, David F. Fouhey, Joyce Chai
Large vision language models (LVLMs) often suffer from object hallucination, producing objects not present in the given images.
1 code implementation • 13 Jun 2024 • Hua Shen, Tiffany Knearem, Reshmi Ghosh, Kenan Alkiek, Kundan Krishna, Yachuan Liu, Ziqiao Ma, Savvas Petridis, Yi-Hao Peng, Li Qiwei, Sushrita Rakshit, Chenglei Si, Yutong Xie, Jeffrey P. Bigham, Frank Bentley, Joyce Chai, Zachary Lipton, Qiaozhu Mei, Rada Mihalcea, Michael Terry, Diyi Yang, Meredith Ringel Morris, Paul Resnick, David Jurgens
From this, we present a conceptual framework of "Bidirectional Human-AI Alignment" to organize the literature from a human-centered perspective.
1 code implementation • 7 Jun 2024 • Jianing Yang, Xuweiyi Chen, Nikhil Madaan, Madhavan Iyengar, Shengyi Qian, David F. Fouhey, Joyce Chai
The integration of language and 3D perception is crucial for developing embodied agents and robots that comprehend and interact with the physical world.
1 code implementation • 7 Jun 2024 • Zhongmou He, Jing Zhu, Shengyi Qian, Joyce Chai, Danai Koutra
To address the efficiency challenges at inference time, we introduce a retrieval-reranking scheme.
1 code implementation • 5 Jun 2024 • Yidong Huang, Jacob Sansom, Ziqiao Ma, Felix Gervits, Joyce Chai
Recent advancements in foundation models (FMs) have unlocked new prospects in autonomous driving, yet the experimental settings of these studies are preliminary, over-simplified, and fail to capture the complexity of real-world driving scenarios in human environments.
1 code implementation • 22 May 2024 • Ziqiao Ma, Zekun Wang, Joyce Chai
In this work, we aim to examine how corrective feedback from interactions influences neural language acquisition from the ground up through systematically controlled experiments, assessing whether it contributes to learning efficiency in language models.
no code implementations • CVPR 2024 • Yichi Zhang, Ziqiao Ma, Xiaofeng Gao, Suhaila Shakiah, Qiaozi Gao, Joyce Chai
Most multimodal large language models (MLLMs) learn language-to-object grounding through causal language modeling where grounded objects are captured by bounding boxes as sequences of location tokens.
Ranked #2 on Referring Expression Segmentation on PhraseCut
Causal Language Modeling Generalized Referring Expression Segmentation +3
1 code implementation • CVPR 2024 • Sihan Xu, Yidong Huang, Jiayi Pan, Ziqiao Ma, Joyce Chai
Despite recent advances in inversion-based editing text-guided image manipulation remains challenging for diffusion models.
1 code implementation • 7 Dec 2023 • Sihan Xu, Yidong Huang, Jiayi Pan, Ziqiao Ma, Joyce Chai
We show that when the initial sample is known, a special variance schedule reduces the denoising step to the same form as the multi-step consistency sampling.
Ranked #1 on Text-based Image Editing on PIE-Bench
1 code implementation • 28 Nov 2023 • Keunwoo Peter Yu, Zheyuan Zhang, Fengyuan Hu, Shane Storks, Joyce Chai
Our results, analysis, and \eilev{}-trained models yield numerous insights about the emergence of in-context learning over video and text, creating a foundation for future work to optimize and scale VLMs for open-domain video understanding and reasoning.
1 code implementation • 9 Nov 2023 • Guangyue Xu, Joyce Chai, Parisa Kordjamshidi
In this work, we propose GIP-COL (Graph-Injected Soft Prompting for COmpositional Learning) to better explore the compositional zero-shot learning (CZSL) ability of VLMs within the prompt-based learning framework.
no code implementations • 2 Nov 2023 • Guangyue Xu, Parisa Kordjamshidi, Joyce Chai
Inspired by this observation, in this paper, we propose MetaReVision, a retrieval-enhanced meta-learning model to address the visually grounded compositional concept learning problem.
1 code implementation • 1 Nov 2023 • Yuwei Bao, Keunwoo Peter Yu, Yichi Zhang, Shane Storks, Itamar Bar-Yossef, Alexander De La Iglesia, Megan Su, Xiao Lin Zheng, Joyce Chai
Despite tremendous advances in AI, it remains a significant challenge to develop interactive task guidance systems that can offer situated, personalized guidance and assist humans in various tasks.
1 code implementation • 31 Oct 2023 • Yichi Zhang, Jiayi Pan, Yuchen Zhou, Rui Pan, Joyce Chai
Vision-Language Models (VLMs) are trained on vast amounts of data captured by humans emulating our understanding of the world.
1 code implementation • 30 Oct 2023 • Ziqiao Ma, Jacob Sansom, Run Peng, Joyce Chai
Such situated evaluation provides a more comprehensive assessment of mental states and potentially mitigates the risk of shortcuts and data leakage.
1 code implementation • 24 Oct 2023 • Zheyuan Zhang, Shane Storks, Fengyuan Hu, Sungryull Sohn, Moontae Lee, Honglak Lee, Joyce Chai
We incorporate these interlinked dual processes in fine-tuning and in-context learning with PLMs, applying them to two language understanding tasks that require coherent physical commonsense reasoning.
1 code implementation • NeurIPS 2023 • Sihan Xu, Ziqiao Ma, Yidong Huang, Honglak Lee, Joyce Chai
Our empirical studies show that Cyclenet is superior in translation consistency and quality, and can generate high-quality images for out-of-domain distributions with a simple change of the textual prompt.
1 code implementation • 12 Oct 2023 • Yinpei Dai, Run Peng, Sikai Li, Joyce Chai
To address these limitations, we introduce Zero-shot Interactive Personalized Object Navigation (ZIPON), where robots need to navigate to personalized goal objects while engaging in conversations with users.
1 code implementation • 21 Sep 2023 • Jianing Yang, Xuweiyi Chen, Shengyi Qian, Nikhil Madaan, Madhavan Iyengar, David F. Fouhey, Joyce Chai
While existing approaches often rely on extensive labeled data or exhibit limitations in handling complex language queries, we propose LLM-Grounder, a novel zero-shot, open-vocabulary, Large Language Model (LLM)-based 3D visual grounding pipeline.
1 code implementation • 5 Jul 2023 • Yuwei Bao, Barrett Martin Lattimer, Joyce Chai
Human language acquisition is an efficient, supervised, and continual process.
1 code implementation • 14 Jun 2023 • Ziqiao Ma, Jiayi Pan, Joyce Chai
The ability to connect language units to their referents in the physical world, referred to as grounding, is crucial to learning and understanding grounded meanings of words.
1 code implementation • 28 May 2023 • Xiaoyang Hu, Shane Storks, Richard L. Lewis, Joyce Chai
Analogical reasoning is a fundamental capacity of human cognition that allows us to reason abstractly about novel situations by relating them to past experiences.
3 code implementations • 26 May 2023 • Shane Storks, Keunwoo Peter Yu, Ziqiao Ma, Joyce Chai
As natural language processing (NLP) has recently seen an unprecedented level of excitement, and more people are eager to enter the field, it is unclear whether current research reproducibility efforts are sufficient for this group of beginners to apply the latest developments.
1 code implementation • 18 May 2023 • Cristian-Paul Bara, Ziqiao Ma, Yingzhuo Yu, Julie Shah, Joyce Chai
To complete these tasks, agents need to engage in situated communication with their partners and coordinate their partial plans towards a complete plan to achieve a joint task goal.
1 code implementation • 17 May 2023 • Nam Ho Koh, Joseph Plata, Joyce Chai
Application Tracking Systems (ATS) have allowed talent managers, recruiters, and college admissions committees to process large volumes of potential candidate applications efficiently.
Ranked #1 on Bias Detection on ICAT LLM bias
no code implementations • 9 Nov 2022 • Guangyue Xu, Parisa Kordjamshidi, Joyce Chai
This work explores the zero-shot compositional learning ability of large pre-trained vision-language models(VLMs) within the prompt-based learning framework and propose a model (\textit{PromptCompVL}) to solve the compositonal zero-shot learning (CZSL) problem.
1 code implementation • 22 Oct 2022 • Yichi Zhang, Jianing Yang, Jiayi Pan, Shane Storks, Nikhil Devraj, Ziqiao Ma, Keunwoo Peter Yu, Yuwei Bao, Joyce Chai
These reactive agents are insufficient for long-horizon complex tasks.
1 code implementation • 22 Oct 2022 • Ziqiao Ma, Ben VanDerPloeg, Cristian-Paul Bara, Huang Yidong, Eui-In Kim, Felix Gervits, Matthew Marge, Joyce Chai
To this end, we introduce Dialogue On the ROad To Handle Irregular Events (DOROTHIE), a novel interactive simulation platform that enables the creation of unexpected situations on the fly to support empirical studies on situated communication with autonomous driving agents.
no code implementations • 4 May 2022 • Shane Storks, Keunwoo Peter Yu, Joyce Chai
As NLP research attracts public attention and excitement, it becomes increasingly important for it to be accessible to a broad audience.
1 code implementation • ACL 2022 • Yuwei Bao, Sayan Ghosh, Joyce Chai
The PRS attempts to learn the speaker-listener disparity and adjust the speech accordingly, by adding a light-weighted disparity adjustment layer into working memory on top of speaker's long-term memory system.
1 code implementation • 23 Jan 2022 • Jiaqi Ma, Ziqiao Ma, Joyce Chai, Qiaozhu Mei
We study the problem of semi-supervised learning with Graph Neural Networks (GNNs) in an active learning setup.
1 code implementation • EMNLP 2021 • Cristian-Paul Bara, Sky CH-Wang, Joyce Chai
An ideal integration of autonomous agents in a human world implies that they are able to collaborate on human terms.
1 code implementation • Findings (EMNLP) 2021 • Shane Storks, Joyce Chai
As large-scale, pre-trained language models achieve human-level and superhuman accuracy on existing language understanding tasks, statistical bias in benchmark data and probing studies have recently called into question their true capabilities.
1 code implementation • Findings (EMNLP) 2021 • Shane Storks, Qiaozi Gao, Yichi Zhang, Joyce Chai
However, evaluations only based on end task performance shed little light on machines' true ability in language understanding and reasoning.
1 code implementation • 3 Sep 2021 • Arjun R. Akula, Keze Wang, Changsong Liu, Sari Saba-Sadiya, Hongjing Lu, Sinisa Todorovic, Joyce Chai, Song-Chun Zhu
More concretely, our CX-ToM framework generates sequence of explanations in a dialog by mediating the differences between the minds of machine and human user.
1 code implementation • Findings (ACL) 2021 • Yichi Zhang, Joyce Chai
On the ALFRED benchmark for task learning, the published state-of-the-art system only achieves a task success rate of less than 10% in an unseen environment, compared to the human performance of over 90%.
no code implementations • EMNLP 2020 • Yonatan Bisk, Ari Holtzman, Jesse Thomason, Jacob Andreas, Yoshua Bengio, Joyce Chai, Mirella Lapata, Angeliki Lazaridou, Jonathan May, Aleksandr Nisnevich, Nicolas Pinto, Joseph Turian
Language understanding research is held back by a failure to relate language to the physical world it describes and to the social interactions it facilitates.
1 code implementation • EMNLP 2018 • Shaohua Yang, Qiaozi Gao, Sari Sadiya, Joyce Chai
To enable collaboration and communication between humans and agents, this paper investigates learning to acquire commonsense evidence for action justification.
no code implementations • ACL 2018 • Qiaozi Gao, Shaohua Yang, Joyce Chai, V, Lucy erwende
Despite recent advances in knowledge representation, automated reasoning, and machine learning, artificial agents still lack the ability to understand basic action-effect relations regarding the physical world, for example, the action of cutting a cucumber most likely leads to the state where the cucumber is broken apart into smaller pieces.
no code implementations • ACL 2017 • Lanbo She, Joyce Chai
To enable human-robot communication and collaboration, previous works represent grounded verb semantics as the potential change of state to the physical world caused by these verbs.