no code implementations • EACL (HCINLP) 2021 • Nikhil Krishnaswamy, Nada Alalyani
In this paper we argue that embodied multimodal agents, i. e., avatars, can play an important role in moving natural language processing toward “deep understanding.” Fully-featured interactive agents, model encounters between two “people,” but a language-only agent has little environmental and situational awareness.
no code implementations • 17 Apr 2022 • Nikhil Krishnaswamy, Sadaf Ghaffari
In this paper we present a novel method for a naive agent to detect novel objects it encounters in an interaction.
no code implementations • 5 Dec 2020 • Nikhil Krishnaswamy, James Pustejovsky
In recent years, data-intensive AI, particularly the domain of natural language processing and understanding, has seen significant progress driven by the advent of large datasets and deep neural networks that have sidelined more classic AI approaches to the field.
no code implementations • 13 Jul 2020 • Katherine Krajovic, Nikhil Krishnaswamy, Nathaniel J. Dimick, R. Pito Salas, James Pustejovsky
We present a new interface for controlling a navigation robot in novel environments using coordinated gesture and language.
no code implementations • LREC 2020 • Nikhil Krishnaswamy, James Pustejovsky
In this paper, we present an analysis of computationally generated mixed-modality definite referring expressions using combinations of gesture and linguistic descriptions.
no code implementations • 18 Sep 2019 • Nikhil Krishnaswamy, James Pustejovsky
We present an architecture for integrating real-time, multimodal input into a computational agent's contextual model.
no code implementations • WS 2019 • Nikhil Krishnaswamy, James Pustejovsky
Referring expressions and definite descriptions of objects in space exploit information both about object characteristics and locations.
no code implementations • 5 Feb 2019 • James Pustejovsky, Nikhil Krishnaswamy
In this paper, we argue that simulation platforms enable a novel type of embodied spatial reasoning, one facilitated by a formal model of object and event semantics that renders the continuous quantitative search space of an open-world, real-time environment tractable.
no code implementations • 27 Nov 2018 • Nikhil Krishnaswamy, Scott Friedman, James Pustejovsky
We present a novel approach to introducing new spatial structures to an AI agent, combining deep learning over qualitative spatial relations with various heuristic search algorithms.
no code implementations • COLING 2018 • James Pustejovsky, Nikhil Krishnaswamy
Most work within the computational event modeling community has tended to focus on the interpretation and ordering of events that are associated with verbs and event nominals in linguistic expressions.
no code implementations • EACL 2017 • James Pustejovsky, Nikhil Krishnaswamy
Simulation and automatic visualization of events from natural language descriptions and supplementary modalities, such as gestures, allows humans to use their native capabilities as linguistic and visual interpreters to collaborate on tasks with an artificial agent or to put semantic intuitions to the test in an environment where user and agent share a common context. In previous work (Pustejovsky and Krishnaswamy, 2014; Pustejovsky, 2013a), we introduced a method for modeling natural language expressions within a 3D simulation environment built on top of the game development platform Unity (Goldstone, 2009).
no code implementations • COLING 2016 • Nikhil Krishnaswamy, James Pustejovsky
Much existing work in text-to-scene generation focuses on generating static scenes.
no code implementations • WS 2016 • James Pustejovsky, Tuan Do, Gitit Kehat, Nikhil Krishnaswamy
Human communication is a multimodal activity, involving not only speech and written expressions, but intonation, images, gestures, visual clues, and the interpretation of actions through perception.
no code implementations • SEMEVAL 2014 • James Pustejovsky, Nikhil Krishnaswamy
The generated simulations act as a conceptual "debugger" for the semantics of different motion verbs: that is, by testing for consistency and informativeness in the model, simulations expose the presuppositions associated with linguistic expressions and their compositions.
no code implementations • 5 Oct 2016 • Tuan Do, Nikhil Krishnaswamy, James Pustejovsky
This paper introduces the Event Capture Annotation Tool (ECAT), a user-friendly, open-source interface tool for annotating events and their participants in video, capable of extracting the 3D positions and orientations of objects in video captured by Microsoft's Kinect(R) hardware.
no code implementations • LREC 2016 • James Pustejovsky, Nikhil Krishnaswamy
We present the specification for a modeling language, VoxML, which encodes semantic knowledge of real-world objects represented as three-dimensional models, and of events and attributes related to and enacted over these objects.
no code implementations • 3 Oct 2016 • Nikhil Krishnaswamy, James Pustejovsky
In this paper, we describe a system for generating three-dimensional visual simulations of natural language motion expressions.