Search Results for author: Jiayuan Mao

Found 45 papers, 8 papers with code

Acquisition of Localization Confidence for Accurate Object Detection

4 code implementations ECCV 2018 Borui Jiang, Ruixuan Luo, Jiayuan Mao, Tete Xiao, Yuning Jiang

The network acquires this confidence of localization, which improves the NMS procedure by preserving accurately localized bounding boxes.

General Classification Object +3

Neural Logic Machines

2 code implementations ICLR 2019 Honghua Dong, Jiayuan Mao, Tian Lin, Chong Wang, Lihong Li, Denny Zhou

We propose the Neural Logic Machine (NLM), a neural-symbolic architecture for both inductive learning and logic reasoning.

Decision Making Inductive logic programming +1

Learning Visually-Grounded Semantics from Contrastive Adversarial Samples

1 code implementation COLING 2018 Haoyue Shi, Jiayuan Mao, Tete Xiao, Yuning Jiang, Jian Sun

Begin with an insightful adversarial attack on VSE embeddings, we show the limitation of current frameworks and image-text datasets (e. g., MS-COCO) both quantitatively and qualitatively.

Adversarial Attack Image Captioning

Visual Concept-Metaconcept Learning

1 code implementation NeurIPS 2019 Chi Han, Jiayuan Mao, Chuang Gan, Joshua B. Tenenbaum, Jiajun Wu

Humans reason with concepts and metaconcepts: we recognize red and green from visual input; we also understand that they describe the same property of objects (i. e., the color).

What's Left? Concept Grounding with Logic-Enhanced Foundation Models

1 code implementation24 Oct 2023 Joy Hsu, Jiayuan Mao, Joshua B. Tenenbaum, Jiajun Wu

We propose the Logic-Enhanced Foundation Model (LEFT), a unified framework that learns to ground and reason with concepts across domains with a differentiable, domain-independent, first-order logic-based program executor.

Visual Reasoning

What Planning Problems Can A Relational Neural Network Solve?

1 code implementation NeurIPS 2023 Jiayuan Mao, Tomás Lozano-Pérez, Joshua B. Tenenbaum, Leslie Pack Kaelbling

Goal-conditioned policies are generally understood to be "feed-forward" circuits, in the form of neural networks that map from the current state and the goal specification to the next action to take.

What Can Help Pedestrian Detection?

no code implementations CVPR 2017 Jiayuan Mao, Tete Xiao, Yuning Jiang, Zhimin Cao

Aggregating extra features has been considered as an effective approach to boost traditional pedestrian detection methods.

Pedestrian Detection

Neural Phrase-to-Phrase Machine Translation

no code implementations6 Nov 2018 Jiangtao Feng, Lingpeng Kong, Po-Sen Huang, Chong Wang, Da Huang, Jiayuan Mao, Kan Qiao, Dengyong Zhou

We also design an efficient dynamic programming algorithm to decode segments that allows the model to be trained faster than the existing neural phrase-based machine translation method by Huang et al. (2018).

Machine Translation Translation

Explicit Recall for Efficient Exploration

no code implementations ICLR 2019 Honghua Dong, Jiayuan Mao, Xinyue Cui, Lihong Li

In this paper, we advocate the use of explicit memory for efficient exploration in reinforcement learning.

Decision Making Efficient Exploration +2

Visually Grounded Neural Syntax Acquisition

no code implementations ACL 2019 Haoyue Shi, Jiayuan Mao, Kevin Gimpel, Karen Livescu

We define concreteness of constituents by their matching scores with images, and use it to guide the parsing of text.

Visual Grounding

Neurally-Guided Structure Inference

no code implementations17 Jun 2019 Sidi Lu, Jiayuan Mao, Joshua B. Tenenbaum, Jiajun Wu

In this paper, we propose a hybrid inference algorithm, the Neurally-Guided Structure Inference (NG-SI), keeping the advantages of both search-based and data-driven methods.

Program-Guided Image Manipulators

no code implementations ICCV 2019 Jiayuan Mao, Xiuming Zhang, Yikai Li, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

Humans are capable of building holistic representations for images at various levels, from local objects, to pairwise relations, to global structures.

Image Inpainting

Temporal and Object Quantification Nets

no code implementations1 Jan 2021 Jiayuan Mao, Zhezheng Luo, Chuang Gan, Joshua B. Tenenbaum, Jiajun Wu, Leslie Pack Kaelbling, Tomer Ullman

We aim to learn generalizable representations for complex activities by quantifying over both entities and time, as in “the kicker is behind all the other players,” or “the player controls the ball until it moves toward the goal.” Such a structural inductive bias of object relations, object quantification, and temporal orders will enable the learned representation to generalize to situations with varying numbers of agents, objects, and time courses.

Event Detection Inductive Bias +1

Multi-Plane Program Induction with 3D Box Priors

no code implementations NeurIPS 2020 Yikai Li, Jiayuan Mao, Xiuming Zhang, William T. Freeman, Joshua B. Tenenbaum, Noah Snavely, Jiajun Wu

We consider two important aspects in understanding and editing images: modeling regular, program-like texture or patterns in 2D planes, and 3D posing of these planes in the scene.

Program induction Program Synthesis

Object-Centric Diagnosis of Visual Reasoning

no code implementations21 Dec 2020 Jianwei Yang, Jiayuan Mao, Jiajun Wu, Devi Parikh, David D. Cox, Joshua B. Tenenbaum, Chuang Gan

In contrast, symbolic and modular models have a relatively better grounding and robustness, though at the cost of accuracy.

Object Question Answering +2

Hierarchical Motion Understanding via Motion Programs

no code implementations CVPR 2021 Sumith Kulal, Jiayuan Mao, Alex Aiken, Jiajun Wu

We posit that adding higher-level motion primitives, which can capture natural coarser units of motion such as backswing or follow-through, can be used to improve downstream analysis tasks.

Video Editing Video Prediction

Temporal and Object Quantification Networks

no code implementations10 Jun 2021 Jiayuan Mao, Zhezheng Luo, Chuang Gan, Joshua B. Tenenbaum, Jiajun Wu, Leslie Pack Kaelbling, Tomer D. Ullman

We present Temporal and Object Quantification Networks (TOQ-Nets), a new class of neuro-symbolic networks with a structural bias that enables them to learn to recognize complex relational-temporal events.

Object Temporal Sequences

Efficient Training and Inference of Hypergraph Reasoning Networks

no code implementations29 Sep 2021 Guangxuan Xiao, Leslie Pack Kaelbling, Jiajun Wu, Jiayuan Mao

To leverage the sparsity in hypergraph neural networks, SpaLoc represents the grounding of relationships such as parent and grandparent as sparse tensors and uses neural networks and finite-domain quantification operations to infer new facts based on the input.

Knowledge Graphs Logical Reasoning +1

Learning Rational Skills for Planning from Demonstrations and Instructions

no code implementations29 Sep 2021 Zhezheng Luo, Jiayuan Mao, Jiajun Wu, Tomas Perez, Joshua B. Tenenbaum, Leslie Pack Kaelbling

We present a framework for learning compositional, rational skill models (RatSkills) that support efficient planning and inverse planning for achieving novel goals and recognizing activities.

On the Expressiveness and Learning of Relational Neural Networks on Hypergraphs

no code implementations29 Sep 2021 Zhezheng Luo, Jiayuan Mao, Joshua B. Tenenbaum, Leslie Pack Kaelbling

Our first contribution is a fine-grained analysis of the expressiveness of these neural networks, that is, the set of functions that they can realize and the set of problems that they can solve.

Grammar-Based Grounded Lexicon Learning

no code implementations NeurIPS 2021 Jiayuan Mao, Haoyue Shi, Jiajun Wu, Roger P. Levy, Joshua B. Tenenbaum

We present Grammar-Based Grounded Lexicon Learning (G2L2), a lexicalist approach toward learning a compositional and grounded meaning representation of language from grounded data, such as paired images and texts.

Network Embedding Sentence +1

FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations

no code implementations ICLR 2022 Lingjie Mei, Jiayuan Mao, Ziqi Wang, Chuang Gan, Joshua B. Tenenbaum

We present a meta-learning framework for learning new visual concepts quickly, from just one or a few examples, guided by multiple naturally occurring data streams: simultaneously looking at images, reading sentences that describe the objects in the scene, and interpreting supplemental sentences that relate the novel concept with other concepts.

Meta-Learning Novel Concepts +1

Programmatic Concept Learning for Human Motion Description and Synthesis

no code implementations CVPR 2022 Sumith Kulal, Jiayuan Mao, Alex Aiken, Jiajun Wu

We introduce Programmatic Motion Concepts, a hierarchical motion representation for human actions that captures both low-level motion and high-level description as motion concepts.

Translating a Visual LEGO Manual to a Machine-Executable Plan

no code implementations25 Jul 2022 Ruocheng Wang, Yunzhi Zhang, Jiayuan Mao, Chin-Yi Cheng, Jiajun Wu

We study the problem of translating an image-based, step-by-step assembly manual created by human designers into machine-interpretable instructions.

3D Pose Estimation Keypoint Detection

PDSketch: Integrated Planning Domain Programming and Learning

no code implementations9 Mar 2023 Jiayuan Mao, Tomás Lozano-Pérez, Joshua B. Tenenbaum, Leslie Pack Kaelbling

This paper studies a model learning and online planning approach towards building flexible and general robots.

Sparse and Local Networks for Hypergraph Reasoning

no code implementations9 Mar 2023 Guangxuan Xiao, Leslie Pack Kaelbling, Jiajun Wu, Jiayuan Mao

Reasoning about the relationships between entities from input facts (e. g., whether Ari is a grandparent of Charlie) generally requires explicit consideration of other entities that are not mentioned in the query (e. g., the parents of Charlie).

Knowledge Graphs World Knowledge

On the Expressiveness and Generalization of Hypergraph Neural Networks

no code implementations9 Mar 2023 Zhezheng Luo, Jiayuan Mao, Joshua B. Tenenbaum, Leslie Pack Kaelbling

Next, we analyze the learning properties of these neural networks, especially focusing on how they can be trained on a finite set of small graphs and generalize to larger graphs, which we term structural generalization.

Learning Rational Subgoals from Demonstrations and Instructions

no code implementations9 Mar 2023 Zhezheng Luo, Jiayuan Mao, Jiajun Wu, Tomás Lozano-Pérez, Joshua B. Tenenbaum, Leslie Pack Kaelbling

We present a framework for learning useful subgoals that support efficient long-term planning to achieve novel goals.

Programmatically Grounded, Compositionally Generalizable Robotic Manipulation

no code implementations26 Apr 2023 Renhao Wang, Jiayuan Mao, Joy Hsu, Hang Zhao, Jiajun Wu, Yang Gao

Robots operating in the real world require both rich manipulation skills as well as the ability to semantically reason about when to apply those skills.

Imitation Learning

Compositional Diffusion-Based Continuous Constraint Solvers

no code implementations2 Sep 2023 Zhutian Yang, Jiayuan Mao, Yilun Du, Jiajun Wu, Joshua B. Tenenbaum, Tomás Lozano-Pérez, Leslie Pack Kaelbling

This paper introduces an approach for learning to solve continuous constraint satisfaction problems (CCSP) in robotic reasoning and planning.

CLEVRER-Humans: Describing Physical and Causal Events the Human Way

no code implementations5 Oct 2023 Jiayuan Mao, Xuelin Yang, Xikun Zhang, Noah D. Goodman, Jiajun Wu

First, there is a lack of diversity in both event types and natural language descriptions; second, causal relationships based on manually-defined heuristics are different from human judgments.

Causal Judgment Data Augmentation +1

HandMeThat: Human-Robot Communication in Physical and Social Environments

no code implementations5 Oct 2023 Yanming Wan, Jiayuan Mao, Joshua B. Tenenbaum

We introduce HandMeThat, a benchmark for a holistic evaluation of instruction understanding and following in physical and social environments.

Learning to Act from Actionless Videos through Dense Correspondences

no code implementations12 Oct 2023 Po-Chen Ko, Jiayuan Mao, Yilun Du, Shao-Hua Sun, Joshua B. Tenenbaum

In this work, we present an approach to construct a video-based robot policy capable of reliably executing diverse tasks across different robots and environments from few video demonstrations without using any action annotations.

Learning Reusable Manipulation Strategies

no code implementations6 Nov 2023 Jiayuan Mao, Joshua B. Tenenbaum, Tomás Lozano-Pérez, Leslie Pack Kaelbling

Humans demonstrate an impressive ability to acquire and generalize manipulation "tricks."

Object

Learning adaptive planning representations with natural language guidance

no code implementations13 Dec 2023 Lionel Wong, Jiayuan Mao, Pratyusha Sharma, Zachary S. Siegel, Jiahai Feng, Noa Korneev, Joshua B. Tenenbaum, Jacob Andreas

Effective planning in the real world requires not only world knowledge, but the ability to leverage that knowledge to build the right representation of the task at hand.

Decision Making World Knowledge

Grounding Language Plans in Demonstrations Through Counterfactual Perturbations

no code implementations25 Mar 2024 Yanwei Wang, Tsun-Hsuan Wang, Jiayuan Mao, Michael Hagenow, Julie Shah

Grounding the common-sense reasoning of Large Language Models in physical domains remains a pivotal yet unsolved problem for embodied AI.

Common Sense Reasoning counterfactual +2

Cannot find the paper you are looking for? You can Submit a new open access paper.