Search Results for author: Zilong Zheng

Found 31 papers, 15 papers with code

SHARP: Search-Based Adversarial Attack for Structured Prediction

no code implementations Findings (NAACL) 2022 Liwen Zhang, Zixia Jia, Wenjuan Han, Zilong Zheng, Kewei Tu

Adversarial attack of structured prediction models faces various challenges such as the difficulty of perturbing discrete words, the sentence quality issue, and the sensitivity of outputs to small perturbations.

Adversarial Attack Dependency Parsing +4

LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding

1 code implementation25 Feb 2024 Yuxuan Wang, Yueqian Wang, Pengfei Wu, Jianxin Liang, Dongyan Zhao, Zilong Zheng

Despite progress in video-language modeling, the computational challenge of interpreting long-form videos in response to task-specific linguistic queries persists, largely due to the complexity of high-dimensional video data and the misalignment between language and visual cues over space and time.

Computational Efficiency Language Modelling +3

Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models

1 code implementation13 Nov 2023 Junpeng Li, Zixia Jia, Zilong Zheng

Document-level Relation Extraction (DocRE), which aims to extract relations from a long context, is a critical challenge in achieving fine-grained structural comprehension and generating interpretable document representations.

Document-level Relation Extraction In-Context Learning +4

JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models

no code implementations10 Nov 2023 ZiHao Wang, Shaofei Cai, Anji Liu, Yonggang Jin, Jinbing Hou, Bowei Zhang, Haowei Lin, Zhaofeng He, Zilong Zheng, Yaodong Yang, Xiaojian Ma, Yitao Liang

Achieving human-like planning and control with multimodal observations in an open world is a key milestone for more functional generalist agents.

LooGLE: Can Long-Context Language Models Understand Long Contexts?

1 code implementation8 Nov 2023 Jiaqi Li, Mengmeng Wang, Zilong Zheng, Muhan Zhang

In this paper, we present LooGLE, a Long Context Generic Language Evaluation benchmark for LLMs' long context understanding.

In-Context Learning Long-Context Understanding +1

MindAgent: Emergent Gaming Interaction

no code implementations18 Sep 2023 Ran Gong, Qiuyuan Huang, Xiaojian Ma, Hoi Vo, Zane Durante, Yusuke Noda, Zilong Zheng, Song-Chun Zhu, Demetri Terzopoulos, Li Fei-Fei, Jianfeng Gao

Large Language Models (LLMs) have the capacity of performing complex scheduling in a multi-agent system and can coordinate these agents into completing sophisticated tasks that require extensive collaboration.

In-Context Learning Scheduling

MindDial: Belief Dynamics Tracking with Theory-of-Mind Modeling for Situated Neural Dialogue Generation

no code implementations27 Jun 2023 Shuwen Qiu, Song-Chun Zhu, Zilong Zheng

We design an explicit mind module that can track three-level beliefs -- the speaker's belief, the speaker's prediction of the listener's belief, and the common belief based on the gap between the first two.

Dialogue Generation Theory of Mind Modeling

MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning

no code implementations4 Jun 2023 Jianghui Wang, Yuxuan Wang, Dongyan Zhao, Zilong Zheng

We introduce MoviePuzzle, a novel challenge that targets visual narrative reasoning and holistic movie understanding.

Benchmarking Contrastive Learning +1

VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions

1 code implementation30 May 2023 Yuxuan Wang, Zilong Zheng, Xueliang Zhao, Jinpeng Li, Yueqian Wang, Dongyan Zhao

Video-grounded dialogue understanding is a challenging problem that requires machine to perceive, parse and reason over situated semantics extracted from weakly aligned video and dialogues.

Dialogue Generation Dialogue Understanding +2

Shuo Wen Jie Zi: Rethinking Dictionaries and Glyphs for Chinese Language Pre-training

1 code implementation30 May 2023 Yuxuan Wang, Jianghui Wang, Dongyan Zhao, Zilong Zheng

We introduce CDBERT, a new learning paradigm that enhances the semantics understanding ability of the Chinese PLMs with dictionary knowledge and structure of Chinese characters.

Contrastive Learning

Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners

1 code implementation24 May 2023 Xiaojuan Tang, Zilong Zheng, Jiaqi Li, Fanxu Meng, Song-Chun Zhu, Yitao Liang, Muhan Zhang

On the whole, our analysis provides a novel perspective on the role of semantics in developing and evaluating language models' reasoning abilities.

Modeling Instance Interactions for Joint Information Extraction with Neural High-Order Conditional Random Field

1 code implementation17 Dec 2022 Zixia Jia, Zhaohui Yan, Wenjuan Han, Zilong Zheng, Kewei Tu

Prior works on joint Information Extraction (IE) typically model instance (e. g., event triggers, entities, roles, relations) interactions by representation enhancement, type dependencies scoring, or global decoding.

Variational Inference

SQA3D: Situated Question Answering in 3D Scenes

1 code implementation14 Oct 2022 Xiaojian Ma, Silong Yong, Zilong Zheng, Qing Li, Yitao Liang, Song-Chun Zhu, Siyuan Huang

We propose a new task to benchmark scene understanding of embodied agents: Situated Question Answering in 3D Scenes (SQA3D).

Question Answering Referring Expression +1

VGStore: A Multimodal Extension to SPARQL for Querying RDF Scene Graph

1 code implementation7 Sep 2022 Yanzeng Li, Zilong Zheng, Wenjuan Han, Lei Zou

Semantic Web technology has successfully facilitated many RDF models with rich data representation methods.

Relational Reasoning Semantic Similarity +1

Energy-Based Generative Cooperative Saliency Prediction

1 code implementation25 Jun 2021 Jing Zhang, Jianwen Xie, Zilong Zheng, Nick Barnes

In this paper, to model the uncertainty of visual saliency, we study the saliency prediction problem from the perspective of generative models by learning a conditional probability distribution over the saliency map given an input image, and treating the saliency prediction as a sampling process from the learned distribution.

Saliency Prediction

Patchwise Generative ConvNet: Training Energy-Based Models From a Single Natural Image for Internal Learning

no code implementations CVPR 2021 Zilong Zheng, Jianwen Xie, Ping Li

Exploiting internal statistics of a single natural image has long been recognized as a significant research paradigm where the goal is to learn the distribution of patches within the image without relying on external training data.

Descriptive Image Generation +1

Learning Triadic Belief Dynamics in Nonverbal Communication from Videos

1 code implementation CVPR 2021 Lifeng Fan, Shuwen Qiu, Zilong Zheng, Tao Gao, Song-Chun Zhu, Yixin Zhu

By aggregating different beliefs and true world states, our model essentially forms "five minds" during the interactions between two agents.

Scene Understanding

Learning Cycle-Consistent Cooperative Networks via Alternating MCMC Teaching for Unsupervised Cross-Domain Translation

no code implementations7 Mar 2021 Jianwen Xie, Zilong Zheng, Xiaolin Fang, Song-Chun Zhu, Ying Nian Wu

This paper studies the unsupervised cross-domain translation problem by proposing a generative framework, in which the probability distribution of each domain is represented by a generative cooperative network that consists of an energy-based model and a latent variable model.

Translation Unsupervised Image-To-Image Translation

Learning Energy-Based Model with Variational Auto-Encoder as Amortized Sampler

no code implementations29 Dec 2020 Jianwen Xie, Zilong Zheng, Ping Li

In this paper, we propose to learn a variational auto-encoder (VAE) to initialize the finite-step MCMC, such as Langevin dynamics that is derived from the energy function, for efficient amortized sampling of the EBM.

Joint Inference of States, Robot Knowledge, and Human (False-)Beliefs

no code implementations25 Apr 2020 Tao Yuan, Hangxin Liu, Lifeng Fan, Zilong Zheng, Tao Gao, Yixin Zhu, Song-Chun Zhu

Aiming to understand how human (false-)belief--a core socio-cognitive ability--would affect human interactions with robots, this paper proposes to adopt a graphical model to unify the representation of object states, robot knowledge, and human (false-)beliefs.

Object Object Tracking

Generative PointNet: Deep Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification

1 code implementation CVPR 2021 Jianwen Xie, Yifei Xu, Zilong Zheng, Song-Chun Zhu, Ying Nian Wu

We propose a generative model of unordered point sets, such as point clouds, in the form of an energy-based model, where the energy function is parameterized by an input-permutation-invariant bottom-up neural network.

3D Generation General Classification +3

Cooperative Training of Fast Thinking Initializer and Slow Thinking Solver for Conditional Learning

no code implementations7 Feb 2019 Jianwen Xie, Zilong Zheng, Xiaolin Fang, Song-Chun Zhu, Ying Nian Wu

This paper studies the problem of learning the conditional distribution of a high-dimensional output given an input, where the output and input may belong to two different domains, e. g., the output is a photo image and the input is a sketch image.

Image-to-Image Translation

Learning Dynamic Generator Model by Alternating Back-Propagation Through Time

no code implementations27 Dec 2018 Jianwen Xie, Ruiqi Gao, Zilong Zheng, Song-Chun Zhu, Ying Nian Wu

The non-linear transformation of this transition model can be parametrized by a feedforward neural network.

Learning Descriptor Networks for 3D Shape Synthesis and Analysis

1 code implementation CVPR 2018 Jianwen Xie, Zilong Zheng, Ruiqi Gao, Wenguan Wang, Song-Chun Zhu, Ying Nian Wu

This paper proposes a 3D shape descriptor network, which is a deep convolutional energy-based model, for modeling volumetric shape patterns.

Object

Cannot find the paper you are looking for? You can Submit a new open access paper.