Search Results for author: Pan Lu

Found 26 papers, 15 papers with code

Towards Socially Intelligent Agents with Mental State Transition and Human Value

no code implementations SIGDIAL (ACL) 2022 Liang Qiu, Yizhou Zhao, Yuan Liang, Pan Lu, Weiyan Shi, Zhou Yu, Song-Chun Zhu

One of which is to track the agent’s mental state transition and teach the agent to make decisions guided by its value like a human.

TheoremQA: A Theorem-driven Question Answering dataset

1 code implementation21 May 2023 Wenhu Chen, Ming Yin, Max Ku, Pan Lu, Yixin Wan, Xueguang Ma, Jianyu Xu, Xinyi Wang, Tony Xia

We evaluate a wide spectrum of 16 large language and code models with different prompting strategies like Chain-of-Thoughts and Program-of-Thoughts.

GSM8K Question Answering

Multimodal Procedural Planning via Dual Text-Image Prompting

1 code implementation2 May 2023 Yujie Lu, Pan Lu, Zhiyu Chen, Wanrong Zhu, Xin Eric Wang, William Yang Wang

The key challenges of MPP are to ensure the informativeness, temporal coherence, and accuracy of plans across modalities.

Informativeness Text-to-Image Generation

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model

2 code implementations28 Apr 2023 Peng Gao, Jiaming Han, Renrui Zhang, Ziyi Lin, Shijie Geng, Aojun Zhou, Wei zhang, Pan Lu, Conghui He, Xiangyu Yue, Hongsheng Li, Yu Qiao

This strategy effectively alleviates the interference between the two tasks of image-text alignment and instruction following and achieves strong multi-modal reasoning with only a small-scale image-text and instruction dataset.

Instruction Following Optical Character Recognition (OCR)

Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models

1 code implementation19 Apr 2023 Pan Lu, Baolin Peng, Hao Cheng, Michel Galley, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Jianfeng Gao

Chameleon synthesizes programs by composing various tools (e. g., LLMs, off-the-shelf vision models, web search engines, Python functions, and heuristic-based modules) for accomplishing complex reasoning tasks.

Logical Reasoning Mathematical Reasoning

A Survey of Deep Learning for Mathematical Reasoning

1 code implementation20 Dec 2022 Pan Lu, Liang Qiu, Wenhao Yu, Sean Welleck, Kai-Wei Chang

Mathematical reasoning is a fundamental aspect of human intelligence and is applicable in various fields, including science, engineering, finance, and everyday life.

Mathematical Reasoning

UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression

2 code implementations6 Dec 2022 Jiaqi Chen, Tong Li, Jinghui Qin, Pan Lu, Liang Lin, Chongyu Chen, Xiaodan Liang

Naturally, we also present a unified multi-task Geometric Transformer framework, Geoformer, to tackle calculation and proving problems simultaneously in the form of sequence generation, which finally shows the reasoning ability can be improved on both two tasks by unifying formulation.

Logical Reasoning Mathematical Reasoning

Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning

1 code implementation29 Sep 2022 Pan Lu, Liang Qiu, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Tanmay Rajpurohit, Peter Clark, Ashwin Kalyan

However, it is unknown if the models can handle more complex problems that involve math reasoning over heterogeneous information, such as tabular data.

Logical Reasoning Mathematical Reasoning

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

1 code implementation20 Sep 2022 Pan Lu, Swaroop Mishra, Tony Xia, Liang Qiu, Kai-Wei Chang, Song-Chun Zhu, Oyvind Tafjord, Peter Clark, Ashwin Kalyan

We further design language models to learn to generate lectures and explanations as the chain of thought (CoT) to mimic the multi-hop reasoning process when answering ScienceQA questions.

Multimodal Deep Learning Multiple-choice +4

Triangular Character Animation Sampling with Motion, Emotion, and Relation

no code implementations9 Mar 2022 Yizhou Zhao, Liang Qiu, Wensi Ai, Pan Lu, Song-Chun Zhu

We propose a Spatial-Temporal And-Or graph (ST-AOG), a stochastic grammar model, to encode the contextual relationship between motion, emotion, and relation, forming a triangle in a conditional random field.

ValueNet: A New Dataset for Human Value Driven Dialogue System

no code implementations12 Dec 2021 Liang Qiu, Yizhou Zhao, Jinchao Li, Pan Lu, Baolin Peng, Jianfeng Gao, Song-Chun Zhu

To the best of our knowledge, ValueNet is the first large-scale text dataset for human value modeling, and we are the first one trying to incorporate a value model into emotionally intelligent dialogue systems.

Dialogue Generation Emotion Recognition +2

Learning from the Tangram to Solve Mini Visual Tasks

1 code implementation12 Dec 2021 Yizhou Zhao, Liang Qiu, Pan Lu, Feng Shi, Tian Han, Song-Chun Zhu

Current pre-training methods in computer vision focus on natural images in the daily-life context.

Few-Shot Learning

IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning

1 code implementation25 Oct 2021 Pan Lu, Liang Qiu, Jiaqi Chen, Tony Xia, Yizhou Zhao, Wei zhang, Zhou Yu, Xiaodan Liang, Song-Chun Zhu

Also, we develop a strong IconQA baseline Patch-TRM that applies a pyramid cross-modal Transformer with input diagram embeddings pre-trained on the icon dataset.

Arithmetic Reasoning Mathematical Reasoning +3

SocAoG: Incremental Graph Parsing for Social Relation Inference in Dialogues

no code implementations ACL 2021 Liang Qiu, Yuan Liang, Yizhou Zhao, Pan Lu, Baolin Peng, Zhou Yu, Ying Nian Wu, Song-Chun Zhu

Inferring social relations from dialogues is vital for building emotionally intelligent robots to interpret human language better and act accordingly.

Dialog Relation Extraction

Towards Socially Intelligent Agents with Mental State Transition and Human Utility

no code implementations12 Mar 2021 Liang Qiu, Yizhou Zhao, Yuan Liang, Pan Lu, Weiyan Shi, Zhou Yu, Song-Chun Zhu

One of which is to track the agent's mental state transition and teach the agent to make decisions guided by its value like a human.

Learning Long- and Short-Term User Literal-Preference with Multimodal Hierarchical Transformer Network for Personalized Image Caption

no code implementations AAAI Conference on Artificial Intelligence (AAAI 2020) 2020 Wei Zhang, Yue Ying, Pan Lu, Hongyuan Zha

Personalized image caption, a natural extension of the standard image caption task, requires to generate brief image descriptions tailored for users’ writing style and traits, and is more practical to meet users’ real demands.

Image Captioning

Knowledge Aware Semantic Concept Expansion for Image-Text Matching

no code implementations International Joint Conferences on Artifical Intelligence (IJCAI) 2019 Botian Shi, Lei Ji, Pan Lu, Zhendong Niu, Nan Duan

In this paper, we develop a Scene Concept Graph (SCG) by aggregating image scene graphs and extracting frequently co-occurred concept pairs as scene common-sense knowledge.

Common Sense Reasoning Content-Based Image Retrieval +2

Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering

no code implementations13 Dec 2018 Gao Peng, Zhengkai Jiang, Haoxuan You, Pan Lu, Steven Hoi, Xiaogang Wang, Hongsheng Li

It can robustly capture the high-level interactions between language and vision domains, thus significantly improves the performance of visual question answering.

Question Answering Visual Question Answering

Question-Guided Hybrid Convolution for Visual Question Answering

no code implementations ECCV 2018 Peng Gao, Pan Lu, Hongsheng Li, Shuang Li, Yikang Li, Steven Hoi, Xiaogang Wang

Most state-of-the-art VQA methods fuse the high-level textual and visual features from the neural network and abandon the visual spatial information when learning multi-modal features. To address these problems, question-guided kernels generated from the input question are designed to convolute with visual features for capturing the textual and visual relationship in the early stage.

Question Answering Visual Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.