Search Results for author: Siyuan Qi

Found 22 papers, 11 papers with code

Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars

no code implementations • 1 Apr 2017 • Chenfanfu Jiang, Siyuan Qi, Yixin Zhu, Siyuan Huang, Jenny Lin, Lap-Fai Yu, Demetri Terzopoulos, Song-Chun Zhu

We propose a systematic learning-based approach to the generation of massive quantities of synthetic 3D scenes and arbitrary numbers of photorealistic 2D images thereof, with associated ground truth information, for the purposes of training, benchmarking, and diagnosing learning-based computer vision and robotics algorithms.

Benchmarking Object +2

Paper
Add Code

Predicting Human Activities Using Stochastic Grammar

no code implementations • ICCV 2017 • Siyuan Qi, Siyuan Huang, Ping Wei, Song-Chun Zhu

This paper presents a novel method to predict future human activities from partially observed RGB-D videos.

Activity Prediction

Paper
Add Code

Intent-aware Multi-agent Reinforcement Learning

no code implementations • 6 Mar 2018 • Siyuan Qi, Song-Chun Zhu

We experiment our algorithm in a real-world problem that is non-episodic, and the number of agents and goals can vary over time.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Generalized Earley Parser: Bridging Symbolic Grammars and Sequence Data for Future Prediction

no code implementations • ICML 2018 • Siyuan Qi, Baoxiong Jia, Song-Chun Zhu

Future predictions on sequence data (e. g., videos or audios) require the algorithms to capture non-Markovian and compositional properties of high-level semantics.

Activity Prediction Future prediction

Paper
Add Code

Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image

1 code implementation • ECCV 2018 • Siyuan Huang, Siyuan Qi, Yixin Zhu, Yinxue Xiao, Yuanlu Xu, Song-Chun Zhu

We propose a computational framework to jointly parse a single RGB image and reconstruct a holistic 3D configuration composed by a set of CAD models using a stochastic grammar model.

Ranked #4 on Monocular 3D Object Detection on SUN RGB-D (AP@0.15 (10 / PNet-30) metric)

Monocular 3D Object Detection Object +5

214

Paper
Code

Learning Human-Object Interactions by Graph Parsing Neural Networks

1 code implementation • ECCV 2018 • Siyuan Qi, Wenguan Wang, Baoxiong Jia, Jianbing Shen, Song-Chun Zhu

For a given scene, GPNN infers a parse graph that includes i) the HOI graph structure represented by an adjacency matrix, and ii) the node labels.

Ranked #32 on Human-Object Interaction Detection on V-COCO

Human-Object Interaction Detection Object

224

Paper
Code

Human-centric Indoor Scene Synthesis Using Stochastic Grammar

1 code implementation • CVPR 2018 • Siyuan Qi, Yixin Zhu, Siyuan Huang, Chenfanfu Jiang, Song-Chun Zhu

We present a human-centric method to sample and synthesize 3D room layouts and 2D images thereof, to obtain large-scale 2D/3D image data with perfect per-pixel ground truth.

Indoor Scene Synthesis

Paper
Code

Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation

1 code implementation • NeurIPS 2018 • Siyuan Huang, Siyuan Qi, Yinxue Xiao, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu

Holistic 3D indoor scene understanding refers to jointly recovering the i) object bounding boxes, ii) room layout, and iii) camera pose, all in 3D.

Ranked #5 on Monocular 3D Object Detection on SUN RGB-D

Monocular 3D Object Detection Object +4

100

Paper
Code

Reasoning Visual Dialogs with Structural and Partial Observations

1 code implementation • CVPR 2019 • Zilong Zheng, Wenguan Wang, Siyuan Qi, Song-Chun Zhu

The answer to a given question is represented by a node with missing value.

Ranked #14 on Visual Dialog on VisDial v0.9 val

Visual Dialog

Paper
Code

Holistic++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense

no code implementations • ICCV 2019 • Yixin Chen, Siyuan Huang, Tao Yuan, Siyuan Qi, Yixin Zhu, Song-Chun Zhu

We propose a new 3D holistic++ scene understanding problem, which jointly tackles two tasks from a single-view image: (i) holistic scene parsing and reconstruction---3D estimations of object bounding boxes, camera pose, and room layout, and (ii) 3D human pose estimation.

3D Human Pose Estimation Human-Object Interaction Detection +1

Paper
Add Code

Theory-based Causal Transfer: Integrating Instance-level Induction and Abstract-level Structure Learning

no code implementations • 25 Nov 2019 • Mark Edmonds, Xiaojian Ma, Siyuan Qi, Yixin Zhu, Hongjing Lu, Song-Chun Zhu

Given these general theories, the goal is to train an agent by interactively exploring the problem space to (i) discover, form, and transfer useful abstract and structural knowledge, and (ii) induce useful knowledge from the instance-level attributes observed in the environment.

Reinforcement Learning (RL) Transfer Learning

Paper
Add Code

PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points

no code implementations • NeurIPS 2019 • Siyuan Huang, Yixin Chen, Tao Yuan, Siyuan Qi, Yixin Zhu, Song-Chun Zhu

Detecting 3D objects from a single RGB image is intrinsically ambiguous, thus requiring appropriate prior knowledge and intermediate representations as constraints to reduce the uncertainties and improve the consistencies between the 2D image plane and the 3D world coordinate.

Ranked #2 on Monocular 3D Object Detection on SUN RGB-D (AP@0.15 (10 / PNet-30) metric)

Monocular 3D Object Detection Object +1

Paper
Add Code

Learning Compositional Neural Information Fusion for Human Parsing

1 code implementation • ICCV 2019 • Wenguan Wang, Zhijie Zhang, Siyuan Qi, Jianbing Shen, Yanwei Pang, Ling Shao

The bottom-up and top-down inferences explicitly model the compositional and decompositional relations in human bodies, respectively.

Human Parsing

Paper
Code

Cascaded Human-Object Interaction Recognition

1 code implementation • CVPR 2020 • Tianfei Zhou, Wenguan Wang, Siyuan Qi, Haibin Ling, Jianbing Shen

The interaction recognition network has two crucial parts: a relation ranking module for high-quality HOI proposal selection and a triple-stream classifier for relation prediction.

Human-Object Interaction Detection Object +1

Paper
Code

Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense

no code implementations • 20 Apr 2020 • Yixin Zhu, Tao Gao, Lifeng Fan, Siyuan Huang, Mark Edmonds, Hangxin Liu, Feng Gao, Chi Zhang, Siyuan Qi, Ying Nian Wu, Joshua B. Tenenbaum, Song-Chun Zhu

We demonstrate the power of this perspective to develop cognitive AI systems with humanlike common sense by showing how to observe and apply FPICU with little training data to solve a wide range of challenging tasks, including tool use, planning, utility inference, and social learning.

Common Sense Reasoning Small Data Image Classification

Paper
Add Code

Heterogeneous Value Alignment Evaluation for Large Language Models

2 code implementations • 26 May 2023 • Zhaowei Zhang, Ceyao Zhang, Nian Liu, Siyuan Qi, Ziqi Rong, Song-Chun Zhu, Shuguang Cui, Yaodong Yang

We conduct evaluations with new auto-metric \textit{value rationality} to represent the ability of LLMs to align with specific values.

Attribute

Paper
Code

E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning

1 code implementation • ICCV 2023 • Cheng Han, Qifan Wang, Yiming Cui, Zhiwen Cao, Wenguan Wang, Siyuan Qi, Dongfang Liu

Specifically, we introduce a set of learnable key-value prompts and visual prompts into self-attention and input layers, respectively, to improve the effectiveness of model fine-tuning.

Visual Prompt Tuning

Paper
Code

CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models

1 code implementation • 5 Sep 2023 • Lingyue Fu, Huacan Chai, Shuang Luo, Kounianhua Du, Weiming Zhang, Longteng Fan, Jiayi Lei, Renting Rui, Jianghao Lin, Yuchen Fang, Yifan Liu, Jingkuan Wang, Siyuan Qi, Kangning Zhang, Weinan Zhang, Yong Yu

With the emergence of Large Language Models (LLMs), there has been a significant improvement in the programming capabilities of models, attracting growing attention from researchers.

Code Generation Multiple-choice

Paper
Code

Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation

no code implementations • 2 Oct 2023 • Shenzhi Wang, Chang Liu, Zilong Zheng, Siyuan Qi, Shuo Chen, Qisen Yang, Andrew Zhao, Chaofei Wang, Shiji Song, Gao Huang

This study utilizes the intricate Avalon game as a testbed to explore LLMs' potential in deceptive environments.

Misinformation

Paper
Add Code

CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents

1 code implementation • 19 Jan 2024 • Siyuan Qi, Shuo Chen, Yexin Li, Xiangyu Kong, Junqi Wang, Bangcheng Yang, Pring Wong, Yifan Zhong, Xiaoyuan Zhang, Zhaowei Zhang, Nian Liu, Wei Wang, Yaodong Yang, Song-Chun Zhu

Within CivRealm, we provide interfaces for two typical agent types: tensor-based agents that focus on learning, and language-based agents that emphasize reasoning.

Decision Making

Paper
Code

Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning?

no code implementations • 23 Jan 2024 • Cheng Han, Qifan Wang, Yiming Cui, Wenguan Wang, Lifu Huang, Siyuan Qi, Dongfang Liu

As the scale of vision models continues to grow, the emergence of Visual Prompt Tuning (VPT) as a parameter-efficient transfer learning technique has gained attention due to its superior performance compared to traditional full-finetuning.

Transfer Learning Visual Prompt Tuning

Paper
Add Code

Panacea: Pareto Alignment via Preference Adaptation for LLMs

no code implementations • 3 Feb 2024 • Yifan Zhong, Chengdong Ma, Xiaoyuan Zhang, Ziran Yang, Qingfu Zhang, Siyuan Qi, Yaodong Yang

Our work marks a step forward in effectively and efficiently aligning models to diverse and intricate human preferences in a controllable and Pareto-optimal manner.

Language Modelling Large Language Model

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.