Search Results for author: Sinan Tan

Found 11 papers, 7 papers with code

Multi-Agent Embodied Question Answering in Interactive Environments

no code implementations ECCV 2020 Sinan Tan, Weilai Xiang, Huaping Liu, Di Guo, Fuchun Sun

We investigate a new AI task --- Multi-Agent Interactive Question Answering --- where several agents explore the scene jointly in interactive environments to answer a question.

3D Reconstruction Embodied Question Answering +1

A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation

1 code implementation2 Oct 2024 Liang Chen, Sinan Tan, Zefan Cai, Weichu Xie, Haozhe Zhao, Yichi Zhang, Junyang Lin, Jinze Bai, Tianyu Liu, Baobao Chang

This work tackles the information loss bottleneck of vector-quantization (VQ) autoregressive image generation by introducing a novel model architecture called the 2-Dimensional Autoregression (DnD) Transformer.

Image Generation Quantization

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

1 code implementation8 Dec 2022 Jinze Bai, Rui Men, Hao Yang, Xuancheng Ren, Kai Dang, Yichang Zhang, Xiaohuan Zhou, Peng Wang, Sinan Tan, An Yang, Zeyu Cui, Yu Han, Shuai Bai, Wenbin Ge, Jianxin Ma, Junyang Lin, Jingren Zhou, Chang Zhou

As a starting point, we provide presets of 7 different modalities and 23 highly-diverse example tasks in OFASys, with which we also develop a first-in-kind, single model, OFA+, that can handle text, image, speech, video, and motion data.

Multi-Task Learning

Mixed Neural Voxels for Fast Multi-view Video Synthesis

1 code implementation ICCV 2023 Feng Wang, Sinan Tan, Xinghang Li, Zeyue Tian, Yafei Song, Huaping Liu

In this paper, we present a novel method named MixVoxels to better represent the dynamic scenes with fast training speed and competitive rendering qualities.

Self-supervised 3D Semantic Representation Learning for Vision-and-Language Navigation

no code implementations26 Jan 2022 Sinan Tan, Mengmeng Ge, Di Guo, Huaping Liu, Fuchun Sun

In the Vision-and-Language Navigation task, the embodied agent follows linguistic instructions and navigates to a specific goal.

Representation Learning Test unseen +1

An Automated Question-Answering Framework Based on Evolution Algorithm

no code implementations26 Jan 2022 Sinan Tan, Hui Xue, Qiyu Ren, Huaping Liu, Jing Bai

Our framework is based on an innovative evolution algorithm, which is stable and suitable for multiple dataset scenario.

Question Answering

Towards Embodied Scene Description

no code implementations30 Apr 2020 Sinan Tan, Huaping Liu, Di Guo, Xin-Yu Zhang, Fuchun Sun

Embodiment is an important characteristic for all intelligent agents (creatures and robots), while existing scene description tasks mainly focus on analyzing images passively and the semantic understanding of the scenario is separated from the interaction between the agent and the environment.

Imitation Learning reinforcement-learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.