Search Results for author: Fan Shi

Found 7 papers, 3 papers with code

Towards Generative Abstract Reasoning: Completing Raven's Progressive Matrix via Rule Abstraction and Selection

no code implementations18 Jan 2024 Fan Shi, Bin Li, xiangyang xue

In the odd-one-out task and two held-out configurations, RAISE can leverage acquired latent concepts and atomic rules to find the rule-breaking image in a matrix and handle problems with unseen combinations of rules and attributes.

Answer Generation Attribute +2

RefineNet: Enhancing Text-to-Image Conversion with High-Resolution and Detail Accuracy through Hierarchical Transformers and Progressive Refinement

no code implementations27 Dec 2023 Fan Shi

In this research, we introduce RefineNet, a novel architecture designed to address resolution limitations in text-to-image conversion systems.

Computational Efficiency Image Generation

Abstracting Concept-Changing Rules for Solving Raven's Progressive Matrix Problems

1 code implementation15 Jul 2023 Fan Shi, Bin Li, xiangyang xue

Finally, we conduct experiments to illustrate the interpretability of CRAB in concept learning, answer selection, and global rule abstraction.

Answer Generation Answer Selection +1

High-order Spatial Interactions Enhanced Lightweight Model for Optical Remote Sensing Image-based Small Ship Detection

no code implementations7 Apr 2023 Yifan Yin, Xu Cheng, Fan Shi, Xiufeng Liu, Huan Huo, ShengYong Chen

Accurate and reliable optical remote sensing image-based small-ship detection is crucial for maritime surveillance systems, but existing methods often struggle with balancing detection performance and computational complexity.

object-detection Small Object Detection

Compositional Law Parsing with Latent Random Functions

1 code implementation15 Sep 2022 Fan Shi, Bin Li, xiangyang xue

The automatic parsing of these laws indicates the model's ability to understand the scene, which makes law parsing play a central role in many visual tasks.

Position Visual Reasoning

ViDA-MAN: Visual Dialog with Digital Humans

no code implementations26 Oct 2021 Tong Shen, Jiawei Zuo, Fan Shi, Jin Zhang, Liqin Jiang, Meng Chen, Zhengchen Zhang, Wei zhang, Xiaodong He, Tao Mei

We demonstrate ViDA-MAN, a digital-human agent for multi-modal interaction, which offers realtime audio-visual responses to instant speech inquiries.

speech-recognition Speech Recognition +2

Raven's Progressive Matrices Completion with Latent Gaussian Process Priors

2 code implementations22 Mar 2021 Fan Shi, Bin Li, xiangyang xue

In this paper we aim to solve the latter one by proposing a deep latent variable model, in which multiple Gaussian processes are employed as priors of latent variables to separately learn underlying abstract concepts from RPMs; thus the proposed model is interpretable in terms of concept-specific latent variables.

Answer Selection Gaussian Processes +1

Cannot find the paper you are looking for? You can Submit a new open access paper.