Search Results for author: Siyu Zhang

Found 22 papers, 13 papers with code

Fast and Interpretable 2D Homography Decomposition: Similarity-Kernel-Similarity and Affine-Core-Affine Transformations

1 code implementation28 Feb 2024 Shen Cai, Zhanhao Wu, Lingxi Guo, Jiachun Wang, Siyu Zhang, Junchi Yan, Shuhan Shen

Under the minimal $4$-point configuration, the first and the last similarity transformations in SKS are computed by two anchor points on target and source planes, respectively.

Computational Efficiency

KNVQA: A Benchmark for evaluation knowledge-based VQA

no code implementations21 Nov 2023 Sirui Cheng, Siyu Zhang, Jiayi Wu, Muchen Lan

Within the multimodal field, large vision-language models (LVLMs) have made significant progress due to their strong perception and reasoning capabilities in the visual and language systems.

Hallucination Visual Question Answering (VQA)

Multiscale Superpixel Structured Difference Graph Convolutional Network for VL Representation

no code implementations20 Oct 2023 Siyu Zhang, Yeming Chen, Sirui Cheng, Yaoru Sun, Jun Yang, Lizhi Bai

It parses the entire image as a fine-to-coarse hierarchical structure of constituent visual patterns, and captures multiscale features by progressively merging adjacent superpixels as graph nodes.

Self-Supervised Learning Superpixels +1

Guided Cooperation in Hierarchical Reinforcement Learning via Model-based Rollout

1 code implementation24 Sep 2023 Haoran Wang, Zeshen Tang, Leya Yang, Yaoru Sun, Fang Wang, Siyu Zhang, Yeming Chen

Here, we propose a goal-conditioned HRL framework named Guided Cooperation via Model-based Rollout (GCMR), aiming to bridge inter-layer information synchronization and cooperation by exploiting forward dynamics.

Hierarchical Reinforcement Learning reinforcement-learning +1

Artificial-Spiking Hierarchical Networks for Vision-Language Representation Learning

no code implementations18 Aug 2023 Yeming Chen, Siyu Zhang, Yaoru Sun, Weijian Liang, Haoran Wang

In this work, we propose an efficient computation framework for multimodal alignment by introducing a novel visual semantic module to further improve the performance of the VL tasks.

Computational Efficiency Contrastive Learning +2

LittleMu: Deploying an Online Virtual Teaching Assistant via Heterogeneous Sources Integration and Chain of Teach Prompts

1 code implementation11 Aug 2023 Shangqing Tu, Zheyuan Zhang, Jifan Yu, Chunyang Li, Siyu Zhang, Zijun Yao, Lei Hou, Juanzi Li

However, few MOOC platforms are providing human or virtual teaching assistants to support learning for massive online students due to the complexity of real-world online education scenarios and the lack of training data.

Language Modelling Question Answering +1

LOIS: Looking Out of Instance Semantics for Visual Question Answering

no code implementations26 Jul 2023 Siyu Zhang, Yeming Chen, Yaoru Sun, Fang Wang, Haibo Shi, Haoran Wang

Visual question answering (VQA) has been intensively studied as a multimodal task that requires effort in bridging vision and language to infer answers correctly.

Question Answering Visual Question Answering +1

You Only Need Two Detectors to Achieve Multi-Modal 3D Multi-Object Tracking

1 code implementation18 Apr 2023 Xiyang Wang, Chunyun Fu, JiaWei He, Mingguang Huang, Ting Meng, Siyu Zhang, Hangning Zhou, Ziyao Xu, Chi Zhang

In the classical tracking-by-detection (TBD) paradigm, detection and tracking are separately and sequentially conducted, and data association must be properly performed to achieve satisfactory tracking performance.

3D Multi-Object Tracking Object +3

PCAE: A Framework of Plug-in Conditional Auto-Encoder for Controllable Text Generation

1 code implementation7 Oct 2022 Haoqin Tu, Zhongliang Yang, Jinshuai Yang, Siyu Zhang, Yongfeng Huang

Visualization of the local latent prior well confirms the primary devotion in hidden space of the proposed model.

Text Generation

An Efficient End-to-End 3D Voxel Reconstruction based on Neural Architecture Search

1 code implementation27 Feb 2022 Yongdong Huang, Yuanzhan Li, Xulong Cao, Siyu Zhang, Shen Cai, Ting Lu, Jie Wang, Yuqi Liu

However, many previous works employ neural networks with fixed architecture and size to represent different 3D objects, which lead to excessive network parameters for simple objects and limited reconstruction accuracy for complex objects.

Binary Classification Neural Architecture Search +1

High-fidelity 3D Model Compression based on Key Spheres

1 code implementation19 Jan 2022 Yuanzhan Li, Yuqi Liu, Yujie Lu, Siyu Zhang, Shen Cai, Yanting Zhang

Compared to previous works, our method achieves the high-fidelity and high-compression 3D object coding and reconstruction.

Model Compression Object +1

Provably Secure Generative Linguistic Steganography

1 code implementation Findings (ACL) 2021 Siyu Zhang, Zhongliang Yang, Jinshuai Yang, Yongfeng Huang

Generative linguistic steganography mainly utilized language models and applied steganographic sampling (stegosampling) to generate high-security steganographic text (stegotext).

Language Modelling Linguistic steganography

You Don't Only Look Once: Constructing Spatial-Temporal Memory for Integrated 3D Object Detection and Tracking

no code implementations ICCV 2021 Jiaming Sun, Yiming Xie, Siyu Zhang, Linghao Chen, Guofeng Zhang, Hujun Bao, Xiaowei Zhou

In this work, we propose a novel system for integrated 3D object detection and tracking, which uses a dynamic object occupancy map and previous object states as spatial-temporal memory to assist object detection in future frames.

3D Object Detection Object +2

Unsupervised Deep Representation Learning and Few-Shot Classification of PolSAR Images

no code implementations27 Jun 2020 Lamei Zhang, Siyu Zhang, Bin Zou, Hongwei Dong

To handle this problem, in this paper, learning transferrable representations from unlabeled PolSAR data through convolutional architectures is explored for the first time.

Contrastive Learning General Classification +3

InSphereNet: a Concise Representation and Classification Method for 3D Object

1 code implementation25 Dec 2019 Hui Cao, Haikuan Du, Siyu Zhang, Shen Cai

Unlike previous methods that use points, voxels, or multi-view images as inputs of deep neural network (DNN), the proposed method constructs a class of more representative features named infilling spheres from signed distance field (SDF).

3D Object Classification Classification +1

Automatic Design of CNNs via Differentiable Neural Architecture Search for PolSAR Image Classification

no code implementations16 Nov 2019 Hongwei Dong, Siyu Zhang, Bin Zou, Lamei Zhang

By DAS, the weights parameters and architecture parameters (corresponds to the hyperparameters but not the topologies) can be optimized by stochastic gradient descent method during the training.

Feature Engineering General Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.