Search Results for author: Jiarui Zhang

Found 16 papers, 5 papers with code

Visual Cropping Improves Zero-Shot Question Answering of Multimodal Large Language Models

1 code implementation24 Oct 2023 Jiarui Zhang, Mahyar Khayatkhoei, Prateek Chhikara, Filip Ilievski

In particular, we show that their zero-shot accuracy in answering visual questions is very sensitive to the size of the visual subject of the question, declining up to $46\%$ with size.

Question Answering Visual Question Answering

PINN-based viscosity solution of HJB equation

no code implementations18 Sep 2023 Tianyu Liu, Steven Ding, Jiarui Zhang, Liutao Zhou

This paper proposed a novel PINN-based viscosity solution for HJB equations.

Differentially private sliced inverse regression in the federated paradigm

no code implementations10 Jun 2023 Shuaida He, Jiarui Zhang, Xin Chen

Sliced inverse regression (SIR), which includes linear discriminant analysis (LDA) as a special case, is a popular and powerful dimension reduction tool.

Dimensionality Reduction regression

A Study of Situational Reasoning for Traffic Understanding

1 code implementation5 Jun 2023 Jiarui Zhang, Filip Ilievski, Kaixin Ma, Aravinda Kollaa, Jonathan Francis, Alessandro Oltramari

Intelligent Traffic Monitoring (ITMo) technologies hold the potential for improving road safety/security and for enabling smart city infrastructure.

Decision Making Knowledge Graphs +2

Using Visual Cropping to Enhance Fine-Detail Question Answering of BLIP-Family Models

no code implementations31 May 2023 Jiarui Zhang, Mahyar Khayatkhoei, Prateek Chhikara, Filip Ilievski

As our initial analysis of BLIP-family models revealed difficulty with answering fine-detail questions, we investigate the following question: Can visual cropping be employed to improve the performance of state-of-the-art visual question answering models on fine-detail questions?

Question Answering Visual Question Answering

Knowledge-enhanced Agents for Interactive Text Games

no code implementations8 May 2023 Prateek Chhikara, Jiarui Zhang, Filip Ilievski, Jonathan Francis, Kaixin Ma

Communication via natural language is a crucial aspect of intelligence, and it requires computational models to learn and reason about world concepts, with varying levels of supervision.

Instruction Following Knowledge Graphs +4

Deformable Model-Driven Neural Rendering for High-Fidelity 3D Reconstruction of Human Heads Under Low-View Settings

2 code implementations ICCV 2023 Baixin Xu, Jiarui Zhang, Kwan-Yee Lin, Chen Qian, Ying He

To address this, we propose geometry decomposition and adopt a two-stage, coarse-to-fine training strategy, allowing for progressively capturing high-frequency geometric details.

3D Reconstruction Neural Rendering +1

Utilizing Background Knowledge for Robust Reasoning over Traffic Situations

1 code implementation4 Dec 2022 Jiarui Zhang, Filip Ilievski, Aravinda Kollaa, Jonathan Francis, Kaixin Ma, Alessandro Oltramari

Understanding novel situations in the traffic domain requires an intricate combination of domain-specific and causal commonsense knowledge.

Knowledge Graphs Multiple-choice +2

An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs

no code implementations21 May 2022 Jiarui Zhang, Filip Ilievski, Kaixin Ma, Jonathan Francis, Alessandro Oltramari

In this paper, we study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.

Knowledge Graphs

CCPM: A Chinese Classical Poetry Matching Dataset

1 code implementation3 Jun 2021 Wenhao Li, Fanchao Qi, Maosong Sun, Xiaoyuan Yi, Jiarui Zhang

We hope this dataset can further enhance the study on incorporating deep semantics into the understanding and generation system of Chinese classical poetry.

Translation

Multiview 2D/3D Rigid Registration via a Point-Of-Interest Network for Tracking and Triangulation ($\text{POINT}^2$)

no code implementations10 Mar 2019 Haofu Liao, Wei-An Lin, Jiarui Zhang, Jingdan Zhang, Jiebo Luo, S. Kevin Zhou

As the POI tracker is shift-invariant, $\text{POINT}^2$ is more robust to the initial pose of the 3D pre-intervention image.

Cannot find the paper you are looking for? You can Submit a new open access paper.