Search Results for author: Yanjie Wang

Found 12 papers, 4 papers with code

A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding

no code implementations2 Jul 2024 Jinghui Lu, Haiyang Yu, Yanjie Wang, YongJie Ye, Jingqun Tang, Ziwei Yang, Binghong Wu, Qi Liu, Hao Feng, Han Wang, Hao liu, Can Huang

Recently, many studies have demonstrated that exclusively incorporating OCR-derived text and spatial layouts with large language models (LLMs) can be highly effective for document understanding tasks.

document understanding Key Information Extraction +5

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

1 code implementation20 May 2024 Jingqun Tang, Qi Liu, YongJie Ye, Jinghui Lu, Shu Wei, Chunhui Lin, Wanqing Li, Mohamad Fitri Faiz Bin Mahmood, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao liu, Xiang Bai, Can Huang

Text-Centric Visual Question Answering (TEC-VQA) in its proper format not only facilitates human-machine interaction in text-centric visual environments but also serves as a de facto gold proxy to evaluate AI models in the domain of text-centric scene understanding.

Benchmarking Question Answering +4

Elysium: Exploring Object-level Perception in Videos via MLLM

1 code implementation25 Mar 2024 Han Wang, Yanjie Wang, YongJie Ye, Yuxiang Nie, Can Huang

Multi-modal Large Language Models (MLLMs) have demonstrated their ability to perceive objects in still images, but their application in video-related tasks, such as object tracking, remains understudied.

Object Referring Expression +4

GloTSFormer: Global Video Text Spotting Transformer

1 code implementation8 Jan 2024 Han Wang, Yanjie Wang, Yang Li, Can Huang

In this paper, we propose a novel Global Video Text Spotting Transformer GloTSFormer to model the tracking problem as global associations and utilize the Gaussian Wasserstein distance to guide the morphological correlation between frames.

Text Spotting

Rethinking Skip Connections in Encoder-decoder Networks for Monocular Depth Estimation

no code implementations29 Aug 2022 Zhitong Lai, Haichao Sun, Rui Tian, Nannan Ding, Zhiguo Wu, Yanjie Wang

Skip connections are fundamental units in encoder-decoder networks, which are able to improve the feature propagtion of the neural networks.

Decoder Monocular Depth Estimation

Learning Oriented Remote Sensing Object Detection via Naive Geometric Computing

no code implementations1 Dec 2021 Yanjie Wang, Xu Zou, Zhijun Zhang, Wenhui Xu, Liqun Chen, Sheng Zhong, Luxin Yan, Guodong Wang

Detecting oriented objects along with estimating their rotation information is one crucial step for analyzing remote sensing images.

object-detection Object Detection +2

Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction

1 code implementation ACL 2020 Samuel Broscheit, Kiril Gashteovski, Yanjie Wang, Rainer Gemulla

An evaluation in such a setup raises the question if a correct prediction is actually a new fact that was induced by reasoning over the open knowledge graph or if it can be trivially explained.

Knowledge Graph Embeddings Link Prediction +3

A Relational Tucker Decomposition for Multi-Relational Link Prediction

no code implementations3 Feb 2019 Yanjie Wang, Samuel Broscheit, Rainer Gemulla

We propose the Relational Tucker3 (RT) decomposition for multi-relational link prediction in knowledge graphs.

Knowledge Graph Embedding Knowledge Graphs +1

On Evaluating Embedding Models for Knowledge Base Completion

no code implementations WS 2019 Yanjie Wang, Daniel Ruffinelli, Rainer Gemulla, Samuel Broscheit, Christian Meilicke

In this paper, we explore whether recent models work well for knowledge base completion and argue that the current evaluation protocols are more suited for question answering rather than knowledge base completion.

Knowledge Base Completion Question Answering

On Multi-Relational Link Prediction with Bilinear Models

no code implementations14 Sep 2017 Yanjie Wang, Rainer Gemulla, Hui Li

Bilinear models belong to the most basic models for this task, they are comparably efficient to train and use, and they can provide good prediction performance.

Knowledge Graph Completion Link Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.