no code implementations • 22 Jul 2024 • Song Wang, Xun Wang, Jie Mei, Yujia Xie, Sean Muarray, Zhang Li, Lingfeng Wu, Si-Qing Chen, Wayne Xiong
Hallucination, a phenomenon where large language models (LLMs) produce output that is factually incorrect or unrelated to the input, is a major challenge for LLM applications that require accuracy and dependability.
1 code implementation • 9 May 2024 • Shuo Zhang, Biao Yang, Zhang Li, Zhiyin Ma, Yuliang Liu, Xiang Bai
To further explore the capabilities of LMM in complex text tasks, we propose the DT-VQA dataset, with 170k question-answer pairs.
no code implementations • 1 Apr 2024 • Yibin Ye, Xichao Teng, Shuo Chen, Yijie Bian, Tao Tan, Zhang Li
Optical-SAR image matching is a fundamental task for image fusion and visual navigation.
1 code implementation • 7 Mar 2024 • Yuliang Liu, Biao Yang, Qiang Liu, Zhang Li, Zhiyin Ma, Shuo Zhang, Xiang Bai
We present TextMonkey, a large multimodal model (LMM) tailored for text-centric tasks.
1 code implementation • CVPR 2024 • Zhang Li, Biao Yang, Qiang Liu, Zhiyin Ma, Shuo Zhang, Jingxu Yang, Yabo Sun, Yuliang Liu, Xiang Bai
Additionally, experiments on 18 datasets further demonstrate that Monkey surpasses existing LMMs in many tasks like Image Captioning and various Visual Question Answering formats.
Ranked #13 on MMR total on MRR-Benchmark (using extra training data)
1 code implementation • 13 May 2023 • Yuliang Liu, Zhang Li, Mingxin Huang, Biao Yang, Wenwen Yu, Chunyuan Li, XuCheng Yin, Cheng-Lin Liu, Lianwen Jin, Xiang Bai
In this paper, we conducted a comprehensive evaluation of Large Multimodal Models, such as GPT4V and Gemini, in various text-related visual tasks including Text Recognition, Scene Text-Centric Visual Question Answering (VQA), Document-Oriented VQA, Key Information Extraction (KIE), and Handwritten Mathematical Expression Recognition (HMER).
no code implementations • 21 Feb 2023 • Kun Wang, Zi Wang, Zhang Li, Ang Su, Xichao Teng, Minhao Liu, Qifeng Yu
Given the rapid development of this field, this paper aims to provide a comprehensive survey of recent advances in oriented object detection.
no code implementations • 16 Jan 2023 • Yiping Jiao, Jeroen van der Laak, Shadi Albarqouni, Zhang Li, Tao Tan, Abhir Bhalerao, Jiabo Ma, Jiamei Sun, Johnathan Pocock, Josien P. W. Pluim, Navid Alemi Koohbanani, Raja Muhammad Saad Bashir, Shan E Ahmed Raza, Sibo Liu, Simon Graham, Suzanne Wetstein, Syed Ali Khurram, Thomas Watson, Nasir Rajpoot, Mitko Veta, Francesco Ciompi
Additionally, we present post-competition results where we show how the presented methods perform on an independent set of lung cancer slides, which was not part of the initial competition, as well as a comparison on lymphocyte assessment between presented methods and a panel of pathologists.
no code implementations • 23 Dec 2022 • Zi Wang, Minglin Chen, Yulan Guo, Zhang Li, Qifeng Yu
Recently, unsupervised domain adaptation in satellite pose estimation has gained increasing attention, aiming at alleviating the annotation cost for training deep models.
no code implementations • 21 Aug 2020 • Zhang Li, Jiehua Zhang, Tao Tan, Xichao Teng, Xiaoliang Sun, Yang Li, Lihong Liu, Yang Xiao, Byungjae Lee, Yilong Li, Qianni Zhang, Shujiao Sun, Yushan Zheng, Junyu Yan, Ni Li, Yiyu Hong, Junsu Ko, Hyun Jung, Yanling Liu, Yu-cheng Chen, Ching-Wei Wang, Vladimir Yurovskiy, Pavel Maevskikh, Vahid Khanagha, Yi Jiang, Xiangjun Feng, Zhihong Liu, Daiqiang Li, Peter J. Schüffler, Qifeng Yu, Hui Chen, Yuling Tang, Geert Litjens
All methods were based on deep learning and categorized into two groups: multi-model method and single model method.
no code implementations • CVPR 2020 • Banglei Guan, Ji Zhao, Zhang Li, Fang Sun, Friedrich Fraundorfer
In this paper we present four cases of minimal solutions for two-view relative pose estimation by exploiting the affine transformation between feature points and we demonstrate efficient solvers for these cases.
no code implementations • 14 Mar 2018 • Zhang Li, Zheyu Hu, Jiaolong Xu, Tao Tan, Hui Chen, Zhi Duan, Ping Liu, Jun Tang, Guoping Cai, Quchang Ouyang, Yuling Tang, Geert Litjens, Qiang Li
Aim: Early detection and correct diagnosis of lung cancer are the most important steps in improving patient outcome.
no code implementations • 10 Feb 2018 • Tao Tan, Zhang Li, Haixia Liu, Ping Liu, Wenfang Tang, Hui Li, Yue Sun, Yusheng Yan, Keyu Li, Tao Xu, Shanshan Wan, Ke Lou, Jun Xu, Huiming Ying, Quchang Ouyang, Yuling Tang, Zheyu Hu, Qiang Li
To help doctors to be more selective on biopsies and provide a second opinion on diagnosis, in this work, we propose a computer-aided diagnosis (CAD) system for lung diseases including cancers and tuberculosis (TB).