1 code implementation • 12 Dec 2024 • Han Wang, Yuxiang Nie, YongJie Ye, Deng GuanYu, Yanjie Wang, Shuai Li, Haiyang Yu, Jinghui Lu, Can Huang
The application of Large Vision-Language Models (LVLMs) for analyzing images and videos is an exciting and rapidly evolving field.
no code implementations • 23 Oct 2024 • Rui Yang, Boming Yang, Aosong Feng, Sixun Ouyang, Moritz Blum, Tianwei She, Yuang Jiang, Freddy Lecue, Jinghui Lu, Irene Li
Using the Graphusion-constructed KG, we achieve a significant improvement on the benchmark, for example, a 9. 2% accuracy improvement on sub-graph completion.
1 code implementation • 15 Jul 2024 • Rui Yang, Boming Yang, Sixun Ouyang, Tianwei She, Aosong Feng, Yuang Jiang, Freddy Lecue, Jinghui Lu, Irene Li
Specifically, we introduce TutorQA, a new expert-verified benchmark for graph reasoning and QA, comprising six tasks and a total of 1, 200 QA pairs.
1 code implementation • 2 Jul 2024 • Jinghui Lu, Haiyang Yu, Yanjie Wang, YongJie Ye, Jingqun Tang, Ziwei Yang, Binghong Wu, Qi Liu, Hao Feng, Han Wang, Hao liu, Can Huang
Recently, many studies have demonstrated that exclusively incorporating OCR-derived text and spatial layouts with large language models (LLMs) can be highly effective for document understanding tasks.
no code implementations • 22 May 2024 • Zhaojun Guo, Jinghui Lu, Xuejing Liu, Rui Zhao, Zhenxing Qian, Fei Tan
Despite the notable advancements achieved by leveraging pre-trained vision-language (VL) models through few-shot tuning for downstream tasks, our detailed empirical study highlights a significant dependence of few-shot learning outcomes on the careful selection of training examples - a facet that has been previously overlooked in research.
1 code implementation • 20 May 2024 • Jingqun Tang, Qi Liu, YongJie Ye, Jinghui Lu, Shu Wei, Chunhui Lin, Wanqing Li, Mohamad Fitri Faiz Bin Mahmood, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao liu, Xiang Bai, Can Huang
Text-Centric Visual Question Answering (TEC-VQA) in its proper format not only facilitates human-machine interaction in text-centric visual environments but also serves as a de facto gold proxy to evaluate AI models in the domain of text-centric scene understanding.
1 code implementation • 22 Feb 2024 • Rui Yang, Boming Yang, Sixun Ouyang, Tianwei She, Aosong Feng, Yuang Jiang, Freddy Lecue, Jinghui Lu, Irene Li
We assess LLMs' zero-shot performance in creating domain-specific concept graphs and introduce TutorQA, a new expert-verified NLP-focused benchmark for scientific graph reasoning and QA.
1 code implementation • 12 Feb 2024 • Dongsheng Zhu, Xunzhu Tang, Weidong Han, Jinghui Lu, Yukun Zhao, Guoliang Xing, Junfeng Wang, Dawei Yin
This paper presents VisLingInstruct, a novel approach to advancing Multi-Modal Language Models (MMLMs) in zero-shot learning.
1 code implementation • 7 Feb 2024 • Jinghui Lu, Ziwei Yang, Yanjie Wang, Xuejing Liu, Brian Mac Namee, Can Huang
In this study, we aim to reduce generation latency for Named Entity Recognition (NER) with Large Language Models (LLMs).
no code implementations • 13 Nov 2023 • Xuejing Liu, Wei Tang, Xinzhe Ni, Jinghui Lu, Rui Zhao, Zechao Li, Fei Tan
This pipeline achieved superior performance compared to the majority of existing Multimodal Large Language Models (MLLM) on four text-rich VQA datasets.
1 code implementation • 21 Aug 2023 • Fan Gao, Hang Jiang, Rui Yang, Qingcheng Zeng, Jinghui Lu, Moritz Blum, Dairui Liu, Tianwei She, Yuang Jiang, Irene Li
Educational materials such as survey articles in specialized fields like computer science traditionally require tremendous expert inputs and are therefore expensive to create and update.
no code implementations • 19 Aug 2023 • Hao Feng, Zijian Wang, Jingqun Tang, Jinghui Lu, Wengang Zhou, Houqiang Li, Can Huang
However, existing advanced algorithms are limited to effectively utilizing the immense representation capabilities and rich world knowledge inherent to these large pre-trained models, and the beneficial connections among tasks within the context of text-rich scenarios have not been sufficiently explored.
1 code implementation • 29 May 2023 • Xuejing Liu, Wei Tang, Jinghui Lu, Rui Zhao, Zhaojun Guo, Fei Tan
Recent advancements in multimodal foundation models (e. g., CLIP) have excelled in zero-shot generalization.
1 code implementation • 27 Nov 2022 • Jinghui Lu, Rui Zhao, Brian Mac Namee, Fei Tan
In this work, we present a ``versatile'' model -- the Prompting-based Unified NER system (PUnifiedNER) -- that works with data from different domains and can recognise up to 37 entity types simultaneously, and theoretically it could be as many as possible.
1 code implementation • 8 Oct 2022 • Dongsheng Zhu, Zhenyu Mao, Jinghui Lu, Rui Zhao, Fei Tan
Contrastive learning has recently achieved compelling performance in unsupervised sentence representation.
1 code implementation • 30 Sep 2022 • Jinghui Lu, Dongsheng Zhu, Weidong Han, Rui Zhao, Brian Mac Namee, Fei Tan
Current methods for prompt learning in zeroshot scenarios widely rely on a development set with sufficient human-annotated data to select the best-performing prompt template a posteriori.
1 code implementation • ACL 2022 • Jinghui Lu, Linyi Yang, Brian Mac Namee, Yue Zhang
We present a novel rationale-centric framework with human-in-the-loop -- Rationales-centric Double-robustness Learning (RDL) -- to boost model out-of-distribution performance in few-shot learning scenarios.
1 code implementation • 12 Jun 2021 • Jinghui Lu, Maeve Henchion, Ivan Bacher, Brian Mac Namee
While with the recent emergence of BERT, deep learning language models can achieve reasonably good performance in document classification with few labelled instances, there is a lack of evidence in the utility of applying BERT-like models on long document classification.
no code implementations • LREC 2020 • Jinghui Lu, Maeve Henchion, Brian Mac Namee
Jensen-Shannon divergence (JSD) is a distribution similarity measurement widely used in natural language processing.
no code implementations • 21 Apr 2020 • Jinghui Lu, Brian MacNamee
While simple vector representations such as bag-of-words and embedding-based representations based on techniques such as word2vec have been shown to be an effective way to represent documents during active learning, the emergence of representation mechanisms based on the pre-trained transformer-based neural network models popular in natural language processing research (e. g. BERT) offer a promising, and as yet not fully explored, alternative.
2 code implementations • 4 Oct 2019 • Jinghui Lu, Maeve Henchion, Brian Mac Namee
Active learning has been shown to be an effective way to alleviate some of the effort required in utilising large collections of unlabelled data for machine learning tasks without needing to fully label them.