Search Results for author: Jinghui Lu

Found 21 papers, 15 papers with code

Dynamic-VLM: Simple Dynamic Visual Token Compression for VideoLLM

1 code implementation12 Dec 2024 Han Wang, Yuxiang Nie, YongJie Ye, Deng GuanYu, Yanjie Wang, Shuai Li, Haiyang Yu, Jinghui Lu, Can Huang

The application of Large Vision-Language Models (LVLMs) for analyzing images and videos is an exciting and rapidly evolving field.

Computational Efficiency

Graphusion: A RAG Framework for Knowledge Graph Construction with a Global Perspective

no code implementations23 Oct 2024 Rui Yang, Boming Yang, Aosong Feng, Sixun Ouyang, Moritz Blum, Tianwei She, Yuang Jiang, Freddy Lecue, Jinghui Lu, Irene Li

Using the Graphusion-constructed KG, we achieve a significant improvement on the benchmark, for example, a 9. 2% accuracy improvement on sub-graph completion.

graph construction Knowledge Graphs +3

A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding

1 code implementation2 Jul 2024 Jinghui Lu, Haiyang Yu, Yanjie Wang, YongJie Ye, Jingqun Tang, Ziwei Yang, Binghong Wu, Qi Liu, Hao Feng, Han Wang, Hao liu, Can Huang

Recently, many studies have demonstrated that exclusively incorporating OCR-derived text and spatial layouts with large language models (LLMs) can be highly effective for document understanding tasks.

document understanding Key Information Extraction +6

What Makes Good Few-shot Examples for Vision-Language Models?

no code implementations22 May 2024 Zhaojun Guo, Jinghui Lu, Xuejing Liu, Rui Zhao, Zhenxing Qian, Fei Tan

Despite the notable advancements achieved by leveraging pre-trained vision-language (VL) models through few-shot tuning for downstream tasks, our detailed empirical study highlights a significant dependence of few-shot learning outcomes on the careful selection of training examples - a facet that has been previously overlooked in research.

Active Learning Few-Shot Learning

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

1 code implementation20 May 2024 Jingqun Tang, Qi Liu, YongJie Ye, Jinghui Lu, Shu Wei, Chunhui Lin, Wanqing Li, Mohamad Fitri Faiz Bin Mahmood, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao liu, Xiang Bai, Can Huang

Text-Centric Visual Question Answering (TEC-VQA) in its proper format not only facilitates human-machine interaction in text-centric visual environments but also serves as a de facto gold proxy to evaluate AI models in the domain of text-centric scene understanding.

Benchmarking Question Answering +4

Leveraging Large Language Models for Concept Graph Recovery and Question Answering in NLP Education

1 code implementation22 Feb 2024 Rui Yang, Boming Yang, Sixun Ouyang, Tianwei She, Aosong Feng, Yuang Jiang, Freddy Lecue, Jinghui Lu, Irene Li

We assess LLMs' zero-shot performance in creating domain-specific concept graphs and introduce TutorQA, a new expert-verified NLP-focused benchmark for scientific graph reasoning and QA.

Question Answering Text Generation

What Large Language Models Bring to Text-rich VQA?

no code implementations13 Nov 2023 Xuejing Liu, Wei Tang, Xinzhe Ni, Jinghui Lu, Rui Zhao, Zechao Li, Fei Tan

This pipeline achieved superior performance compared to the majority of existing Multimodal Large Language Models (MLLM) on four text-rich VQA datasets.

Image Comprehension Optical Character Recognition (OCR) +2

Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts

1 code implementation21 Aug 2023 Fan Gao, Hang Jiang, Rui Yang, Qingcheng Zeng, Jinghui Lu, Moritz Blum, Dairui Liu, Tianwei She, Yuang Jiang, Irene Li

Educational materials such as survey articles in specialized fields like computer science traditionally require tremendous expert inputs and are therefore expensive to create and update.

Hallucination Machine Translation +2

UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding

no code implementations19 Aug 2023 Hao Feng, Zijian Wang, Jingqun Tang, Jinghui Lu, Wengang Zhou, Houqiang Li, Can Huang

However, existing advanced algorithms are limited to effectively utilizing the immense representation capabilities and rich world knowledge inherent to these large pre-trained models, and the beneficial connections among tasks within the context of text-rich scenarios have not been sufficiently explored.

Instruction Following Text Detection +1

Deeply Coupled Cross-Modal Prompt Learning

1 code implementation29 May 2023 Xuejing Liu, Wei Tang, Jinghui Lu, Rui Zhao, Zhaojun Guo, Fei Tan

Recent advancements in multimodal foundation models (e. g., CLIP) have excelled in zero-shot generalization.

Domain Adaptation Few-Shot Learning +3

PUnifiedNER: A Prompting-based Unified NER System for Diverse Datasets

1 code implementation27 Nov 2022 Jinghui Lu, Rui Zhao, Brian Mac Namee, Fei Tan

In this work, we present a ``versatile'' model -- the Prompting-based Unified NER system (PUnifiedNER) -- that works with data from different domains and can recognise up to 37 entity types simultaneously, and theoretically it could be as many as possible.

named-entity-recognition Named Entity Recognition +1

What Makes Pre-trained Language Models Better Zero-shot Learners?

1 code implementation30 Sep 2022 Jinghui Lu, Dongsheng Zhu, Weidong Han, Rui Zhao, Brian Mac Namee, Fei Tan

Current methods for prompt learning in zeroshot scenarios widely rely on a development set with sufficient human-annotated data to select the best-performing prompt template a posteriori.

Language Modelling text-classification +2

A Rationale-Centric Framework for Human-in-the-loop Machine Learning

1 code implementation ACL 2022 Jinghui Lu, Linyi Yang, Brian Mac Namee, Yue Zhang

We present a novel rationale-centric framework with human-in-the-loop -- Rationales-centric Double-robustness Learning (RDL) -- to boost model out-of-distribution performance in few-shot learning scenarios.

BIG-bench Machine Learning Few-Shot Learning

A Sentence-level Hierarchical BERT Model for Document Classification with Limited Labelled Data

1 code implementation12 Jun 2021 Jinghui Lu, Maeve Henchion, Ivan Bacher, Brian Mac Namee

While with the recent emergence of BERT, deep learning language models can achieve reasonably good performance in document classification with few labelled instances, there is a lack of evidence in the utility of applying BERT-like models on long document classification.

Classification Document Classification +1

Diverging Divergences: Examining Variants of Jensen Shannon Divergence for Corpus Comparison Tasks

no code implementations LREC 2020 Jinghui Lu, Maeve Henchion, Brian Mac Namee

Jensen-Shannon divergence (JSD) is a distribution similarity measurement widely used in natural language processing.

Investigating the Effectiveness of Representations Based on Pretrained Transformer-based Language Models in Active Learning for Labelling Text Datasets

no code implementations21 Apr 2020 Jinghui Lu, Brian MacNamee

While simple vector representations such as bag-of-words and embedding-based representations based on techniques such as word2vec have been shown to be an effective way to represent documents during active learning, the emergence of representation mechanisms based on the pre-trained transformer-based neural network models popular in natural language processing research (e. g. BERT) offer a promising, and as yet not fully explored, alternative.

Active Learning Word Embeddings

Investigating the Effectiveness of Representations Based on Word-Embeddings in Active Learning for Labelling Text Datasets

2 code implementations4 Oct 2019 Jinghui Lu, Maeve Henchion, Brian Mac Namee

Active learning has been shown to be an effective way to alleviate some of the effort required in utilising large collections of unlabelled data for machine learning tasks without needing to fully label them.

Active Learning BIG-bench Machine Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.