no code implementations • EMNLP 2021 • Xueguang Ma, Minghan Li, Kai Sun, Ji Xin, Jimmy Lin
Recent work has shown that dense passage retrieval techniques achieve better ranking accuracy in open-domain question answering compared to sparse retrieval techniques such as BM25, but at the cost of large space and memory requirements.
no code implementations • NAACL (TrustNLP) 2022 • Minghan Li, Xueguang Ma, Jimmy Lin
The bi-encoder design of dense passage retriever (DPR) is a key factor to its success in open-domain question answering (QA), yet it is unclear how DPR’s question encoder and passage encoder individually contributes to overall performance, which we refer to as the encoder attribution problem.
1 code implementation • Findings (EMNLP) 2021 • Minghan Li, Ming Li, Kun Xiong, Jimmy Lin
Our method reaches state-of-the-art performance on 5 benchmark QA datasets, with up to 10% improvement in top-100 accuracy compared to a joint-training multi-task DPR on SQuAD.
no code implementations • 13 Mar 2024 • Minghan Li, Eric Gaussier
Experiments on standard dense retrieval and conversational dense retrieval models both demonstrate improvements on baseline models when they are fine-tuned on the pseudo-relevance labeled data.
1 code implementation • 28 Feb 2024 • Minghan Li, Shuai Li, Xindong Zhang, Lei Zhang
Despite the recent advances in unified image segmentation (IS), developing a unified video segmentation (VS) model remains a challenge.
Ranked #2 on Video Semantic Segmentation on VSPW (using extra training data)
Referring Expression Segmentation Referring Video Object Segmentation +6
no code implementations • 10 Dec 2023 • Shuai Li, Minghan Li, Pengfei Wang, Lei Zhang
To address these challenges, we present a universal transformer-based framework, abbreviated as OpenSD, which utilizes the same architecture and network parameters to handle open-vocabulary segmentation and detection tasks.
Ranked #8 on Zero Shot Segmentation on Segmentation in the Wild
no code implementations • 15 Nov 2023 • Minghan Li, Honglei Zhuang, Kai Hui, Zhen Qin, Jimmy Lin, Rolf Jagerman, Xuanhui Wang, Michael Bendersky
We first show that directly applying the expansion techniques in the current literature to state-of-the-art neural rankers can result in deteriorated zero-shot performance.
1 code implementation • 26 Mar 2023 • Minghan Li, Lei Zhang
As a result, the amount of pixel-wise annotations in existing video instance segmentation (VIS) datasets is small, limiting the generalization capability of trained VIS models.
Ranked #16 on Video Instance Segmentation on YouTube-VIS 2021
1 code implementation • CVPR 2023 • Minghan Li, Shuai Li, Wangmeng Xiang, Lei Zhang
The proposed MDQE is the first VIS method with per-clip input that achieves state-of-the-art results on challenging videos and competitive performance on simple videos.
Ranked #13 on Video Instance Segmentation on YouTube-VIS 2021
1 code implementation • CVPR 2023 • Shuai Li, Minghan Li, Ruihuang Li, Chenhang He, Lei Zhang
The positive and negative weights of these soft anchors are dynamically adjusted during training so that they can contribute more to ``representation learning'' in the early training stage, and contribute more to ``duplicated prediction removal'' in the later stage.
1 code implementation • 15 Feb 2023 • Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen
We hence propose a new DA approach with diverse queries and sources of supervision to progressively train a generalizable DR. As a result, DRAGON, our dense retriever trained with diverse augmentation, is the first BERT-base-sized DR to achieve state-of-the-art effectiveness in both supervised and zero-shot evaluations and even competes with models using more complex late interaction (ColBERTv2 and SPLADE++).
no code implementations • 13 Feb 2023 • Xinyu Zhang, Minghan Li, Jimmy Lin
Recent progress in information retrieval finds that embedding query and document representation into multi-vector yields a robust bi-encoder retriever on out-of-distribution datasets.
1 code implementation • 13 Feb 2023 • Minghan Li, Sheng-Chieh Lin, Xueguang Ma, Jimmy Lin
Multi-vector retrieval methods have demonstrated their effectiveness on various retrieval datasets, and among them, ColBERT is the most established method based on the late interaction of contextualized token embeddings of pre-trained language models.
no code implementations • 13 Dec 2022 • Minghan Li, Eric Gaussier
To address this issue, researchers have resorted to adversarial learning and query generation approaches; both approaches nevertheless resulted in limited improvements.
1 code implementation • 18 Nov 2022 • Minghan Li, Sheng-Chieh Lin, Barlas Oguz, Asish Ghoshal, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen
In this paper, we unify different multi-vector retrieval models from a token routing viewpoint and propose conditional token interaction via dynamic lexical routing, namely CITADEL, for efficient and effective multi-vector retrieval.
no code implementations • 13 Oct 2022 • Linqing Liu, Minghan Li, Jimmy Lin, Sebastian Riedel, Pontus Stenetorp
To balance these two considerations, we propose a combination of an effective filtering strategy and fusion of the retrieved documents based on the generation probability of each context.
1 code implementation • 31 Jul 2022 • Sheng-Chieh Lin, Minghan Li, Jimmy Lin
Pre-trained language models have been successful in many knowledge-intensive NLP tasks.
1 code implementation • 19 May 2022 • Minghan Li, Xinyu Zhang, Ji Xin, Hongyang Zhang, Jimmy Lin
For example, on MS MARCO Passage v1, our method yields an average candidate set size of 27 out of 1, 000 which increases the reranking speed by about 37 times, while the MRR@10 is greater than a pre-specified value of 0. 38 with about 90% empirical coverage and the empirical baselines fail to provide such guarantee.
1 code implementation • CVPR 2022 • Yabin Zhang, Minghan Li, Ruihuang Li, Kui Jia, Lei Zhang
In this work, we, for the first time to our best knowledge, propose to perform Exact Feature Distribution Matching (EFDM) by exactly matching the empirical Cumulative Distribution Functions (eCDFs) of image features, which could be implemented by applying the Exact Histogram Matching (EHM) in the image feature space.
no code implementations • 12 Mar 2022 • Minghan Li, Lei Zhang
Based on the fact that adjacent frames in a short clip are highly coherent in content, we propose to extend the one-stage FiFo framework to a clip-in clip-out (CiCo) one, which performs VIS clip by clip.
1 code implementation • 18 Nov 2021 • Minghan Li, Diana Nicoleta Popa, Johan Chagnon, Yagmur Gizem Cinar, Eric Gaussier
On a wide range of natural language processing and information retrieval tasks, transformer-based models, particularly pre-trained language models like BERT, have demonstrated tremendous effectiveness.
no code implementations • 4 Oct 2021 • Minghan Li, Jimmy Lin
Previous work on generalization of DPR mainly focus on testing both encoders in tandem on out-of-distribution (OOD) question-answering (QA) tasks, which is also known as domain adaptation.
1 code implementation • 3 May 2021 • Thibaut Thonet, Yagmur Gizem Cinar, Eric Gaussier, Minghan Li, Jean-Michel Renders
To address this shortcoming, we propose SmoothI, a smooth approximation of rank indicators that serves as a basic building block to devise differentiable approximations of IR metrics.
1 code implementation • CVPR 2021 • Minghan Li, Shuai Li, Lida Li, Lei Zhang
To further explore temporal correlation among video frames, we aggregate a temporal fusion module to infer instance masks from each frame to its adjacent frames, which helps our framework to handle challenging videos such as motion blur, partial occlusion and unusual object-to-camera poses.
Ranked #24 on Video Instance Segmentation on YouTube-VIS 2021
no code implementations • 31 Jul 2020 • Minghan Li, Xialei Liu, Joost Van de Weijer, Bogdan Raducanu
Active learning emerged as an alternative to alleviate the effort to label huge amount of data for data hungry applications (such as image/video indexing and retrieval, autonomous driving, etc.).
1 code implementation • 18 Sep 2019 • Hong Wang, Yichen Wu, Minghan Li, Qian Zhao, Deyu Meng
The investigations on rain removal from video or a single image has thus been attracting much research attention in the field of computer vision and pattern recognition, and various methods have been proposed against this task in the recent years.
1 code implementation • 13 Sep 2019 • Minghan Li, Xiangyong Cao, Qian Zhao, Lei Zhang, Chenqiang Gao, Deyu Meng
Furthermore, a transformation operator imposed on the background scenes is further embedded into the proposed model, which finely conveys the dynamic background transformations, such as rotations, scalings and distortions, inevitably existed in a real video sequence.
no code implementations • 3 Dec 2018 • Minghan Li, Tanli Zuo, Ruicheng Li, Martha White, Wei-Shi Zheng
Knowledge distillation is an effective technique that transfers knowledge from a large teacher model to a shallow student.
no code implementations • CVPR 2018 • Minghan Li, Qi Xie, Qian Zhao, Wei Wei, Shuhang Gu, Jing Tao, Deyu Meng
Based on such understanding, we specifically formulate both characteristics into a multiscale convolutional sparse coding (MS-CSC) model for the video rain streak removal task.